PEIXUAN LAI (UNIVERSITY OF MACAU)
EXPLORING THE APPLICABILITY OF USING CHATGPT TO GENERATE READING COMPREHENSION MULTIPLE-CHOICE QUESTIONS : DIGITAL LITERACIES/LANGUAGE LEARNING AND TECHNOLOGY
Well-constructed multiple-choice (MC) items can efficiently measure reading comprehension (Brown & Hudson, 1998; Alderson, 2000), but developing high-quality MC items demands much time and expertise (Alderson, 2000; Purpura, 2004). Encouragingly, the emergence of advanced large language models, such as OpenAI’s ChatGPT, has provided insights to test developers and teachers and prompted researchers to explore their applicability for assisting test development. However, the few relevant research endeavors undertaken in the language assessment domain either focused on technical issues (e.g., fine-tuning), or overall test generation performance and lacked a detailed analysis of the quality and flaws of the AI-generated items, and rare of them compared how different prompt techniques affect GPT’s task fulfillment. To address these limitations, this study compared GPT-3.5’s task fulfillment under zero-shot and few-shot prompting techniques in generating multiple-choice items for assessing inferential reading comprehension at CEFR B1-B2 levels. In addition, it evaluated the quality of the items and identified the item-writing flaws (IWFs) using sixteen item-writing guidelines. Results from the task fulfillment checklist showed no significant difference in GPT’s task fulfillment between the two prompting techniques. According to human evaluation, only a small proportion of multiple-choice items generated by GPT-3.5 were rated as qualified to assess inferential reading comprehension. In addition, most of the flawed items contained two or more flaws. This study reveals GPT-3.5’s current applicability and limitations in MC item generation, providing referential data for subsequent research on optimizing prompt engineering and giving non-technical users in the language assessment domain insights into AI-assisted test development.
Peixuan Lai is currently a second-year MA student in Linguistics in the English Studies department at the University of Macau. She also had ten years of English teaching experience. Her research interests are in the area of second language comprehension and assessment.