AI proves superior to real university students in examinations

A recent study has demonstrated that AI is now outperforming university students in various tasks, including creative thinking and academic assessments. Researchers from the University of Arkansas found that large language models, such as those similar to ChatGPT, excel in generating creative responses, often surpassing human participants. This has raised questions about the validity and generalisability of current creativity assessment methods for humans.

Adding to this, a study conducted by researchers at the University of Reading revealed that AI-generated exam answers often go undetected and score higher than those of real students. The researchers created 33 fictitious students and used ChatGPT to generate answers for undergraduate psychology exams. On average, the AI-generated answers scored half a grade boundary higher than those of real students. Remarkably, 94% of these AI-generated essays did not raise any concerns with markers, indicating a mere 6% detection rate, which the researchers believe is likely an overestimate.

The study’s findings, published in the journal Plos One, suggest that students could potentially use AI to cheat undetected and achieve higher grades than those who do not cheat. Associate Professor Peter Scarfe and Professor Etienne Roesch, who led the study, emphasised the need for educators worldwide to recognise the implications of AI on educational integrity. In a talk with the BBC, Dr Scarfe stated: “Many institutions have moved away from traditional exams to make assessment more inclusive.

“Our research shows it is of international importance to understand how AI will affect the integrity of educational assessments.

“We won’t necessarily go back fully to handwritten exams – but the global education sector will need to evolve in the face of AI.”

The study also highlighted that while AI performed better than humans in first- and second-year exams, human students scored higher in third-year exams. This discrepancy aligns with the notion that current AI struggles with more abstract reasoning. This finding is significant as it underscores the potential limitations of AI in handling complex, abstract tasks.

In light of these developments, academics have raised concerns about the influence of AI in education. Some institutions, such as Glasgow University, have reintroduced in-person exams to mitigate the potential for AI-assisted cheating. A Guardian report earlier this year found that while most undergraduates use AI programs to assist with their essays, only 5% admitted to submitting unedited AI-generated text in their assessments.