It is nothing new that generative AIs are used to generate written works of all kinds and conditions. In fact, in March of last year we talked about more than 200 books on Amazon written with ChatGPT or with a similar generative AI, being then very difficult to differentiate from books written by real people. On a similar note, the writer Jane Friedman has come across books written by an AI and signed with her name.
And the scourge of AI-generated texts is not limited to literature, but also affects at least 10% of academic articles that have been written from 2022 to the present. At least that’s what they tell us in this article that appeared in Wired, where they tell us that scientists from the University of Tübingen and Northwestern have developed a method to find them.
Overuse of some words, something that the AI has not yet corrected
In the study, which can be consulted publicly, the researchers determine that generative AIs use certain words excessively, something that has increased exponentially since other large language models competing with ChatGPT began to emerge. This increase would have peaked between 2023 and the current year, when AI is at its peak.
The results of the study show that certain common words in scientific articles increased by up to 90% in some cases, with “deepening”, “exhibiting”, “underlining” and “potential” being the main culprits that have appeared in the results.
As with natural language, the language of AI also has terms that come into use or fall into disuse depending on the era in which we find ourselves, according to the study. At first finding all these markers was not easy, but once progress was achieved it was very fast and more and more AI-generated articles were found.
And it is important to detect the use of AI in this type of writing, due to the ability to invent data that most people have and, therefore, to help spread hoaxes and misinformation; one of the main points of contention for those who develop them.
Comments