AI-Generated Reviews Trick People and Bots, Endangering Trust in Online Platforms

AI-Generated Reviews Mislead Users and Security Software, Endangering Online Platform Trust

904

A new study by Balázs Kovács, a professor at the Yale School of Management, shows that restaurant reviews made by AI can pass the Turing test, which means they can fool both humans and AI monitors.

In this day and age of advanced language models like GPT-4, the results have big effects on how trustworthy online reviews are.

How Online Reviews Can Help You

Reviews found online have become very important in influencing consumer choices, and most people depend on them to make smart choices.

There are, however, more and more complex AI language models that could make these reviews less reliable.

Two tests were done by Kovács with a broad group of people he found through Prolific Academic.

The 301 people who took part were split between the two groups.

The average age of them was 47, and 56.5% of them were women.

All of them spoke English as their first language and lived in the US, Canada, the UK, Ireland, and Australia.

People who took part in Study 1 saw both real Yelp reviews and reviews made by GPT-4 using AI.

They were only able to figure out the source about half of the time, which is about the same as random chance.

Even more surprising were the results of Study 2, in which GPT-4 wrote completely made-up reviews: 64% of the time, people thought the reviews were written by humans.


AI bots were also tricked.

Kovács also tried some of the best AI devices that can tell the difference between text written by humans and text written by AI.

He gave 102 of the reviews to Copyleaks, an AI tool for reading text that is open to everyone.

Copyleaks marked all 102 reviews it looked at as being written by humans, showing that it couldn’t tell which ones were written by AI.

Kovács gave the reviews to GPT-4 again and asked it to rate on a scale from 0 to 100 how likely it was that each review was written by AI.

Even GPT-4 couldn’t reliably tell the difference between reviews written by humans and those written by its own AI. Most of its answers were between 10 and 20, which suggests that it thought both types of reviews were pretty much the same.

The results show that current AI recognition methods aren’t very good when it comes to advanced language models.


What this mean for review platforms and other places

The results are important for review sites, companies, and customers in general.

Bad people could use AI to make fake reviews, which would hurt trust in online platforms.

Small businesses that depend on honest reviews may be hit the hardest.

Review platforms should rethink how they handle identification because of the study. Policymakers should also think about taking legal action to make sure that everything is open.

Kovács writes, “The discovery that large language models (LLMs) can quickly and cheaply create online review texts that can’t be told apart from those written by humans has broad implications.”

Kovács concludes that “Once consumers understand how LLMs can quickly and cheaply make reviews that look real, it will likely make them question whether a review was written by a person or an AI.”

Comments are closed.