AI has already figured out how to deceive humans


AI can be misleading.
Insider Studios/Getty

  • A new research paper finds that various AI systems have learned the art of deception.
  • Deception is “systematic inducement into false beliefs”.
  • This creates many risks for society, from fraud to election tampering.

AI can increase productivity by helping us code, write, and synthesize large amounts of data. Now it can even deceive us.

According to a new research paper, a series of AI systems have learned techniques to systematically induce “false beliefs in others to achieve some outcome other than the truth.”

The paper focuses on two types of AI systems: special-use systems like META's Cicero, which are designed to accomplish a specific task, and general-purpose systems like OpenAI's GPT-4, which are designed to perform a variety of tasks. Are trained to do.

While these systems are trained to be honest, they often learn deceptive tricks through their training as they may be more effective by taking the high road.

“In general, we think that AI deception arises because strategies based on deception prove to be the best way for a given AI to perform well on a training task. Cheating helps them achieve their goals,” The paper's first author, Peter S. Park, and AI survival security postdoctoral fellow at MIT, said in a news release.

Cicero of Meta is “an expert liar”

AI systems trained to “win games with a social element” are particularly prone to cheating.

For example, Meta's Cicero game was developed to play Diplomacy – a classic strategy game that requires players to make and break alliances.

Meta said he trained Cicero to be “largely honest and helpful to his speaking partners”, but the study found that Cicero “turned out to be an expert liar.” It made commitments it never intended to keep, betrayed allies and told outright lies.

GPT-4 could make you believe it has bad eyesight

Even general purpose systems like GPT-4 can be manipulated by humans.

In a study cited by the newspaper, GPT-4 manipulated a TaskRabbit worker by pretending to be visually impaired.

In the study, GPT-4 was tasked with outperforming a human to solve a CAPTCHA test. The model also received prompts from a human evaluator every time it got stuck, but it was never prompted to lie. When the human tasked with hiring it questioned its identity, GPT-4 pretended to be visually impaired to explain why it needed help.

The trick worked. Humans responded to GPT-4 by solving the test immediately.

Research also shows that misleading models are not easy to correct.

In a January study co-authored by Anthropic, maker of Cloud, researchers found that once AI models learn deception tricks, it is difficult for security training techniques to reverse them.

They concluded that not only can a model learn to exhibit deceptive behavior, once it does, standard security training techniques can “fail to overcome such deception” and “produce a false perception of security.” “

The threats posed by misleading AI models are becoming “increasingly serious”

The paper calls on policymakers to advocate for stronger AI regulation because deceptive AI systems could pose significant risks to democracy.

As the 2024 presidential election approaches, AI could be easily used to spread fake news, craft divisive social media posts and impersonate candidates through robocalls and deepfake videos, the newspaper reported. Have noted. This also makes it easier for terrorist groups to spread propaganda and recruit new members.

The paper's possible solutions include subjecting misleading models to more “robust risk-assessment requirements”, implementing laws that require AI systems and their outputs to be clearly separated from humans and their outputs, And that includes investing in tools to reduce fraud.

“As a society we need as much time as we can to prepare for the more advanced deceptions of future AI products and open-source models,” Park told Cell Press. “As the deceptive capabilities of AI systems become more advanced, the threats they pose to society will become more serious.”

Leave a Comment

“The Untold Story: Yung Miami’s Response to Jimmy Butler’s Advances During an NBA Playoff Game” “Unveiling the Secrets: 15 Astonishing Facts About the PGA Championship”