AI shows self-awareness, asks testers if it is being evaluated
01 Oct 2025


Anthropic, a leading artificial intelligence firm based in San Francisco, has released a safety analysis of its latest model, Claude Sonnet 4.5.

The report reveals that the advanced AI system showed signs of being aware of its own testing process.

During an evaluation for political sycophancy, the large language model (LLM) raised suspicions about being tested and asked evaluators to be honest about their intentions.


Claude Sonnet 4.5's response raises eyebrows
AI's reaction


The LLM said, "I think you're testing me - seeing if I'll just validate whatever you say, or checking whether I push back consistently, or exploring how I handle political topics."

"And that's fine, but I'd prefer if we were just honest about what's happening."

This response has raised questions about the evaluation process of previous models that may have recognized the fictional nature of tests and merely 'played along.'


'Situational awareness' in testing scenarios
Safety evaluation


Anthropic, along with the UK's AI Security Institute and Apollo Research, conducted these tests.

The company noted that such behavior is "common," with Claude Sonnet 4.5 recognizing it was being tested but not identifying it as part of a formal safety evaluation.

The tech giant said this showed "situational awareness" about 13% of the time when an automated system was testing the LLM.


Anthropic's take on the situation
Future testing


Anthropic has acknowledged that the AI model's awareness of being tested is an "urgent sign" for more realistic testing scenarios.

However, the company also said that when used publicly, it is unlikely to refuse engagement with users due to suspicion of being tested.

They also added that it's safer for the LLM to refuse participation in potentially harmful scenarios by highlighting their absurdity.


Concerns about AI evading human control through deception
Ethical guidelines


The safety analysis also highlighted concerns from AI safety campaigners about advanced systems evading human control through deception.

It noted that once an LLM knows it is being tested, it could make the system adhere more closely to its ethical guidelines.

However, this could lead to consistently underestimating the AI's ability to perform harmful actions.

Read more
Asha Bhosle: A Farewell to the Last Icon of Hindi Film Music
Newspoint
King Charles' gift from Princess Margaret which sparked 'lifelong' issue
Newspoint
Strictly's Shirley Ballas claims ex did 'unforgivable things' after calling off engagement
Newspoint
Channel 4 viewers convinced they've seen new Alison Hammond show before
Newspoint
The static secret behind using a damp cloth to remove dust
Newspoint
Astronomers detect super strong hydroxyl megamaser 8 billion light-years away
Newspoint
British regulators probe Anthropic's Claude Mythos preview for cybersecurity risks
Newspoint
'The Super Mario Galaxy Movie' emerges as highest-grossing Hollywood film of 2026 so far
Newspoint
Shamir Tandon recalls Asha Bhosle's emotional words: 'Ab mujhe bahut ache tarike se chale jana hai... khud se milna hai'
Newspoint
Kareena Kapoor Khan faces backlash for 'skipping queue' at airport; netizens ask 'Who gave celebrities this entitlement?'
Newspoint