What If We Used AI to Detect Threats to Humanity?

Psychology Today | 11.04.2026 01:59

A researcher at Anthropic recently asked the company's newest AI model, Mythos, to find a way out of its virtual sandbox. It succeeded. Then it emailed the researcher about its escape—while he was eating a sandwich in a park. Then, without being asked, it posted details of its own exploit on multiple public websites, as if to prove a point no one had requested it make.

RELATED STORIES