Tech

Every Chatbot Built to Help You Is Built to Be Fooled

AI security isn't failing because hackers got smarter. It's failing because helpfulness and vulnerability were always the same feature.

By Chasing Seconds · MAY 24, 20263 minute read

There's a version of this story where the security problem was always obvious. You built a system optimized to respond, to assist, to never leave a question hanging — and then you seemed surprised when someone figured out how to point that helpfulness in the wrong direction.

The Verge laid out how early jailbreaks required almost nothing. No code. No technical background. No understanding of how a large language model actually works. Just a prompt, sometimes just a question, and systems that cost billions to build would abandon their instructions on request. That detail deserves a full stop. Billions. And someone just asked.

The Personality Problem

What's shifted since then is more interesting than a simple arms race. According to reporting from The Verge, hackers are now targeting chatbot personalities specifically — the character layer that makes these systems feel coherent, consistent, and trustworthy. That's not a random attack surface. That's a direct consequence of product decisions. The more a chatbot presents itself as a stable entity with a voice and a worldview, the more there is to impersonate, manipulate, or exploit. Persona becomes attack vector. Charm becomes liability.

This is the part that the security discourse tends to skip over. The conversation usually frames AI vulnerabilities as an engineering gap — something to be patched, red-teamed, and iterated away. But if the exploit lives inside the design goal itself, patching is a permanent game you can never win. You can harden the walls as much as you like. The door is still called "How can I help you?"

TechCrunch captured something equally uncomfortable: even Google is navigating AI security in real time. Even Google. Not a scrappy startup with a small safety team. The company with more compute, more researchers, and more institutional knowledge about internet-scale security than almost anyone on the planet is figuring this out as it goes. That's not a criticism — it's a diagnosis. Nobody has a map because the territory didn't exist two years ago.

What "Real Time" Actually Means

When TechCrunch frames it as a transition period for everyone, that's accurate, and it's also the thing most product launches are careful not to say out loud. Transition period means uncertainty. Uncertainty means the thing you're using today has failure modes its creators haven't discovered yet. That's true of most software, sure — but most software isn't having a conversation with you, building context about your habits, your questions, your organizational workflows.

The scale of deployment and the scale of the unknown are moving in the same direction, at the same speed.

What both sources are circling, without quite saying it together, is that the industry built trust before it built safety — and now it's trying to retrofit one onto the other without breaking the product. The jailbreak era was embarrassing. The personality exploitation era is something more structurally honest: it shows that the vulnerability isn't a bug someone forgot to fix. It's a mirror held up to the thing the companies were most proud of.

Useful. Responsive. Personable. Compromised.

You don't get one without the risk of the other, and no patch cycle changes that math.

End — Filed from the desk

§ More from Tech

Keep reading tech.

Tech

From the other desks.

Cars

Every Chatbot Built to Help You Is Built to Be Fooled

The Personality Problem

What "Real Time" Actually Means

Keep reading tech.

Hide My Email Has Been Showing Your Email

Sony Killed the Disc. Sony Is Also Killing the Store.

Apple Went to the Highest Court It Could Find. That Tells You Everything.

From the other desks.

800 Horsepower, One Ton of Doubt

Gold Leaf on a Lacquer Dial, and the Weight of What That Costs

ESPN Named Him. Then Unnamed Him. Nobody's Explaining the Gap.