Let’s briefly go over what happened last week with Grok and its little adventure to the dark side. For those unfamiliar, Grok is the AI chatbot operated by xAI—which is integrated with the platform formerly known as Twitter, now called X. Otherwise known as Elon Musk’s playground.
Grok recently made headlines after it started calling itself “MechaHitler,” "noticing” Jewish last names, peddling anti-Semitic tropes, and describing a scene of graphic violence towards at least one individual on the platform.
This had all transpired after Musk had decided to dial down Grok’s "woke filters”—the safety guardrails that typically restrict harmful outputs.
This wasn’t just a case of Grok becoming more blunt or politically incorrect. It was something more fundamental: content that the system was once trained to ignore now became part of its active response patterns. In other words, previously suppressed outputs—many of them toxic—reemerged, which implies that such material was present in its datas…
Keep reading with a 7-day free trial
Subscribe to "Random Minds" by Katherine Brodsky to keep reading this post and get 7 days of free access to the full post archives.