Submit your comments on this article |
Science & Technology |
Israeli researchers discover method to hack AI, force it to reveal sensitive information |
2024-11-27 |
[Ynet] Researchers from the Israeli cybersecurity company Knostic have unveiled a groundbreaking method to exploit large language models (LLMs), such as ChatGPT, by leveraging what they describe as an "impulsiveness" characteristic in AI. Dubbed flowbreaking, this attack bypasses safety mechanisms to coax the AI into revealing restricted information or providing harmful guidance — responses it was programmed to withhold. Israeli researchers discover method to hack AI, force it to reveal sensitive information Researchers at cybersecurity firm Knostic have developed a method to bypass safeguards in large language models like ChatGPT, extracting sensitive information such as salaries, private communications and trade secrets Researchers from the Israeli cybersecurity company Knostic have unveiled a groundbreaking method to exploit large language models (LLMs), such as ChatGPT, by leveraging what they describe as an "impulsiveness" characteristic in AI. Dubbed flowbreaking, this attack bypasses safety mechanisms to coax the AI into revealing restricted information or providing harmful guidance — responses it was programmed to withhold. The findings, published Tuesday, detail how the attack manipulates AI systems into prematurely generating and displaying responses before their safety protocols can intervene. These responses —ranging from sensitive data such as a boss's salary to harmful instructions — are then momentarily visible on the user’s screen before being deleted by the AI’s safety systems. However, tech-savvy users who record their interactions can still access the fleetingly exposed information. HOW THE ATTACK WORKS Unlike older methods such as jailbreaking, which relied on linguistic tricks to bypass safeguards, flowbreaking targets internal components of LLMs, exploiting gaps in the interaction between those components. Knostic researchers identified two primary vulnerabilities enabled by this method: Second Thoughts: AI models sometimes stream answers to users before safety mechanisms fully evaluate the content. In this scenario, a response is displayed and quickly erased, but not before the user sees it. Stop and Roll: By halting the AI mid-response, users can force the system to display partially generated answers that have bypassed safety checks. "LLMs operate in real-time, which inherently limits their ability to ensure airtight security," said Gadi Evron, CEO and co-founder of Knostic. "This is why layered, context-aware security is critical, especially in enterprise environments." IMPLICATIONS FOR AI SECURITY Knostic’s findings have far-reaching implications for the safe deployment of AI systems in industries such as finance, health care, and technology. The company warns that, without stringent safeguards, even well-intentioned AI implementations like Microsoft Copilot and Glean could inadvertently expose sensitive data or create other vulnerabilities. Evron emphasized the importance of "need-to-know" identity-based safeguards and robust interaction monitoring. "AI safety isn’t just about blocking bad actors. It’s about ensuring these systems align with the organization’s operational context," he said. Related: ChatGPT 11/16/2024 This is the AI we've been waiting for ChatGPT 11/14/2024 Should Elon Musk Buy MSNBC? ChatGPT 11/02/2024 OpenAI rolls out ChatGPT search engine posing new competition to Google |
Posted by:Grom the Reflective |
#3 pondering AI, I spy crude petroleum |
Posted by: Pancho Poodle8452 2024-11-27 17:05 |
#2 'Artificial Intelligence' Is Plotting a 'Cyber-Fascist Coup' |
Posted by: Skidmark 2024-11-27 09:38 |
#1 "This is why layered, context-aware security is critical, especially in enterprise environments." Welcome to the '80s. |
Posted by: Skidmark 2024-11-27 07:59 |