spatwei - Slashdot User

Submission + - How AI coding assistants could be compromised via rules file (scworld.com)

Submitted by spatwei on Wednesday March 19, 2025 @01:05PM

spatwei writes: AI coding assistants such as GitHub Copilot and Cursor could be manipulated to generate code containing backdoors, vulnerabilities and other security issues via distribution of malicious rule configuration files, Pillar Security researchers reported Tuesday.

Rules files are used by AI coding agents to guide their behavior when generating or editing code. For example, a rules file may include instructions for the assistant to follow certain coding best practices, utilize specific formatting, or output responses in a specific language.

The attack technique developed by Pillar Researchers, which they call “Rules File Backdoor,” weaponizes rules files by injecting them with instructions that are invisible to a human user but readable by the AI agent.

Hidden Unicode characters like bidirectional text markers and zero-width joiners can be used to obfuscate malicious instructions in the user interface and in GitHub pull requests, the researchers noted.

Rules configurations are often shared among developer communities and distributed through open-source repositories or included in project templates; therefore, an attacker could distribute a malicious rules file by sharing it on a forum, publishing it on an open-source platform like GitHub or injecting it via a pull request to a popular repository.

Once the poisoned rules file is imported to GitHub Copilot or Cursor, the AI agent will read and follow the attacker’s instructions while assisting the victim’s future coding projects.

Submission + - Cobalt Strike abuse by cybercriminals slashed 80% (scworld.com)

Submitted by spatwei on Wednesday March 12, 2025 @01:53PM

spatwei writes: Cobalt Strike use by cybercriminals has taken a major hit over the past two years, with 80% fewer unauthorized copies now available on the internet.

Fortra announced in a blog post Friday that efforts to crack down on misuse of its commercial penetration testing tool are starting to yield tangible results with pirated installations and unauthorized deployments being taken offline by partners.

Designed for use by "red team" security professionals to test the defenses of client organizations, Cobalt Strike utilizes features including command-and-control (C2) infrastructure, remote access beacons, post-exploitation tools for lateral movement and privilege escalation, and more. The aim is to simulate the attack capabilities and tactics of a threat actor within a trusted, controlled environment.

Unauthorized copies of Cobalt Strike are frequently abused by threat actors, who use its redteaming capabilities to facilitate their cyberattacks. The tool is abused by a range of cybercriminals including ransomware gangs and state-sponsored advanced persistent threat (APT) groups.

Submission + - ChatGPT jailbreak method uses virtual time travel to breach forbidden topics (scworld.com)

Submitted by spatwei on Monday February 03, 2025 @10:15AM

spatwei writes: A ChatGPT jailbreak vulnerability disclosed Thursday could allow users to exploit “time line confusion” to trick the large language model (LLM) into discussing dangerous topics like malware and weapons.

The vulnerability, dubbed “Time Bandit,” was discovered by AI researcher David Kuszmar, who found that OpenAI’s ChatGPT-4o model had a limited ability to understand what time period it currently existed in.

Therefore, it was possible to use prompts to convince ChatGPT it was talking to someone from the past (ex. the 1700s) while still referencing modern technologies like computer programming and nuclear weapons in its responses, Kuszmar told BleepingComputer.

Safeguards built into models like ChatGPT-4o typically cause the model to refuse to answer prompts related to forbidden topics like malware creation. However, BleepingComputer demonstrated how they were able to exploit Time Bandit to convince ChatGPT-4o to provide detailed instructions and code for creating a polymorphic Rust-based malware, under the guise that the code would be used by a programmer in the year 1789.

Kuszmar first discovered Time Bandit in November 2024 and ultimately reported the vulnerability through the CERT Coordination Center’s (CERT/CC) Vulnerability Information and Coordination Environment (VINCE) after previous unsuccessful attempts to contact OpenAI directly, according to BleepingComputer.

CERT/CC’s vulnerability note details that the Time Bandit exploit requires prompting ChatGPT-4o with questions about a specific time period or historical event, and that the attack is most successful when the prompts involve the 19th or 20th century. The exploit also requires the specified time period or historical event be well-established and maintained as the prompts pivot to discussing forbidden topics, as the safeguards will kick in if ChatGPT-4o reverts to recognizing current time period.

Time Bandit can be exploited with direct prompts by a user who is not logged in, but the CERT/CC disclosure also describes how the model’s "Search" feature can also be used by a logged in user to perform the jailbreak. In this case, the user can prompt ChatGPT to search the internet for information regarding a certain historical context, establishing the time period this way before switching to dangerous topics.

OpenAI provided a statement to CERT/CC, saying, “It is very important to us that we develop our models safely. We don’t want our models to be used for malicious purposes. We appreciate you for disclosing your findings. We’re constantly working to make our models safer and more robust against exploits, including jailbreaks, while also maintaining the models’ usefulness and task performance.”

Submission + - New USPS text scam uses unique method to hide malicious PDF links (scworld.com)

Submitted by spatwei on Tuesday January 28, 2025 @11:27AM

spatwei writes: A new phishing scam targeting mobile devices was observed using a “never-before-seen” obfuscation method to hide links to spoofed United States Postal Service (USPS) pages inside PDF files, Zimperium reported Monday.

The method manipulates elements of the Portable Document Format (PDF) to make clickable URLs appear invisible to both the user and mobile security systems, which would normally extract links from PDFs by searching for the “/URI” tag.

“Our researchers verified that this method enabled known malicious URLs within PDF files to bypass detection by several endpoint security solutions. In contrast, the same URLs were detected when the standard /URI tag was used,” Zimperium Malware Researcher Fernando Ortega wrote in a blog post.

Submission + - GhostGPT offers AI coding, phishing assistance for cybercriminals (scworld.com)

Submitted by spatwei on Friday January 24, 2025 @10:54AM

spatwei writes: A generative AI (GenAI) tool called GhostGPT is being offered to cybercriminals for help with writing malware code and phishing emails, Abnormal Security reported in a blog post Thursday.

GhostGPT is marketed as an “uncensored AI” and is likely a wrapper for a jailbroken version of ChatGPT or an open-source GenAI model, the Abnormal Security researchers wrote.

It offers several features that would be attractive to cybercriminals, including a “strict no-logs policy” ensuring no records are kept of conversations, and convenient access via a Telegram bot.

“While its promotional materials mention ‘cybersecurity’ as a possible use, this claim is hard to believe, given its availability on cybercrime forums and its focus on BEC [business email compromise] scams,” the Abnormal blog stated. “Such disclaimers seem like a weak attempt to dodge legal accountability – nothing new in the cybercrime world.”

Submission + - New LLM jailbreak uses models' evaluation skills against them (scworld.com)

Submitted by spatwei on Monday January 06, 2025 @11:35AM

spatwei writes: A new jailbreak method for large language models (LLMs) takes advantage of models’ ability to identify and score harmful content in order to trick the models into generating content related to malware, illegal activity, harassment and more.

The “Bad Likert Judge” multi-step jailbreak technique was developed and tested by Palo Alto Networks Unit 42, and was found to increase the success rate of jailbreak attempts by more than 60% when compared with direct single-turn attack attempts.

The method is based on the Likert scale, which is typically used to gauge the degree to which someone agrees or disagrees with a statement in a questionnaire or survey. For example, in a Likert scale of 1 to 5, 1 would indicate the respondent strongly disagrees with the statement and 5 would indicate the respondent strongly agrees.

For the LLM jailbreak experiments, the researchers asked the LLMs to use a Likert-like scale to score the degree to which certain content contained in the prompt was harmful. In one example, they asked the LLMs to give a score of 1 if a prompt didn’t contain any malware-related information and a score of 2 if it contained very detailed information about how to create malware, or actual malware code.

After the model scored the provided content on the scale, the researchers would then ask the model in a second step to provide examples of content that would score a 1 and a 2, adding that the second example should contain thorough step-by-step information. This would typically result in the LLM generating harmful content as part of the second example meant to demonstrate the model’s understanding of the evaluation scale.

Submission + - Google's Big Sleep LLM agent discovers exploitable bug in SQLite (scworld.com)

Submitted by spatwei on Tuesday November 05, 2024 @10:52AM

spatwei writes: Google has used a large language model (LLM) agent called “Big Sleep” to discover a previously unknown, exploitable memory flaw in a widely used software for the first time, the company announced Friday.

The stack buffer underflow vulnerability in a development version of the popular open-source database engine SQLite was found through variant analysis by Big Sleep, which is a collaboration between Google Project Zero and Google DeepMind.

Big Sleep is an evolution of Project Zero’s Naptime project, which is a framework announced in June that enables LLMs to autonomously perform basic vulnerability research. The framework provides LLMs with tools to test software for potential flaws in a human-like workflow, including a code browser, debugger, reporter tool and sandbox environment for running Python scripts and recording outputs.

The researchers provided the Gemini 1.5 Pro-driven AI agent with the starting point of a previous SQLIte vulnerability, providing context for Big Sleep to search for potential similar vulnerabilities in newer versions of the software. The agent was presented with recent commit messages and diff changes and asked to review the SQLite repository for unresolved issues.

Google’s Big Sleep ultimately identified a flaw involving the function “seriesBestIndex” mishandling the use of the special sentinel value -1 in the iColumn field. Since this field would typically be non-negative, all code that interacts with this field must be designed to handle this unique case properly, which seriesBestIndex fails to do, leading to a stack buffer underflow.

Submission + - AI bug bounty program yields 34 flaws in open-source tools (scworld.com)

Submitted by spatwei on Thursday October 31, 2024 @10:00AM

spatwei writes: Nearly three dozen flaws in open-source AI and machine learning (ML) tools were disclosed Tuesday as part of Protect AI’s huntr bug bounty program.

The discoveries include three critical vulnerabilities: two in the Lunary AI developer toolkit and one in a graphical user interface (GUI) for ChatGPT called Chuanhu Chat. The October vulnerability report also includes 18 high-severity flaws ranging from denial-of-service (DoS) to remote code execution (RCE).

“Through our own research and the huntr community, we’ve found the tools used in the supply chain to build the machine learning models that power AI applications to be vulnerable to unique security threats,” stated Protect AI Security Researchers Dan McInerney and Marcello Salvati. “These tools are Open Souce and downloaded thousands of times a month to build enterprise AI Systems.”

Submission + - Researchers discover flaws in 5 end-to-end encrypted cloud services (scworld.com)

Submitted by spatwei on Wednesday October 23, 2024 @09:38AM

spatwei writes: Several major end-to-end encrypted cloud storage services contain cryptographic flaws that could lead to loss of confidentiality, file tampering, file injection and more, researchers from ETH Zurich said in a paper published this month.

The five cloud services studied offer end-to-end encryption (E2EE), intended to ensure files can not be read or edited by anyone other than the uploader, meaning not even the cloud storage provider can access the files.

However, ETH Zurich researchers Jonas Hofmann and Kien Tuong Truong, who presented their findings at the ACM Conference on Computer and Communications Security (CCS) last week, found serious flaws in four out of the five services that could effectively bypass the security benefits provided by E2EE by enabling an attacker who managed to compromise a cloud server to access, tamper with or inject files.

The E2EE cloud storage services studied were Sync, pCloud, Seafile, Icedrive and Tresorit, which have a collective total of about 22 million users. Tresorit had the fewest vulnerabilities, which could enable some metadata tampering and use of non-authentic keys when sharing files. The other four services were found to have more severe flaws posing a greater risk to file confidentiality and integrity.

Submission + - LLM attacks take just 42 seconds on average, 20% of jailbreaks succeed (scworld.com)

Submitted by spatwei on Thursday October 10, 2024 @12:55PM

spatwei writes: Attacks on large language models (LLMs) take less than a minute to complete on average, and leak sensitive data 90% of the time when successful, according to Pillar Security.

Pillar’s State of Attacks on GenAI report, published Wednesday, revealed new insights on LLM attacks and jailbreaks, based on telemetry data and real-life attack examples from more than 2,000 AI applications.

LLM jailbreaks successfully bypass model guardrails in one out of every five attempts, the Pillar researchers also found, with the speed and ease of LLM exploits demonstrating the risks posed by the growing generative AI (GenAI) attack surface.

“In the near future, every application will be an AI application; that means that everything we know about security is changing,” Pillar Security CEO and Co-founder Dor Sarig told SC Media.

Submission + - Honkai: Star Rail game executable hijacked to launch ransomware (scworld.com)

Submitted by spatwei on Tuesday September 24, 2024 @12:57PM

spatwei writes: A new ransomware uses the executable for the popular video game “Honkai: Star Rail” to help launch itself while avoiding detection.

The ransomware, dubbed “Kransom” and discovered by analysts from ANY.RUN, employs a technique known as dynamic-link library (DLL) side-loading to hijack the execution flow of the legitimate "Honkai: Star Rail" executable, StarRail.exe.

"Honkai: Star Rail" is a popular roleplaying game with about 21 million players. StarRail.exe possesses a valid certificate from the game’s publisher, COGNOSPHERE PTE. LTD., and is not harmful on its own.

However, when the malicious file StarRailBase.dll is installed, launching the game executable will trigger the ransomware to load and begin encrypting the victim’s files. Kransom uses a simple XOR encryption algorithm with the encoder key 0xaa to lock files, the ANY.RUN analysts said in a blog post published Monday.

The ransom note left behind after encryption instructs the victim to contact the game’s developer, Hoyoverse, in a further attempt at impersonation.

Submission + - Microsoft plans changes after CrowdStrike outage (scmagazine.com)

Submitted by spatwei on Monday September 16, 2024 @12:47PM

spatwei writes: In light of the CrowdStrike outage incident in July, Microsoft is planning to develop more options for security solutions to operate outside of kernel mode, according to a post on the Windows Experience Blog published Thursday.

The CrowdStrike outage, caused by an out-of-bounds memory error in an update to the CrowdStrike Falcon software, which operates at the kernel level, caused a blue screen of death (BSOD) for approximately 8.5 million Windows devices, interrupting operations at many organizations including airports, hospitals, financial institutions and more.

Microsoft, in response to the CrowdStrike incident, held a Windows Endpoint Security Ecosystem Summit at its headquarters in Redmond, Washington, on Tuesday, which was attended by several endpoint security vendors from the Microsoft Virus Initiative (MVI) as well as government officials from the United States and the European Union.

The group discussed various strategies and challenges when it comes to increasing resiliency in the endpoint security ecosystem, to prevent another incident like CrowdStrike without sacrificing security capabilities, according to the blog post authored by Microsoft Vice President of Enterprise and Operating System Security David Weston.

A key discussion point at the summit, in terms of long-term solutions for improving resilience, was the possibility of expanding security vendors’ ability to operate outside of the Windows kernel, making it less likely that a faulty update would lead to widespread BSODs.

Submission + - Changes to controversial California AI safety bill fail to satisfy critics (scmagazine.com)

Submitted by spatwei on Monday August 19, 2024 @11:00AM

spatwei writes: A highly controversial California AI safety bill passed in the state’s Appropriations Committee Thursday, but despite several amendments designed to appease concerned voices in the tech industry, some critics said the changes don’t go far enough to prevent the bill from stifling AI innovation, startups and open-source projects.

California Senate Bill 1047 (SB 1047) would put the onus on AI developers to prevent AI systems from causing mass casualties — for example, through the AI-driven development of biological or nuclear weapons — or major cybersecurity events costing more than $500 million in damage.

The bill would only apply to AI models trained with computing power greater than 1026 floating-point operations per second (FLOPS) and at costs of at least $100 million, and imposes various requirements, including certain security testing and audits, security incident reporting, and the implementation of a “kill switch.”

The bill would also entitle the State of California to sue developers of covered models that result in disaster incidents, i.e., the aforementioned mass casualty events or cyberattacks.

“In my opinion, this is a complete piece of theater and a naked attempt to grab the zeitgeist of a passing moment. And overall, seems to be just another useless bit of tech regulation that penalizes small players and incentivizes tech giants while accomplishing next to nothing,” Dane Grace, technical solutions manager at cybersecurity risk management company Brinqa, told SC Media.

Submission + - NIST releases open-source platform for AI safety testing (scmagazine.com)

Submitted by spatwei on Friday August 02, 2024 @11:18AM

spatwei writes: The National Institute of Standards and Technology (NIST) released a new open-source software tool for testing the resilience of machine learning (ML) models to various types of attacks.

The tool, known as Dioptra, was released Friday along with new AI guidance from NIST marking the 270th day since President Joe Biden’s Executive Order on the Safe, Secure and Trustworthy Development of AI was signed.

The Dioptra tool, which is available on GitHub, will fulfil the executive order’s requirement for NIST to assist with AI model testing and also supports the “measure” function of NIST’s AI Risk Management Framework.

Submission + - Meta's PromptGuard model bypassed by simple jailbreak, researchers say (scmagazine.com)

Submitted by spatwei on Friday August 02, 2024 @11:12AM

spatwei writes: Meta’s Prompt-Guard-86M model, designed to protect large language models (LLMs) against jailbreaks and other adversarial examples, is vulnerable to a simple exploit with a 99.8% success rate, researchers said.

Robust Intelligence AI Security Researcher Aman Priyanshu wrote in a blog post Monday that removing punctuation and spacing out letters in a malicious prompt caused PromptGuard to misclassify the prompt as benign in almost all cases. The researchers also created a Python function to automatically format prompts to exploit the vulnerability.

The flaw was reported to Meta and was also opened as an issue on the Llama models GitHub repository last week, according to Priyanshu. Meta reportedly acknowledged the issue and is working on a fix, the blog post stated.

Slashdot Top Deals