On Tue, Apr 28, 2026 at 11:00 AM Jeremy Stanley <[email protected]> wrote:
>
> As I'm sure is the case for everyone, the projects I work in are
> under a seemingly unending deluge of vulnerability reports from
> researchers using LLMs to mine for security gold in our software. At
> the same time, we see maintainers on our projects relying on
> LLM-oriented tools to develop fixes for vulnerabilities and compose
> prose for advisories.
>
> While I take a moment to catch my breath, this new Bizarro World
> we're all living in has gotten me thinking about the risks of public
> LLM services to embargoed vulnerability handling workflows and
> traditional coordinated disclosure. The operators of these LLM
> services are known to feed prompts and results back into their
> training data, presumably making it faster and easier for the same
> information to be found later by other users of the same service.
> Would keeping embargoes short help to mitigate related risks of
> parallel rediscovery or outright disclosure to other LLM users? It
> seems to me that there must be some inherent lag in this process,
> but how much?
>
> I'm sorely tempted, both due to the increased volume and the risk of
> premature disclosure, to just assume that any vulnerability reported
> as a result of research using an LLM is trivially discoverable by
> others, and give up trying to pretend there's any point to working
> it under embargo. Similarly, it makes sense to me that patch
> development and descriptive prose shouldn't be produced with LLM
> assistance for any vulnerability that is being worked under an
> embargo.
>
> I can't be the only one whose been pondering this... what positions
> have the rest of you taken?

Anthropic has a long article at [0].  If you scroll down beyond the
explanations for the vulnerabilities the tool found, you land in a
section titled "Suggestions for defenders today".  From that section:

    Think beyond vulnerability finding. Frontier models can also
    accelerate defensive work in many other ways. For example, they can:

      * Provide a first-round triage to evaluate the correctness and
        severity of bug reports;
      * De-duplicate bug reports and otherwise help with the triage
        processes;
      * Assist in writing reproduction steps for vulnerability reports;
      * Write initial patch proposals for bug reports;
      * Analyze cloud environments for misconfigurations;
      * Aid engineers in reviewing pull requests for security bugs;
      * Accelerate migrations from legacy systems to more secure ones;

They seem like good suggestions.  However, if you are going to use AI
to find bugs and perform the triage, then the project must have a very
good and comprehensive set of test cases (both positive and negative).
While this is needed for both Natural Intelligence changes and AI
changes, I believe it is more important because AI is not going to
have a lot of Natural Intelligence oversight at times.

[0] Assessing Claude Mythos Preview’s cybersecurity capabilities,
<https://red.anthropic.com/2026/mythos-preview/>.

Jeff

Reply via email to