Hi,

Good to see another discussion to address the proliferation of AI slop.

I agree with the idea of implementing a basic Van Halen test (though I
believe it can be easily subverted if the contributor is not acting in good
faith). It is essentially another imperfect categorization mechanism to
whittle down the PR queue.

Also, I mostly agree with Shahar's suggestion to add an AI step in the CI
workflow. But I think it should not be triggered automatically as I imagine
it could get very expensive in terms of token consumption. This seems
similar to the Co-pilot reviews we have in place except as a rough filter
before review rather than after. I was also wondering if we could limit the
thinking time (I am not sure if this is feasible but most models have a
'low' and 'high' consumption setting from what I have experienced) and then
once the completely un-reviewable PRs have been filtered out, the regular
review process with Co-pilot and a human reviewer can take over.

Thanks,
Sameer Mesiah.

On Mon, 8 Jun 2026 at 19:00, Shahar Epstein <[email protected]> wrote:

> Thanks for bringing it up Ash,
>
> I agree that AI slop has been annoying since the emergence of this
> revolution, and it seems that we still struggle to deal with it.
>
> I'm ok with giving a shot to the m&m idea and monitor how effective it is
> in quick detection of slops.
>
> Personally, I do believe in the "fighting fire with fire" methodology to
> solve it, which means either:
> - Adding an AI step in the CI that judges the PRs' contents of first-time
> contributors and treats the PR accordingly.
> - or, Having a scheduled workflow that does the above in batches (to avoid
> potential abuse by multiple commits).
>
> Both should be possible to implement with our AWS instance. However, if the
> m&m method proves useful - maybe it won't be needed, or we could still have
> it but as a secondary layer.
>
> Jarek - FYI, I use WSL without a browser being configured, so it always
> retries creating PRs without "--web" (now I understand why it always tries
> it in the first place :D).
>
>
> Shahar
>
> On Mon, Jun 8, 2026 at 4:16 PM Ash Berlin-Taylor <[email protected]> wrote:
>
> > Hello everyone,
> >
> > I’d like to start a discussion point that I put on the agenda for the Dev
> > Call last Thursday but we ran out of time.
> >
> > I don’t think we should be letting agents open PRs, and we should update
> > our policies to forbid it.
> >
> > First off, two things. 1) This is not about agents or LLM generated code,
> > just about the act of opening a PR. and 2) We can’t “forbid” it in any
> real
> > sense, but we can very much make this a brown m&m test[1], and that can
> > feed into a signal to issue triage.
> >
> > Why do I think this is a problem? It means that the likelyhood of the
> > actor behind it following the rest of the instructions is greatly
> reduced -
> > and with the up-tick in volume it serves as a useful pre-filter of there
> > being a motivated human behind the change.
> >
> > I have noticed a number of PRs I’ve reviewed where I don’t think any
> human
> > has actually looked at the change, and frankly: I’m bored of wasting my
> > time on drive-by AI slop. I am firmly in the camp that Humans need to own
> > their change, and the person opening the PR should have at least looked
> at
> > all the code
> >
> > One example way we can achieve this would be something like this
> > https://github.com/apache/airflow/pull/68013
> >
> > [1]: https://en.wikipedia.org/wiki/Van_Halen_test
> >
> > -ash
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
>

Reply via email to