Hi, Good to see another discussion to address the proliferation of AI slop.
I agree with the idea of implementing a basic Van Halen test (though I believe it can be easily subverted if the contributor is not acting in good faith). It is essentially another imperfect categorization mechanism to whittle down the PR queue. Also, I mostly agree with Shahar's suggestion to add an AI step in the CI workflow. But I think it should not be triggered automatically as I imagine it could get very expensive in terms of token consumption. This seems similar to the Co-pilot reviews we have in place except as a rough filter before review rather than after. I was also wondering if we could limit the thinking time (I am not sure if this is feasible but most models have a 'low' and 'high' consumption setting from what I have experienced) and then once the completely un-reviewable PRs have been filtered out, the regular review process with Co-pilot and a human reviewer can take over. Thanks, Sameer Mesiah. On Mon, 8 Jun 2026 at 19:00, Shahar Epstein <[email protected]> wrote: > Thanks for bringing it up Ash, > > I agree that AI slop has been annoying since the emergence of this > revolution, and it seems that we still struggle to deal with it. > > I'm ok with giving a shot to the m&m idea and monitor how effective it is > in quick detection of slops. > > Personally, I do believe in the "fighting fire with fire" methodology to > solve it, which means either: > - Adding an AI step in the CI that judges the PRs' contents of first-time > contributors and treats the PR accordingly. > - or, Having a scheduled workflow that does the above in batches (to avoid > potential abuse by multiple commits). > > Both should be possible to implement with our AWS instance. However, if the > m&m method proves useful - maybe it won't be needed, or we could still have > it but as a secondary layer. > > Jarek - FYI, I use WSL without a browser being configured, so it always > retries creating PRs without "--web" (now I understand why it always tries > it in the first place :D). > > > Shahar > > On Mon, Jun 8, 2026 at 4:16 PM Ash Berlin-Taylor <[email protected]> wrote: > > > Hello everyone, > > > > I’d like to start a discussion point that I put on the agenda for the Dev > > Call last Thursday but we ran out of time. > > > > I don’t think we should be letting agents open PRs, and we should update > > our policies to forbid it. > > > > First off, two things. 1) This is not about agents or LLM generated code, > > just about the act of opening a PR. and 2) We can’t “forbid” it in any > real > > sense, but we can very much make this a brown m&m test[1], and that can > > feed into a signal to issue triage. > > > > Why do I think this is a problem? It means that the likelyhood of the > > actor behind it following the rest of the instructions is greatly > reduced - > > and with the up-tick in volume it serves as a useful pre-filter of there > > being a motivated human behind the change. > > > > I have noticed a number of PRs I’ve reviewed where I don’t think any > human > > has actually looked at the change, and frankly: I’m bored of wasting my > > time on drive-by AI slop. I am firmly in the camp that Humans need to own > > their change, and the person opening the PR should have at least looked > at > > all the code > > > > One example way we can achieve this would be something like this > > https://github.com/apache/airflow/pull/68013 > > > > [1]: https://en.wikipedia.org/wiki/Van_Halen_test > > > > -ash > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > >
