To put things into context, adding the checkbox was discussed <https://lists.apache.org/thread/s5pchk082wpqro8vk400c7wv5jhsbvwg> on the devlist and agreed upon by a lazy cocensus <https://lists.apache.org/thread/9b19dcbcdb41ngw0jqgzcsrtrxl0v34c>.
tl;dr - why we need it: - Reduce reviewer burden (it was introduced at the time when we were overwhelmed with AI slop) - Increase transparency - Preserve ownership - Legal considerations <https://www.apache.org/legal/generative-tooling.html> (was briefly mentioned as part of the proposed template, but I think that it's a serious matter for the project's future health) I don't think that comparison to what we used to do in the past is "apples-to-apples". We're in a very different state in terms of community size, number of releases, and now we're at the beginning of an AI revolution. So asking contributors to add a short description to ensure that they're aware of and own the changes they proposed is a blessing (and not a big deal, IMO - eventually it's a matter of copying-pasting-stylizing the prompt that they had just given to the AI moments before creating the PR). >From another perspective, now that release managers use AI tooling to cherry-pick and/or describe changes in the changelog, it is even more important that PRs are well-summarized - to help them with the release, as the number of PRs has nicely grown overtime - but there's usually one release manager that handles them in each release. Shahar On Sun, Jun 14, 2026 at 4:00 AM Kaxil Naik <[email protected]> wrote: > Hi Przemsyslaw, > > Thanks for the nudge for not keeping the GenAI checkbox on > https://github.com/apache/airflow/pull/68492 > . > > Would you (and generally everyone here) mind sharing what folks are using > that field for? > > I deliberately try to not keep that field in my PR descriptions. And while > reviewing the PRs as well I do not look at that field. As similar to what > you said, it has 0 bearing on my code review. Either the PR is good or > genuine attempt with mistakes or pure slop. > > My read (which can obviously be just my own interpretation) is that those > are guidelines and could help spot drive-by new contributors who in an > attempt to create stars for their GitHub profile, create genuine slop with > 0 project understanding. For committers on the other hand, they are well > aware and have knowledge of the working of the project. Hence, those are > guidelines and are not rules. I can keep the checkbox on my PR but I don’t > think it would serve any purpose. And we include the attribution line in > AGENTS.md anyway, so any AI agent will add that line to the PR by current > design. So that isn’t going to help if we plan or decide to use that to > classify AI slop. > > I have closed probably 50+ PRs over last several months on the Airflow repo > to close sloppy PRs but haven’t used that field to judge that but more the > pattern of several PRs, unrelated changes, lack of response when asked with > technical questions etc were the reason. > > Over Airflow’s history of last 10-11 years, the PR description template has > undergone various incarnation. > https://github.com/apache/airflow/pull/2810 Is an example of my simple PR > - > 9 years ago. And while it looks closed, it isn’t:) we just used different > mechanism to merge changes to main. And this PR was merged. And lessons > from all those years was to know the motivation behind the PR and any > gotchas. We have had several PRs with no descriptions at all but I might > have merged them as well as it was just too evident from the PR title. > > So my recommendation would not be to add/need anything in PR description > which we aren’t going to use to determine something from it. If we’d like > to do any test like the one suggested in thread email, i.e some action > based on the failure of the test, I am fine with it. > > Regards, > Kaxil > > On Sat, 13 Jun 2026 at 13:42, Przemysław Mirowski <[email protected]> > wrote: > > > Hello everyone, > > > > For the start, this message can be a little out of context of this > > discussion (sorry about that), but as it touches the AI usage on the > > Airflow project, I felt that it may be worth it to add my 2c here. > > > > As for the context - I don't use AI to do any coding, prepare PRs, review > > the code. I don't use it also in other areas in my life as a > contradiction > > to the fact that I did some science research on AI in couple of areas > e.g. > > medicine. I don't use it mainly from 2 reasons: > > > > 1. > > I don't trust AI - any generative model can generate stuff which are not > > true and most of the time, AI is pretty convinced that it is right, when > it > > is not > > 2. > > I believe that Software Engineering is a craft which I just like doing > and > > getting better in it. Using any AI, will not make my craftsmanship > better. > > In the contrary, it will made me dependent on the tool and will not make > me > > exercise my understanding on multiple levels of Software Engineering as > the > > design decisions will be proposed by the model and not the result of my > > thoughts and understanding of the issue. Now I can write anything, with > > using AI to generate code for 3 years straight, I don't believe that I > > could write same quality of code or maybe I would not be able to write > > anything after that long time > > > > Of course there are more reasons like climate-related stuff, but these > two > > are the most important for me. > > > > I am the Apache Airflow contributor for some time now and in majority of > > cases, I'm involved in the Helm Chart area. As there are not many things > > going on there, the PRs number is low. As there was not many interest in > > the Helm Chart in the past, I started doing review to potentially make > PRs > > "ready for maintainer" review to, maybe, make Helm Chart alive again. Due > > to all AI-stuff going on since some time now, I'm doing less review and > it > > takes longer time. Not because I'm less committed to the project or I > don't > > like AI or anything. Personally, who wrote the code (human or AI) it > > doesn't really matter for me, until the quality is good and the code is > > fitted correctly to the project and does not break e.g. consistency of > it. > > What I see in the Helm Chart-related PRs is that people do not review the > > code which they commit and, in most cases, when the e.g. helm template > > logic is not perfect, but good enough for me to press "Approve", the test > > cases are just out of the place e.g. in terms of quality, consistency or > > even duplication (same test case already exists somewhere in the test > suite > > and new one is proposed). For me this is really discouraging, because it > > basically kills "Community over Code" which is the core of the Apache > > Software Foundation which was part of my decision why I've got involved > in > > the project. > > > > But, keeping some more relevance to the thread itself, I would ask you, > as > > the Maintainers of the project, to slow down a bit. I feel like past 2-3 > > months was like a sprint to try to solve the issue and looking at some > PRs, > > comments and discussions on the devlist, I think that some things are > > tested too quickly on too big scale, which impacts both Maintainers and > > Contributors of the project. I believe that nobody knows how to resolve > > current situation but taking some actions can be discouraging for current > > or future contributors (first PR which came up to my mind - > > https://github.com/apache/airflow/pull/61039). Just take one step at the > > time. I think that moving just faster will not resolve this issue. > > > > +1 for the Shahar proposal regarding the new PR Template. I would add to > > it the gate in the CI for validation of the description e.g. if > everything > > is visible as it should be (I noticed that a lot of PRs do not have "Was > > generative AI tooling..." part in desc e.g. > > https://github.com/apache/airflow/pull/68492). > > > > P.S. For anyone interested the starting point for this message was > > https://github.com/apache/airflow/pull/68074 PR. > > > > Best regards, > > Przemek > > ________________________________ > > From: Jarek Potiuk <[email protected]> > > Sent: 12 June 2026 16:47 > > To: [email protected] <[email protected]> > > Subject: Re: Discuss/proposal: Update our AI coding policy to "forbid" > > agents opening PRs (not banning LLM generated-code) > > > > Hello everyone, > > > > I’m happy to share that I’ve implemented and tested a new iteration of > our > > triage process based on your feedback! I hope this will help us to > > continue getting benefits from the triage (100s of drive-by PRs moved out > > of the pile, plus useful guidance to some human new collaborators) + > > opportunity to automate deterministic parts in CI and continuously refine > > it will be a good start to make more improvements. > > > > I hope this stabilizes things so we can move forward next with the PR > > template updates and review process (as next steps) and help clear the > > maintainer review queue together. > > > > Here’s a look at what’s new: > > > > 1. Focused Communication: I’ve replaced repetitive comments with a > > single, updateable description in the PR. It keeps track of the latest > > status and responsible party, letting authors know exactly when they are > > "ready for review." > > 2. Helpful Notifications: Authors are now assigned when action is needed > > and unassigned once ready, ensuring they get the right notifications at > the > > right time. > > 3. Smarter Mentions: A new Python script (in deterministic hooks) > ensures > > maintainer IDs are formatted correctly - with (`@id` in backticks) to > > prevent any accidental pings. > > 4. Approachable Tone: Comments are now shorter and more direct, > balancing > > friendly guidance with our expectations. > > 5. Reliability: The workflow remains consistent while making > > responsibility even clearer for everyone. > > > > I’m still gathering stats to see what we can automate in the CI soon. > > Today’s triage (66 actions out of 500) shows that more PRs are passing > our > > criteria than being filtered out—which confirms that our main goal is > > simply making the most of our human attention! > > > > Triage Summary: > > > > * mark-ready: 21 > > * workflow-approvals: 20 > > * reruns: 3 > > * violation folds (draft/comment): 7 > > * request-author-confirmation: 4 > > * pings: 4 > > * stale-draft closes: 5 > > > > You can see the new notes in action here for example (screenshots are > also > > attached): https://github.com/apache/airflow/pull/67790 > > > > I hope we can continue refining it together, and I think that thread was > a > > good opportunity to surface some of the issues. > > > > Best regards, > > Jarek > > > > > > On Thu, Jun 11, 2026 at 3:32 PM 김준영 <[email protected]<mailto: > > [email protected]>> wrote: > > That makes a lot of sense — thanks for taking the time to explain the > > reasoning in detail. I have a much better understanding of the > > project's philosophy now. > > > > Junyeong Kim > > > > 2026년 6월 11일 (목) 오후 10:27, Jarek Potiuk <[email protected]<mailto: > > [email protected]>>님이 작성: > > > > > > That said, I think the key distinction is who controls the assignee > > > slot. Rather than contributors (or agents) requesting assignment, what > > > if maintainers were the ones to grant it — based on their own current > > > capacity? Each maintainer could self-regulate how many issues they're > > > actively triaging at a given time. Even if agents flood the queue with > > > requests, nothing moves forward without a maintainer actively choosing > > > to open a slot. > > > > > > This is precisely the point. We want to cut the noise, not add more. A > > > maintainer mechanically assigning an assignee to the person requesting > it > > > is precisely what we do not want to do. Especially since we have no way > > of > > > knowing whether the person requesting it is a real person or a bot. > It's > > > really not a problem to have several people (or agents) working on the > > same > > > issue simultaneously. We even prefer people opening PRs without prior > > > issues, and really de-duplication of that work is not a goal for us. > > > Contributors working on the same thing in parallel will learn - even > from > > > others doing parallel implementation (if they are humans) or lose their > > own > > > tokens (if they are agents. We care about people learning, we do not > care > > > about others directing their tokens into whatever they feel they want. > > > > > > In short - at least as I see it (but I would love to hear > > others)—handling > > > assignments manually adds maintainers more (dull and completely > > mechanical) > > > work, while freeing people using agents without understanding the > > workflow > > > to save their tokens and spam us even more. It also gives them fewer > > > opportunities to learn, so it's not worth it—only losses, no gains. > > > > > > On Thu, Jun 11, 2026 at 2:17 PM 김준영 <[email protected]<mailto: > > [email protected]>> wrote: > > > > > > > Subject: Re: [DISCUSS] Agents opening PRs > > > > > > > > Hi Jarek, > > > > > > > > Thanks for the thoughtful response — the point about agents instantly > > > > re-requesting assignment is a real concern. > > > > > > > > That said, I think the key distinction is who controls the assignee > > > > slot. Rather than contributors (or agents) requesting assignment, > what > > > > if maintainers were the ones to grant it — based on their own current > > > > capacity? Each maintainer could self-regulate how many issues they're > > > > actively triaging at a given time. Even if agents flood the queue > with > > > > requests, nothing moves forward without a maintainer actively > choosing > > > > to open a slot. > > > > > > > > This shifts the bottleneck to maintainer bandwidth, which is already > > > > the real constraint anyway. And it naturally filters signal from > noise > > > > — maintainers would prioritize issues worth acting on. > > > > > > > > Could that be a workable middle ground? > > > > > > > > Junyeong Kim > > > > > > > > 2026년 6월 11일 (목) 오후 9:07, Jarek Potiuk <[email protected]<mailto: > > [email protected]>>님이 작성: > > > > > > > > > > Hi everyone, > > > > > > > > > > Just a quick update that’s quite relevant to this discussion and > > Ash’s > > > > > concerns about AGENTS.md. I had a great call yesterday with Jason > > and our > > > > > GSoC intern, Roy. We’ve decided to focus his internship on > optimizing > > > > > AGENTS.md by extracting key sections and defining evals for them, > > > > inspired > > > > > by the mini-eval framework in Magpie. This should help make our > > agentic > > > > > instructions much more deterministic. Since agents can struggle > with > > very > > > > > long instructions, splitting these into smaller, focused "skills" > > should > > > > > really help them follow our guidelines more reliably. > > > > > > > > > > We’ll share a formal announcement on the devlist soon. I’d love for > > us > > > > all > > > > > to jump in on the reviews—it’s a great chance for us to learn > > together > > > > > about agent limitations and how to better manage them. > > > > > > > > > > Junyeong, thanks for the suggestion on reintroducing assignments. > > While I > > > > > understand the intent, I'm a little worried it might backfire. In > the > > > > past, > > > > > "assign and disappear" was a challenge, but my bigger concern now > is > > that > > > > > agents can "request assignment" almost instantly after de-assigning > > and > > > > > practically for free (deterministically). Previously, requesting > > > > > assignments created a lot of noise and required maintainers to act. > > > > > However, even if we automate this - like some other projects—agents > > could > > > > > effectively block issues indefinitely, making it much harder for > real > > > > human > > > > > contributors to find an opening. > > > > > > > > > > But - looking forward to hearing more thoughts. > > > > > > > > > > Best regards, > > > > > > > > > > Jarek > > > > > > > > > > > > > > > On Thu, Jun 11, 2026 at 1:39 AM 김준영 <[email protected] > <mailto: > > [email protected]>> wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > Thanks for the discussion — as a contributor, I've found it > really > > > > > > helpful to understand how maintainers are thinking about this. > > > > > > > > > > > > One thing I've noticed from the contributor side: without an > > assignee > > > > > > system, there's no clear signal at the issue level that someone > is > > > > > > already working on something. That lower friction might be part > of > > > > > > what's making it easier for agent-driven PRs to slip through > > without > > > > > > prior discussion. > > > > > > > > > > > > I'm not sure of the full history behind removing assignees, but I > > > > > > wonder if the original "assign and abandon" problem could have > been > > > > > > addressed with an auto-unassign policy (e.g. 2 weeks of > inactivity) > > > > > > rather than removing the system entirely. Reintroducing assignees > > with > > > > > > that kind of timeout might act as an upstream complement to the > > > > > > PR-level checks being discussed here. > > > > > > > > > > > > Could that be worth revisiting alongside Jarek's proposal? > > > > > > > > > > > > Junyeong Kim > > > > > > > > > > > > 2026년 6월 11일 (목) 오전 8:20, Jarek Potiuk <[email protected]<mailto: > > [email protected]>>님이 작성: > > > > > > > > > > > > > > > I was watching the mail train and I think that sounds good. > > Hope > > > > the > > > > > > > > check can be made early e.g. during build info and if > possible > > can > > > > we > > > > > > > > (once setting to DRAFT) kill all successor steps to save CI > > > > capacity? > > > > > > > > > > > > > > Excellent idea - absolutely, we can build it into > > "selective-checks" > > > > to > > > > > > > "fail" and make a clear statement during failure. I hadn't > > thought of > > > > > > that. > > > > > > > There were some ideas about "pull_request_target", but yes, you > > are > > > > > > > completely right - all that checks are deterministic and can be > > part > > > > of > > > > > > the > > > > > > > "buid info" job that we use to determine what to do with the > PR. > > > > Should > > > > > > be > > > > > > > very simple. > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 10, 2026 at 8:43 PM Jens Scheffler < > > [email protected]<mailto:[email protected]>> > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > I was watching the mail train and I think that sounds good. > > Hope > > > > the > > > > > > > > check can be made early e.g. during build info and if > possible > > can > > > > we > > > > > > > > (once setting to DRAFT) kill all successor steps to save CI > > > > capacity? > > > > > > > > > > > > > > > > Otherwise I hope we can most constructive, not "Fighting fire > > with > > > > > > fire" > > > > > > > > but rather aim to improve agent descriptions to optimize > > other's > > > > token > > > > > > > > budgets in favor of our requirements. We can not turn back > > time and > > > > > > need > > > > > > > > to assume the level of agent contributions will stay forever > in > > > > future. > > > > > > > > > > > > > > > > Jens > > > > > > > > > > > > > > > > On 10.06.26 08:55, Jarek Potiuk wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > I’ve spent some time reflecting on all the great points > > raised > > > > here. > > > > > > Our > > > > > > > > > shared goals are to ensure human ownership and review, keep > > > > agents as > > > > > > > > > helpful assistants rather than sole authors, and reduce the > > > > cognitive > > > > > > > > load > > > > > > > > > from long AI-generated descriptions. > > > > > > > > > > > > > > > > > > I really like Shahar's proposal and would love to build on > it > > > > with a > > > > > > few > > > > > > > > > suggestions to make our expectations clear and supportive > > for our > > > > > > human > > > > > > > > > contributors: > > > > > > > > > > > > > > > > > > - Explicit Instructions: Let’s be very open in our > > templates > > > > and > > > > > > > > > AGENTS.md. We can instruct agents to pause and ask the > human > > to > > > > > > write the > > > > > > > > > description, noting that this personal touch is essential > for > > > > the PR > > > > > > to > > > > > > > > > stay open. > > > > > > > > > - Human Review Checkbox: I suggest adding a checkbox: "- > > [ ] I > > > > > > have > > > > > > > > > reviewed this code myself." We’ll instruct agents to leave > > this > > > > for > > > > > > the > > > > > > > > > human to check, ensuring that vital moment of reflection. > > > > > > > > > - Instead of copy-pasting (which I find awkward), we can > > > > instruct > > > > > > the > > > > > > > > > agents to use `gh --web`, `--template` (to include the > > > > template), and > > > > > > > > > `--draft` (following Pierre's idea). This creates natural > > > > > > > > > checkpoints—filling the description, checking the box, > > clicking > > > > > > submit, > > > > > > > > and > > > > > > > > > undrafting—that encourage human involvement. > > > > > > > > > > > > > > > > > > We should also state the consequences for non-compliance: > To > > > > keep our > > > > > > > > queue > > > > > > > > > healthy, we should use an automated process to close PRs > that > > > > miss > > > > > > these > > > > > > > > > steps, with a note explaining how to resubmit them with > human > > > > input. > > > > > > > > > > > > > > > > > > All those expectations and closing etc. should be equally > > > > applied to > > > > > > all > > > > > > > > > PRs, including maintainer PRs. This will also allow those > of > > us > > > > who > > > > > > use > > > > > > > > > agents to monitor the process and refine the instructions > if > > we > > > > see > > > > > > any > > > > > > > > > loopholes that agents try to bypass or learn how to > > circumvent. > > > > This > > > > > > will > > > > > > > > > allow us to continuously improve the process. > > > > > > > > > > > > > > > > > > I believe this approach balances productivity with the > > > > high-quality > > > > > > human > > > > > > > > > collaboration we all value. > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > > > Jarek > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 9, 2026 at 5:00 PM Shahar Epstein < > > [email protected]<mailto:[email protected]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> Here's a more concrete suggestion: > > > > > > > > >> > > > > > > > > >> Updating the PR template in such a way that: > > > > > > > > >> 1. Human summary is now a MUST - at least a oneliner* (or > > more, > > > > > > > > depending > > > > > > > > >> on the scope - TBD) that describes the suggested changes > > > > written by > > > > > > the > > > > > > > > >> PR's author themselves (without AI assistance). > > > > > > > > >> 2. AI summary is optional. However, when included - it > MUST > > be > > > > bound > > > > > > > > within > > > > > > > > >> a collapsible box, mainly to save cognitive load for > > > > maintainers and > > > > > > > > >> contributors, but also to encourage human interaction like > > we > > > > used > > > > > > to do > > > > > > > > >> before it all started. > > > > > > > > >> 3. PR's author (human) should be the one declaring the AI > > usage > > > > > > > > checkbox - > > > > > > > > >> added a short statement of ownership. > > > > > > > > >> > > > > > > > > >> Contributors will be instructed to use this template and > > adhere > > > > to > > > > > > the > > > > > > > > >> instructions when creating a PR. > > > > > > > > >> Agents may push branches to forks, but they will be > > instructed > > > > to > > > > > > avoid > > > > > > > > >> creating PRs on their own to the upstream repository, and > > > > instead > > > > > > > > provide > > > > > > > > >> the link for creating the PR using this template (they > could > > > > > > suggest an > > > > > > > > AI > > > > > > > > >> summary, but the contributor should copy and paste it > > manually > > > > to > > > > > > the > > > > > > > > >> collapsible box). Trying to work around that might result > > in an > > > > M&M > > > > > > test > > > > > > > > >> directly in the PR's description (TBD). > > > > > > > > >> > > > > > > > > >> Example is available here < > > > > > > https://github.com/apache/airflow/pull/68055> > > > > > > > > - > > > > > > > > >> I've made HTML comments visible, they will be hidden in > the > > real > > > > > > thing. > > > > > > > > >> > > > > > > > > >> Took inspiration for this idea from > > https://tenbluelinks.org/ , > > > > > > that > > > > > > > > hides > > > > > > > > >> the AI overview on Google if you're not interested > > > > > > (highly-recommended > > > > > > > > >> btw). > > > > > > > > >> > > > > > > > > >> Can we live with that? > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> Shahar > > > > > > > > >> > > > > > > > > >> On Tue, Jun 9, 2026 at 3:30 PM Ash Berlin-Taylor < > > > > [email protected]<mailto:[email protected]>> > > > > > > > > wrote: > > > > > > > > >> > > > > > > > > >>> I don’t care one way or another about using AI as a tool > > in CI, > > > > > > that is > > > > > > > > >>> secondary to my goal which is to try and do something to > > make > > > > it > > > > > > clear > > > > > > > > >> what > > > > > > > > >>> we expect from people wanting to contribute to Airflow, > > namely: > > > > > > > > >>> > > > > > > > > >>> 1. Human involvement. > > > > > > > > >>> > > > > > > > > >>> By submitting a PR you are saying “yes I want to be a > > member > > > > of the > > > > > > > > >>> community”. Agents submitting without human interaction > go > > > > against > > > > > > > > this. > > > > > > > > >>> > > > > > > > > >>> 2. Human ownership. > > > > > > > > >>> > > > > > > > > >>> It is _your responsibility_ as the PR author to follow up > > on > > > > it, > > > > > > > > address > > > > > > > > >>> comments, and request reviews. > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>> I frankly find the AI generated triage comments verbose, > > and a > > > > > > waste > > > > > > > > of > > > > > > > > >>> time and pure noise even without the `@` spam. > > > > > > > > >>> > > > > > > > > >>> If the user doesn’t care enough about their own PR to > > follow > > > > up on > > > > > > it: > > > > > > > > >>> close it after some time. We don’t need to baby sit them. > > Nor > > > > do I > > > > > > need > > > > > > > > >> yet > > > > > > > > >>> more commit email messages to read through. > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>> So how does it sound: It sounds like hell to me and an > even > > > > bigger > > > > > > > > waste > > > > > > > > >>> of electricity in a climate crisis. > > > > > > > > >>> > > > > > > > > >>> I want to be involved in a community of humans working to > > build > > > > > > > > software. > > > > > > > > >>> I do not want to see LLMs producing so much output that > > other > > > > > > people > > > > > > > > need > > > > > > > > >>> LLMs to summarise it, with no humans looking at things. > > > > > > > > >>> > > > > > > > > >>> -ash > > > > > > > > >>> > > > > > > > > >>>> On 9 Jun 2026, at 13:18, Jarek Potiuk <[email protected] > > <mailto:[email protected]>> > > > > wrote: > > > > > > > > >>>> > > > > > > > > >>>>> Why? Because AI “instructions” cannot be trusted. And I > > am > > > > after > > > > > > a > > > > > > > > >>> signal > > > > > > > > >>>> that people are blindly using LLMs without enough human > > > > > > introversion. > > > > > > > > >>>> > > > > > > > > >>>> But is not that what you are doing? This proposal is > about > > > > adding > > > > > > > > >> another > > > > > > > > >>>> AI instruction (just hidden in HTML) - how is that going > > to > > > > help? > > > > > > > > >>>> > > > > > > > > >>>>> You already updated the instructions to not `@` the > > reviewer > > > > here > > > > > > > > >>>> Indeed, LLMs are not deterministic by nature. But they > are > > > > > > improvable. > > > > > > > > >>>> Through iterations of refinement and adding more > > guardrails > > > > we can > > > > > > > > >>> improve > > > > > > > > >>>> it—and this is exactly why I am running it manually to > > make it > > > > > > better. > > > > > > > > >>> This > > > > > > > > >>>> is the same as in regular breeze development in the > past. > > > > > > Initially, > > > > > > > > >>> there > > > > > > > > >>>> were many small issues - and I remember how you > complained > > > > about > > > > > > them > > > > > > > > >> and > > > > > > > > >>>> how unnecessary they seemed—yet we now perfected it over > > time. > > > > > > Now, it > > > > > > > > >>>> allows all contributors and maintainers to work much > more > > > > > > efficiently > > > > > > > > >> and > > > > > > > > >>>> lose less time. BTW. Thanks for notifying me; I must > > > > strengthen > > > > > > this > > > > > > > > >> one > > > > > > > > >>>> and see why, as there might be another improvement to > > > > implement. > > > > > > This > > > > > > > > >> is > > > > > > > > >>>> also why we are not "yet" doing CI analysis by AI - > > because I > > > > > > want to > > > > > > > > >>>> iterate on it and fix it in the way to know which parts > > are > > > > > > > > >>> deterministic. > > > > > > > > >>>>> I want to do anything and everything to reduce the > drive > > by > > > > > > > > >> contribution > > > > > > > > >>>> with no human activity. I’m happy to spend my time > helping > > > > > > humans, but > > > > > > > > >> if > > > > > > > > >>>> they are just going to feed that back to an LLM and burn > > an > > > > > > egregious > > > > > > > > >>>> amount of carbon: no thank you. > > > > > > > > >>>> > > > > > > > > >>>> And again I am not sure how the proposal to add that > > > > instruction > > > > > > would > > > > > > > > >>>> address this particular issue? Are you just proposing to > > add > > > > > > another > > > > > > > > >>>> instruction for the LLM (or am I wrong?). How does it > > solve > > > > the > > > > > > > > >> problem? > > > > > > > > >>>> From what I understand we have two basic proposals > here - > > > > that > > > > > > > > >> contradict > > > > > > > > >>>> each other: > > > > > > > > >>>> > > > > > > > > >>>> * Ash - do not use AI to fight with AI at all > > > > > > > > >>>> * Amoght, Shahar - use AI in CI > > > > > > > > >>>> > > > > > > > > >>>> But I think, the triage I am running now shows a third > > way: > > > > > > > > >>>> > > > > > > > > >>>> * we use AI to try out and generate triage action and > > figure > > > > out > > > > > > which > > > > > > > > >>>> parts are practically 100% deterministic and can help > with > > > > triage > > > > > > > > (this > > > > > > > > >>> is > > > > > > > > >>>> the stats I am gathering now) > > > > > > > > >>>> * qe use AI to convert the SKILLS we have into > > deterministic > > > > CI > > > > > > code > > > > > > > > >> that > > > > > > > > >>>> does those triage steps (no AI used at all at runtime) > > > > > > > > >>>> * we continue perfecting the manually-triggered AI > SKILLS > > to > > > > get > > > > > > more > > > > > > > > >> AI > > > > > > > > >>>> heuristics that we can turn into deterministic CI code > > > > > > > > >>>> > > > > > > > > >>>> This seems to fulfill seemingly contradictory > expectations > > > > that > > > > > > > > >> different > > > > > > > > >>>> people have in a nice way. I am about to produce stats > > from > > > > the > > > > > > last > > > > > > > > >> run > > > > > > > > >>>> and was just about to propose this approach. > > > > > > > > >>>> > > > > > > > > >>>> How does it sound Ash, Amogh, Shahar and others ? > > > > > > > > >>>> > > > > > > > > >>>> J. > > > > > > > > >>>> > > > > > > > > >>>> > > > > > > > > >>>> On Tue, Jun 9, 2026 at 12:55 PM Ash Berlin-Taylor < > > > > [email protected]<mailto:[email protected]> > > > > > > > > > > > > > > > >>> wrote: > > > > > > > > >>>>> Why? Because AI “instructions” cannot be trusted. And I > > am > > > > after > > > > > > a > > > > > > > > >>> signal > > > > > > > > >>>>> that people are blindly using LLMs without enough human > > > > > > introversion. > > > > > > > > >>>>> > > > > > > > > >>>>> Want a prime example? > > > > > > > > >>>>> > > > > > > > > >>>>> The pr triage skill. > > > > > > > > >>>>> > > > > > > > > >>>>> You already updated the instructions to not `@` the > > reviewer > > > > here > > > > > > > > >>>>> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/airflow-steward/blob/76cfa5e1d2e682b88df5205e9cda396df51a66b6/skills/pr-management-triage/comment-templates.md#reviewer-mention-policy > > > > > > > > >>>>>> When a comment's only addressee is the PR author (the > > > > > > > > >>>>> request-author-confirmation, reviewer-ping > > author-primary, > > > > and > > > > > > > > >>> review-nudge > > > > > > > > >>>>> author-primary templates), the body references the > > reviewer > > > > > > without > > > > > > > > >>>>> @-mentioning them > > > > > > > > >>>>> > > > > > > > > >>>>> And yet the LLM did it again: > > > > > > > > >>>>> > > > > > > > > https://github.com/apache/airflow/pull/66633#discussion_r3344849352 > > > > > > > > >>>>> > > > > > > > > >>>>>> @korex-f — A reviewer (@ashb) has requested changes on > > this > > > > PR, > > > > > > so > > > > > > > > >> I've > > > > > > > > >>>>> removed the ready for maintainer review label — the > next > > > > step is > > > > > > on > > > > > > > > >> your > > > > > > > > >>>>> side. Could you address the review comments (push a > fix, > > or > > > > reply > > > > > > > > >>> in-thread > > > > > > > > >>>>> explaining why the feedback doesn't apply)? Once > > addressed, > > > > > > > > re-request > > > > > > > > >>>>> review from @ashb or re-mark the PR ready and it > returns > > to > > > > the > > > > > > > > >>> maintainer > > > > > > > > >>>>> queue. Thank you. > > > > > > > > >>>>> > > > > > > > > >>>>> And frankly I’m tired of all this shit. > > > > > > > > >>>>> > > > > > > > > >>>>> I want to do anything and everything to reduce the > drive > > by > > > > > > > > >> contribution > > > > > > > > >>>>> with no human activity. I’m happy to spend my time > > helping > > > > > > humans, > > > > > > > > but > > > > > > > > >>> if > > > > > > > > >>>>> they are just going to feed that back to an LLM and > burn > > an > > > > > > egregious > > > > > > > > >>>>> amount of carbon: no thank you. > > > > > > > > >>>>> > > > > > > > > >>>>> -ash > > > > > > > > >>>>> > > > > > > > > >>>>> > > > > > > > > >>>>>> On 9 Jun 2026, at 10:38, Jarek Potiuk < > [email protected] > > <mailto:[email protected]>> > > > > wrote: > > > > > > > > >>>>>> > > > > > > > > >>>>>> Hi Ash, Amogh, and Shahar, > > > > > > > > >>>>>> > > > > > > > > >>>>>> Ash, I'm curious to learn more about how the "brown > m&m > > > > test" > > > > > > > > differs > > > > > > > > >>>>> from > > > > > > > > >>>>>> our current request for agents to identify themselves. > > > > Could you > > > > > > > > help > > > > > > > > >>> me > > > > > > > > >>>>>> understand the flow and the specific benefits you see? > > It > > > > feels > > > > > > > > >> similar > > > > > > > > >>>>> to > > > > > > > > >>>>>> me, but I'd love to hear your perspective in case I'm > > > > missing a > > > > > > > > >> nuance. > > > > > > > > >>>>>> Regarding the gh pr create --web approach, we included > > those > > > > > > > > >>> instructions > > > > > > > > >>>>>> to ensure we meet ASF legal guidelines for Gen-AI > > headers, > > > > and > > > > > > to > > > > > > > > >>> support > > > > > > > > >>>>>> contributors who might not have Copilot. That said, if > > you > > > > have > > > > > > > > ideas > > > > > > > > >>> on > > > > > > > > >>>>>> how to trim the context or improve the templates, we > > truly > > > > > > > > appreciate > > > > > > > > >>> PRs > > > > > > > > >>>>>> that improve them—and many people already have. > > AGENTS.md > > > > is a > > > > > > team > > > > > > > > >>>>> effort, > > > > > > > > >>>>>> and we’re always looking for ways to make it better. > > Let's > > > > keep > > > > > > our > > > > > > > > >>>>>> collaboration positive as we refine these processes > > > > together. > > > > > > > > >>>>>> > > > > > > > > >>>>>> Amogh and Shahar, yep the idea of an validatio step in > > the > > > > CI > > > > > > for > > > > > > > > >>>>>> first-time contributions is something we should > > implement > > > > > > sooner or > > > > > > > > >>>>> later. > > > > > > > > >>>>>> I have actually been gathering stats on this for the > > last > > > > two > > > > > > weeks. > > > > > > > > >>> I’ve > > > > > > > > >>>>>> been preparing to see how manually triggered triage > > tasks > > > > can > > > > > > turn > > > > > > > > >> into > > > > > > > > >>>>>> automated ones—I'm gathering stats on when human > > judgment is > > > > > > needed. > > > > > > > > >> I > > > > > > > > >>>>>> shared some stats about this recently and will > continue > > > > > > gathering > > > > > > > > >> them. > > > > > > > > >>>>> The > > > > > > > > >>>>>> next step is discussing here what and how we can > > automate. > > > > > > > > >>>>>> > > > > > > > > >>>>>> Also, the current triage process already uses our Pull > > > > Request > > > > > > > > >> criteria > > > > > > > > >>>>> to > > > > > > > > >>>>>> pre-classify the PRs and only marks them with "ready > for > > > > > > maintainer > > > > > > > > >>>>> review" > > > > > > > > >>>>>> if those criteria are met. So, if there are any > specific > > > > > > criteria > > > > > > > > >> you’d > > > > > > > > >>>>>> like to see added to our "Pull request criteria," PRs > > are > > > > most > > > > > > > > >> welcome > > > > > > > > >>>>>> there as well. > > > > > > > > >>>>>> > > > > > > > > >>>>>> Best regards, > > > > > > > > >>>>>> > > > > > > > > >>>>>> Jarek > > > > > > > > >>>>> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > --------------------------------------------------------------------- > > > > > > > > >>> To unsubscribe, e-mail: > [email protected] > > <mailto:[email protected]> > > > > > > > > >>> For additional commands, e-mail: > > [email protected]<mailto:[email protected]> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > > > To unsubscribe, e-mail: [email protected] > > <mailto:[email protected]> > > > > > > > > For additional commands, e-mail: [email protected] > > <mailto:[email protected]> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > To unsubscribe, e-mail: [email protected] > <mailto: > > [email protected]> > > > > > > For additional commands, e-mail: [email protected] > > <mailto:[email protected]> > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [email protected]<mailto: > > [email protected]> > > > > For additional commands, e-mail: [email protected]<mailto: > > [email protected]> > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected]<mailto: > > [email protected]> > > For additional commands, e-mail: [email protected]<mailto: > > [email protected]> > > > > >
