Yeah! I’m all on board in that case! On Thu, Mar 12, 2026 at 7:22 PM Jarek Potiuk <[email protected]> wrote:
> > As long as this doesn’t turn into a human replacement situation, I think > this is really useful and is a great idea! > > Not really - this is a "human enhancement." - by all means and designed > from grounds up to be one. > > > On Thu, Mar 12, 2026 at 11:45 PM EJ Stinson <[email protected]> > wrote: > > > As long as this doesn’t turn into a human replacement situation, I think > > this is really useful and is a great idea! > > > > On Thu, Mar 12, 2026 at 8:57 AM Jarek Potiuk <[email protected]> wrote: > > > > > Indeed - but this does not reflect the final numbers yet. It also > misses > > > the fact that 200 of those are already draft (I converted a 100 of them > > are > > > least) and each of them has detailed instructions for the authors what > to > > > do. At least 10 people thanked me for providing such detailed and > > guidance. > > > > > > > > > Some 30-40 so far are `ready for maintainer review` (label). But I have > > not > > > run the tool on all of the open PRs only on mine, dev-area and > providers. > > > > > > i am still iterating and I proving the tool (with Yeongook help - he > also > > > run some of the triage - not only me. > > > > > > There are still some things to complete - I implemented a very strong > > > security layer to completely sandbox and isolate the LLMs and prevent > PR > > > prompt injection attacks (those are real things nowadays) > > > https://github.com/apache/airflow/pull/63422 > > > > > > And I am working on a separate `review` mode - which will allow > > maintainers > > > equally efficiently review already good PRs (the ones marked with > 'ready > > > for maintainer review` label With the same approach - deterministic > > checks > > > for speed + very targeted LLM assistance - but keep that human in the > > loop > > > and maintainers in the driving seat. No comment, no message, no > > assessment > > > posted to the contributor without conscious decision of the maintainer. > > > > > > I am looking at responses of people and have some small improvements on > > the > > > way. > > > > > > I am also implementing some of the small workflows I see as current > > > patterns in reviews of others - and I hope by early next week I will > > have a > > > completely working and battle tested solution. > > > > > > I think that with the tool we will be able to handle easily (and I am > not > > > exaggerating at all) at the very least two orders of magnitude more PR > > > traffic that we see right now - especially when more of us start using > it > > > and when we share the triage/review burden (very, very low for triage > > > part). among more maintainers. > > > > > > I was hoping to demo it today at dev call - but I did not realize I am > > > getting back to Warsaw from Slovakia today - and it is unlikely I will > be > > > able to share demo - istll might be on/off at the call but in demoable > > > circumstances likely - I might try but It's not likely - but I will > > create > > > a detailed description of the tool, how to use it and proposed process > > and > > > will record a screencast (likely weekend) demoing how it works and will > > > share with everyone. > > > > > > I am super optimistic that wei will be able to solve the PR problem > this > > > way, and that we will be able to apply similar approach to issues and > > later > > > also security reports. Smartly combining humans as driver's, > > deterministic > > > (though AI-generated) code + LLMs as the additional 'intelligent > > > assistant's for things that cannot be done deterministically seems to > be > > > working beautifully. > > > > > > J > > > > > > On Thu, Mar 12, 2026, 16:28 Vincent Beck <[email protected]> wrote: > > > > > > > Pretty impressive results, we were at 500+ open PRs 2 days ago and > now > > we > > > > are at ~430 open PRs. Bravo! > > > > > > > > On 2026/03/11 14:51:36 Kevin Yang wrote: > > > > > Thanks for the feedback! More than happy if I could implement these > > > > options > > > > > and integrations. I will look into the current implementation and > > draft > > > > PRs > > > > > by the upcoming week. > > > > > > > > > > Best, > > > > > Kevin Yang > > > > > > > > > > On Wed, Mar 11, 2026 at 4:05 AM Jarek Potiuk <[email protected]> > > wrote: > > > > > > > > > > > You can absolutely add the option to use any agent or model to > the > > > > tool I > > > > > > created. Currently it can use copilot, Claude, codex - but you > can > > > add > > > > PR > > > > > > to use any model - it is build for that purpose. > > > > > > > > > > > > This is integrated with breeze uatctually even automatically > stores > > > > which > > > > > > model you use and continue using it. The interface to LLm ia > super > > > > Simple. > > > > > > It does not even use Pydantic AI - it just generates prompt and > > > parses > > > > the > > > > > > output. so by all means - adding a way to use any other LLM. > > > > > > > > > > > > 90% of the work done by the tool is deterministic; it only asks > the > > > LLM > > > > > > when it is in doubt. > > > > > > > > > > > > So - by all means, PRs to use any other LLMs - whether local or > > > remote > > > > - > > > > > > are most welcome. Also we can add opencode and ollama integration > > > > > > > > > > > > [image: image.png] > > > > > > > > > > > > J. > > > > > > > > > > > > On Wed, Mar 11, 2026, 03:32 Kevin Yang <[email protected]> > > > wrote: > > > > > > > > > > > >> Hi Jarek, > > > > > >> > > > > > >> Thank you very much for all the efforts in building the > > solutions. I > > > > > >> recently also read through the following discussions [1,2,3], > and > > > > think > > > > > >> about whether there is a good approach on tackling the > challenge. > > > > > >> > > > > > >> I believe integrating with LLM is a good approach, especially > can > > > > leverage > > > > > >> its reasoning capabilities to provide a better triage. Existing > > > > products > > > > > >> such as Copilot Code Review can also provide insightful triage > as > > > > > >> previously proposed by Kaxil. > > > > > >> > > > > > >> I also find another direction that also looks promising to me is > > to > > > > > >> use a *small > > > > > >> language model (SLM)*, a model with 2-4 B parameters, which can > be > > > > run on > > > > > >> standard Github runners, using CPU-only, to triage issues and > PRs. > > > > I've > > > > > >> built a github action *SLM Triage* ( > > > > > >> https://github.com/marketplace/actions/slm-triage). > > > > > >> > > > > > >> What advantages does SLM offer? > > > > > >> * It can be run on a standard GitHub runner, on CPU, and finish > > > > execution > > > > > >> in around 3 - 5 minutes > > > > > >> * There is no API cost, billing set up with LLM service > > > > > >> * It runs on GitHub events, when an issue or PR is opened, and > > > > capable to > > > > > >> triage issues as long as there are GitHub runners available > > > > > >> * It can be simply integrated into GitHub Actions without > > > > infrastructure, > > > > > >> or local setup. > > > > > >> > > > > > >> What are the current limitations? > > > > > >> * It doesn't have enough domain knowledge about a specific > > codebase, > > > > so it > > > > > >> can only triage based on high-level context, and relevancy > between > > > > context > > > > > >> information and code changes > > > > > >> * It has limited reasoning capability > > > > > >> * It has limited context window (128k context window size, some > > > might > > > > have > > > > > >> ~256k) > > > > > >> > > > > > >> Why I think it can be a potential direction > > > > > >> * I feel some issues or PRs can be triage based on some basic > > > > heuristics > > > > > >> and rules > > > > > >> * Even though context window is limited, if the process is > > triggered > > > > when > > > > > >> issue opened, the context window is good enough to capture issue > > > > > >> description, pr description, and even code change > > > > > >> * It is easier to set up for a broader open-source community, > and > > > > probably > > > > > >> more cost efficient, it can scale based on workflow adoption > > > > > >> * It can take action through API such as comment on an issue, > add > > > > label, > > > > > >> close an issue or PR, etc. based on the triage result. > > > > > >> > > > > > >> I also attempted to triage multiple issues and PRs on airflow > > > > repository, > > > > > >> and check the actual issues/PRs (I created a script to dry-run > and > > > > inspect > > > > > >> the triage result and reasoning). The result looks promising, > but > > > > > >> sometimes > > > > > >> I found it is "a bit strict" and needs some improvements in > terms > > of > > > > > >> prompting. > > > > > >> > > > > > >> I wonder if this is a valid idea, but it would be great if the > > idea > > > > can > > > > > >> potentially help. > > > > > >> > > > > > >> Thanks, > > > > > >> Kevin Yang > > > > > >> > > > > > >> [1] https://github.com/orgs/community/discussions/185387 > > > > > >> [2] > > https://github.com/ossf/wg-vulnerability-disclosures/issues/178 > > > > > >> [3] > > > > > >> > > > > > >> > > > > > > > > > > https://www.reddit.com/r/opensource/comments/1q3f89b/open_source_is_being_ddosed_by_ai_slop_and_github/#:~:text=FunBrilliant5713-,Open%20source%20is%20being%20DDoSed%20by%20AI%20slop%20and%20GitHub,which%20submissions%20came%20from%20Copilot > > > > > >> . > > > > > >> > > > > > >> On Tue, Mar 10, 2026 at 9:13 PM Jarek Potiuk <[email protected]> > > > > wrote: > > > > > >> > > > > > >> > Just to update everyone: I've auto-triaged a bunch of PRs—the > > tool > > > > works > > > > > >> > very well IMHO, but we will know after the authors see them > and > > > > review > > > > > >> > > > > > > >> > Some stats (I will gather more in the next days as I am adding > > > > timing > > > > > >> and > > > > > >> > further improvements): > > > > > >> > > > > > > >> > * I triaged about 100 PRs in under an hour of elapsed time (I > > > > > >> > also corrected, improved and noted some fixes, so it will be > > > faster) > > > > > >> > * I converted 30 of those into Drafts and closed a few > > > > > >> > * I have not marked any as ready to review yet, but I will do > > that > > > > > >> tomorrow > > > > > >> > * The LLM (Claude) assessment is quite fast - faster than I > > > thought. > > > > > >> > Parallelizing it also helps. LLM assessment takes between 20 s > > > and 2 > > > > > >> > minutes (elapsed), but usually, only a few pull requests (15% > or > > > > less) > > > > > >> are > > > > > >> > LLM assessed in a batch, so this is not a bottleneck. I will > > also > > > > > >> modify > > > > > >> > the tool to start reviewing deterministic things before LLMs > > > > complete - > > > > > >> > which should speed up the whole process even more > > > > > >> > * The LLM assessments are pretty good - but a few were > > > significantly > > > > > >> wrong > > > > > >> > and I would not post them. It's good we have Human-In-The-Loop > > and > > > > in > > > > > >> the > > > > > >> > driver's seat. > > > > > >> > > > > > > >> > Overall - I think the tool is doing very well what I wanted. > But > > > > let's > > > > > >> see > > > > > >> > the improvements over the next few days, observe how authors > > > react, > > > > and > > > > > >> > determine if it can actually help maintainers > > > > > >> > > > > > > >> > I added a few PRs as improvements; looking forward to > reviews, : > > > > > >> > > > > > > >> > * https://github.com/apache/airflow/pull/63318 > > > > > >> > * https://github.com/apache/airflow/pull/63317 > > > > > >> > * https://github.com/apache/airflow/pull/63315 > > > > > >> > * https://github.com/apache/airflow/pull/63319 > > > > > >> > * https://github.com/apache/airflow/pull/63320 > > > > > >> > > > > > > >> > J. > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > On Tue, Mar 10, 2026 at 10:18 PM Jarek Potiuk < > [email protected] > > > > > > > wrote: > > > > > >> > > > > > > >> > > Lazy consensus reached. I will try it out tonight. I added > > more > > > > > >> signals > > > > > >> > > (unresolved review comments) and filtering options ( > > > > > >> > > https://github.com/apache/airflow/pull/63300) that will be > > > useful > > > > > >> during > > > > > >> > > this phase. > > > > > >> > > > > > > > >> > > On Fri, Mar 6, 2026 at 9:08 PM Jarek Potiuk < > [email protected] > > > > > > > wrote: > > > > > >> > > > > > > > >> > >> Hello here, > > > > > >> > >> > > > > > >> > >> I am asking a lazy consensus on the approach proposed in > > > > > >> > >> > > > https://lists.apache.org/thread/ly6lrm2gc4p7p54vomr8621nmb1pvlsk > > > > > >> > >> regarding our approach to triaging PRs. > > > > > >> > >> > > > > > >> > >> The lazy consensus will last till Tuesday 10 pm CEST ( > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > https://www.timeanddate.com/countdown/generic?iso=20260310T22&p0=262&font=cursive > > > > > >> > >> ) > > > > > >> > >> > > > > > >> > >> Summary of the proposal > > > > > >> > >> > > > > > >> > >> This is the proposed update to the PR contributing > > guidelines: > > > > > >> > >> > > > > > >> > >> > Start with **Draft**: Until you are sure that your PR > > passes > > > > all > > > > > >> the > > > > > >> > >> quality checks and tests, keep it in **Draft** status. This > > > will > > > > > >> signal > > > > > >> > to > > > > > >> > >> maintainers that the PR is not yet ready for review and it > > will > > > > > >> prevent > > > > > >> > >> maintainers from accidentally merging it before it's ready. > > > Once > > > > you > > > > > >> are > > > > > >> > >> sure that your PR is ready for review, you can mark it as > > > "Ready > > > > for > > > > > >> > >> review" in the GitHub UI. Our regular check will convert > all > > > PRs > > > > from > > > > > >> > >> non-collaborators that do not pass our quality gates to > Draft > > > > status, > > > > > >> > so if > > > > > >> > >> you see that your PR is in Draft status and you haven't set > > it > > > to > > > > > >> Draft. > > > > > >> > >> Check the comments to see what needs to be fixed. > > > > > >> > >> > > > > > >> > >> That's a "broad" description of the process; details will > be > > > > worked > > > > > >> out > > > > > >> > >> while testing the solution. > > > > > >> > >> > > > > > >> > >> The PR: https://github.com/apache/airflow/pull/62682 > > > > > >> > >> > > > > > >> > >> My testing approach is to start with individual areas, > update > > > and > > > > > >> > perfect > > > > > >> > >> the tool, gradually increase the reach of it and engage > > others > > > - > > > > > >> then we > > > > > >> > >> might think about more regular process involving more > > > > maintainers. > > > > > >> > >> > > > > > >> > >> J. > > > > > >> > >> > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: [email protected] > > > > For additional commands, e-mail: [email protected] > > > > > > > > > > > > > >
