Totally missed this thread in the plethora of emails we have had lately! I agree with Dennis, Kunal, and others that code generation works well when you are a strict reviewer. The agentic mode can often get out of hand and start to get *lazy/excited *at times.
Lazy in the sense that it can generate things but if you keep on improvising it, it starts to write scripts even for the most basic tasks. Excited in the sense that it starts engineering things that are even not asked off of it. So a model is as good as the instructions / prompts given to it. Specific prompts with what *not *to do yield the best results. Personally for me, the areas where the editor has been shining are: * Generating test cases, impressive coverage with Sonnet 4.5 * Testing a very specific section of code. If I want to just test a very specific bit of code between a large function/ class with many dependencies, I can get most of the stuff around it mocked / injected and can get coverage for the very specific part I was looking for * Important but tedious repetitive tasks - like fixing mypy warnings etc, where the pattern is repetitive and can be done by the editor without much wandering Dennis - for generating git commit messages, *smartcommit *is a tool I have been using for about a week now, and it does a decent job: https://github.com/arpxspace/smartcommit Thanks & Regards, Amogh Desai On Wed, Dec 3, 2025 at 8:33 AM Ryan Hatter via dev <[email protected]> wrote: > In response to the other devlist thread on AIP 91 / MCP, I set out to build > a minimally viable example of what it could look like. I ended up building > a tool that would allow users to interact with Airflow via LLM/MCP in a way > that goes beyond just exposing the Airflow REST API. For example, a user > could say "please update the revenue dashboard", which would trigger a dag > (or set of dags via asset aware scheduling). IMO, it became too opinionated > for the main Airflow project (I'd be happy to be proven wrong here!), but I > think it's pretty cool. Not sure if a gif will work in the devlist but I'll > give it a try: > > [image: demo.gif] > > I'm really quite skeptical of LLMs beyond individual/personal use, but I > don't think it's out of the realm of possibility that they become better > and more capable of complex, code-based tasks. Hopefully this spurs some > ideas :) > > Demo repo: https://github.com/RNHTTR/airflow-mcp-demo > Project repo with the gif in case it doesn't load in the dev list: > https://github.com/RNHTTR/MCPipeline?tab=readme-ov-file#demo > > On Tue, Dec 2, 2025 at 9:05 AM Kunal Bhattacharya <[email protected]> > wrote: > > > My experience with Gen AI code editors has been somewhat mixed as well. I > > have mostly used Windsurf, alternating between GPT-5 and Claude Sonnet > 4.5 > > models and I have felt it useful for: > > * Understanding pre-existing code, with some caveats around more nitty > > gritty stuff that it tends to miss but gets right overall > > * Writing repetitive code, like tests, when provided the exact framework > > such as a couple of reference test suites to mirror in terms of approach. > > In a particular scenario where I had to write 16 integration tests, I > wrote > > 2 with all of the helper functions and asked Windsurf to replicate the > same > > framework for the remaining 14 and it did a decent job. But even then, it > > seemed to go off the rails with simple tasks such as appending only to > the > > end of the file and not the middle. > > > > In summary, for writing code, it seems to be useful only when you know > > 80-90% of the exact changes required to be done, hands on. > > > > On the other hand, when asked to do stuff where the solution is not > > entirely clear to me, e.g. updating the extensibility of a package that > is > > being used in our codebase which would impact which functions to call, > > which objects to change for deprecations etc., it fails horribly. I also > > tried using it to resolve a service-to-service communication issue in our > > platform via configurations in the Helm chart but again started running > in > > circles. So the R&D and trial-n-error stuff seems to stay with me while I > > can assign it the more mundane stuff, which is okay I guess :) I am > > interested to experiment with locally hosted LLMs to see if this changes > my > > experience! > > > > On a lighter note, I would definitely be more concerned if it starts > doing > > pinpoint R&Ds and suggesting accurate solutions across large codebases! > > > > Curious to know how others are using it! > > > > Regards, > > Kunal > > > > > > > > On Tue, Dec 2, 2025 at 2:55 PM Jarek Potiuk <[email protected]> wrote: > > > > > Thanks Dennis :). Hopefully with your message we will get back on > track, > > > rather than being distracted with mailing list issues ;). > > > > > > Yeah I have quite similar experiences, - hopefully we can get this > thread > > > going and others will chime in as well. I am not sure if the channel > on > > > slack is a good idea, so maybe let's continue here. > > > > > > One more comment. We recently had discussion at the ASF members@ about > > > using AI for AF (including the guidelines I shared) - and of course > > people > > > have various concerns - from licensing, training AI on copyrighted > > > material, "dependency on big tech" etc. Valid concerns and we have some > > > very constructive discussions on how we can make a better use of AI > > > ourselves in a way that follows our principles. > > > > > > I personally think that first of all - AI is overhyped of course, but > > it's > > > here to stay, I also see how the models can get optimised over time, > and > > > start fitting into smaller hardware and can run locally and eventually > - > > > while some of the big ones are trying to take over the AI and monetise > > it, > > > the open-source world (maybe even ASF building and releasing their > fully > > > open-source models) will win. Many of us don't remember (because they > > were > > > not born yet ;) ) - we've seen that 30 years ago when open source was > > just > > > starting - where proprietary software was basically the only thing you > > > could get. Now 9X% of the software out there is open-source and while > > > proprietary services are out there still, you can use most of the > > software > > > for free (for example - Airflow :D). > > > > > > I'd love to hear also from others - how they are using AI now :). BTW. > I > > > will be speaking in February at a new "grass-root" conference in Poland > > > https://post-software.intentee.com/ (run by two of my very young and > > > enthusiastic friends) where I will be speaking about our usage of AI > > > (starting with the UI translation project), so I also have also a very > > good > > > reason to ask you for feedback here :). > > > > > > J. > > > > > > > > > > > > On Mon, Dec 1, 2025 at 8:27 PM Jarek Potiuk <[email protected]> wrote: > > > > > > > > Hey, please remove me from this distribution list! Thanks! > > > > > > > > Hey - you can remove yourself following the description on > > > > https://airflow.apache.org/community/ > > > > > > > > > > > > On Mon, Dec 1, 2025 at 8:05 PM Aaron Dantley <[email protected] > > > > > > wrote: > > > > > > > >> Hey, please remove me from this distribution list! Thanks! > > > >> > > > >> On Mon, Dec 1, 2025 at 1:36 PM Ferruzzi, Dennis < > [email protected]> > > > >> wrote: > > > >> > > > >> > I was hoping this thread would get more love so I could see how > > others > > > >> are > > > >> > using it. I'm not using LLMs a whole lot for writing actual code > > > right > > > >> > now, I don't find them all that intelligent. My experience feels > > more > > > >> like > > > >> > having an overeager intern; the code isn't great, the "thinking" > is > > > >> pretty > > > >> > one-track - often retrying the same failed ideas multiple times - > > and > > > >> it's > > > >> > often faster to just do it myself. > > > >> > > > > >> > I have tried things like: > > > >> > - "here is a python file I have made changes to, and the existing > > > test > > > >> > file, do I still have coverage?" A dedicated tool like covecov is > > > >> better > > > >> > for this, but I'm trying to give them a fair shot. > > > >> > - "I just wrote a couple of functions, I need you to check for > any > > > >> > missing type-hints and generate the method docsctrings following > > > >> pydocstyle > > > >> > formatting rules and the formatting style of the existing > methods". > > > The > > > >> > docstrings then need to be reviewed, but they are usually pretty > > > decent, > > > >> > and a dedicated linter is likely better at the hinting. > > > >> > > > > >> > - Summarizing existing code into plain English seems to work > pretty > > > well > > > >> > if you just want an overview of what a block of code is actually > > doing > > > >> > - "Summarize this git diff into a 2-line PR description" usually > > > results > > > >> > in a pretty reasonable starting point that just needs some tweaks. > > > >> > > > > >> > Parsing stack traces I think are the biggest thing that it > actually > > > does > > > >> > well; those things can get out of hand some times and it can be > > handy > > > to > > > >> > have the LLM parse it and get you the summary and the main issues > > > (don't > > > >> > show me the internal calls of 3rd party packages, etc) > > > >> > > > > >> > I recently started giving Cline a try, it's a code-aware LLM that > > > lives > > > >> in > > > >> > your IDE and has access to any files in the current project. It's > > > >> > definitely better but still not great IMHO. What I do like about > > that > > > >> one > > > >> > is you can ask thinks like "where do we ACTUALLY write the > > > >> serialized_dag > > > >> > to the database?" and "Show me where we actually re-parse the dag > > bag" > > > >> and > > > >> > it seems to be pretty good at tracing through the code to find > that > > > >> kind of > > > >> > thing, which has saved me a little time when poking at corners of > > the > > > >> > project I'm not as familiar with. But given my experience with > them > > > in > > > >> the > > > >> > past and the complexity of the codebase, I never really trust that > > it > > > >> finds > > > >> > all the references. For example, if it points to a line of code > > where > > > >> we > > > >> > re-parse the dag bag I can't trust that this is the **only** place > > we > > > do > > > >> > that, so I may have to double-check it's work anyway. > > > >> > > > > >> > Overall, I think Jarek actually hit the nail on the head with his > > > >> comment > > > >> > that the key to using them right now is figuring out what they > > > actually > > > >> CAN > > > >> > do well and avoiding them for tasks where they are going to slow > you > > > >> down. > > > >> > It takes some trial and error to figure out where that line is and > > new > > > >> > models and tools come out so fast, the line is constantly > shifting. > > > >> > > > > >> > > > > >> > - ferruzzi > > > >> > > > > >> > > > > >> > ________________________________ > > > >> > From: Jarek Potiuk <[email protected]> > > > >> > Sent: Tuesday, November 11, 2025 3:21 AM > > > >> > To: [email protected] > > > >> > Subject: [EXT] Share your Gen-AI contributions ? > > > >> > > > > >> > CAUTION: This email originated from outside of the organization. > Do > > > not > > > >> > click links or open attachments unless you can confirm the sender > > and > > > >> know > > > >> > the content is safe. > > > >> > > > > >> > > > > >> > > > > >> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur > > > >> externe. > > > >> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous > ne > > > >> pouvez > > > >> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas > > certain > > > >> que > > > >> > le contenu ne présente aucun risque. > > > >> > > > > >> > > > > >> > > > > >> > Hello community, > > > >> > > > > >> > *TL;DR; I have a proposal that we share a bit more openly how we > are > > > >> using > > > >> > Gen AI tooling to make us more productive. I thought about > creating > > a > > > >> > dedicated #gen-ai-contribution-sharing channel in Slack for that > > > >> purpose* > > > >> > > > > >> > I've been using various Gen-AI tools and I am sure many of us do > and > > > >> I've > > > >> > seen people sharing their experiences in various places - we also > > > >> shared it > > > >> > a bit here - our UI Translation project is largely based on AI > > helping > > > >> our > > > >> > translators to do the heavy-lifting. I also shared a few times how > > AI > > > >> > helped me to massively speed up work on fixing footers on our 250K > > > >> pages of > > > >> > documentation and - more recently - make sure our licensing in > > > packages > > > >> is > > > >> > compliant with ASF - but also I used Gen AI to generate some > > scripting > > > >> > tools (breeze ci upgrade and the check_translation_completness.py > > > >> script). > > > >> > Also many of our contributors use various Gen AI tools to create > > their > > > >> PRs. > > > >> > And I know few of us use it to analyse stack-traces and errors, > and > > > use > > > >> it > > > >> > to explain how our code works. > > > >> > > > > >> > I thought that there are two interesting aspects that it would be > > > great > > > >> > that we learn from one another: > > > >> > > > > >> > 1) What kind of tooling you use and how it fits-in the UX and > > > developer > > > >> > experience (I used a number of things - from copilot CLI, IDE > > > >> integration > > > >> > to Copilot reviews and Agents. I found that the better integrated > > the > > > >> tool > > > >> > is in your daily regular tasks, the more useful it is. > > > >> > > > > >> > 2) The recurring theme from all the Gen-AI discussions I hear is > > that > > > >> it's > > > >> > most important to learn where Gen AI helps, and where it stands in > > the > > > >> way: > > > >> > * in a few things I tried Gen AI makes me vastly more productive > - I > > > >> feel > > > >> > * in some of them I feel the reviews, correction of mistakes and > > > >> generally > > > >> > iteration on it slows me down significantly > > > >> > * in some cases it maybe not faster, but takes a lot less mental > > > energy > > > >> and > > > >> > decision making and mostly repetitive coding, so generally I feel > > > >> happier > > > >> > * finally there are cases (like the UI translation) that I would > > never > > > >> even > > > >> > attempt because of the vast amount of mostly repetitive and > > generally > > > >> > boring things that would normally cause me dropping out very > quickly > > > and > > > >> > abandoning it eventually > > > >> > > > > >> > I feel that we could learn from each-other. For me learning by > > > example - > > > >> > especially an example in a project that you know well and you can > > > easily > > > >> > transplant the learnings to your own tasks - is the fastest and > best > > > >> way of > > > >> > learning. > > > >> > > > > >> > Finally - The Apache Software Foundation has this official > guidance > > on > > > >> > using AI to contribute code [1] - I think this is a very well > > written > > > >> one, > > > >> > and it describes some border conditions where AI contributions are > > > "OK" > > > >> > from the licencing, copyright point of view - largely to avoid big > > > >> chunks > > > >> > of copyrightable code leaking from GPL-licensed training material. > > And > > > >> > while it does not have definite answers, I think when we share our > > > >> > contributions openly we can discuss things like "is that > > > copyrightable", > > > >> > where is that coming from etc. etc. (note that in many cases - > when > > > you > > > >> > generate large chunks of code, you can ask the LLM where the code > > > comes > > > >> > from and several of the LLM tools even provides you immediately > the > > > >> > references of the sources of code in such cases. > > > >> > > > > >> > So my proposal is to create a *#gen-ai-contribution-sharing *in > our > > > >> > slack - where we will share our experiences from using AI, ask > when > > > you > > > >> > have doubts about whether you can submit such a code etc. > > > >> > > > > >> > WDYT? Is it a good idea ? > > > >> > > > > >> > [1] Generative Tooling Guidance by ASF: > > > >> > https://www.apache.org/legal/generative-tooling.html > > > >> > > > > >> > J. > > > >> > > > > >> > > > > > > > > > >
