> Hey, please remove me from this distribution list! Thanks! Hey - you can remove yourself following the description on https://airflow.apache.org/community/
On Mon, Dec 1, 2025 at 8:05 PM Aaron Dantley <[email protected]> wrote: > Hey, please remove me from this distribution list! Thanks! > > On Mon, Dec 1, 2025 at 1:36 PM Ferruzzi, Dennis <[email protected]> > wrote: > > > I was hoping this thread would get more love so I could see how others > are > > using it. I'm not using LLMs a whole lot for writing actual code right > > now, I don't find them all that intelligent. My experience feels more > like > > having an overeager intern; the code isn't great, the "thinking" is > pretty > > one-track - often retrying the same failed ideas multiple times - and > it's > > often faster to just do it myself. > > > > I have tried things like: > > - "here is a python file I have made changes to, and the existing test > > file, do I still have coverage?" A dedicated tool like covecov is better > > for this, but I'm trying to give them a fair shot. > > - "I just wrote a couple of functions, I need you to check for any > > missing type-hints and generate the method docsctrings following > pydocstyle > > formatting rules and the formatting style of the existing methods". The > > docstrings then need to be reviewed, but they are usually pretty decent, > > and a dedicated linter is likely better at the hinting. > > > > - Summarizing existing code into plain English seems to work pretty well > > if you just want an overview of what a block of code is actually doing > > - "Summarize this git diff into a 2-line PR description" usually results > > in a pretty reasonable starting point that just needs some tweaks. > > > > Parsing stack traces I think are the biggest thing that it actually does > > well; those things can get out of hand some times and it can be handy to > > have the LLM parse it and get you the summary and the main issues (don't > > show me the internal calls of 3rd party packages, etc) > > > > I recently started giving Cline a try, it's a code-aware LLM that lives > in > > your IDE and has access to any files in the current project. It's > > definitely better but still not great IMHO. What I do like about that > one > > is you can ask thinks like "where do we ACTUALLY write the serialized_dag > > to the database?" and "Show me where we actually re-parse the dag bag" > and > > it seems to be pretty good at tracing through the code to find that kind > of > > thing, which has saved me a little time when poking at corners of the > > project I'm not as familiar with. But given my experience with them in > the > > past and the complexity of the codebase, I never really trust that it > finds > > all the references. For example, if it points to a line of code where we > > re-parse the dag bag I can't trust that this is the **only** place we do > > that, so I may have to double-check it's work anyway. > > > > Overall, I think Jarek actually hit the nail on the head with his comment > > that the key to using them right now is figuring out what they actually > CAN > > do well and avoiding them for tasks where they are going to slow you > down. > > It takes some trial and error to figure out where that line is and new > > models and tools come out so fast, the line is constantly shifting. > > > > > > - ferruzzi > > > > > > ________________________________ > > From: Jarek Potiuk <[email protected]> > > Sent: Tuesday, November 11, 2025 3:21 AM > > To: [email protected] > > Subject: [EXT] Share your Gen-AI contributions ? > > > > CAUTION: This email originated from outside of the organization. Do not > > click links or open attachments unless you can confirm the sender and > know > > the content is safe. > > > > > > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne > pouvez > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain > que > > le contenu ne présente aucun risque. > > > > > > > > Hello community, > > > > *TL;DR; I have a proposal that we share a bit more openly how we are > using > > Gen AI tooling to make us more productive. I thought about creating a > > dedicated #gen-ai-contribution-sharing channel in Slack for that purpose* > > > > I've been using various Gen-AI tools and I am sure many of us do and I've > > seen people sharing their experiences in various places - we also shared > it > > a bit here - our UI Translation project is largely based on AI helping > our > > translators to do the heavy-lifting. I also shared a few times how AI > > helped me to massively speed up work on fixing footers on our 250K pages > of > > documentation and - more recently - make sure our licensing in packages > is > > compliant with ASF - but also I used Gen AI to generate some scripting > > tools (breeze ci upgrade and the check_translation_completness.py > script). > > Also many of our contributors use various Gen AI tools to create their > PRs. > > And I know few of us use it to analyse stack-traces and errors, and use > it > > to explain how our code works. > > > > I thought that there are two interesting aspects that it would be great > > that we learn from one another: > > > > 1) What kind of tooling you use and how it fits-in the UX and developer > > experience (I used a number of things - from copilot CLI, IDE integration > > to Copilot reviews and Agents. I found that the better integrated the > tool > > is in your daily regular tasks, the more useful it is. > > > > 2) The recurring theme from all the Gen-AI discussions I hear is that > it's > > most important to learn where Gen AI helps, and where it stands in the > way: > > * in a few things I tried Gen AI makes me vastly more productive - I feel > > * in some of them I feel the reviews, correction of mistakes and > generally > > iteration on it slows me down significantly > > * in some cases it maybe not faster, but takes a lot less mental energy > and > > decision making and mostly repetitive coding, so generally I feel happier > > * finally there are cases (like the UI translation) that I would never > even > > attempt because of the vast amount of mostly repetitive and generally > > boring things that would normally cause me dropping out very quickly and > > abandoning it eventually > > > > I feel that we could learn from each-other. For me learning by example - > > especially an example in a project that you know well and you can easily > > transplant the learnings to your own tasks - is the fastest and best way > of > > learning. > > > > Finally - The Apache Software Foundation has this official guidance on > > using AI to contribute code [1] - I think this is a very well written > one, > > and it describes some border conditions where AI contributions are "OK" > > from the licencing, copyright point of view - largely to avoid big chunks > > of copyrightable code leaking from GPL-licensed training material. And > > while it does not have definite answers, I think when we share our > > contributions openly we can discuss things like "is that copyrightable", > > where is that coming from etc. etc. (note that in many cases - when you > > generate large chunks of code, you can ask the LLM where the code comes > > from and several of the LLM tools even provides you immediately the > > references of the sources of code in such cases. > > > > So my proposal is to create a *#gen-ai-contribution-sharing *in our > > slack - where we will share our experiences from using AI, ask when you > > have doubts about whether you can submit such a code etc. > > > > WDYT? Is it a good idea ? > > > > [1] Generative Tooling Guidance by ASF: > > https://www.apache.org/legal/generative-tooling.html > > > > J. > > >
