Re: Share your Gen-AI contributions ?

Jarek Potiuk Mon, 01 Dec 2025 11:27:37 -0800

> Hey, please remove me from this distribution list! Thanks!

Hey - you can remove yourself following the description on
https://airflow.apache.org/community/



On Mon, Dec 1, 2025 at 8:05 PM Aaron Dantley <[email protected]> wrote:

> Hey, please remove me from this distribution list! Thanks!
>
> On Mon, Dec 1, 2025 at 1:36 PM Ferruzzi, Dennis <[email protected]>
> wrote:
>
> > I was hoping this thread would get more love so I could see how others
> are
> > using it.  I'm not using LLMs a whole lot for writing actual code right
> > now, I don't find them all that intelligent. My experience feels more
> like
> > having an overeager intern; the code isn't great, the "thinking" is
> pretty
> > one-track - often retrying the same failed ideas multiple times - and
> it's
> > often faster to just do it myself.
> >
> > I have tried things like:
> >  - "here is a python file I have made changes to, and the existing test
> > file, do I still have coverage?"  A dedicated tool like covecov is better
> > for this, but I'm trying to give them a fair shot.
> >  - "I just wrote a couple of functions, I need you to check for any
> > missing type-hints and generate the method docsctrings following
> pydocstyle
> > formatting rules and the formatting style of the existing methods". The
> > docstrings then need to be reviewed, but they are usually pretty decent,
> > and a dedicated linter is likely better at the hinting.
> >
> > - Summarizing existing code into plain English seems to work pretty well
> > if you just want an overview of what a block of code is actually doing
> > - "Summarize this git diff into a 2-line PR description" usually results
> > in a pretty reasonable starting point that just needs some tweaks.
> >
> > Parsing stack traces I think are the biggest thing that it actually does
> > well; those things can get out of hand some times and it can be handy to
> > have the LLM parse it and get you the summary and the main issues (don't
> > show me the internal calls of 3rd party packages, etc)
> >
> > I recently started giving Cline a try, it's a code-aware LLM that lives
> in
> > your IDE and has access to any files in the current project.  It's
> > definitely better but still not great IMHO.  What I do like about that
> one
> > is you can ask thinks like "where do we ACTUALLY write the serialized_dag
> > to the database?" and "Show me where we actually re-parse the dag bag"
> and
> > it seems to be pretty good at tracing through the code to find that kind
> of
> > thing, which has saved me a little time when poking at corners of the
> > project I'm not as familiar with.  But given my experience with them in
> the
> > past and the complexity of the codebase, I never really trust that it
> finds
> > all the references.  For example, if it points to a line of code where we
> > re-parse the dag bag I can't trust that this is the **only** place we do
> > that, so I may have to double-check it's work anyway.
> >
> > Overall, I think Jarek actually hit the nail on the head with his comment
> > that the key to using them right now is figuring out what they actually
> CAN
> > do well and avoiding them for tasks where they are going to slow you
> down.
> > It takes some trial and error to figure out where that line is and new
> > models and tools come out so fast, the line is constantly shifting.
> >
> >
> >  - ferruzzi
> >
> >
> > ________________________________
> > From: Jarek Potiuk <[email protected]>
> > Sent: Tuesday, November 11, 2025 3:21 AM
> > To: [email protected]
> > Subject: [EXT] Share your Gen-AI contributions ?
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > Hello community,
> >
> > *TL;DR; I have a proposal that we share a bit more openly how we are
> using
> > Gen AI tooling to make us more productive. I thought about creating a
> > dedicated #gen-ai-contribution-sharing channel in Slack for that purpose*
> >
> > I've been using various Gen-AI tools and I am sure many of us do and I've
> > seen people sharing their experiences in various places - we also shared
> it
> > a bit here - our UI Translation project is largely based on AI helping
> our
> > translators to do the heavy-lifting. I also shared a few times how AI
> > helped me to massively speed up work on fixing footers on our 250K pages
> of
> > documentation and - more recently - make sure our licensing in packages
> is
> > compliant with ASF - but also I used Gen AI to generate some scripting
> > tools (breeze ci upgrade and the check_translation_completness.py
> script).
> > Also many of our contributors use various Gen AI tools to create their
> PRs.
> > And I know few of us use it to analyse stack-traces and errors, and use
> it
> > to explain how our code works.
> >
> > I thought that there are two interesting aspects that it would be great
> > that we learn from one another:
> >
> > 1) What kind of tooling you use and how it fits-in the UX and developer
> > experience (I used a number of things - from copilot CLI, IDE integration
> > to Copilot reviews and Agents. I found that the better integrated the
> tool
> > is in your daily regular tasks, the more useful it is.
> >
> > 2) The recurring theme from all the Gen-AI discussions I hear is that
> it's
> > most important to learn where Gen AI helps, and where it stands in the
> way:
> > * in a few things I tried Gen AI makes me vastly more productive - I feel
> > * in some of them I feel the reviews, correction of mistakes and
> generally
> > iteration on it slows me down significantly
> > * in some cases it maybe not faster, but takes a lot less mental energy
> and
> > decision making and mostly repetitive coding, so generally I feel happier
> > * finally there are cases (like the UI translation) that I would never
> even
> > attempt because of the vast amount of mostly repetitive and generally
> > boring things that would normally cause me dropping out very quickly and
> > abandoning it eventually
> >
> > I feel that we could learn from each-other. For me learning by example -
> > especially an example in a project that you know well and you can easily
> > transplant the learnings to your own tasks - is the fastest and best way
> of
> > learning.
> >
> > Finally - The Apache Software Foundation has this official guidance on
> > using AI to contribute code [1]  - I think this is a very well written
> one,
> > and it describes some border conditions where AI contributions are "OK"
> > from the licencing, copyright point of view - largely to avoid big chunks
> > of copyrightable code leaking from GPL-licensed training material. And
> > while it does not have definite answers, I think when we share our
> > contributions openly we can discuss things like "is that copyrightable",
> > where is that coming from etc. etc.  (note that in many cases - when you
> > generate large chunks of code, you can ask the LLM where the code comes
> > from and several of the LLM tools even provides you immediately the
> > references of the sources of code in such cases.
> >
> > So my proposal is to create a *#gen-ai-contribution-sharing  *in our
> > slack - where we will share our experiences from using AI, ask when you
> > have doubts about whether you can submit such a code etc.
> >
> > WDYT? Is it a good idea ?
> >
> > [1] Generative Tooling Guidance by ASF:
> > https://www.apache.org/legal/generative-tooling.html
> >
> > J.
> >
>

Re: Share your Gen-AI contributions ?

Reply via email to