Re: Share your Gen-AI contributions ?

Ferruzzi, Dennis Mon, 01 Dec 2025 10:36:18 -0800

I was hoping this thread would get more love so I could see how others are 
using it.  I'm not using LLMs a whole lot for writing actual code right now, I 
don't find them all that intelligent. My experience feels more like having an 
overeager intern; the code isn't great, the "thinking" is pretty one-track - 
often retrying the same failed ideas multiple times - and it's often faster to 
just do it myself.


I have tried things like:
 - "here is a python file I have made changes to, and the existing test file, 
do I still have coverage?"  A dedicated tool like covecov is better for this, 
but I'm trying to give them a fair shot.
 - "I just wrote a couple of functions, I need you to check for any missing 
type-hints and generate the method docsctrings following pydocstyle formatting 
rules and the formatting style of the existing methods". The docstrings then 
need to be reviewed, but they are usually pretty decent, and a dedicated linter 
is likely better at the hinting.

- Summarizing existing code into plain English seems to work pretty well if you 
just want an overview of what a block of code is actually doing
- "Summarize this git diff into a 2-line PR description" usually results in a 
pretty reasonable starting point that just needs some tweaks.

Parsing stack traces I think are the biggest thing that it actually does well; 
those things can get out of hand some times and it can be handy to have the LLM 
parse it and get you the summary and the main issues (don't show me the 
internal calls of 3rd party packages, etc)

I recently started giving Cline a try, it's a code-aware LLM that lives in your 
IDE and has access to any files in the current project.  It's definitely better 
but still not great IMHO.  What I do like about that one is you can ask thinks 
like "where do we ACTUALLY write the serialized_dag to the database?" and "Show 
me where we actually re-parse the dag bag" and it seems to be pretty good at 
tracing through the code to find that kind of thing, which has saved me a 
little time when poking at corners of the project I'm not as familiar with.  
But given my experience with them in the past and the complexity of the 
codebase, I never really trust that it finds all the references.  For example, 
if it points to a line of code where we re-parse the dag bag I can't trust that 
this is the **only** place we do that, so I may have to double-check it's work 
anyway.

Overall, I think Jarek actually hit the nail on the head with his comment that 
the key to using them right now is figuring out what they actually CAN do well 
and avoiding them for tasks where they are going to slow you down.  It takes 
some trial and error to figure out where that line is and new models and tools 
come out so fast, the line is constantly shifting.


 - ferruzzi


________________________________
From: Jarek Potiuk <[email protected]>
Sent: Tuesday, November 11, 2025 3:21 AM
To: [email protected]
Subject: [EXT] Share your Gen-AI contributions ?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



Hello community,

*TL;DR; I have a proposal that we share a bit more openly how we are using
Gen AI tooling to make us more productive. I thought about creating a
dedicated #gen-ai-contribution-sharing channel in Slack for that purpose*

I've been using various Gen-AI tools and I am sure many of us do and I've
seen people sharing their experiences in various places - we also shared it
a bit here - our UI Translation project is largely based on AI helping our
translators to do the heavy-lifting. I also shared a few times how AI
helped me to massively speed up work on fixing footers on our 250K pages of
documentation and - more recently - make sure our licensing in packages is
compliant with ASF - but also I used Gen AI to generate some scripting
tools (breeze ci upgrade and the check_translation_completness.py script).
Also many of our contributors use various Gen AI tools to create their PRs.
And I know few of us use it to analyse stack-traces and errors, and use it
to explain how our code works.

I thought that there are two interesting aspects that it would be great
that we learn from one another:

1) What kind of tooling you use and how it fits-in the UX and developer
experience (I used a number of things - from copilot CLI, IDE integration
to Copilot reviews and Agents. I found that the better integrated the tool
is in your daily regular tasks, the more useful it is.

2) The recurring theme from all the Gen-AI discussions I hear is that it's
most important to learn where Gen AI helps, and where it stands in the way:
* in a few things I tried Gen AI makes me vastly more productive - I feel
* in some of them I feel the reviews, correction of mistakes and generally
iteration on it slows me down significantly
* in some cases it maybe not faster, but takes a lot less mental energy and
decision making and mostly repetitive coding, so generally I feel happier
* finally there are cases (like the UI translation) that I would never even
attempt because of the vast amount of mostly repetitive and generally
boring things that would normally cause me dropping out very quickly and
abandoning it eventually

I feel that we could learn from each-other. For me learning by example -
especially an example in a project that you know well and you can easily
transplant the learnings to your own tasks - is the fastest and best way of
learning.

Finally - The Apache Software Foundation has this official guidance on
using AI to contribute code [1]  - I think this is a very well written one,
and it describes some border conditions where AI contributions are "OK"
from the licencing, copyright point of view - largely to avoid big chunks
of copyrightable code leaking from GPL-licensed training material. And
while it does not have definite answers, I think when we share our
contributions openly we can discuss things like "is that copyrightable",
where is that coming from etc. etc.  (note that in many cases - when you
generate large chunks of code, you can ask the LLM where the code comes
from and several of the LLM tools even provides you immediately the
references of the sources of code in such cases.

So my proposal is to create a *#gen-ai-contribution-sharing  *in our
slack - where we will share our experiences from using AI, ask when you
have doubts about whether you can submit such a code etc.

WDYT? Is it a good idea ?

[1] Generative Tooling Guidance by ASF:
https://www.apache.org/legal/generative-tooling.html

J.

Re: Share your Gen-AI contributions ?

Reply via email to