On 4/17/26 10:56 AM, Jerry D writes below:
On Fri, Apr 17, 2026 at 5:22 AM Christopher Albert via Gcc
<[email protected]> wrote:
 >
 > On 4/17/26 1:38 PM, Richard Biener wrote:
 >
 > > On Fri, Apr 17, 2026 at 12:51 PM Christopher Albert via Gcc
 > > <[email protected]> wrote:
 > >>> On Thu, 16 Apr 2026 at 15:28, Richard Earnshaw (foss) via Gcc
 > >>> <[email protected]> wrote:
 > >>>> On 14/03/2026 19:02, Jeffrey Law via Gcc wrote:
 > >>>>>
 > >>>>> On 3/14/2026 12:59 PM, Jerry D via Gcc wrote:
> >>>>>> Some of the various LLM services available appear to be getting very good at generating bug fixes. I realize that one must be careful as these tools can at times do things that may be superfluous to the actual fix. By superfluous I mean lines of code that are not relevant to the lines that fix it.
 > >>>>>>
> >>>>>> I saw some discussions of this subject for gcc somewhere and wanted to know if we have a specific policy established / documented somewhere regarding this.
 > >>>>> The steering committee is trying to figure out a good policy right now.
 > >>>>>
 > >>>>> Jeff
> >>>> I notice that the Linux kernel recently adopted the following policy: https://github.com/torvalds/linux/blob/master/Documentation/process/coding- assistants.rst
 > >>>>
 > >>>> Has there been any progress on GCC yet?
 > >>> Carlos and I prepared a draft policy, but I believe the GCC steering
 > >>> committee is also looking into it. The FSF are also working on
 > >>> policies.
 > >>>
 > >>> Our draft policy takes a similar position to the kernel one: LLMs
 > >>> cannot do a DCO sign-off as their output is not copyrightable. The
 > >>> correct trailer to use is Assisted-by and not Co-authored-by. But our
 > >>> draft policy proposes *not* accepted AI-generated code, only allowing
 > >>> the use of AI for assistance, idea generation, testing, but not
 > >>> generating the actual code. That's because the legal status of
 > >>> AI-generated code is unclear, is not copyrightable, and does not meet
 > >>> the legal prerequisites for GCC contributions.
 > >> I would add one practical point from recent experience.
 > >> A substantial part of my own recent GCC contributions was only possible
 > >> because reviewers and maintainers engaged seriously with patches I
 > >> developed using AI tools under my direction. In at least some cases, it
 > >> would be fair to say that this went beyond AI as pure "idea generation":
 > >> the tools were part of the development workflow that let me produce,
 > >> iterate on, and validate fixes much more efficiently. The patches were
 > >> still submitted by me, reviewed by me, tested by me, and signed off by
 > >> me, with full responsibility on my side for every line.
 > >> In practice, this workflow was very successful. I do not think I could
 > >> have fixed so many bugs, at that quality and speed, without it. I would
 > >> therefore be cautious about a policy that is too strict. If GCC rules
 > >> out any patch where AI contributed more than idea generation, we may
 > >> lose an accountable workflow that has worked well here, and we risk
 > >> falling behind projects that take a more pragmatic line.
 > >> The legal picture also seems more nuanced to me than a simple rule of
 > >> "assistance allowed, generation forbidden":
 > >> *US. U.S. Copyright Office, Copyright and Artificial Intelligence,
 > >> Part 2: Copyrightability (Jan 2025): AI used to assist human
 > >> creativity does not by itself defeat protection; purely AI-generated
 > >> material is not protected; the analysis is case-specific and turns on
 > >> human authorship and control over the final expression; human
 > >> selection, arrangement, and modification of AI output may be
 > >> protected.
> >> https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-2- Copyrightability-Report.pdf
 > >> *EU. The CJEU standard is the "author's own intellectual creation",
 > >> that is, free and creative choices by a human author (Infopaq
 > >> C-5/08, Painer C-145/10, Cofemel C-683/17). The EU AI Act
 > >> (Reg. 2024/1689) regulates AI systems and transparency, not copyright
 > >> authorship.
 > >> https://eur-lex.europa.eu/eli/reg/2024/1689/oj
 > >> *UK. CDPA 1988 s.9(3) assigns authorship of computer-generated works
 > >> to "the person by whom the arrangements necessary for the creation of
 > >> the work are undertaken."
 > >> https://www.legislation.gov.uk/ukpga/1988/48/section/9
 > >> These regimes do not, in my view, support a simple categorical rule
 > >> that any code produced with substantial AI involvement must be excluded.
 > >> The more relevant question seems to be whether there is sufficient human
 > >> authorship and control over the final result.
 > >> I fully understand the need for caution on provenance, licensing, and
 > >> responsibility, and I agree that an LLM cannot itself sign a DCO. But
 > >> from my perspective, the decisive criterion should be that the human
 > >> contributor takes full responsibility for the submitted patch:
 > >> careful review, any necessary rewriting, testing, and sign-off,
 > >> rather than a blanket rule that excludes code whenever AI tools played
 > >> a substantial role somewhere in the development process.
 > > I agree to some extent, but then I'd also like to at least see assisted-by
 > > tags as proposed by the Linux kernel policy.  But I also have to say that
 > > for people that do not have an existing track record with projects it's
 > > very difficult to provide that upfront trust on following such "grey" 
policy
 > > to the spirit it was written with.  Esp. since there's wide areas of GCC
 > > where knowledge of existing code is scarce and since LLMs are so
 > > convincing in their hallucinations so code quality might degrade (I
 > > realize this isn't a problem of LLMs per se but of unmaintained code
 > > pieces and any contribution to it).
 >
 > I fully agree with you, Richard! Making AI assistance transparent and
 > have clear rules about it is key. I didn't publicly flag my use of AI
 > tools up to now because I was afraid that the contributions would get
 > rejected upfront due to unclear policy or strong opinions. Internally, I
 > was more open about it out of respect for the reviewers who invested a
 > lot of time and effort. From now on, I will add Assisted-by: to the patches.

They absolutely should have been rejected and now they should be
rejected for not disclosing in the past usage of LLMs.
See what OpenJDK does about this; https://openjdk.org/legal/ai.
I am shocked that folks are even trying to use AI to create patches in
this community while the openjdk community rejects them right away.



 >> In other projects there are rules that outright refuse any contribution
 > that has to do with AI because they are afraid to be flooded by
 > unqualified code.

Not the openjdk project. They even have "safety and security",
"intellectual-property" issues with the LLMs based patches.


 > I had such an experience also in one of my projects,
 > where I got automatically generated contributions that were completely
 > useless. Some (possibly semi-automatic) pre-screening could also become
 > relevant, depending on the amount of PRs and patches of such kind.
 >
 >
 > IMHO there is no turning back from these developments, and I am
 > confident that Richard and Carlos will find a well-balanced solution.

Yes there is. There is always turning back and rejecting them. You
have also ignored the nature of LLMs as they are hugely closed off to
many people and even bad for the environment and take unknown input
and spit out mixed up maybe not so legal copyrighted code.

Thanks,
Andrea Pinski

First I apologize for the broken thread, my email client, operator error.

I sense a lot of contention from the text of this thread which I read off of the archives.

Since I started this thread, I thought I should step in. I think folks may be missing an important aspect of this.

Yes, an LLM set loose to automatically go forth without human interaction is going to produce garbage. Garbage in garbage out. Don't do this. We are adults.

The key to these LLMs is the prompting. If the LLM is prompted with gcc copyrighted code chunks or files, then everything it produces is "derived from". The use of the word "generated" is a false presumption. It does not generate. It goes through a complex process of probabilities. This is why it is so good at identifying details and corner cases.

Yes it can "hallucinate", a horrible use of the word. Go back to statistics. Statistics do lie. This is why it is critical to have the iterative human interaction which is the process used by Albert and that is coupled with multiple real reviewers doing real testing behind the scenes. The gfortran folks use both email and our MatterMost channels for this interaction.

I have been seeing folks here using language like "shocked" or "absolutely". Truth is we are all on the bleeding edge. I started this thread because I did not see enough discussion in the open. I see high potential here to do the greater good. This must always by our goal.

It is not my decision on any of this. Please everyone, lets not shoot our selves in the foot.

As always, best regards to all.

Jerry


Reply via email to