Hi,

On 2026-02-20 17:43, Cayetano Santos via Development of GNU Guix and the GNU System distribution. wrote:

In my opinion we should accept contributions from LLM’s, provided the
origin of the code is clearly stated somehow (pr title, comment line,
etc.). I don’t think this is something which could avoid, anyway.

I also assume we share the following axiomata.

1. No existing LLM limits its training data
    to works belong to the public domain.
2. LLMs may leak their training data, outputing verbatim copies
    of their training materials.
3. From around 15 lines of code/text is eligible for copyright [2].
4. We do not take upstreams' copyright claims for granted.

There is a missing point here. To me, free software is all about
empathy: I feel concerned about you, provided you feel concerned about
me. Why this is relevant in this context ? Because I’m thinking about code
reviews.

If your contribution is not important enough for you to bother writing
it by yourself, don’t expect me to read it, even less expend some time
doing a serious review.

This apply as a general rule in all what relates LLM generated text, and
so the importance of clearly stating the origin of the text.

C.

I do agree with most of what I read in the conversation (to different degrees) and I don't think we can do anything here and we probably shouldn't either.

If I just contribute things that I stolen from another repository (no LLM in between), would reviewers check for that?

If the patch is huge, would reviewers accept it?

If the changes don't work, would reviewers let that go in?

All those cases apply if there are LLMs involved or not.

The process is different from the perspective of the person who writes the code: they may not be aware of the copyright violations, the hallucinations, etc the LLM will produce. From our side it doesn't really matter much.

I could've been pushing LLM generated code to Guix for years and you wouldn't notice. Unless it is garbage code, badly written, badly formatted or anything else and we already have mechanisms for checking that.

In summary, if the contribution is well done (no obvious copyright issues, small changes, they work, etc) there's no way we can know if it used an LLM or not. If we can actually know it would be because the contribution is not good enough, and in that cases we should just reject it as we already do.


A different story is if we actually want as a community to share our *opinion* about the usage of LLMs, or if we want to ask contributors to say if their contribution was made using LLMs just for adjusting the review criteria in those contributions (they could still lie, though).


About if your changes are "not important for you" or anything else... You wouldn't notice either if my changes to Guix are important for me or not. The point is if they are important and useful for Guix.

I do not review code for the person that sent the code, I review code for the project. I'm not doing a favor to the person who sent the code, they are (potentially) doing a favor to me, to the project that I love, to help me maintain it.

That could be also a very problematic assumption for all the packages we autogenerate, import and so on. (Are they made for humans to read, Ludovic?)

Of course, I prefer well crafted, well described, commits that work first run and I can blindly apply. But life is hard.


I dislike the LLM rhetoric and all, but I just don't see any issue from our side. It's like telling people not to do drugs if they take part in Guix. That's not my problem as long as their changes are good.[1]

Of course I can have and I do have an opinion about drugs (usage, production, legality, etc). It's just not relevant.

Cheers,
Ekaitz


[1]: I do have friends that sometimes write code under the influence of substances. It is very good sometimes (that's what they say) but some other times they don't understand what they did. Isn't that similar to what the LLMs do?

Reply via email to