Hey Ludo,

I have some comments from a quick read. I hope it's fine I am not
waiting for the formal discussion period, as I see no reason to postpone
commenting. I might be more busy later on, so the comments might not
come at all, which would seem unfortunate to me.

Ludovic Courtès <[email protected]> writes:

> authors: Ludovic Courtès

I understand we've established the header will be 'authors', not just
'author' even in case there is just one person. However, in the text
there is 'authors' used multiple times and it was confusing to me, I
wasn't sure if it was referring to GCD author or authors of something
else. I propose to use 'an author' in the text instead (as long as there
is no second author).

>
> ## Pledge
>
> We propose the following project commitments:
>
>   1. The project **will not use nor encourage use of genAI** for its
>        code, packages, code review, artwork, translations, or any other
>        artifacts.

What does it mean 'the project'? There are maintainers, committers, team
members behind this project... Is this supposed to say that committers
will not use genAI for work on Guix on the tasks outlined here? Or only
that the computing resources of the project won't use automated review
checkers, auto updates of packages using LLM and so on?

I think it would be nice if this was more clearly written, either saying
what individuals are/aren't supposed to do in regards to Guix, and/or
defining in what sense the project won't use them. In the end it's about
respoinsibilities of those individuals, so claimiing the project won't
do something makes it vague, I can think of many interpretations of this
and I am not sure which one is the right one.

The previous paragraphs apply mostly to the 'will not use genAI' point.
The other point, 'encourage use of genAI' is clearer, I would say (but
still not 100 %), as I can imagine that one meaning that officially, on
the sites, in documentation of the project and so on there won't be text
saying 'you might want to use LLM for X, Y and Z' and so on. So I think
even this other point would be worth an expansion.

> 2. We kindly ask contributors to respect this choice and not use LLMs
>  for their contributions to Guix.  Nevertheless, code claimed to be
>  produced in whole or in part by genAI **may be incorporated in the
>  limit of at most 15 lines of code** to ensure the contributor has a
>  valid copyright claim on the code.

What if it's 16?
What if I fold few lines into one to make the limit?
What if the contribution is split to multiple to comply?

I have looked online and I understand some sources are following
this heuristic of 15 lines of code being non-copyrightable, so I suppose
it comes from such places?

But there can be 15 lines of code that are just formatting, updating a
version, toggling a simple parameter, like #:tests? and there can be 15
lines of code that changes how the whole Guix System boots.

So wouldn't it be better to say something in the sense that if the code
would not be copyrightable when written by human, it is fine to be done
by genAI?

Or are there some laws / legal precedents that operate on rule of 15 lines of
code exactly, not taking into account the contents? And in such case,
we can definitely leave this point as is.

>   3. Software where the majority of commits were authored or co-authored
>        by genAI **will not be packaged in Guix**.  Notable examples of
>        such code include [Claude’s C
>        compiler](https://github.com/anthropics/claudes-c-compiler/),
>        [EmDash](https://github.com/emdash-cms/emdash), and
>        [Neomacs](https://github.com/eval-exec/neomacs),

I think there will be software that is currently not
written by LLMs, but might gradually become so. The parts of the code
being progressively replaced by LLM written code.

So for such cases, maybe there should also be a clause regarding
updating software? Ie. that Guix won't be updating even already packaged
packages when they get substantial rewrites through LLM and possibly
deprecate them, following the rules of the GCD on deprecation?

Additionally, I think a better metric would be the code itself, not
commits themselves. If you had a project with 5000 commits and then one
commit rewriting most through LLM, it might still fall through as
pronct that might be packaged in Guix. Of course you could argue it's not
even the same project anymore, but that's more of a philosophical
question and opinions might differ, so looking at parts of the code
being written by LLM looks to me more resilient to such disputes.

Lastly, I have a question about the motivation of this point. Is this
meant as and extension of the fact that Guix packages only open source
licensed software, or is it meant out of 'fear' that LLM-written code is
not free & open source - taking into account the question of
copyrightability of LLM code? If it's the former, okay. If it's the
latter, I think this point could be generalized and made 'dependant' on
how the legal framework will get established around LLMs in the future.
Of course currently this is an unknown zone, so it probably has to be
called out explicitly at least in some way, but it might be good to
say the motivation here so that after the FOSS community and laws start
working with world with LLM completely, it could be argued more easily
how to change this point, if anyhow.

>   4. Packages in Guix will always be **built from source**, the only
>        exceptions being compilers or build systems for which a bootstrap

Just taking beginning of this point. My understanding is that this is
already the case, this GCD is not changing anything. And in that case I
think it would be nice if it was explicitly called out. So something in
the sense: 'Guix has already committed to building everything from
source, with the only exceptions... And in case of software containing
fully trained neural networks that means ...'

Of course then the section about neural networks is a new one and it's
good to establish that.

---

I understand my points might be making the pledge longer,
which might be undesirable. And if that's so, I think it would be okay
if we had more details in the GCD, later on. The pledge itself does not
have to be non-ambiguous as long as it is explained elsewhere in the document.

Rutherther

Reply via email to