Re: [PATCH v2] docs/devel: relax policy on AI-generated contributions

Michael S. Tsirkin Fri, 05 Jun 2026 02:49:53 -0700

On Fri, Jun 05, 2026 at 10:39:15AM +0100, Daniel P. Berrangé wrote:
> On Fri, Jun 05, 2026 at 05:25:36AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Jun 05, 2026 at 10:17:16AM +0100, Daniel P. Berrangé wrote:
> > > On Thu, Jun 04, 2026 at 12:37:58PM +0200, Paolo Bonzini wrote:
> > > > Il mer 3 giu 2026, 19:54 Daniel P. Berrangé <[email protected]> ha
> > > > scritto:
> > > > 
> > > > > The AI policy should just
> > > > > make a point that we expect to be communicating with people not
> > > > > bots pretending to be people.
> > > > >
> > > > 
> > > > Yes, it's better to have that stated clearly.
> > > > 
> > > > > True but we also need a rule. The spirit is better explained elsewhere
> > > > > > (and also, building consensus on spirit vs. a rule are two different
> > > > > > things).
> > > > >
> > > > > Do we have a better elsewhere in this case ?  It is a point 
> > > > > specifically
> > > > > about intent of the AI policy rule.
> > > > 
> > > > 
> > > > The rule in this draft says 20 lines, tests, mechanical changes and 
> > > > docs.
> > > > The spirit is what is in the commit message, basically to maximize the
> > > > benefit and limit the possible damage?
> > > 
> > > Putting "the spirit" in the commit message is essentially /dev/null to
> > > anyone reading the policy later.
> > > 
> > > > > See my reply to Peter elsewhere in the thread. I agree with your
> > > > > > concerns for both docs and discretion, but I had specific uses in 
> > > > > > mind
> > > > > > that I'd like to allow.
> > > > > >
> > > > > > For docs:
> > > > > > - create tutorials and/or feature documentation based on functional 
> > > > > > tests
> > > > >
> > > > > That doesn't sound too appealing to me. Reverse engineering docs or
> > > > > tutorials from our functional tests is exactly the kind of thing that 
> > > > > feels
> > > > > likely to result in volumous text of marginal value which will have a 
> > > > > large
> > > > > burden on reviewers.
> > > > >
> > > > 
> > > > At the same time this can be helpful for maintainers themselves? Let's 
> > > > also
> > > > look at this from the point of view of producing better output, not just
> > > > from that of being on the receiving end of slop. Especially for docs I 
> > > > have
> > > > a hard time imagining people sending out whole new "manuals"... The
> > > > bugfixes rule ironically seems the most dangerous to me from the
> > > > Dunning-Krueger point of view.
> > > > 
> > > > My question is: do we want disclosure for anything is created with the 
> > > > help
> > > > of LLMs, even if only small parts survive untouched? I think so, 
> > > > because a
> > > > lot more, even if edited, would still be originally from AI. But then 
> > > > it's
> > > > important to have rules allowing it and a way to track it.
> > > 
> > > IMHO need unconditional disclosure, because the use of the LLM impacts
> > > the license of the code. QEMU is traditionally expected to be GPLv2+
> > > licensed for all new code, but there's the train of thought that LLM
> > > code is public domain.
> > > If it gets human editting afterwards we can
> > > consider that the human edits are GPLv2+ licensed, but IMHO we still
> > > want to know the origins.
> > 
> > Wait that's a big ask.
> > 
> > DOC explicitly does not ask if code might be available anywhere else
> > under any other license. Just that contributor can contribute under GPL.
> > If it's public domain then the human can license is under GPL.
> 
> For new files, in checkpatch we validate that SPDX-License-Identifier
> is explicitly set as GPL-2.0-or-later. Contributors are expected to
> justify any divergence in the commit message.
> 
> I've seen guidance that SPDX-License-Identifier for AI output code
> should NOT state a license, under the theory it is public domain.


Not state a license? Recommended by a lawyer? Seen where? Why?

> If it is human editted though, I would expect it to overrule this
> guidance and explicitly state GPL-2.0-or-later in the SPDX tag
> unless the contributor wants to explicitly put their own edits
> under public domain too.
> 

Yes.  So far we just asked:

  (b) The contribution is based upon previous work that, to the best
      of my knowledge, is covered under an appropriate open source
      license and I have the right under that license to submit that
      work with modifications, whether created in whole or in part
      by me, under the same open source license (unless I am
      permitted to submit under a different license), as indicated
      in the file; or


this:
         unless I am permitted to submit under a different license
applies to public domain works.


> Ultimately QEMU is a copyleft project as a whole and IMHO we should
> prioritize retaining that for as large a portion of the codebase is
> is practical.

But of course. We can make this explicit too: that
contributing it should be under GPL and/or implies licensing it under GPL.


> > > > It would definitely be intended for merge. There's a lot of boilerplate
> > > > code in the Rust bindings, for example, that is voluminous but *mostly*
> > > > lacks creativity---the creative part basically can be described by the
> > > > spec/docs and should already clear the low bar required for originality,
> > > > even if the code is automatically generated. I included a couple 
> > > > examples
> > > > in my reply to Peter.
> > > 
> > > So we know there are examples which are probably low risk from a license
> > > POV, but which are massively larger than 20 lines of code. This just
> > > makes me more uncomfortable with the 20 line rule as the definition of
> > > the policy - we know that rule is wrong / undesirable from the start and
> > > needs this exception to make it viable.
> > 
> > So 20 lines or mechanical changes? what is considered mechanical will be
> > decided by maintainers, contributor should check with them up front.
> 
> If we are wanting to allow mechanical changes / boilerplate, then we
> should express that in the policy such that the policy can be reasonably
> understood without having to ask permission / questions ahead of time. 
> 
> With regards,
> Daniel

Indeed but what is mechanical is a matter of taste.


> -- 
> |: https://berrange.com       ~~        https://hachyderm.io/@berrange :|
> |: https://libvirt.org          ~~          https://entangle-photo.org :|
> |: https://pixelfed.art/berrange   ~~    https://fstop138.berrange.com :|

Re: [PATCH v2] docs/devel: relax policy on AI-generated contributions

Reply via email to