Hi,

Sorry to top-post!  But - I wanted to bring the discussion back to
licensing.  I have great sympathy for the ecological and code-quality
concerns, but licensing is a separate question, and, it seems to me,
an urgent question.

Imagine I asked some AI to give me code to replicate a particular algorithm A.

It is perfectly possible that the AI will largely or completely
reproduce some existing GPL code for A, from its training data.  There
is no way that I could know that the AI has done that without some
substantial research.  Surely, this is a license violation of the GPL
code?   Let's say we accept that code.  Others pick up the code and
modify it for other algorithms.  The code-base gets infected with GPL
code, in a way that will make it very difficult to disentangle.

Have we consulted a copyright lawyer on this?   Specifically, have we
consulted someone who advocates the GPL?

Cheers,

Matthew

On Thu, Jul 4, 2024 at 11:27 AM Marten van Kerkwijk
<m...@astro.utoronto.ca> wrote:
>
> Hi All,
>
> I agree with Dan that the actual contributions to the documentation are
> of little value: it is not easy to write good documentation, with
> examples that show not just the mechnanics but the purpose of the
> function, i.e., go well beyond just showing some random inputs and
> outputs.  And poorly constructed examples are detrimental in that they
> just hide the fact that the documentation is bad.
>
> I also second his worries about ecological and social costs.
>
> But let me add a third issue: the costs to maintainers.  I had a quick
> glance at some of those PRs when they were first posted, but basically
> decided they were not worth my time to review.  For a human contributor,
> I might well have decided differently, since helping someone to improve
> their contribution often leads to higher quality further contributions.
> But here there seems to be no such hope.
>
> All the best,
>
> Marten
>
> Daniele Nicolodi <dani...@grinta.net> writes:
>
> > On 03/07/24 23:40, Matthew Brett wrote:
> >> Hi,
> >>
> >> We recently got a set of well-labeled PRs containing (reviewed)
> >> AI-generated code:
> >>
> >> https://github.com/numpy/numpy/pull/26827
> >> https://github.com/numpy/numpy/pull/26828
> >> https://github.com/numpy/numpy/pull/26829
> >> https://github.com/numpy/numpy/pull/26830
> >> https://github.com/numpy/numpy/pull/26831
> >>
> >> Do we have a policy on AI-generated code?   It seems to me that
> >> AI-code in general must be a license risk, as the AI may well generate
> >> code that was derived from, for example, code with a GPL-license.
> >
> > There is definitely the issue of copyright to keep in mind, but I see
> > two other issues: the quality of the contributions and one moral issue.
> >
> > IMHO the PR linked above are not high quality contributions: for
> > example, the added examples are often redundant with each other. In my
> > experience these are representative of automatically generate content:
> > as there is little to no effort involved into writing it, the content is
> > often repetitive and with very low information density. In the case of
> > documentation, I find this very detrimental to the overall quality.
> >
> > Contributions generated with AI have huge ecological and social costs.
> > Encouraging AI generated contributions, especially where there is
> > absolutely no need to involve AI to get to the solution, as in the
> > examples above, makes the project co-responsible for these costs.
> >
> > Cheers,
> > Dan
> >
> > _______________________________________________
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: m...@astro.utoronto.ca
> _______________________________________________
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: matthew.br...@gmail.com
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to