Re: [sympy] Blog post on NUMFOCUS concerns

Jason Moore Mon, 27 Jan 2025 14:09:49 -0800

Dear Aaron,

Paul does raise numerous specific issues in his post and just saying "there
is nothing to worry about" doesn't allay any concerns that may form from
reading his post, at least not for me. If you look at existing
organizations that are 501(c)(6) orgs, none really give off any warm fuzzy
feelings nor do they come off as altruistic. So that does warrant concern.


And, as for the legality of generative AI tools, that will be decided by
courts around the world at some point. I just listened to this recent
Freakonoimcs: https://freakonomics.com/podcast/how-to-poison-an-a-i-machine/
and the professor pointed out the standard strategy of many companies,
which is to do something that is legally ambiguous, but do it fast and
broadly so people are hooked on it before the ligation system can even
evaluate it, then by the time the courts do get to it, they rule in the
company's favor because it is too ingrained to undo. The software licenses
I apply to my open source code say that my license must be carried along
with any copied and reworked versions of that code. I don't think the
companies or people using the tools are following these licenses. If they
don't have to follow them, then why does anyone at all have to follow them?

Jason
moorepants.info
+01 530-601-9791


On Mon, Jan 27, 2025 at 10:21 PM Aaron Meurer <[email protected]> wrote:

> On Sun, Jan 26, 2025 at 1:40 AM Jason Moore <[email protected]> wrote:
> >
> > Hi,
> >
> > I was browsing Paul Invanov's blog today and came across this article:
> >
> > https://pirsquared.org/blog/numfocus-concerns.html
> >
> > We are part of NUMFOCUS, so I'd say it is important to at least be aware
> of this. I do not have an opinion yet myself, but wanted to share.
>
> I love Paul, but I think that blog post is mostly FUD and these
> concerns about the 501c6 are not something to be worried about. I had
> many discussions with various people about this and related issues at
> the NumFOCUS summit last year and I'm confident that everything is OK.
> The 501c6 is more or less just a way for NumFOCUS to raise more money,
> as it makes it easier for some types or organizations to give. But the
> whole thing is being set up so that it does not affect the
> relationship with the projects (like SymPy). Unfortunately the summit
> was several months ago so I don't remember all the details, but maybe
> some more details have been posted publicly since then. But the
> biggest high level takeaway I had from the summit is that NumFOCUS
> really does care about the open source projects and has their best
> interests as community run projects at heart, and also that it is
> probably the only fiscal sponsorship organization that fits that
> description (i.e., moving away from NumFOCUS would be a bad idea).
>
> If you're still concerned about this, I would suggest emailing Andy
> Terrel about this (or maybe we can get him to respond here). He is on
> the NumFOCUS board and also is a (from a long time ago) contributor to
> SymPy.
>
> >
> > Also, this is what attracted me to his blog today:
> https://pirsquared.org/blog/current-challenges-in-free-software-and-open-source-development.html
> and is food for thought about whether we should have some policy to not
> accept AI generated code due to the likelhiood of OSS licenses being
> violated. There are examples of open source projects implementing such
> rules.
>
> I personally don't think LLM outputs violate OSS licenses. The closest
> something might come to being an issue is if an LLM generated a
> significant block of code that is verbatim copied from something else.
> That's not only unlikely in general due to the way LLMs work, but it's
> unlikely for SymPy because most code that would be written for SymPy
> is not something that would already have appeared somewhere else.
>
> At any rate, the ship has basically sailed on this. I would expect a
> large fraction of SymPy contributors already make use of LLMs in some
> form or other, whether it's using code completion from something like
> GitHub copilot or prompting a tool like ChatGPT or Cursor to help
> refactor or write a function. Frankly if you're not using LLMs at all
> to help you code you should because they are very useful tools.
>
> Looking at some other projects, scikit-image added "no ai
> contributions" policy and they ended up having to remove it
> https://github.com/scikit-image/scikit-image/pull/7429. scikit-learn
> has a policy disallowing completely automated contributions
> (contributions that have no human in the loop)
>
> https://github.com/scikit-learn/scikit-learn/blob/main/doc/developers/contributing.rst#automated-contributions-policy
> .
> I think that's a good policy, but also I don't know if it's something
> we need to write down unless it starts to become an issue (has it?).
>
> There's also, separately, the question of the quality of LLM generated
> code. I think that we need to use the GitHub review process we have
> always been using to ensure the SymPy code remains high quality
> regardless of its source. This means the usual things: good, thorough
> tests that check for correctness, readable code, avoiding various
> antipatterns, etc. LLM generated code won't always fit these
> parameters, especially if not prompted correctly.
>
> I think the biggest concern here is contributors (especially newer
> contributors) contributing code that exclusively comes from an LLM
> without any thought from the contributor themselves. This is
> especially likely from potential GSoC applicants. This we should
> disallow, because LLMs are not good enough to do this right now, and
> in the case of a GSoC applicant, it tells us nothing about their
> coding ability. Basically, any contributor to SymPy should be
> responsible for all the code they contribute. This especially makes it
> harder to evaluate GSoC applicants, but that's unfortunately the world
> we live in and we just need to learn how to evaluate people better
> (happy to discuss ideas for this. Should we do video call interviews
> with top GSoC applicants?)
>
> Aaron Meurer
>
>
>
> Aaron Meurer
>
> >
> > Jason
> > moorepants.info
> > +01 530-601-9791
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "sympy" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected].
> > To view this discussion visit
> https://groups.google.com/d/msgid/sympy/CAP7f1AjsFmZv%2BZGB2RVH9%3DS4KcaR%2B%2B0QtG8hJ1hwKYKLOXg%3D9w%40mail.gmail.com
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/sympy/CAKgW%3D6J97CtZNyN4qkx_%3DyEaj67yGB%3Dx%2BLLXx9gS2KKmtwKjcg%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/sympy/CAP7f1AgYpWdQE%3DbYXi0Mq2Fqq4aGF_yCcqfsKQydvG3MX4ww8w%40mail.gmail.com.

Re: [sympy] Blog post on NUMFOCUS concerns

Reply via email to