Hello Guixers,
It’s been another week with no response or movement on this. I’m
disappointed that this situation seems to be getting treated so
lightly. Adhering to the terms of software licenses is
fundamental to the operation of the free software ecosystem; there
is no software freedom without it. It’s surprising that a pretty
clear-cut situation of creating derivative works of free software
in violation of their licenses would be shrugged off so easily.
Whatever the Guix organization’s position is, I’m reaching my
personal limit, and need to see some kind of positive movement on
this[1]. If Guix is going to continue to facilitate license
violations, I will have no choice but to remove my software from
it to defend them.
— Ian
[1]: Personally, I would be satisfied with a per-package setting
which disables scheduling source for archiving by SWH. Seeing
this, or a committment to build this within a reasonable
timeframe, would allay my concerns.
Ian Eure <i...@retrospec.tv> writes:
Hello,
I’m following up on this since discussion since it’s been a
month and
I haven’t heard any updates.
Summarizing the situation:
- SHF has an opaque, difficult, and undocumented process for
handling name changes. I’s like to stress again that this is
*not* strictly a transgender issue (though it likely affects
them
more, or in worse/different ways) -- it is a human respect
issue.
Many, many more cisgender people change their name than
transgender people.
- SHF gave their archive to HuggingFace, an "AI" company which
is
generating derived works with no attribution or provenance, in
ways which violate the both licenses of the projects used to
train
their model, and the SHF principles for LLMs.
- HuggingFace wasn’t respecting requests to opt-out of their
model.
On the first point, it sounds like SHF has made concrete
progress to
improve[1], which is very good to hear. If SHF continues on
this
course, I think the concern is resolved.
On the third point, HuggingFace has begun honoring opt-out
requests,
but is still very far behind. Also, they don’t remove code from
the
older versions of their model -- it remains there forever. This
is
progress, but still, not great.
On the second point, I have not seen any public statements
indicating
that either SHF or HuggingFace even acknowledges the problem.
SHF’s
most recent newsletter[2], published in April 2024 (after these
concerns came to light), continues to tout that StarCoder2 is
"the
first AI model aligned with our principles," which appears to be
false. StarCoder2 includes both licensed and unlicensed code,
and
HuggingFace’s own StarChat2 playground produces works derivative
of
this code, with no attribution or licensing information. There
is
also no statement or position on the SHF news blog. Nor hsa
HuggingFace either fixed their tools, or made a statement. This
is
still very much a live concern.
I have a few questions:
- Has Guix reached out to SHF to express these concerns / get a
response?
- Whether a public or private response, what would Guix consider
to
be an acceptable response? An unacceptable respoinse?
- How long is Guix willing to wait for a response?
Thanks,
— Ian
[1]:
https://cohost.org/arborelia/post/5273879-they-are-fixing-some
[2]:
https://www.softwareheritage.org/wp-content/uploads/2024/04/Software-Heritage-2024-Vision-Milestones-Newsletter.pdf
Ian Eure <i...@retrospec.tv> writes:
Hi Guixy people,
I’d never heard of SWH before I started hacking on Guix last
fall,
and
it struck me as rather a good idea. However, I’ve seen some
things
lately which have soured me on them.
They appear to be using the archive to build LLMs:
https://www.softwareheritage.org/2024/02/28/responsible-ai-with-starcoder2/
I was also distressed to see how poorly they treated a
developer who
wished to update their name:
https://cohost.org/arborelia/post/4968198-the-software-heritag
https://cohost.org/arborelia/post/5052044-the-software-heritag
GPL’d software I’ve created has been packaged for Guix, which I
assume
means it’s been included in SWH. While I’m dealing with their
(IMO:
unethical) opt-out process, I likely also need to stop new
copies
from
being uploaded again in the future.
Is there a way to indicate, in a Guix package, that it should
*never*
be included in SWH?
Is there a way to tell Guix to never download source from SWH?
I want absolutely nothing to do with them.
Thanks,
— Ian