Hey y'all,
I've avoided weighing in on this topic because I'm of two minds about
it. Still, when members of the community raise concerns, it's important
to take those concerns seriously. We must be careful how we address
them because the opinions and concerns of any community member are as
Hi,
Ian Eure skribis:
> While this is what their paper claims[1], it doesn’t appear to be
> true, since I can see my own GPL’d code in the training set. I’ve
> since moved nearly all of my code off GitHub, but if you visit their
> "Am I in The Stack?" page[2] and enter my old username
Hi Ian,
On Thu, Jun 27 2024, Ian Eure wrote:
> I’ve [...] moved nearly all of my code off GitHub
Me too. I think closed it off from search crawlers. No one should be
using Github anymore except for merge requests. I left many years ago.
> if you visit their "Am I in The Stack?" page
Thank
Hi Ludo,
Ludovic Courtès writes:
Ian Eure skribis:
Guix sends archive requests to SWH. SWH gives that source code
to
HuggingFace. HuggingFace demonstrably violates the licenses.
Which licenses? As has been said previously, and you can verify
for
yourself, it does not ingest code
Ian Eure skribis:
> Guix sends archive requests to SWH. SWH gives that source code to
> HuggingFace. HuggingFace demonstrably violates the licenses.
Which licenses? As has been said previously, and you can verify for
yourself, it does not ingest code under copyleft licenses.
Ludo’.
Hi,
El 21/06/24 a las 9:19, MSavoritias escribió:
On Fri, 21 Jun 2024 09:41:10 +0100
Dale Mellor wrote:
`-x archival` does it, but it is too easy to forget and once the cat is out
of the bag privacy is lost. I really think this should be default behaviour, or
at least there should be
On Fri, 21 Jun 2024 09:41:10 +0100
Dale Mellor wrote:
> On Thu, 2024-06-20 at 22:59 +0200, Ekaitz Zarraga wrote:
> > Hi,
> >
> > On 2024-06-20 22:54, Andreas Enge wrote:
> > > Am Thu, Jun 20, 2024 at 07:42:44PM +0100 schrieb Dale Mellor:
> > > > I'm sure guix lint tried to push my code out
On Thu, 20 Jun 2024 16:40:57 +0200
Simon Tournier wrote:
> Being concrete and explicit, could you please share:
>
> 1. Which part of your code is included in the pretraining dataset?
>
> It’s easy, you can copy/paste a snippet and it returns the location
> from where it comes from.
>
On Thu, 20 Jun 2024 16:35:10 +0200
Ekaitz Zarraga wrote:
> > 2. You seem to imply that Free Software or code is apolitical. (in the
> > sense of social or state politics not) Which it is not. Nothing is.
> > For example Free Software is explicitly pro-capitalist and
> > pro-Google/big companies.
On Thu, 2024-06-20 at 22:59 +0200, Ekaitz Zarraga wrote:
> Hi,
>
> On 2024-06-20 22:54, Andreas Enge wrote:
> > Am Thu, Jun 20, 2024 at 07:42:44PM +0100 schrieb Dale Mellor:
> > > I'm sure guix lint tried to push my code out to them the last time I
> > > tried.
> >
> > Ah indeed, there is this
On Thu, 20 Jun 2024 14:43:30 -0700
Andy Tai wrote:
> > Date: Wed, 19 Jun 2024 09:36:29 +0100
> > From: Dale Mellor
> > I use Guix as a tool to develop my own projects, private and
> > personal for reasons I'm keeping to myself. As part of that I write package
> > definitions for them, and use
Hi,
On Thu, 20 Jun 2024 at 19:42, Dale Mellor wrote:
> I'm sure guix lint tried to push my code out to them the last time I
> tried.
Yes, it’s the checker ’archival’.
Therefore, running “guix lint -x archival” does not send any request to
SWH.
Cheers,
simon
> Date: Wed, 19 Jun 2024 09:36:29 +0100
> From: Dale Mellor
> I use Guix as a tool to develop my own projects, private and
> personal for reasons I'm keeping to myself. As part of that I write package
> definitions for them, and use the Guix machinery to build and test. I
> *cannot*
> have
Am Thu, Jun 20, 2024 at 10:59:41PM +0200 schrieb Ekaitz Zarraga:
> For this specific case we could add some flag to the command line like
> `--do-not-archive` or something like that.
guix lint -x archival
if I understand "guix lint --help" correctly.
Andreas
Hi,
On 2024-06-20 22:54, Andreas Enge wrote:
Am Thu, Jun 20, 2024 at 07:42:44PM +0100 schrieb Dale Mellor:
I'm sure guix lint tried to push my code out to them the last time I tried.
Ah indeed, there is this in guix/lint.scm:
(define (check-archival package)
"Check whether PACKAGE's
Am Thu, Jun 20, 2024 at 07:42:44PM +0100 schrieb Dale Mellor:
> I'm sure guix lint tried to push my code out to them the last time I tried.
Ah indeed, there is this in guix/lint.scm:
(define (check-archival package)
"Check whether PACKAGE's source code is archived on Software Heritage. If
On Thu, 2024-06-20 at 19:00 +0200, Andreas Enge wrote:
> Am Wed, Jun 19, 2024 at 09:36:29AM +0100 schrieb Dale Mellor:
> > No, it's not. I use Guix as a tool to develop my own projects, private
> > and
> > personal for reasons I'm keeping to myself. As part of that I write package
> >
Am Wed, Jun 19, 2024 at 09:36:29AM +0100 schrieb Dale Mellor:
> No, it's not. I use Guix as a tool to develop my own projects, private and
> personal for reasons I'm keeping to myself. As part of that I write package
> definitions for them, and use the Guix machinery to build and test. I
>
On Tue, 2024-06-18 at 07:19 -0700, Ian Eure wrote:
> Hi MSavoritias,
>
> Thank you for the email.
>
> I’m going to lay out this situation as clearly as I can, in the
> hope that others will better understand, and hopefully treat it
> with the seriousness it deserves.
>
> 1. Guix requests SWH
Hi MSavoritias, all,
On Thu, 20 Jun 2024 at 09:51, MSavoritias wrote:
>> Not to avoid the question but from a pragmatic point of view, one
>> might ask if the source code you write and do not want to be included
>> in the training dataset, if this source code is concretely part of
>> that
Hi,
On 2024-06-20 08:36, MSavoritias wrote:
On Wed, 19 Jun 2024 17:46:08 +0200
Ekaitz Zarraga wrote:
On 2024-06-19 12:25, raingl...@riseup.net wrote:
On 2024-06-19 11:54, Efraim Flashner wrote:
On Wed, Jun 19, 2024 at 12:13:38PM +0300, MSavoritias wrote:
...
One of our packages, dbxfs,
On Wed, 19 Jun 2024 16:41:33 +0200
Simon Tournier wrote:
> Hi MSavoritias, all,
>
> Let me provide more context.
>
> The concern started couple of months ago, to my knowledge. And
> discussion is still on going. So I think that’s incorrect to say “any
> result for over 6 months”.
Hey Simon,
On Wed, 19 Jun 2024 17:46:08 +0200
Ekaitz Zarraga wrote:
> On 2024-06-19 12:25, raingl...@riseup.net wrote:
> > On 2024-06-19 11:54, Efraim Flashner wrote:
> >> On Wed, Jun 19, 2024 at 12:13:38PM +0300, MSavoritias wrote:
> >> ...
> >> One of our packages, dbxfs, left Github a while ago and
On Wed, 19 Jun 2024 19:56:26 -0700
Felix Lechner wrote:
> Hi MSavoritias,
>
> On Wed, Jun 19 2024, MSavoritias wrote:
>
> > I am not interested what the states or licenses/copyrights allow or
> > don't allow in this case. What I care about is what we expect as a
> > community when we submit a
Hi MSavoritias,
On Wed, Jun 19 2024, MSavoritias wrote:
> I am not interested what the states or licenses/copyrights allow or
> don't allow in this case. What I care about is what we expect as a
> community when we submit a package/code to guix and if that violates
> our social rules and
On 2024-06-19 12:25, raingl...@riseup.net wrote:
On 2024-06-19 11:54, Efraim Flashner wrote:
On Wed, Jun 19, 2024 at 12:13:38PM +0300, MSavoritias wrote:
...
One of our packages, dbxfs, left Github a while ago and continued
development on a different forge. They adjusted their README to
Hi MSavoritias, all,
Let me provide more context.
The concern started couple of months ago, to my knowledge. And
discussion is still on going. So I think that’s incorrect to say “any
result for over 6 months”.
Moreover, I feel you have a misunderstanding about HuggingFace and SWH
partnership.
On Wed, 19 Jun 2024 12:54:30 +0300
Efraim Flashner wrote:
> On Wed, Jun 19, 2024 at 12:13:38PM +0300, MSavoritias wrote:
> > On Wed, 19 Jun 2024 09:52:36 +0200
> > Simon Tournier wrote:
> >
> > > Hi Ian, all,
> > >
> > > On Tue, 18 Jun 2024 at 10:57, Ian Eure wrote:
>
> > > I think that
On 2024-06-18 20:08, Ian Eure wrote:
> Andy Tai writes:
>
>> What is the role of GNU Guix in this? If Guix is mainly a referral
>> mechanism like web page links to the actual contents, the real problem
>> is not Guix but the use of free software which can be obtained via
>> other mechanisms
On 2024-06-19 11:54, Efraim Flashner wrote:
> On Wed, Jun 19, 2024 at 12:13:38PM +0300, MSavoritias wrote:
> ...
> One of our packages, dbxfs, left Github a while ago and continued
> development on a different forge. They adjusted their README to disallow
> hosting of their code on Github. Based
On Tue, Jun 18, 2024 at 11:37:17AM +0300, MSavoritias wrote:
> Hello,
> So with that said I urge anybody who has been in contact with them in
> an official Guix capacity to come forward, otherwise I can volunteer to
> be that. Idk if we have a community outreach thing I need to be in also
> for
On Wed, Jun 19, 2024 at 10:01:43AM +0300, MSavoritias wrote:
> On Tue, 18 Jun 2024 13:31:02 -0400
> Greg Hogan wrote:
>
> > On Tue, Jun 18, 2024 at 12:33 PM MSavoritias
> > wrote:
> > >
> >
> > If you feel that LLMs/AI are violating the terms of a license, then
> > feel free to pursue that
On Wed, Jun 19, 2024 at 12:13:38PM +0300, MSavoritias wrote:
> On Wed, 19 Jun 2024 09:52:36 +0200
> Simon Tournier wrote:
>
> > Hi Ian, all,
> >
> > On Tue, 18 Jun 2024 at 10:57, Ian Eure wrote:
> >
> > > Guix is continuing to partner with SWH in spite of their continued
> > > support of
On Wed, 19 Jun 2024 09:52:36 +0200
Simon Tournier wrote:
> Hi Ian, all,
>
> On Tue, 18 Jun 2024 at 10:57, Ian Eure wrote:
>
> > Guix is continuing to partner with SWH in spite of their continued
> > support of these violations.
>
> Quickly because I am in the middle of a busy day. :-)
Hi Ian, all,
On Tue, 18 Jun 2024 at 10:57, Ian Eure wrote:
> Guix is continuing to partner with SWH in spite of their continued
> support of these violations.
Quickly because I am in the middle of a busy day. :-)
I think that LLM asks ethical and legal question that even FSF or EFF or
SFC
On Tue, 18 Jun 2024 13:31:02 -0400
Greg Hogan wrote:
> On Tue, Jun 18, 2024 at 12:33 PM MSavoritias
> wrote:
> >
> > Ah it seems I wasn't clear enough.
> > I meant write something like:
> >
> > By packaging a software project for Guix you are exposing said
> > software to a code harvesting
Guix sends archive requests to SWH. SWH gives that source code to
HuggingFace. HuggingFace demonstrably violates the licenses.
Guix could stop sending archive requests to SWH. This wouldn’t
*stop* the bad things from happening, but it would *stop
condoning* them. The same as how Guix not
Hi Greg,
Please read my earlier reply in this thread[1].
HuggingFace is demonstrably violating the licenses of the Free
Software used to train its StarCoder2 LLM.
Software Heritage is continuing to partner with HuggingFace in
spite of these violations.
Guix is continuing to partner with
On Tue, Jun 18, 2024 at 12:33 PM MSavoritias wrote:
>
> Ah it seems I wasn't clear enough.
> I meant write something like:
>
> By packaging a software project for Guix you are exposing said software
> to a code harvesting project (also known as LLMs or "AI") run by
> Software Heritage and/or
What is the role of GNU Guix in this? If Guix is mainly a referral
mechanism like web page links to the actual contents, the real problem
is not Guix but the use of free software which can be obtained via
other mechanisms directly anyway to train LLMs if Guix is not in the
loop?
On Tue, 18 Jun 2024 12:21:33 -0400
Greg Hogan wrote:
> On Tue, Jun 18, 2024 at 4:37 AM MSavoritias
> wrote:
> >
> > 1. Add a clear disclaimer/requirment that any new package that is
> > added in Guix, the person has to give consent or get consent from
> > the person that the package is written
On Tue, Jun 18, 2024 at 4:37 AM MSavoritias wrote:
>
> 1. Add a clear disclaimer/requirment that any new package that is added
> in Guix, the person has to give consent or get consent from the person
> that the package is written in. This needs to be added in the docs and
> in the email
Hi MSavoritias,
Thank you for the email.
I’m going to lay out this situation as clearly as I can, in the
hope that others will better understand, and hopefully treat it
with the seriousness it deserves.
1. Guix requests SWH to archive some source code. This is fine.
2. SWH archives the
Hello,
Context:
As you may already know there have discussions around Software Heritage
and the LLM model they are collaborating with for a bit now. The model
itself was announced at
https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/
As I have started writing some
44 matches
Mail list logo