Re: legal questions regarding machine learning models
Mathieu Blondel writes: > * The model alone can be distributed under a free license. > - As a consequence of this, neither the original data nor the program > to build the model need to be free. Going by the FSF definition of a free work, specifically freedom 1 and 3 http://www.gnu.org/philosophy/free-sw.html>, a necessary precondition for a work to be free is for its recipients to have free access to the source form of the work. What does “the source form of the work” mean for these models? Whatever the answer to that is, describes something that needs to be freely available to every recipient, in order to consider the work free. > * The DFSG is more restrictive and requires the source of any software > in Debian. The DFSG has different restrictions from the FSF definition, true. I don't think it differs on this point though: free access to the source form of the work is part of the definition of free software. -- \ “I got some new underwear the other day. Well, new to me.” —Emo | `\ Philips | _o__) | Ben Finney -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: legal questions regarding machine learning models
On Thu, May 28, 2009 at 5:51 AM, Francesco Poli wrote: >> Afterall, a model is just a big set of numbers. > > Machine code is just a long sequence of 0s and 1s... I knew someone would come up with this :-) Let me summarize and please correct me if I'm wrong. * The model alone can be distributed under a free license. - As a consequence of this, neither the original data nor the program to build the model need to be free. * The DFSG is more restrictive and requires the source of any software in Debian. - If you consider that the model is the source like it was accepted for a picture which is a 2D rendering of a 3D model, then you can package the model directly. - Otherwise, it is necessary that the data are included in the source package and the tools to build the model are in Debian main. -> To cope with models which take too long to compute, it should be possible to ship a pre-built architecture-independent model together with the data. However this doesn't solve the problem that the data may be too large to be hosted in the archive. -> If data size becomes a problem, then one could resort to use the non-free archive in order to ship the model only. Thank you, Mathieu Blondel -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Appropriate use of debian-legal (was: legal questions regarding machine learning models)
Steve Langasek writes: > [specific person]'s posts are an inappropriate use of this mailing > list and not productive, and [they should] stop posting. On what are you basing your judgement of “appropriate use of this mailing list”? Can you give specific examples of posts you think are inappropriate for this mailing list, and why those specific posts are inappropriate, so that we can understand your position? -- \“To me, boxing is like a ballet, except there's no music, no | `\ choreography, and the dancers hit each other.” —Jack Handey | _o__) | Ben Finney -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: legal questions regarding machine learning models
On Wed, 27 May 2009 11:37:56 +0200 Steve Langasek wrote: > On Wed, May 27, 2009 at 10:33:52AM +0200, Josselin Mouette wrote: > > > Disclaimers, of course: IANADD, TINASOTODP (and IANAL, TINLA). > > > If you really feel the urge to add meaningless acronyms to all your > > emails, please do so in your signature. > > Better yet: he should recognize that the reason he needs to add all these > acronyms is because his posts are an inappropriate use of this mailing list > and not productive, and stop posting. You're not new to such impolite replies, and I don't think your reputation benefits from them. Anyway, if disagreeing with FTP masters and expressing one's own opinion (while *explicitly* clarifying that what is expressed is just one's own opinion, and not necessarily the official Debian position) is an "inappropriate use of this mailing list", then I suggest that the list is shut down as soon as possible and that debian-le...@l.d.o is turned into a forwarder to ftpmas...@d.o ... That way you have the guarantee that *no* reply from debian-le...@l.d.o can possibly include heretic and sacrilegious opinions that dare to disagree with the FTP masters! I am not sure that the FTP masters would be overly happy to have to deal with all the questions that are directed to debian-le...@l.d.o, but one does not have to care about little details like these... I used to think that the Debian Project cared about Free Software and maybe even about free speech, but something apparently went wrong... :-( -- New location for my website! Update your bookmarks! http://www.inventati.org/frx . Francesco Poli . GnuPG key fpr == C979 F34B 27CE 5CD8 DC12 31B5 78F4 279B DD6D FCF4 pgpLvxuASEGTr.pgp Description: PGP signature
Re: legal questions regarding machine learning models
On Wed, 27 May 2009 10:33:52 +0200 Josselin Mouette wrote: > Le mercredi 27 mai 2009 à 00:36 +0200, Francesco Poli a écrit : [...] > > I instead think that FTP masters should change their minds about 2D > > images rendered from 3D models. > > I suggest you start your own distribution, in which you won’t ship: > * xfonts-* (bitmap renderings of non-free vector fonts) Are you saying that xfonts-* are derived from non-free fonts? How can they be DFSG-free, then? > * all icons shipped without SVG source When an icon is actually created in SVG format, what's so strange about insisting that its real source (i.e.: SVG) is shipped in the Debian (main) source package? > * all pictures shipped without XCF/PSD source (oh yeah, that makes > a lot) Again, for pictures that are created in XCF format, the preferred form for making modifications is the .xcf file, in most cases. Why are you insisting that source-less works should be accepted in Debian main? > * actually, all pictures that are initially photographs of an > object (the preferred form of modification is the original > object; if you want to see it at another angle, you need to take > another photograph) For photographs, the physical object is *not* the preferred form for making modifications to the work, it's the preferred form for *recreating* the work from scratch. I think we have already had this discussion. See http://lists.debian.org/debian-legal/2008/12/msg00085.html You may argue that the same reasoning applies to 3D models, but I think the key difference stays in the word "preferred". Since you cannot transfer physical objects through a network, or copy & modify them, and so forth, they are not preferred for making modifications to photographs. 3D models are instead digital information that may well be the preferred form for making modifications to a work. Of course, in some cases, the huge size of a 3D model could well move the preference to some other form. As I said, it's always a case-by-case decision, but not one that should be taken lightly, IMHO. > * all sound files shipped without the full genetic code of the > speaker As for photographs, I don't think that this is the actual source. > > You could call it something like gNewSense, and you could discuss during > hours with RMS how much better it is this way. Naah, I disagree with RMS on a number of matters, so I don't think that my "own distro" would be more similar to gNewSense, than to Debian... > > > Disclaimers, of course: IANADD, TINASOTODP (and IANAL, TINLA). > > If you really feel the urge to add meaningless acronyms to all your > emails, please do so in your signature. Not all my messages require the same set of disclaimers, if at all. -- New location for my website! Update your bookmarks! http://www.inventati.org/frx . Francesco Poli . GnuPG key fpr == C979 F34B 27CE 5CD8 DC12 31B5 78F4 279B DD6D FCF4 pgpcFXkmmIfEC.pgp Description: PGP signature
Re: legal questions regarding machine learning models
On Wed, 27 May 2009 11:36:55 +0200 Mark Weyer wrote: [...] > Extremes: I do not agree with this classification of my view. > I value a free game for the fact, that I can fool around with the source > to make it "better". Adding features, levels, characters. If this means > that I have to add long ears to some sprite (which is obviously generated > from some 3D model), then I want to have access to that model and to the > toolchain used to turn the model into the sprite. Because that is much > more simple and robust, and creates a much more consistent set of sprite > animation parts, than doing it with gimp on each part of each animation > sequence individually. Free data is important for the very same reason > that free programs are! Exactly so. I agree that this is the key aspect to take into account when talking about this issue. Unfortunately some people seem to think that getting more games (or images, or music, or ...) is worth sacrificing the important freedoms... :-( > > What to do: As always it is a tradeoff between quantity and quality, in > this case of packages. Maintaining a high freeness standard has an impact > on the resources needed, so it limits the number of costly packages that > you can support for any given amount of available resources. > I value Debian because (and as long as) it puts the emphasis on freeness. 100 % agreement here. I also think that Debian *should* value Freeness standards over the mere quantity of packages in main. > > > PS:I'm CC'ing to the Debian Games Team mailing list. > > Done as well, but I am not subscribed to that list. Same here: I am subscribed to debian-legal, but not to debian-devel-games. -- New location for my website! Update your bookmarks! http://www.inventati.org/frx . Francesco Poli . GnuPG key fpr == C979 F34B 27CE 5CD8 DC12 31B5 78F4 279B DD6D FCF4 pgp6qLCiuxHb9.pgp Description: PGP signature
Re: legal questions regarding machine learning models
On Wed, 27 May 2009 11:25:09 +0900 Mathieu Blondel wrote: > On Wed, May 27, 2009 at 7:36 AM, Francesco Poli wrote: > > > I think that in the case of machine learning models, source form is > > even more clearly distinct from compiled object. > > We can consider an artificial neural network, for instance (Mathieu, > > correct me if it's a wrong example). > > I am under the impression that basically nobody would change connection > > weights by hand, in order to modify a neural network. > > Yes the connection weights of an artificial neural network are a good > example of the parameters I was talking about. In practice, nobody > would change a connection weight by hand because it's impossible to > predict the effect of this particular weight on the overall > performance of the model. Training algorithms are mostly clever ways > to find a good model without trying the infinity of parameter > combinations. Good, this confirms my supposition. > So in practice yes, a model would be barely useful for > further work on the model without the original data. In that regard, > the original data AND the program used to train the model (this > includes the implementations and the options passed to the algorithm) > can be seen as the only real source. The program used to train the model is not necessarily part of the source, IMHO. The GNU GPL v3 states (in Section 1): | However, it [the "Corresponding Source" for a work] does not include | the work's System Libraries, or general-purpose tools or generally | available free programs which are used unmodified in performing | those activities [generate, install, and run the object, and modify | the work] but which are not part of the work. > > But yet again, I could pretend that I just happened to find the model > parameters by hand. Free Software is not about pretending you are a sort of oracle who can guess magic numbers! Otherwise, any source availability requirement would be moot: I could always pretend I wrote the machine code by hand, but that won't be true, in most cases. > Afterall, a model is just a big set of numbers. Machine code is just a long sequence of 0s and 1s... [...] > However, this is not good on the long > term since that makes the model dependent on the person who holds the > data. Definitely. [...] > Is it forbidden for > someone to release an image made with Photoshop as free? You *can* create a DFSG-free image with Adobe Photoshop. If the source form may be read and modified with DFSG-free tools (e.g.: The Gimp), then everything is OK and the image may be included in Debian main. If, on the other hand, the source form of the image may *only* be manipulated with Photoshop and other non-free tools, then I think that the image may still be DFSG-free, but belongs in the Debian contrib archive, at best. At least, this is how I understand it. > > Regarding Debian packaging, I think it's a wise decision to rebuild > the model whenever the data and the training program are free, the > data is not too large and the computation not too long. Should > objective criterion of what is too large and what is too large be > decided or should that be left to the DD? Then a remaining question is > what to do with models for which we don't have the original data or > the original training program? My personal take on the matter is that, in order for a package to be included in Debian main: * the package must comply with the DFSG * source must be distributed in the source package * tools needed to generate (or to use) the object must be DFSG-free and included in Debian main This is how I interpret Policy 2.2.1: http://www.debian.org/doc/debian-policy/ch-archive.html#s-main However, it is my understanding that, in some cases (e.g. long rebuilding times), it is acceptable to also ship pre-built (architecture-independent) objects in the source package, *along with* the corresponding source. One should however be extremely careful in doing this, since it makes it harder to check and be sure that Policy 2.2.1 requirements are satisfied. I hope I clarified my opinions. As stated before, I should stress again that what I expressed above are my own opinions. Usual disclaimers: IANAL, TINLA, IANADD, TINASOTODP. -- New location for my website! Update your bookmarks! http://www.inventati.org/frx . Francesco Poli . GnuPG key fpr == C979 F34B 27CE 5CD8 DC12 31B5 78F4 279B DD6D FCF4 pgpRedXwYAn6s.pgp Description: PGP signature
Re: legal questions regarding machine learning models
I know I should not reply to polemic posts because it is just one step short of troll-feeding, but anyway: > I suggest you start your own distribution, in which you won’t ship: > * xfonts-* (bitmap renderings of non-free vector fonts) I agree that these do not belong in a free distribution. There should be plenty of free alternatives, ness pah? > * all icons shipped without SVG source > * all pictures shipped without XCF/PSD source (oh yeah, that makes > a lot) I would handle these on a case-by-case basis. For a 64x64 icon which has no connection to other icons (apart from what can easily be done by copy and paste), I would say the icon itself is just as good as its source. For SVG: Yes, the ability to scale the icon to a new resolution is very important. I assume that your next move will be something like "But then, we cannot ship GNOME or KDE!". I have seen such arguments before (don't know if it was from you, though). This is just blackmail. In the same way you could argue for the inclusion of . And, personally, I do not care whether GNOME or KDE are in Debian. > * actually, all pictures that are initially photographs of an > object (the preferred form of modification is the original > object; if you want to see it at another angle, you need to take > another photograph) > * all sound files shipped without the full genetic code of the > speaker You are being ridiculous on purpose. Source, as I understand it, is always something digital. > You could call it something like gNewSense, and you could discuss during > hours with RMS how much better it is this way. Just because GNU and RMS have similar views, that does not immediately make the view invalid. This has to be judged on a case-by-case basis. Best regards, Mark Weyer -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: legal questions regarding machine learning models
On Wed, May 27, 2009 at 10:33:52AM +0200, Josselin Mouette wrote: > > Disclaimers, of course: IANADD, TINASOTODP (and IANAL, TINLA). > If you really feel the urge to add meaningless acronyms to all your > emails, please do so in your signature. Better yet: he should recognize that the reason he needs to add all these acronyms is because his posts are an inappropriate use of this mailing list and not productive, and stop posting. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developerhttp://www.debian.org/ slanga...@ubuntu.com vor...@debian.org -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: legal questions regarding machine learning models
> > I agree with you. In particular, in many cases a single 3D model is used > > to create many 2D images. If you don't have the model, you need to do > > the modification many times. > > And then there is the case of increasing the resolution... > I don't know if it would be technically possible to go to that > extremes. Having the source code of all the music and video intros for > all the games, of all the sounds, could be probably 100 times bigger > than the current archives. Well, you get the idea. I don't think it's > a single package what we're talking about. I remember there was a > thread some time ago on what would happen if we took the "having a > whole free source and toolchain" when applied to music, and how it > would be absolutely impossible to achieve, at least right now. Any > idea on what to do in those situations? That's a mixture of questions. I'll add my 2e-2 Euro to each separately. Archive size: The case that I had in mind is that the data is purely synthetic. In those cases the source form is negligibly small when compared to the binary form. Especially in the cases you mention: Game intros rendered from some 3D scene. Game music created from some music score. Sounds which are programmed. I assume that you have non-synthetic data in mind: Music which is actually recorded, videos which are shot with real actors, sounds recorded from the real world. And that what is shipped is a severely compressed form of the original. In that case I guess one can argue that the source requirement is void: I always understand source to be preferred form for modifications among the digital forms of the software. The kind of modifications I see for e.g. music (replace the violin player by someone who actually can play the instrument; correct a discord which is due to a typo in the score) is impossible to achieve without rerecording, so a big digital version of the music is just as useless as a small one. Building time: Coming back to purely synthetic data. building time can be a real pain. Waiting 24 hours (on fast machines) for a build is fine for me as upstream, but not something I would want to cause to your buildd when my software is just one out of thousands of packages. There, I do see a practical problem. With my upstream hat on, I will continue to ship my data under licenses that do require source, but I will not care whether you redo the building or whether you just copy the precompiled data which I give you. Provided of course, that you also ship the source. Extremes: I do not agree with this classification of my view. I value a free game for the fact, that I can fool around with the source to make it "better". Adding features, levels, characters. If this means that I have to add long ears to some sprite (which is obviously generated from some 3D model), then I want to have access to that model and to the toolchain used to turn the model into the sprite. Because that is much more simple and robust, and creates a much more consistent set of sprite animation parts, than doing it with gimp on each part of each animation sequence individually. Free data is important for the very same reason that free programs are! What to do: As always it is a tradeoff between quantity and quality, in this case of packages. Maintaining a high freeness standard has an impact on the resources needed, so it limits the number of costly packages that you can support for any given amount of available resources. I value Debian because (and as long as) it puts the emphasis on freeness. > PS:I'm CC'ing to the Debian Games Team mailing list. Done as well, but I am not subscribed to that list. -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: legal questions regarding machine learning models
2009/5/27 Mark Weyer : >> > This looks very similar to distributing a picture which is a 2D >> > rendering of a 3D model without distributing the original model. This is >> > already accepted in the archive, and the reason is that a 2D picture is >> > its own source, and can serve as a base for modified versions this way. >> >> I disagree with this decision by the FTP masters. >> I personally think that, in most cases, the 2D rendering is not the >> actual source, since many modifications would be best made by changing >> the 3D model and re-rendering the 2D image. > > I agree with you. In particular, in many cases a single 3D model is used > to create many 2D images. If you don't have the model, you need to do > the modification many times. > And then there is the case of increasing the resolution... I don't know if it would be technically possible to go to that extremes. Having the source code of all the music and video intros for all the games, of all the sounds, could be probably 100 times bigger than the current archives. Well, you get the idea. I don't think it's a single package what we're talking about. I remember there was a thread some time ago on what would happen if we took the "having a whole free source and toolchain" when applied to music, and how it would be absolutely impossible to achieve, at least right now. Any idea on what to do in those situations? Greetings, Miry PS:I'm CC'ing to the Debian Games Team mailing list. -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: legal questions regarding machine learning models
Le mercredi 27 mai 2009 à 00:36 +0200, Francesco Poli a écrit : > > Of course, the decision is up to the FTP masters, but I think this > > should be accepted for the sake of consistency with things we already > > cannot decently exclude from the archive. > > I instead think that FTP masters should change their minds about 2D > images rendered from 3D models. I suggest you start your own distribution, in which you won’t ship: * xfonts-* (bitmap renderings of non-free vector fonts) * all icons shipped without SVG source * all pictures shipped without XCF/PSD source (oh yeah, that makes a lot) * actually, all pictures that are initially photographs of an object (the preferred form of modification is the original object; if you want to see it at another angle, you need to take another photograph) * all sound files shipped without the full genetic code of the speaker You could call it something like gNewSense, and you could discuss during hours with RMS how much better it is this way. > Disclaimers, of course: IANADD, TINASOTODP (and IANAL, TINLA). If you really feel the urge to add meaningless acronyms to all your emails, please do so in your signature. -- .''`. Josselin Mouette : :' : `. `' “I recommend you to learn English in hope that you in `- future understand things” -- Jörg Schilling signature.asc Description: Ceci est une partie de message numériquement signée
Re: legal questions regarding machine learning models
> I mentioned Voxforge in my previous email. Their goal is to use their > free spech data to train models with HTK and use the models with > Julius. You can get the source code of HTK after registration on their > website but the license has severe restrictions so HTK is not free > software. Julius is a free software speech recognition engine that can > use models trained with HTK. Note that HTK is pretty much THE speech > recognition framework in the speech recognition community. If you > consider that the ultimate source of a model is not only the data but > also the software used to train it, then Voxforge models built with > HTK can't be free, even though the data were free. Is it forbidden for > someone to release an image made with Photoshop as free? As I understand it, this depends on what you mean by "free". It is quite possible to distribute these models under a free license, even under one which requires distribution of source. The source code would then be the Voxforge data plus the parameters given to HTK. It would not include the source code of HTK, as HTK acts in this process like a compiler. However, a corresponding Debian package would be in contrib at best (and that only, if HTK can be shipped in non-free), because the package would have a build-dependency on HTK. I guess, in the long run your community needs a free replacement of HTK. Again, this is only how I understand things. Best regards, Mark Weyer -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Re: legal questions regarding machine learning models
> > This looks very similar to distributing a picture which is a 2D > > rendering of a 3D model without distributing the original model. This is > > already accepted in the archive, and the reason is that a 2D picture is > > its own source, and can serve as a base for modified versions this way. > > I disagree with this decision by the FTP masters. > I personally think that, in most cases, the 2D rendering is not the > actual source, since many modifications would be best made by changing > the 3D model and re-rendering the 2D image. I agree with you. In particular, in many cases a single 3D model is used to create many 2D images. If you don't have the model, you need to do the modification many times. And then there is the case of increasing the resolution... > Disclaimers, of course: IANADD, TINASOTODP (and IANAL, TINLA). Same here. Best regards, Mark Weyer -- To UNSUBSCRIBE, email to debian-legal-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org