Re: location of UnicodeData.txt
On Mon, Dec 02, 2002 at 12:20:26PM -0500, Jim Penny wrote: So, can a standard be DSFG free? Strictly speaking, no. A standard is an idea, or a collection of ideas. There are many ways to express an idea, so there are many ways to express a standard. Some of these expressions may receive copyright protection. -- G. Branden Robinson|Optimists believe we live in the Debian GNU/Linux |best of all possible worlds. [EMAIL PROTECTED] |Pessimists are afraid the optimists http://people.debian.org/~branden/ |are right. pgpL6YQMq51Li.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Thu, Dec 05, 2002 at 08:33:08PM -0600, John Hasler wrote: However, if that data can only be usefully expressed in precisely that way (that is, reverse-engineering those algorithms would regenerate the file) then the copyright on the file is probably unenforceable. Exactly. If there is no possibility for original expression within the technical constraints imposed, one has no ability to generate the sort of work which copyright is designed to protect. -- G. Branden Robinson| Convictions are more dangerous Debian GNU/Linux | enemies of truth than lies. [EMAIL PROTECTED] | -- Friedrich Nietzsche http://people.debian.org/~branden/ | pgpWZODF9XIXu.pgp Description: PGP signature
Re: location of UnicodeData.txt
Branden Robinson wrote: On Mon, Dec 02, 2002 at 12:20:26PM -0500, Jim Penny wrote: So, can a standard be DSFG free? Strictly speaking, no. A standard is an idea, or a collection of ideas. There are many ways to express an idea, so there are many ways to express a standard. Some of these expressions may receive copyright protection. The correct question, I think, is Can a standards _document_ be DFSG free? I think it could be, but most probably are not; a standards document is usually copyrighted by the organization that governs the standard, and in the absence of an explicit grant of the right to make derivative works, such a document would not be DFSG free (to the best of my understanding). Craig
Re: location of UnicodeData.txt
On Tue, Dec 10, 2002 at 05:18:38PM -0500, Branden Robinson wrote: On Thu, Dec 05, 2002 at 08:33:08PM -0600, John Hasler wrote: However, if that data can only be usefully expressed in precisely that way (that is, reverse-engineering those algorithms would regenerate the file) then the copyright on the file is probably unenforceable. Exactly. If there is no possibility for original expression within the technical constraints imposed, one has no ability to generate the sort of work which copyright is designed to protect. about 48 or more scripts are encoded. ASCII was frozen. That leaves 47! ways to order the scripts (and they did not choose alphabetic by english name). Latin alone has 840 code points (characters). Even given that there is a traditional ordering in the portions of this, there are other big spans that have no natural order. Bunch more choices made here. Then, each character has a potential of 22 binary properties, (not derived from UnicodeData.txt, but in a separate file PropList.txt), and 14 fields, most of which have 20 to 256 or more options. I would venture to guess that even with a perfect oracle, it would be essentially imposible to reverse engineer the Unicode data files, much less the ancillary algorithms. That is, a 32 bit search space with at least 36 properties to be discovered per data point is whopping big. Jim Penny
Re: location of UnicodeData.txt
On Tue, Dec 10, 2002 at 02:23:56PM -0800, Craig Dickson wrote: The correct question, I think, is Can a standards _document_ be DFSG free? I think it could be, but most probably are not; a standards document is usually copyrighted by the organization that governs the standard, and in the absence of an explicit grant of the right to make derivative works, such a document would not be DFSG free (to the best of my understanding). I agree; I was just making my usual futile effort to ensure clarity in the discussion. -- G. Branden Robinson| Debian GNU/Linux | If existence exists, [EMAIL PROTECTED] | why create a creator? http://people.debian.org/~branden/ | pgpzUyZbMsNxR.pgp Description: PGP signature
Re: location of UnicodeData.txt
Craig Dickson [EMAIL PROTECTED] writes: Branden Robinson wrote: On Mon, Dec 02, 2002 at 12:20:26PM -0500, Jim Penny wrote: So, can a standard be DSFG free? Strictly speaking, no. A standard is an idea, or a collection of ideas. There are many ways to express an idea, so there are many ways to express a standard. Some of these expressions may receive copyright protection. The correct question, I think, is Can a standards _document_ be DFSG free? I think it could be, but most probably are not; a standards document is usually copyrighted by the organization that governs the standard, and in the absence of an explicit grant of the right to make derivative works, such a document would not be DFSG free (to the best of my understanding). Yes, this is true. However, it does not follow that an *implementation* of a standard is non-free merely because the standards document is not. Generally speaking, there is no relationship here at all.
Re: location of UnicodeData.txt
Jim Penny [EMAIL PROTECTED] writes: I would venture to guess that even with a perfect oracle, it would be essentially imposible to reverse engineer the Unicode data files, much less the ancillary algorithms. That is, a 32 bit search space with at least 36 properties to be discovered per data point is whopping big. That's irrelevant. The *implementation* of the standard is not copying, even if you had to read the standard to figure out how to do it. Indeed, a functional equivalence rule is also nice here: I can write a new program to implement *your* interface, even if I had to read your program to figure out what the interface is. (This is true because interface copyright has died flaming death.)
Re: location of UnicodeData.txt
John Hasler [EMAIL PROTECTED] writes: Thomas Bushnell writes: The copyright is on the *file* and not on the data,... Did I say it was? ...and certainly not on the *information* which the file contains. An instantiation of that information could be considered a derivative of the copyrighted work. My second paragraph explains one reason why it might not be. I believe at this point you are raising FUD. The license on Unicode explicitly grants permission to make such derivatives, if they even are such, in free programs. This is sufficient for our purposes, because it means that the free program is really free, and that's all Debian requires.
Re: location of UnicodeData.txt
Thomas Bushnell writes: I believe at this point you are raising FUD. I believe I was attempting to discuss the subject calmly and rationally while avoiding inflammatory language such as you are raising FUD. The license on Unicode explicitly grants permission to make such derivatives, if they even are such, in free programs. Reference? I don't recall seeing this mentioned earlier in this thread, and it is not at all clear from a quick perusal of the copyright data on the Unicode site that the license for the file in question is DFSG-compliant. Could you tell me what I am missing? BTW the copy of the file on my system (installed by perl-modules) lacks the apparently required disclaimer. -- John Hasler [EMAIL PROTECTED] (John Hasler) Dancing Horse Hill Elmwood, WI
Re: location of UnicodeData.txt
John Hasler [EMAIL PROTECTED] writes: The license on Unicode explicitly grants permission to make such derivatives, if they even are such, in free programs. Reference? I don't recall seeing this mentioned earlier in this thread, and it is not at all clear from a quick perusal of the copyright data on the Unicode site that the license for the file in question is DFSG-compliant. Could you tell me what I am missing? The file is not being distributed, rather, data from it has been extracted and is being used. This is explicitly permitted by the license on the file. If you claim that a current Debian package is violating the DFSG, then the appropriate forum is debian-legal, not debian-devel.
Re: location of UnicodeData.txt
John Hasler [EMAIL PROTECTED] writes: Thomas Bushnel writes: A program can use the algorithms specified by Unicode without any copying of Unicode, and can thus be entirely free. What is UnicodeData.txt for? Do programs actually use it in some way, or is it just a reference for programmers, like the description of a protocol? Both, in theory, though I don't know of any current programs that automatically use UnicodeData.txt. A more usual pattern is to do what the Unicode people expected, which is, say, an automated scan across the file at the time libc is compiled, to generate ctype tables and the like.
Re: location of UnicodeData.txt
John Hasler [EMAIL PROTECTED] writes: starner writes: If you run most algorithms specified by Unicode, like normalization, capitalization or the bidirectional algorithm, you do it with the use of the data from UnicodeData.txt, whether you copied it from there or copied it from the Unicode book. That's what I thought. Therefor any program that implements any of those algorithms is dependent on the data in UnicodeData.txt. However, if that data can only be usefully expressed in precisely that way (that is, reverse-engineering those algorithms would regenerate the file) then the copyright on the file is probably unenforceable. The copyright is on the *file* and not on the data, and certainly not on the *information* which the file contains.
Re: location of UnicodeData.txt
Thomas Bushnell writes: The copyright is on the *file* and not on the data,... Did I say it was? ...and certainly not on the *information* which the file contains. An instantiation of that information could be considered a derivative of the copyrighted work. My second paragraph explains one reason why it might not be. -- John Hasler [EMAIL PROTECTED] (John Hasler) Dancing Horse Hill Elmwood, WI
Re: location of UnicodeData.txt
On Fri, Dec 06, 2002 at 08:12:57AM -0600, John Hasler wrote: Thomas Bushnell writes: The copyright is on the *file* and not on the data,... Did I say it was? ...and certainly not on the *information* which the file contains. An instantiation of that information could be considered a derivative of the copyrighted work. My second paragraph explains one reason why it might not be. A couple of URLs of interest: http://lxr.mozilla.org/seamonkey/source/intl/unicharutil/tools/ http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Tools/unicode/makeunicodedata.py?rev=1.17content-type=text/vnd.viewcvs-markup Both show that these projects (at least) are mechanically deriving their internal unicode tables from UnicodeData.txt. Jim Penny
Re: location of UnicodeData.txt
Bernhard R. Link [EMAIL PROTECTED] writes: * Jim Penny [EMAIL PROTECTED] [021203 17:35]: OK, now, supposing that the unicode license is found to be non-DSFG free, and hence that UnicodeData.txt is non-free. Suppose a program implements either unicode collation, regular expressions, or any of the other things mentioned above. (collation is at: http://www.unicode.org/unicode/reports/tr10/, regular expressions are at http://www.unicode.org/unicode/reports/tr18/) Can the program be in debian main? When such ridiculous preconditions hold, the program will be most likely undistributable at all. Not only for debian but for anybody except the author, if there is a single one. (At least if it is GPL or any other similar licence) A program can use the algorithms specified by Unicode without any copying of Unicode, and can thus be entirely free.
Re: location of UnicodeData.txt
Thomas Bushnel writes: A program can use the algorithms specified by Unicode without any copying of Unicode, and can thus be entirely free. What is UnicodeData.txt for? Do programs actually use it in some way, or is it just a reference for programmers, like the description of a protocol? -- John Hasler [EMAIL PROTECTED] (John Hasler) Dancing Horse Hill Elmwood, WI
Re: location of UnicodeData.txt
John Hasler writes: Thomas Bushnel writes: A program can use the algorithms specified by Unicode without any copying of Unicode, and can thus be entirely free. What is UnicodeData.txt for? Do programs actually use it in some way, or is it just a reference for programmers, like the description of a protocol? UnicodeData.txt is a listing of every character in Unicode, its name, how it combines with other characters, whether it's left to right right to left or other, what type of character it is, what the capital/lowercase/titlecase version of it is, and every other normative property of a Unicode character, in a format to be read by programs. If you run most algorithms specified by Unicode, like normalization, capitalization or the bidirectional algorithm, you do it with the use of the data from UnicodeData.txt, whether you copied it from there or copied it from the Unicode book.
Re: location of UnicodeData.txt
starner writes: If you run most algorithms specified by Unicode, like normalization, capitalization or the bidirectional algorithm, you do it with the use of the data from UnicodeData.txt, whether you copied it from there or copied it from the Unicode book. That's what I thought. Therefor any program that implements any of those algorithms is dependent on the data in UnicodeData.txt. However, if that data can only be usefully expressed in precisely that way (that is, reverse-engineering those algorithms would regenerate the file) then the copyright on the file is probably unenforceable. -- John Hasler [EMAIL PROTECTED] (John Hasler) Dancing Horse Hill Elmwood, WI
Re: location of UnicodeData.txt
* Jim Penny [EMAIL PROTECTED] [021203 17:35]: OK, now, supposing that the unicode license is found to be non-DSFG free, and hence that UnicodeData.txt is non-free. Suppose a program implements either unicode collation, regular expressions, or any of the other things mentioned above. (collation is at: http://www.unicode.org/unicode/reports/tr10/, regular expressions are at http://www.unicode.org/unicode/reports/tr18/) Can the program be in debian main? When such ridiculous preconditions hold, the program will be most likely undistributable at all. Not only for debian but for anybody except the author, if there is a single one. (At least if it is GPL or any other similar licence) Hochachtungsvoll, Bernhard R. Link -- gEistiO sagen wir mal...ich hab alle sourcen in /lost+found/waimea me gEistiO: [...] Warum lost+found? gEistiO wo haette ich es denn sonst hingeben solln?
Re: location of UnicodeData.txt
But they clearly do not want you to modify anything, including character name! Character name is a searchable field, which some applications may need. It's an English field, for which there is a canonical translation for French, and there should be translation for other languages. But, on the unicode stability policy page http://www.unicode.org/unicode/standard/stability_policy.html it says: The character names are used to distinguish between characters, and do not always express the full meaning of each character. They are designed to be used programmatically, and thus must be stable. In some cases the original name chosen to represent the character is inaccurate in one way or another. Any such inaccuracies are dealt with by adding annotations to the character name list (which is printed in the Unicode Standard and provided in a machine-readable format), or by adding descriptive text to the standard. Note: It is possible to produce translated names for the characters, to make the information conveyed by the name accessible to non-English speakers. Hmmm. What does that mean? Are translated names to be annotations, descriptions, character names, or are they maintained in a separate table? How do you use the name programmatically if you don't know the language they are in? I did some googling, but could not find the French trasnlation files. Is there an URL? Jim Penny
Re: location of UnicodeData.txt
If a system simply declared a section of data to be UniCode data, and made no attempt to comprehend the contents, it probably would not need to have access to the contents of Unicode.txt. Just like if a system simply declared a section of data to be code complaint to Fortran-2026, and if it made no attempt to comprehend it, it wouldn't need access to the contents of that standard. A text-processing program that needs to display data is going to need the contents of UnicodeData for BiDi. A proper cut program should use UnicodeData, so it doesn't seperate a character from a subsequent combining character. A spell program is going to need the data to know which characters end words. Anything that handles text in a way more complex then cat will access to this data. OK, now, supposing that the unicode license is found to be non-DSFG free, and hence that UnicodeData.txt is non-free. Suppose a program implements either unicode collation, regular expressions, or any of the other things mentioned above. (collation is at: http://www.unicode.org/unicode/reports/tr10/, regular expressions are at http://www.unicode.org/unicode/reports/tr18/) Can the program be in debian main? In other words, does the program require ... non-free packages or packages which are not in our archive at all for ... execution? Jim Penny
Re: location of UnicodeData.txt
Jim Penny [EMAIL PROTECTED] writes: OK, now, supposing that the unicode license is found to be non-DSFG free, and hence that UnicodeData.txt is non-free. Suppose a program implements either unicode collation, regular expressions, or any of the other things mentioned above. (collation is at: http://www.unicode.org/unicode/reports/tr10/, regular expressions are at http://www.unicode.org/unicode/reports/tr18/) Can the program be in debian main? In other words, does the program require ... non-free packages or packages which are not in our archive at all for ... execution? 1) I don't answer questions based on false hypotheticals. It's not worth the time or energy. 2) Debian implements many non-free standards documents. Why don't you see how we already handle this situation?
Re: location of UnicodeData.txt
On Sun, Dec 01, 2002 at 11:06:12AM +1300, Nick Phillips wrote: On Sat, Nov 30, 2002 at 12:35:25PM -0500, Jim Penny wrote: I think you are missing the points here. First of all, DFSG applied to the standard does not want to change the standard, but wants all to be able to change the text of the standard. Huh? If I change the text of the standard, I have changed the standard! No you haven't, only the standards body in question can do that. The above is in the context of people wanting to be able to change the unicode.txt file(s). This file cannot be changed without producing something that differs from the standard. Correcting it produces an artifact that is distinct from the standard. Is that unclear? There are all sorts of reasons why you might wish to create derivative works based on the standard -- a new standard for a different purpose, for example. Derivative works are covered by copyright. Period. I would advise that you not base a defense of infringment of copyright on the fact that you have only used it to create a derivative work. Or helpful documentation of the standard for people who are intimidated by the 'dry' nature of the original... This, on the other hand, would probably be regarded as fair use, especially as you would need only illustrative snippets to create such documentation. In normal circumstances, embedding the entire table in your documentation would likely not be regarded as fair use, but that is a fact based pattern that would have to be decided on a case by case basis. In this case, it is arguable that the Unicode Consortium's license specifically permits inclusion of the entire table, as it does permit unlimited extraction. On the other hand, if you wish to create a competitor to the unicode standard, say the debicode standard, I see no moral right that you have to incorporate, without permission, the unicode standard. You should expect to start from scratch! Engage brain. Do you think that if I want to create a competitor to, say, GNU Emacs, that I should expect to have to start from scratch? Or fetchmail? Or any one of the thousands of DFSG-free packages that are in main? Brain engaged. OK, according to you, anyone can make a competitor to GNU Emacs and may use the GNU emacs code. Great. So, now consider microsoft visual gnu emacs, which is released under the MS-EULA. If that seems to fail to capture your meaning, then well, suppose I think that the GPL sucks, and that BSD is the one true license. Can I the create FreeOpenBSDGNU emacs with a BSD license (as a derivative work)? What's that? Oh, you mean that anyone may produce a derivative work that is licensed in a manner compatible with the original work's license, provided the original license specifically grants that right? Oh. Yes, I agree with that. Stated in those terms, it is not much of a surprise. Now, where in the Unicode license does it give you permission to create derivative works? The license does say Information can be extracted from these files Oh, and you have to provide an accompanying notice indicating the source. The license does not say that any of the information in files provided by the Unicode Consortium can be modified (except by extraction). This makes it fail DSFG guideline 3. Cheers, Nick -- Nick Phillips -- [EMAIL PROTECTED] Tomorrow will be cancelled due to lack of interest. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: location of UnicodeData.txt
On Sun, Dec 01, 2002 at 11:10:09AM +0100, Bernhard R. Link wrote: * Jim Penny [EMAIL PROTECTED] [021130 18:43]: Huh? If I change the text of the standard, I have changed the standard! For example, if I have : 0332;COMBINING LOW LINE;Mn;220;NSM;N;NON-SPACING UNDERSCORE and change this to 0332;NON-COMBINING LOW LINE;Mn;220;NSM;N;SPACING UNDERSCORE Then the standard has been changed! That is, this file is line after line of character number assignment, followed by character name, (and other information). There is no possible change that does not change the standard! Hint: (from standard writer's viewpoint) - A standard that can be changed by anyone, at anytime, without notice and consultation is not a standard, especially if it is a contentious standard that has some people seriously upset (i.e, Russian and XJK users). You seem to understand less and less. If the text is changed, it is no longer the standard. (A standard can not be changed changing the text, as the standard is not a local file, but the unmodified text). So, can a standard be DSFG free? What the licence of a standard file may resonable demand is that no changed text pretends to be the unmodified standard. They can demand more than that, a lot more. All of copyright law comes to bear (if the standard is deemed copyrightable and has been copyrighted.) In particular, the owner of a copyright has, unless waived, control over the right to distribute derivative works. The text of every standard that I know of is modifiable. However, it normally takes the consent of the standards body and is issued under its aegis. Again, Jim Penny's unicode standard has no value, and even debian unicode has very limited appeal. You are again talkin of the standard. Not the text of the standard. A standard body can issue a new standard. And trademark laws and other things can force any new XYZ standard for UVW to be issued by some special entity. Look at the file! UniCode.txt is the core of the standard, it happens to be an ASCII text file. So what, every standard is embodied in text at some point! You seem to regard standards as some Platonic ideal, completely divorced from the text which defines them. This may be a valid viewpoint in some cases; e.g. the original algol-60 report. It is not in other cases, e.g. the algol-68 report. UniCode.txt is a text file which has no redundacy and no explanatory text. There is simply no portion of this file that can be modified without making an artifact that differs from the standard in some substantive way. On the other hand, if you wish to create a competitor to the unicode standard, say the debicode standard, I see no moral right that you have to incorporate, without permission, the unicode standard. You should expect to start from scratch! Now, IANAL, but I suspect that any unicode editor that reproduced enough information from the unicode standard to be useful would be considered a derived work. More importantly, I think that is is arguable that this table is, in the terms of the Debian Social Contract, necessary for the execution of a full unicode editor. (The language of the debian Social Contract is even more general and vague than copyright law! It talkes about and to freely use the information supplied in the creation of products supporting the UnicodeTM Standard. If this does not include making modifications, then jurisdiction is more broken then I ever thought. (In my eyes the information should even not be copyrightable at all, but this point may be discussed). The license permits extraction of information for documentation or programs. This may be completely different from modification or correction of information. In either case, the social contract would place the unicode table into non-free; and any editor that depended on the table, or information derived from the table (in a copyright sense) in either non-free or contrib. The table itself may be non-free. I doubt any editor will use the file itself but use modification suitable for the program. I have no problem with this result. But saying that the unicode character table cannot be distributed by debian, in spite of specific language permitting us to do so, seems a bit extreme. If it does not suit for main, then it can not be distributed as part of debian. (by definition) But is can be distributed by debian, not as a part of debian. That is, it may be put in non-free, and it may be distributed using the debian mirrors. Note: I did not use the phrase part of, that is yours. And the consequences of this decision will probably seem extreme to many people. This example just happens to be particularly cogent; there is no doubt it is non-free, there is no doubt it is copyrightable, there is little doubt that it is necessary for the execution of a substantial corpus of programs
Re: location of UnicodeData.txt
On Mon, Dec 02, 2002 at 11:16:07AM -0500, Jim Penny wrote: On Sun, Dec 01, 2002 at 11:06:12AM +1300, Nick Phillips wrote: There are all sorts of reasons why you might wish to create derivative works based on the standard -- a new standard for a different purpose, for example. Derivative works are covered by copyright. Period. I would advise that you not base a defense of infringment of copyright on the fact that you have only used it to create a derivative work. Umm, yes. That's exactly why the text of a standard should be free. You seem to be confusing the should be and is discussions. On the other hand, if you wish to create a competitor to the unicode standard, say the debicode standard, I see no moral right that you have to incorporate, without permission, the unicode standard. You should expect to start from scratch! Engage brain. Do you think that if I want to create a competitor to, say, GNU Emacs, that I should expect to have to start from scratch? Or fetchmail? Or any one of the thousands of DFSG-free packages that are in main? Brain engaged. OK, according to you, anyone can make a competitor to GNU Emacs and may use the GNU emacs code. Great. So, now consider microsoft visual gnu emacs, which is released under the MS-EULA. You seem to have hit the wrong button when you tried to engage your brain. Why would create a competitor have to mean create a competitor under a conflicting license? The GNU Emacs license allows you to create a competitor without starting from scratch. That is what makes it free! What's that? Oh, you mean that anyone may produce a derivative work that is licensed in a manner compatible with the original work's license, provided the original license specifically grants that right? Oh. Yes, I agree with that. Stated in those terms, it is not much of a surprise. I don't think he meant that at all. You're confusing may with should expect to be able to. The whole provided... clause misses the point. Laws do not define morality. Now, why do you think that it would not be a good thing for the text of the text of the Unicode license to be free? Your only answer so far seems to be because it currently isn't. Richard Braakman
Re: location of UnicodeData.txt
On Mon, Dec 02, 2002 at 07:30:57PM +0200, Richard Braakman wrote: On Mon, Dec 02, 2002 at 11:16:07AM -0500, Jim Penny wrote: On Sun, Dec 01, 2002 at 11:06:12AM +1300, Nick Phillips wrote: There are all sorts of reasons why you might wish to create derivative works based on the standard -- a new standard for a different purpose, for example. Derivative works are covered by copyright. Period. I would advise that you not base a defense of infringment of copyright on the fact that you have only used it to create a derivative work. Umm, yes. That's exactly why the text of a standard should be free. You seem to be confusing the should be and is discussions. On the other hand, if you wish to create a competitor to the unicode standard, say the debicode standard, I see no moral right that you have to incorporate, without permission, the unicode standard. You should expect to start from scratch! Engage brain. Do you think that if I want to create a competitor to, say, GNU Emacs, that I should expect to have to start from scratch? Or fetchmail? Or any one of the thousands of DFSG-free packages that are in main? Brain engaged. OK, according to you, anyone can make a competitor to GNU Emacs and may use the GNU emacs code. Great. So, now consider microsoft visual gnu emacs, which is released under the MS-EULA. You seem to have hit the wrong button when you tried to engage your brain. Why would create a competitor have to mean create a competitor under a conflicting license? The GNU Emacs license allows you to create a competitor without starting from scratch. That is what makes it free! The question above did not specify that the competitor was to be GPL licensed. In the universe of GPL licensed programs, you are indeed free to create a competitor using code incorported from GNU emacs; in fact, the universe of DFSG licenses was specified. In the universe of DFSG licensed programs, you are not free to create a competitor using incorporated code, in particular, you cannot create a BSD licensed version of GNU emacs using derived code. (And if BSD licenses were allowed, then so would the MS-EULA license, by washing the GPL through the BSD license.) What's that? Oh, you mean that anyone may produce a derivative work that is licensed in a manner compatible with the original work's license, provided the original license specifically grants that right? Oh. Yes, I agree with that. Stated in those terms, it is not much of a surprise. I don't think he meant that at all. You're confusing may with should expect to be able to. The whole provided... clause misses the point. Laws do not define morality. This is straying terribly far from field, but are you saying that it is morally correct that the debian project modify standards without permission of the standards body? Or that it is morally correct to incorporate (portions of) other programs in your work unconditiontally and without permission of the original creators? Are you saying that if the FSF brings a suit alleging GPL violation, that this suit is immoral? If your answer to any of these is yes, then your morality is very different from mine. Now, why do you think that it would not be a good thing for the text of the text of the Unicode license to be free? Your only answer so far seems to be because it currently isn't. 1) that is a good enough answer to make a determination on whether it is part of non-free, contrib, or main. 2) It is an embodiment of years of work by many people who did not agree that it should be free (in DSFG terms). 3) I can think of no value in a standard that is DFSG free. The purpose of a standard is to ensure interoperability. If there first has to be a discovery phase to determine how my standard deviates from your standard, interoperability is reduced if not destroyed. This is not to say that standards should not permit extension. Most do. However, even this has been controversial in the past (Microsoft Kereboros, for example). Jim Penny Richard Braakman -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: location of UnicodeData.txt
Jim Penny [EMAIL PROTECTED] writes: There are all sorts of reasons why you might wish to create derivative works based on the standard -- a new standard for a different purpose, for example. Derivative works are covered by copyright. Period. I would advise that you not base a defense of infringment of copyright on the fact that you have only used it to create a derivative work. The DFSG says that we only use things in Debian where we have been granted that right. We think that standards bodies should grant us that right, and then we will distribute their standard as part of our system. If they don't grant us the right, we won't take advantage of it, and we won't distribute it. Nobody is infringing any copyrights.
Re: location of UnicodeData.txt
Jim Penny [EMAIL PROTECTED] writes: Now, where in the Unicode license does it give you permission to create derivative works? The license does say Information can be extracted from these files Oh, and you have to provide an accompanying notice indicating the source. The license does not say that any of the information in files provided by the Unicode Consortium can be modified (except by extraction). This makes it fail DSFG guideline 3. What about the null extraction, done by using the extraction tool named cat?
Re: location of UnicodeData.txt
Jim Penny [EMAIL PROTECTED] writes: This is straying terribly far from field, but are you saying that it is morally correct that the debian project modify standards without permission of the standards body? Or that it is morally correct to incorporate (portions of) other programs in your work unconditiontally and without permission of the original creators? Are you saying that if the FSF brings a suit alleging GPL violation, that this suit is immoral? I'm saying that 1) It is not *possible* to change the standard without the permission of the standards body; 2) We want the right to be able to distribute properly marked modified versions of documents issued by standards bodies (where properly marked means adding a notice like this is not the official standard, we changed it); 3) If we don't have the right in (2), it's can't be part of Debian. I happen to think that there is generally no moral problem in violating copyrights, but Debian's policy is to honor them.
Re: location of UnicodeData.txt
On Mon, Dec 02, 2002 at 10:43:42AM -0800, Thomas Bushnell, BSG wrote: Jim Penny [EMAIL PROTECTED] writes: Now, where in the Unicode license does it give you permission to create derivative works? The license does say Information can be extracted from these files Oh, and you have to provide an accompanying notice indicating the source. The license does not say that any of the information in files provided by the Unicode Consortium can be modified (except by extraction). This makes it fail DSFG guideline 3. What about the null extraction, done by using the extraction tool named cat? As far as I can tell, this is permitted. It would not be permitted under normal copyright law, but the license does permit arbitrary extraction. Extraction of the entirety still appears to be extraction. What the Unicode Consortium does not say is what the distribution rights are relative to the subsetted tables. This is a license weakness, but I suspect that any sane judge would hold that giving permission to do the extracting implies giving permission to distribute the result. But, I suspect that any sane judge would also say that extraction for the purpose of license laundering is not implied. That is, you could not take the Unicode Consortium's file, apply cat to it, and relicense the result under BSD (for example). Now, what is Unicode Consortium really saying here? They are saying that you are allowed to use subsets of Unicode. For example, you may be interested in only a few languages. You may select the relevant portions of the table out. Or, if you know that you don't care about bidirectionality, ligation, extenders, grapheme link, or any of the other various and sundry attibutes, you may drop them. In other words, you can do either row or column projection if that is advantageous to you. But they clearly do not want you to modify anything, including character name! Character name is a searchable field, which some applications may need. Jim Penny -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: location of UnicodeData.txt
Jim Penny [EMAIL PROTECTED] writes: But, I suspect that any sane judge would also say that extraction for the purpose of license laundering is not implied. That is, you could not take the Unicode Consortium's file, apply cat to it, and relicense the result under BSD (for example). Sure, but nobody is proposing any kind of license laundering.
Re: location of UnicodeData.txt
But they clearly do not want you to modify anything, including character name! Character name is a searchable field, which some applications may need. It's an English field, for which there is a canonical translation for French, and there should be translation for other languages. The only overlap with any previous character coding is the first 127 characters (ASCII). Nope. There's massive overlap with previous character codings on all sorts of levels. The first 256 characters are Latin-1; the Greek block is a superset of ISO-8859-7 (that is, the characters are in the same order, but some of the gaps have been filled in), as is Cyrillic and Arabic for their respective 8859 standard. All the Indian blocks are weird echos of ISCII. The basic CJK block is the ideographs from the preexisting Chinese, Japanese and Korean standards, sorted by the order of traditional dictionaries like the KangXi. If a system simply declared a section of data to be UniCode data, and made no attempt to comprehend the contents, it probably would not need to have access to the contents of Unicode.txt. Just like if a system simply declared a section of data to be code complaint to Fortran-2026, and if it made no attempt to comprehend it, it wouldn't need access to the contents of that standard. A text-processing program that needs to display data is going to need the contents of UnicodeData for BiDi. A proper cut program should use UnicodeData, so it doesn't seperate a character from a subsequent combining character. A spell program is going to need the data to know which characters end words. Anything that handles text in a way more complex then cat will access to this data. __ Do you want a free e-mail for life ? Get it at http://www.personal.ro/
Re: location of UnicodeData.txt
* Jim Penny [EMAIL PROTECTED] [021130 18:43]: Huh? If I change the text of the standard, I have changed the standard! For example, if I have : 0332;COMBINING LOW LINE;Mn;220;NSM;N;NON-SPACING UNDERSCORE and change this to 0332;NON-COMBINING LOW LINE;Mn;220;NSM;N;SPACING UNDERSCORE Then the standard has been changed! That is, this file is line after line of character number assignment, followed by character name, (and other information). There is no possible change that does not change the standard! Hint: (from standard writer's viewpoint) - A standard that can be changed by anyone, at anytime, without notice and consultation is not a standard, especially if it is a contentious standard that has some people seriously upset (i.e, Russian and XJK users). You seem to understand less and less. If the text is changed, it is no longer the standard. (A standard can not be changed changing the text, as the standard is not a local file, but the unmodified text). What the licence of a standard file may resonable demand is that no changed text pretends to be the unmodified standard. The text of every standard that I know of is modifiable. However, it normally takes the consent of the standards body and is issued under its aegis. Again, Jim Penny's unicode standard has no value, and even debian unicode has very limited appeal. You are again talkin of the standard. Not the text of the standard. A standard body can issue a new standard. And trademark laws and other things can force any new XYZ standard for UVW to be issued by some special entity. On the other hand, if you wish to create a competitor to the unicode standard, say the debicode standard, I see no moral right that you have to incorporate, without permission, the unicode standard. You should expect to start from scratch! Now, IANAL, but I suspect that any unicode editor that reproduced enough information from the unicode standard to be useful would be considered a derived work. More importantly, I think that is is arguable that this table is, in the terms of the Debian Social Contract, necessary for the execution of a full unicode editor. (The language of the debian Social Contract is even more general and vague than copyright law! It talkes about and to freely use the information supplied in the creation of products supporting the UnicodeTM Standard. If this does not include making modifications, then jurisdiction is more broken then I ever thought. (In my eyes the information should even not be copyrightable at all, but this point may be discussed). In either case, the social contract would place the unicode table into non-free; and any editor that depended on the table, or information derived from the table (in a copyright sense) in either non-free or contrib. The table itself may be non-free. I doubt any editor will use the file itself but use modification suitable for the program. I have no problem with this result. But saying that the unicode character table cannot be distributed by debian, in spite of specific language permitting us to do so, seems a bit extreme. If it does not suit for main, then it can not be distributed as part of debian. (by definition) And the consequences of this decision will probably seem extreme to many people. This example just happens to be particularly cogent; there is no doubt it is non-free, there is no doubt it is copyrightable, there is little doubt that it is necessary for the execution of a substantial corpus of programs which are otherwise DFSG free. These program would certainly include unicode editors, and would probably include python, perl and ruby. These no doubt are all wrong in my eyes. MfG, Bernhard R. Link -- gEistiO sagen wir mal...ich hab alle sourcen in /lost+found/waimea me gEistiO: [...] Warum lost+found? gEistiO wo haette ich es denn sonst hingeben solln?
Re: location of UnicodeData.txt
Giacomo Catenazzi [EMAIL PROTECTED] writes: I never read ISO C In this case it's a bad idea to write C programs. You should use a programming language where the standardization committee fought with ISO to publish the text of the standard under a free (even libre) license. ;-) Are database (tables) copyrightable? In Germany, databases are subject to one of our equivalents to copyright.
Re: location of UnicodeData.txt
On Fri, Nov 29, 2002 at 11:37:41AM +0100, Bernhard R. Link wrote: * Jim Penny [EMAIL PROTECTED] [021128 03:35]: So, according to Branden, international standards are supposed to allow debian the right to modify them and to distribute the modified versions. Absent said permission, which is hardly ever going to be given, they must be considered non-free. (This is, of course, logically forthright.) Moreover, according to the non-free removal proponents, we should not even distribute the un-modified copies of these files. Yet, unicode is supposed to be the canonical character encoding scheme for debian. Does this mean every unicode text editor belongs in contrib (depends on something non-free)? I think you are missing the points here. First of all, DFSG applied to the standard does not want to change the standard, but wants all to be able to change the text of the standard. Huh? If I change the text of the standard, I have changed the standard! For example, if I have : 0332;COMBINING LOW LINE;Mn;220;NSM;N;NON-SPACING UNDERSCORE and change this to 0332;NON-COMBINING LOW LINE;Mn;220;NSM;N;SPACING UNDERSCORE Then the standard has been changed! That is, this file is line after line of character number assignment, followed by character name, (and other information). There is no possible change that does not change the standard! Hint: (from standard writer's viewpoint) - A standard that can be changed by anyone, at anytime, without notice and consultation is not a standard, especially if it is a contentious standard that has some people seriously upset (i.e, Russian and XJK users). This is a good thing, the text of standards should be modifiable. How else shall someone write the following standard without having written the first or having to write all from scratch? The text of every standard that I know of is modifiable. However, it normally takes the consent of the standards body and is issued under its aegis. Again, Jim Penny's unicode standard has no value, and even debian unicode has very limited appeal. On the other hand, if you wish to create a competitor to the unicode standard, say the debicode standard, I see no moral right that you have to incorporate, without permission, the unicode standard. You should expect to start from scratch! Secondly: What has a unicode editor have to do with the unicode standard? It should only implement it. If it would contain parts of the standard-text (tables or whatever) that were protected by copyright law and the standard would allow no modifications, then noone would be allowed to copy the editor. (No special problem with debian) A unicode editor must know certain properties of the character set (note, I am not talking about font properties here, unicode does not deal with fonts.) Examples might be langauge, combining marks, bidirectionality, input methods, surrogates, Hangul syllables. These are things that an editor must know, and that pretty much, must be looked up in the unicode table. Now, the unicode license happens to be fairly clear, and fairly permissive. See: http://www.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html It specifically gives permission for redistibution, without fee, providing a statement of copyright, and a disclaimer are preserved. It also specifically allows incorporation into programs under the same terms. But those terms happen to be non-DSFG free. They fail 3 and 8. Now, IANAL, but I suspect that any unicode editor that reproduced enough information from the unicode standard to be useful would be considered a derived work. More importantly, I think that is is arguable that this table is, in the terms of the Debian Social Contract, necessary for the execution of a full unicode editor. (The language of the debian Social Contract is even more general and vague than copyright law! In either case, the social contract would place the unicode table into non-free; and any editor that depended on the table, or information derived from the table (in a copyright sense) in either non-free or contrib. I have no problem with this result. But saying that the unicode character table cannot be distributed by debian, in spite of specific language permitting us to do so, seems a bit extreme. And the consequences of this decision will probably seem extreme to many people. This example just happens to be particularly cogent; there is no doubt it is non-free, there is no doubt it is copyrightable, there is little doubt that it is necessary for the execution of a substantial corpus of programs which are otherwise DFSG free. These program would certainly include unicode editors, and would probably include python, perl and ruby. Jim Penny Hochachtungsvoll, Bernhard R. Link -- gEistiO sagen wir mal...ich hab alle sourcen in /lost+found/waimea me gEistiO: [...] Warum lost+found? gEistiO wo haette ich es denn sonst hingeben solln?
Re: location of UnicodeData.txt
On Sat, Nov 30, 2002 at 12:35:25PM -0500, Jim Penny wrote: I think you are missing the points here. First of all, DFSG applied to the standard does not want to change the standard, but wants all to be able to change the text of the standard. Huh? If I change the text of the standard, I have changed the standard! No you haven't, only the standards body in question can do that. There are all sorts of reasons why you might wish to create derivative works based on the standard -- a new standard for a different purpose, for example. Or helpful documentation of the standard for people who are intimidated by the 'dry' nature of the original... On the other hand, if you wish to create a competitor to the unicode standard, say the debicode standard, I see no moral right that you have to incorporate, without permission, the unicode standard. You should expect to start from scratch! Engage brain. Do you think that if I want to create a competitor to, say, GNU Emacs, that I should expect to have to start from scratch? Or fetchmail? Or any one of the thousands of DFSG-free packages that are in main? Cheers, Nick -- Nick Phillips -- [EMAIL PROTECTED] Tomorrow will be cancelled due to lack of interest.
Re: location of UnicodeData.txt
* Jim Penny [EMAIL PROTECTED] [021128 03:35]: So, according to Branden, international standards are supposed to allow debian the right to modify them and to distribute the modified versions. Absent said permission, which is hardly ever going to be given, they must be considered non-free. (This is, of course, logically forthright.) Moreover, according to the non-free removal proponents, we should not even distribute the un-modified copies of these files. Yet, unicode is supposed to be the canonical character encoding scheme for debian. Does this mean every unicode text editor belongs in contrib (depends on something non-free)? I think you are missing the points here. First of all, DFSG applied to the standard does not want to change the standard, but wants all to be able to change the text of the standard. This is a good thing, the text of standards should be modifiable. How else shall someone write the following standard without having written the first or having to write all from scratch? Secondly: What has a unicode editor have to do with the unicode standard? It should only implement it. If it would contain parts of the standard-text (tables or whatever) that were protected by copyright law and the standard would allow no modifications, then noone would be allowed to copy the editor. (No special problem with debian) Hochachtungsvoll, Bernhard R. Link -- gEistiO sagen wir mal...ich hab alle sourcen in /lost+found/waimea me gEistiO: [...] Warum lost+found? gEistiO wo haette ich es denn sonst hingeben solln?
Re: location of UnicodeData.txt
Richard Braakman [EMAIL PROTECTED] writes: On Thu, Nov 28, 2002 at 02:58:38PM +0100, Tim Dijkstra wrote: UnicodeData is different, because we need the data in our program, not only the ideas. And it this case we see that as software! Maybe you're right that we don't really need the rfc's in main. They actually are now and it would be a shame if we dropped them. But we need files like this unicode file in main, which is part of a specification (I think), so can't be altered. But do you think it's _okay_ for such a file not to be free? (Whether it actually is or not is a topic for debian-legal). Whether it's OK is not for debian-devel. Whether it is or is not is for debian-legal, and I'll comment on the thread if and when it shows up there.
Re: location of UnicodeData.txt
On Wed, 27 Nov 2002 16:53:00 -0500 Branden Robinson [EMAIL PROTECTED] wrote: On Wed, Nov 27, 2002 at 04:23:51PM -0500, Jim Penny wrote: I see no problem with this license as far as it goes, but it doesn't go far enough. There is no permission granted to make modifications (and distribute modified versions). (DFSG 3) So, according to Branden, international standards are supposed to allow debian the right to modify them and to distribute the modified versions. No, international standards can say whatever they want, and bear whatever license the standards organization wants, within the law. Debian has its Free Software Guidelines and we do not, in theory, apply them differently based on who the licensor is. So doesn't this mean it's time to change the social contract or the DFSG (are standards software?) to make an exception for 'documents and files describing standards'. It's clear that we can't live without them (hence should be in main), and it is also clear there is no use in changing standards on you're own. And no I can't make an amendment, I'm not a DD. Tim
Re: location of UnicodeData.txt
Tim Dijkstra wrote: So doesn't this mean it's time to change the social contract or the DFSG (are standards software?) to make an exception for 'documents and files describing standards'. It's clear that we can't live without them (hence should be in main), and it is also clear there is no use in changing standards on you're own. No, we can live without standard in main. I never read ISO C and POSIX standards (because these was non free (like free beer)). But I program GNU/Linux in C. Also the RFC are not enough free, but I see no problem readint it in non-free. It would better to have those standard in 'main' and to be able to modify (translate, correct, collect, simplify,..), but yet... UnicodeData is different, because we need the data in our program, not only the ideas. And it this case we see that as software! Are database (tables) copyrightable? (IIRC there was some discussion in US, but anyway, in the 'free world', IMO data are free. ciao giacomo
Re: location of UnicodeData.txt
On Thu, 28 Nov 2002 13:55:31 +0100 Giacomo Catenazzi [EMAIL PROTECTED] wrote: No, we can live without standard in main. I never read ISO C and POSIX standards (because these was non free (like free beer)). But I program GNU/Linux in C. Also the RFC are not enough free, but I see no problem readint it in non-free. It would better to have those standard in 'main' and to be able to modify (translate, correct, collect, simplify,..), but yet... There is no use in changing standards without agreeing on it in some forum. UnicodeData is different, because we need the data in our program, not only the ideas. And it this case we see that as software! Maybe you're right that we don't really need the rfc's in main. They actually are now and it would be a shame if we dropped them. But we need files like this unicode file in main, which is part of a specification (I think), so can't be altered. grts Tim
Re: location of UnicodeData.txt
On Thu, Nov 28, 2002 at 02:58:38PM +0100, Tim Dijkstra wrote: UnicodeData is different, because we need the data in our program, not only the ideas. And it this case we see that as software! Maybe you're right that we don't really need the rfc's in main. They actually are now and it would be a shame if we dropped them. But we need files like this unicode file in main, which is part of a specification (I think), so can't be altered. But do you think it's _okay_ for such a file not to be free? (Whether it actually is or not is a topic for debian-legal). We could be stuck with programs that we can't modify to support new languages, for example. Richard Braakman
Re: location of UnicodeData.txt
Richard Braakman writes: But do you think it's _okay_ for such a file not to be free? /usr/share/perl/5.8.0/unicore/UnicodeData.txt, which I assume is the file you are talking about, contains just a table of data. Unless its creation involved creativity rather than just sweat of the brow it is not protected by copyright. -- John Hasler [EMAIL PROTECTED] (John Hasler) Dancing Horse Hill Elmwood, WI
Re: location of UnicodeData.txt
Hi, On Thu, Nov 28, 2002 at 10:45:48AM -0600, John Hasler wrote: Richard Braakman writes: But do you think it's _okay_ for such a file not to be free? /usr/share/perl/5.8.0/unicore/UnicodeData.txt, which I assume is the file you are talking about, contains just a table of data. Unless its creation involved creativity rather than just sweat of the brow it is not protected by copyright. I'd say that the definition of Unicode, heck even ASCII, involves a fair amount of creativity. The question is, is the definition of Unicode, as a set of named glyphs and encodings, protected by copyright? If not, then a table in a particular format representing that definition and nothing but that definition is not likely to be copyrightable. However, in these perverse times, where companies patent hyperlinks, I honestly have no idea whether Unicode itself is owned but licensed royalty-free, or as free as say, ASCII or English. Newspeak is free for non-commercial and other non-infringing uses, and when not used to say bad things about Our President. Otherwise, please contact the worldwite patent bureau for a RAND-license. Bah. Cheers, Emile. -- E-Advies / Emile van Bergen | [EMAIL PROTECTED] tel. +31 (0)70 3906153| http://www.e-advies.info pgp4ZYWM8L3hW.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Thu, Nov 28, 2002 at 06:07:57PM +0100, Emile van Bergen wrote: However, in these perverse times, where companies patent hyperlinks, I honestly have no idea whether Unicode itself is owned but licensed royalty-free, or as free as say, ASCII or English. These days I wouldn't be eager to rely on the limits of copyrightability. CNN.com - Composer pays for piece of silence - Sep. 23, 2002 http://www.cnn.com/2002/SHOWBIZ/Music/09/23/uk.silence/ Richard Braakman
Re: location of UnicodeData.txt
On Thu, Nov 28, 2002 at 07:33:43PM +0200, Richard Braakman wrote: These days I wouldn't be eager to rely on the limits of copyrightability. CNN.com - Composer pays for piece of silence - Sep. 23, 2002 http://www.cnn.com/2002/SHOWBIZ/Music/09/23/uk.silence/ It's worth pointing out that the major problem there was that the derived work actually credited the original which would make it much harder to defend a copyright action. -- You grabbed my hand and we fell into it, like a daydream - or a fever.
Re: location of UnicodeData.txt
Emile van Bergen writes: I'd say that the definition of Unicode, heck even ASCII, involves a fair amount of creativity. I don't doubt that the development of Unicode involved creativity: under current law it probably qualifies as a patentable invention. Inventions and ideas, however, cannot be copyrighted: only creative works reduced to tangible form can. I'm arguing that the _creation_ _of_ _that_ _table_ involved no creativity, not that the invention of Unicode didn't. Is it possible to create other Unicode tables that serve the same purpose as that one and differ from it non-trivially? -- John Hasler [EMAIL PROTECTED] (John Hasler) Dancing Horse Hill Elmwood, WI
Re: location of UnicodeData.txt
Hi, On Thu, Nov 28, 2002 at 11:47:52AM -0600, John Hasler wrote: Emile van Bergen writes: I'd say that the definition of Unicode, heck even ASCII, involves a fair amount of creativity. I don't doubt that the development of Unicode involved creativity: under current law it probably qualifies as a patentable invention. Inventions and ideas, however, cannot be copyrighted: only creative works reduced to tangible form can. I'm arguing that the _creation_ _of_ _that_ _table_ involved no creativity, not that the invention of Unicode didn't. Well, so you say that if I write a novel, all my creativity is in the abstract idea; putting the words down involved no extra creativity; thus the sequence of words cannot be copyrighted? I think your argument doesn't help here... Is it possible to create other Unicode tables that serve the same purpose as that one and differ from it non-trivially? Good question. Under your reasoning, merely writing the list down from the unicode spec, possibly using | as separators instead of :, should do the trick. Cheers, Emile. -- E-Advies / Emile van Bergen | [EMAIL PROTECTED] tel. +31 (0)70 3906153| http://www.e-advies.info pgp1vNT3MXUdd.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Thu, 28 Nov 2002 17:57:35 +0200 Richard Braakman [EMAIL PROTECTED] wrote: On Thu, Nov 28, 2002 at 02:58:38PM +0100, Tim Dijkstra wrote: UnicodeData is different, because we need the data in our program, not only the ideas. And it this case we see that as software! Maybe you're right that we don't really need the rfc's in main. They actually are now and it would be a shame if we dropped them. But we need files like this unicode file in main, which is part of a specification(I think), so can't be altered. But do you think it's _okay_ for such a file not to be free? (Whether it actually is or not is a topic for debian-legal). What I mean to say is that it's useless to demand that you can modify a 'standard', so yes, I think it's OK that it is DSFG non-free. What is the use of changing this unicode table, but not telling the rest of the world? grts Tim
Re: location of UnicodeData.txt
On Thu, Nov 28, 2002 at 07:02:07PM +0100, Emile van Bergen wrote: On Thu, Nov 28, 2002 at 11:47:52AM -0600, John Hasler wrote: I'm arguing that the _creation_ _of_ _that_ _table_ involved no creativity, not that the invention of Unicode didn't. Well, so you say that if I write a novel, all my creativity is in the abstract idea; putting the words down involved no extra creativity; thus the sequence of words cannot be copyrighted? I don't think I would follow you that far, but I do agree that saying it's just a table of data is no more meaningful than saying it's just a sequence of characters. The nature of the data is relevant. In this case, the data consists of two main components: - A mapping of character codes to character names - A list of attributes for each character Both of these components were carefully designed, with decisions that involve efficiency tradeoffs (use vs. compression vs. conversion) and that affect the usefulness of the result. Some of the decisions are still controversial. This data wasn't found engraved on some rock, and it's not a collection of pre-existing facts, it was created. Is it possible to create other Unicode tables that serve the same purpose as that one and differ from it non-trivially? Good question. Under your reasoning, merely writing the list down from the unicode spec, possibly using | as separators instead of :, should do the trick. Note that this file _is_ part of the unicode spec. As far as I know the character attributes are defined nowhere else. So writing the list down from the unicode spec means copying the file. Richard Braakman
Re: location of UnicodeData.txt
Hi, On Thu, Nov 28, 2002 at 10:47:02PM +0200, Richard Braakman wrote: On Thu, Nov 28, 2002 at 07:02:07PM +0100, Emile van Bergen wrote: On Thu, Nov 28, 2002 at 11:47:52AM -0600, John Hasler wrote: Is it possible to create other Unicode tables that serve the same purpose as that one and differ from it non-trivially? Good question. Under your reasoning, merely writing the list down from the unicode spec, possibly using | as separators instead of :, should do the trick. Note that this file _is_ part of the unicode spec. As far as I know the character attributes are defined nowhere else. So writing the list down from the unicode spec means copying the file. So the spec is the implementation, and a copyrighted implementation at that? That's lousy, to say the least. So all programs that follow the full Unicode spec are derived works of the implementation of the standard found in that copyrighted table, and have to carry the copyright and disclaimer notice mandated by the Unicode consortium? If so, Unicode cannot be regarded an unencumbered standard, if you're strict about it. Cheers, Emile. -- E-Advies / Emile van Bergen | [EMAIL PROTECTED] tel. +31 (0)70 3906153| http://www.e-advies.info pgp8zIa2iPS5u.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Thu, Nov 28, 2002 at 10:45:48AM -0600, John Hasler wrote: Richard Braakman writes: But do you think it's _okay_ for such a file not to be free? /usr/share/perl/5.8.0/unicore/UnicodeData.txt, which I assume is the file you are talking about, contains just a table of data. Unless its creation involved creativity rather than just sweat of the brow it is not protected by copyright. A similar discussion might apply to the CMap files in /usr/share/fonts/cmap provided by cmap-adobe-japan1, etc. These have an Adobe copyright and license preventing modification, but the actual contents are just: 33 begincidrange 00ff0 0100 01ff 256 0200 02ff 512 0300 03ff 768 0400 04ff 1024 0500 05ff 1280 0600 06ff 1536 0700 07ff 1792 0800 08ff 2048 0900 09ff 2304 etc Hamish -- Hamish Moffatt VK3SB [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: location of UnicodeData.txt
On Thu, Nov 28, 2002 at 05:46:04PM +, Mark Brown wrote: On Thu, Nov 28, 2002 at 07:33:43PM +0200, Richard Braakman wrote: These days I wouldn't be eager to rely on the limits of copyrightability. CNN.com - Composer pays for piece of silence - Sep. 23, 2002 http://www.cnn.com/2002/SHOWBIZ/Music/09/23/uk.silence/ It's worth pointing out that the major problem there was that the derived work actually credited the original which would make it much harder to defend a copyright action. But they never stated which part of 4'33 had apparently had it's copyright violated... -- --- Paul TBBle Hampson, MCSE 5th year CompSci/Asian Studies student, ANU The Boss, Bubblesworth Pty Ltd (ABN: 51 095 284 361) [EMAIL PROTECTED] Of course Pacman didn't influence us as kids. If it did, we'd be running around in darkened rooms, popping pills and listening to repetitive music. This email is licensed to the recipient for non-commercial use, duplication and distribution. --- pgpmDaeyjy2iD.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Thu, Nov 28, 2002 at 07:02:07PM +0100, Emile van Bergen wrote: Hi, On Thu, Nov 28, 2002 at 11:47:52AM -0600, John Hasler wrote: Emile van Bergen writes: I'd say that the definition of Unicode, heck even ASCII, involves a fair amount of creativity. I don't doubt that the development of Unicode involved creativity: under current law it probably qualifies as a patentable invention. Inventions and ideas, however, cannot be copyrighted: only creative works reduced to tangible form can. I'm arguing that the _creation_ _of_ _that_ _table_ involved no creativity, not that the invention of Unicode didn't. Well, so you say that if I write a novel, all my creativity is in the abstract idea; putting the words down involved no extra creativity; thus the sequence of words cannot be copyrighted? Copyright _cannot_ be applied to ideas, only the implementation or physical representation of thost ideas. (I can't copyright an mp3 player but I can copyright The mp3 player I wrote.) A patent however, applies to a new process or invention, which will usually encompass an idea more strongly. (I can patent A method of turning mp3s into sound as long as no-one else has done it that way before.) Patents are civil actions, while copyright violation is criminal, so copyright _has_ to be more limited than patents since there's more punishment and less recourse. -- --- Paul TBBle Hampson, MCSE 5th year CompSci/Asian Studies student, ANU The Boss, Bubblesworth Pty Ltd (ABN: 51 095 284 361) [EMAIL PROTECTED] Of course Pacman didn't influence us as kids. If it did, we'd be running around in darkened rooms, popping pills and listening to repetitive music. This email is licensed to the recipient for non-commercial use, duplication and distribution. --- pgpChorgfc8eb.pgp Description: PGP signature
Re: location of UnicodeData.txt
Paul Hampson writes: Patents are civil actions, while copyright violation is criminal,... In the US copyright infringement is usually (not always anymore, but still usually) civil as well. -- John Hasler [EMAIL PROTECTED] Dancing Horse Hill Elmwood, Wisconsin
location of UnicodeData.txt
I am developping a (simple) application that needs to have UnicodeData.txt file available. Of course there are more applications that need this file. So far I found these two: perl-modules: /usr/share/perl/5.8.0/unicore/UnicodeData.txt console-data: /usr/share/unidata/UnicodeData-2.1.8.txt (way obsolete) I was thinking about putting the file into something like /usr/share/unidata (probably with more files, those from /usr/share/perl/5.8.0/unicore/). Moreover, my application can use the file even if it is gzipped, which is obviously desirable. What are your opinions about this? I would rather not want to Depend: on perl-modules (it is a python application :-)), but duplicating /usr/share/perl/5.8.0/unicore/UnicodeData.txt seems a waste of diskspace and bandwidth, and messing up with symlinks in postinst I like even less. -- --- | Radovan GarabĂk http://melkor.dnp.fmph.uniba.sk/~garabik/ | | __..--^^^--..__garabik @ melkor.dnp.fmph.uniba.sk | --- Antivirus alert: file .signature infected by signature virus. Hi! I'm a signature virus! Copy me into your signature file to help me spread!
Re: location of UnicodeData.txt
Radovan Garabik [EMAIL PROTECTED] writes: I am developping a (simple) application that needs to have UnicodeData.txt file available. Of course there are more applications that need this file. So far I found these two: perl-modules: /usr/share/perl/5.8.0/unicore/UnicodeData.txt console-data: /usr/share/unidata/UnicodeData-2.1.8.txt (way obsolete) Heh. There's another: miscfiles: /usr/share/misc/unicode.gz The current version is Unicode 3.1.1.
Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 08:50:10AM -0800, Thomas Bushnell, BSG wrote: Heh. There's another: miscfiles: /usr/share/misc/unicode.gz The current version is Unicode 3.1.1. According to http://www.unicode.org/Public/UNIDATA/UnicodeData.html there's a version 3.2. Hmm, is this file Free? There's a license on that same page: Limitations on Rights to Redistribute This Data Recipient is granted the right to make copies in any form for internal distribution and to freely use the information supplied in the creation of products supporting the Unicode^TM Standard. The files in the Unicode Character Database can be redistributed to third parties or other organizations (whether for profit or not) as long as this notice and the disclaimer notice are retained. Information can be extracted from these files and used in documentation or programs, as long as there is an accompanying notice indicating the source. Richard Braakman
Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 07:59:00PM +0200, Richard Braakman wrote: On Wed, Nov 27, 2002 at 08:50:10AM -0800, Thomas Bushnell, BSG wrote: Heh. There's another: miscfiles: /usr/share/misc/unicode.gz The current version is Unicode 3.1.1. According to http://www.unicode.org/Public/UNIDATA/UnicodeData.html there's a version 3.2. Hmm, is this file Free? There's a license on that same page: This is a question for -legal, FYI. Limitations on Rights to Redistribute This Data Recipient is granted the right to make copies in any form for internal distribution and to freely use the information supplied in the creation of products supporting the Unicode^TM Standard. The files in the Unicode Character Database can be redistributed to third parties or other organizations (whether for profit or not) as long as this notice and the disclaimer notice are retained. Information can be extracted from these files and used in documentation or programs, as long as there is an accompanying notice indicating the source. I see no problem with this license as far as it goes, but it doesn't go far enough. There is no permission granted to make modifications (and distribute modified versions). (DFSG 3) -- G. Branden Robinson| Convictions are more dangerous Debian GNU/Linux | enemies of truth than lies. [EMAIL PROTECTED] | -- Friedrich Nietzsche http://people.debian.org/~branden/ | pgpAhxvuVE23U.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 03:54:35PM -0500, Branden Robinson wrote: On Wed, Nov 27, 2002 at 07:59:00PM +0200, Richard Braakman wrote: On Wed, Nov 27, 2002 at 08:50:10AM -0800, Thomas Bushnell, BSG wrote: Heh. There's another: miscfiles: /usr/share/misc/unicode.gz The current version is Unicode 3.1.1. According to http://www.unicode.org/Public/UNIDATA/UnicodeData.html there's a version 3.2. Hmm, is this file Free? There's a license on that same page: This is a question for -legal, FYI. Limitations on Rights to Redistribute This Data Recipient is granted the right to make copies in any form for internal distribution and to freely use the information supplied in the creation of products supporting the Unicode^TM Standard. The files in the Unicode Character Database can be redistributed to third parties or other organizations (whether for profit or not) as long as this notice and the disclaimer notice are retained. Information can be extracted from these files and used in documentation or programs, as long as there is an accompanying notice indicating the source. I see no problem with this license as far as it goes, but it doesn't go far enough. There is no permission granted to make modifications (and distribute modified versions). (DFSG 3) So, according to Branden, international standards are supposed to allow debian the right to modify them and to distribute the modified versions. Absent said permission, which is hardly ever going to be given, they must be considered non-free. (This is, of course, logically forthright.) Moreover, according to the non-free removal proponents, we should not even distribute the un-modified copies of these files. Yet, unicode is supposed to be the canonical character encoding scheme for debian. Does this mean every unicode text editor belongs in contrib (depends on something non-free)? What an interesting anecdote! Jim Penny
Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 04:23:51PM -0500, Jim Penny wrote: I see no problem with this license as far as it goes, but it doesn't go far enough. There is no permission granted to make modifications (and distribute modified versions). (DFSG 3) So, according to Branden, international standards are supposed to allow debian the right to modify them and to distribute the modified versions. No, international standards can say whatever they want, and bear whatever license the standards organization wants, within the law. Debian has its Free Software Guidelines and we do not, in theory, apply them differently based on who the licensor is. Incidentally, allowing Debian the right to modify them and to distribute the modified versions would also be insufficient; perhaps you haven't read the DFSG lately. 8. License Must Not Be Specific to Debian The rights attached to the program must not depend on the program's being part of a Debian system. If the program is extracted from Debian and used or distributed without Debian but otherwise within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the Debian system. Absent said permission, which is hardly ever going to be given, they must be considered non-free. (This is, of course, logically forthright.) Well, yes. That is what the words of the DFSG mean. Moreover, according to the non-free removal proponents, we should not even distribute the un-modified copies of these files. I cannot speak for all proponents of the proposed GR, but yes, that's my understanding. Yet, unicode is supposed to be the canonical character encoding scheme for debian. I don't see that in the current version of the Policy manual, but it wouldn't surprise me if we were to standardize on Unicode, since it seems to be the best-of-breed in the character set department. Does this mean every unicode text editor belongs in contrib (depends on something non-free)? Many (perhaps all) RFCs are non-free as well; does that mean that compliant implementations must go into contrib or non-free? What an interesting anecdote! I do not grasp what place emotionalism has in a simple, coolheaded discussion of licensing. If you are upset with the ramifications of the DFSG, you can always propose a General Resolution to amend its terms, or repeal it entirely, perhaps in favor of something more pragmatic. Incidentally, is there a reason you did not respect the Mail-Followup-To header? -- G. Branden Robinson|I have a truly elegant proof of the Debian GNU/Linux |above, but it is too long to fit [EMAIL PROTECTED] |into this .signature file. http://people.debian.org/~branden/ | pgp3rPkJesTIe.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 04:23:51PM -0500, Jim Penny wrote: So, according to Branden, international standards are supposed to allow debian the right to modify them and to distribute the modified versions. Absent said permission, which is hardly ever going to be given, they must be considered non-free. (This is, of course, logically forthright.) Moreover, according to the non-free removal proponents, we should not even distribute the un-modified copies of these files. I raised this issue not long ago, on this mailing list. See thread starting from Message-ID [EMAIL PROTECTED]. (sorry, I don't know how to convert this to a URL on lists.debian.org). Just out of curiosity, are documents like the DFSG distrubuted with Debian? If so, are you allowed to modify them? (I assume documents like the GPL, being licenses, are excempt from this requirement?) Also, I note that /usr/doc/debian-policy/copyright (woody) has a copyright for FSSTND, and it says No portion of this document may be redistributed in any modified or abridged form without the prior approval of the FSSTND coordinator.. Does this mean that the FSSTND should never have been distributed with Debian? Obviously it must have been at one stage, or it wouldn't be in the copyright file. -- Brian May [EMAIL PROTECTED]
Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 04:53:00PM -0500, Branden Robinson wrote: On Wed, Nov 27, 2002 at 04:23:51PM -0500, Jim Penny wrote: I see no problem with this license as far as it goes, but it doesn't go far enough. There is no permission granted to make modifications (and distribute modified versions). (DFSG 3) So, according to Branden, international standards are supposed to allow debian the right to modify them and to distribute the modified versions. Moreover, according to the non-free removal proponents, we should not even distribute the un-modified copies of these files. I cannot speak for all proponents of the proposed GR, but yes, that's my understanding. Yet, unicode is supposed to be the canonical character encoding scheme for debian. I don't see that in the current version of the Policy manual, but it wouldn't surprise me if we were to standardize on Unicode, since it seems to be the best-of-breed in the character set department. Does this mean every unicode text editor belongs in contrib (depends on something non-free)? Many (perhaps all) RFCs are non-free as well; does that mean that compliant implementations must go into contrib or non-free? I notice that you did not answer. As far as I can tell, given the current definition, the logically coherent answer is yes. There is some wiggle room in this. See below. What an interesting anecdote! I do not grasp what place emotionalism has in a simple, coolheaded discussion of licensing. If you are upset with the ramifications of the DFSG, you can always propose a General Resolution to amend its terms, or repeal it entirely, perhaps in favor of something more pragmatic. Anecdote. A particular fact of an interesting nature. Incidentally, is there a reason you did not respect the Mail-Followup-To header? Yup, the anecdote had nothing to do with legal. Had a lot to do with the ramifications of the more radical interpretations of the DFSG and the consequences of these interpretations. It was interesting to see you argue that a license was non-free. To be consistent with the GR, you should have been observing that it could not be a part of debian. If there is a point in this, it is that the status quo ante allows some wiggle room. In particular, section 5 of the social contract grants this. If you remove section 5, and reduce debian to only things that have a DSFG license, the resulting axiomatic system can be used in interesting ways. In particular, recursive application of the axioms is very intersting. Can an artifact that claims to be compliant with a non-DSFG free standard itself be considered to be free? That is, does it depend on the standard for its execution? Compare and contrast this with an installer of a non-free package. Note: the typical installer can, in fact, install an infinite number of items -- after all, most installers are not strongly version dependent! Jim Penny Note: there is an intentional ambiguity of the word debian above, which will drive some fundamentalists crazy. My definition of debian: the totality of software, documents, and other artifacts produced by debian developers and contributed to the debian archives.
Re: location of UnicodeData.txt
Does this mean every unicode text editor belongs in contrib (depends on something non-free)? Many (perhaps all) RFCs are non-free as well; does that mean that compliant implementations must go into contrib or non-free? The problem is, every character in Unicode, all 70,000 of them, has a distinct set of properties. UnicodeData.txt is basically a listing of those properties. If it is a copyrightable work, I see no way for a text processing program to conform to Unicode without using a derivative of that copyrighted work. Likewise, I'd bet that file or some derivative of it is embedded in both Perl and Python - you can't reasonably handle Unicode characters without it. We could always pony up the $12,000 (or $1200 for an associate membership) and become a member of Unicode and complain about this from the inside.
Re: location of UnicodeData.txt
[Jim trimmed from CC; I'm not sure why his address was in your M-F-T.] On Thu, Nov 28, 2002 at 09:43:33AM +1100, Brian May wrote: Just out of curiosity, are documents like the DFSG distrubuted with Debian? Well, certainly some documents like the DFSG might be distributed as part of the Debian system. As regards the DFSG specifically: doc-debian: /usr/share/doc/debian/social-contract.txt Package: doc-debian Status: install ok installed Priority: standard Section: doc Installed-Size: 844 Maintainer: Josip Rodin [EMAIL PROTECTED] Version: 3.0.1 Suggests: www-browser, postscript-viewer Description: Debian Project documentation, Debian FAQ and other documents The Debian Project is an association of individuals who have made common cause to create a free operating system. . In this package, you will find: * Debian Linux Manifesto, * Constitution for the Debian Project, * Debian GNU/Linux Social Contract, * Debian Free Software Guidelines. . Additionally provided are: * Debian GNU/Linux Frequently Asked Questions (FAQ), * Debian Bug Tracking System documentation, and * Introduction to the Debian mailing lists. . All of these files are available at ftp://ftp.debian.org/debian/doc/ and mirrors thereof. If so, are you allowed to modify them? The answer is either unknown or yes. http://www.debian.org/social_contract and scroll to the bottom. See the copyright notice? http://www.debian.org/license (I assume documents like the GPL, being licenses, are excempt from this requirement?) The DFSG doesn't make any exceptions for license texts used as such (i.e., when they are applied to a work that is being distributed under its terms), but earlier this year I proposed making such an exception explicit in my proposed interpretive guidelines. However, my proposal was never formally adopted by anyone, so as far as I know it only reflects my thinking on issues. So the official answer to your question is probably unknown. Also, I note that /usr/doc/debian-policy/copyright (woody) has a copyright for FSSTND, and it says No portion of this document may be redistributed in any modified or abridged form without the prior approval of the FSSTND coordinator.. Does this mean that the FSSTND should never have been distributed with Debian? Obviously it must have been at one stage, or it wouldn't be in the copyright file. Under our current standards, probably not. We used to be considerably less careful about licensing than we are now, and furthermore when we adopted the DFSG we never did an audit of main to identify everything within it that might not meet its terms. The Debian Policy Manual predates the Social Contract and DFSG, if I'm not mistaken, so it's possible this language was simply never reviewed in the context of the DFSG. -- G. Branden Robinson| Never attribute to malice that Debian GNU/Linux | which can be adequately explained [EMAIL PROTECTED] | by stupidity. http://people.debian.org/~branden/ | pgpiu1B26cHzA.pgp Description: PGP signature
Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 05:00:55PM -0600, [EMAIL PROTECTED] wrote: The problem is, every character in Unicode, all 70,000 of them, has a distinct set of properties. UnicodeData.txt is basically a listing of those properties. If it is a copyrightable work, That's a big if, and the answer may be different in Europe and the United States, as we found out on debian-legal recently when discussing an unrelated work (the aspell-en package). I'm not really sure what we're supposed to do about disparities in intellectual property, or if those disparities impact the Unicode property list. -- G. Branden Robinson| You don't just decide to break Debian GNU/Linux | Kubrick's code of silence and then [EMAIL PROTECTED] | get drawn away from it to a http://people.debian.org/~branden/ | discussion about cough medicine. pgproxZr8ZAZM.pgp Description: PGP signature