RE: Questions about Unicode history
Thank you all for all the precious answers that I am receiving publicly and privately. I am collecting enough material to write a book about the history of encoding, rather than just a short article about Unicode! I think that much of this material has general interest, so I will post a RESUME of all the answers as soon as I see that the thread has expired. *** I assume that I CAN RE-POST the PRIVATE ANSWERS that I received. If any of the authors wishes me to not republish their messages or part of them, or wish to remain anonymous, please let me know separately. *** Most of the answers, of course, are contained in Magda Danish's yet unpublished summary of Unicode history. When the case, I will simply refer to the Unicode history on the Unicode web site; everybody will be able to read it as soon as it will be completed and published. _ Marco
Re: Questions about Unicode history
For when particular characters were added to Unicode, you can also consult the new DerivedAge.txt, currently in the BETA at: http://www.unicode.org/Public/BETA/Unicode3.2/DerivedAge-3.2.0d2.txt Mark — Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ [For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr] http://www.macchiato.com - Original Message - From: Kenneth Whistler [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Wednesday, January 30, 2002 12:18 Subject: Re: Questions about Unicode history Marco, I'll answer as many of your questions as I can, and will cc this to the unicode list (in part to forestall a gazillion Well, I think maybe X responses). --Ken - When did the Unicode project start, and who started it? The detailed history for this will soon be available on the Unicode website. The short answer is that Joe Becker (Xerox) and Lee Collins (Apple) were highly instrumental in getting the ball rolling on this, and the preliminary work they did, primarily on Han unification, dated from 1987. However, the Unicode project had many beginnings -- many points where you could mark a milestone in its early development. And the Unicode Consortium celebrated a number of 10-year anniversaries, starting from 1998 and continuing through last year. - Is it true Han Unification was the core of Unicode, and the idea of an universal encoding come afterwards? The effort by Xerox and Apple to do a Han unification was key to the motivation that eventually led to a serious effort to actually *do* Unicode and then to establish the Unicode Consortium to standardize and promote it. However, the idea of a universal encoding predated that considerably. In some respects the Xerox Character Code Standard (XCCS) was a serious attempt at providing a universal character encoding (although it did not include a unified Han encoding, but only Japanese kanji). XCCS 2.0 (1980) contained, in addition to Japanese kanji: Latin (with IPA), Hiragana, Bopomofo, Katakana, Greek, Cyrillic, Runic, Gothic, Arabic, Hebrew, Georgian, Armenian, Devanagari, Hangul jamo, and a wide variety of symbols. The early Unicoders mined XCCS 2.0 heavily for the early drafts of Unicode 1.0, and always regarded it as the prototype for a universal encoding. Additionally, you have to consider that the beginning of the ISO project for a Multi-octet Universal Character Set (10646) predated the formal establishment of Unicode. Part of the impetus for the serious work to standardize Unicode was, of course, discontent with the then architecture of the early drafts of 10646. - Who and when invented the name Unicode? This one has a definitive answer: Joe Becker coined the term, for unique, universal, and uniform character encoding, in 1987. First documented use is in December, 1987. - When did the ISO 10646 project start? Unfortunately, the document register for early WG2 documents doesn't have dates for all the early documents, and I don't have all the early documents to check. But... The 4th meeting of WG2 was held in London in February, 1986. The first three meetings were in Geneva, Turin, and London, respectively. That puts the likely timeframe for the Geneva meeting, and the establishment of WG2 by SC2 at about 1984. The *only* project for WG2 was 10646. Some of the older oldtimers on the list may have more exact information about the early WG2 work. - When did Unicode and ISO 10646 merge? It wasn't a single date that can be pointed to, like the signing of an armistice. In some respects, Unicode and ISO 10646 are *still* merging, as modifications and amendments to deal with niggling little architectural edge cases are worked out. However the key dates were: January 3, 1991. Incorporation of the Unicode Consortium, which signalled to SC2 that the Unicoders were serious in their intentions. May, 1991. Meeting #19 of WG2 in San Francisco. An ad hoc meeting took place between WG2 members and some Unicoders, which paved the way for the later merger of the standards. June, 1991. The 10646 DIS 1 was defeated in its ballotting. This left the only reasonable way forward an architectural compromise with the Unicode Standard, which at that point was in copy edit and about to go to press. June 3, 1991. The date of 10646M proposal draft to merge Unicode and 10646, by Ed Hart. This was a key document in the resulting merger of features. August, 1991. The Geneva WG2 meeting accepted Han unification, combining marks, dropped byte-by-byte restrictions on code values for UCS-2, and accepted Unicode repertoire additions. From that point forward, the overall aspect of what became ISO/IEC 10646-1:1993 was clear. - What is the name of the GB and JIS standards that have the same repertoire as Unicode? GB 13000 has the same repertoire as ISO/IEC
RE: Questions about Unicode history
Hi Marco, I am currently working on a few web pages that talk about the Unicode history. They are not publicly accessible yet but I'm sure they hold the answers to most of your questions. I will email you the temporary url in a separate email. Regards, Magda. -Original Message- From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] Sent: Wednesday, January 30, 2002 9:29 AM To: [EMAIL PROTECTED] Subject: Questions about Unicode history Hallo. I am writing a short article about Unicode, and I realized that I don't know or I am not sure of many Unicode-related facts and dates that I would like to mention. I apologize for this is a huge list of questions (and I hope that they are not all in the FAQ). Anyway, if anybody is in the mood for trivia, I thank you in advance: - When did the Unicode project start, and who started it? - Is it true Han Unification was the core of Unicode, and the idea of an universal encoding come afterwards? - Who and when invented the name Unicode? - When did the ISO 10646 project start? - When did Unicode and ISO 10646 merge? - What is the name of the GB and JIS standards that have the same repertoire as Unicode? - When did Unicode stop to be 16 bits? (I.e., when were surrogates added?) - I can't remember the version when some scripts were added: Syriac, Thaana, Sinhala, Tibetan, Myanmar, Ethiopic, Cherokee, Canadian Syllabics, Ogham, Runes, Khmer, Mongolian, Yi, Etruscan, Gothic, Deseret, CJK ext. A, CJK ext. B. - Roughly, how many ideographs are in modern use in extensions A and B? - Roughly, when will version 3.2 become official? - Roughly, when will the version 4 book be published? I also have a few non-Unicode questions: - When was ASCII first published and by whom? - What standard was current before ASCII? (BAUDOT, is it?) How many bits did it use? - Did the ASCII standard expire, and when? - When was ISO 646 published? - I think that ISO 646 expired. When? - When was ISO 8859 published? - When did the first double-byte encoding appear? - Are OpenType fonts currently implemented in any platform other than Windows? Thanks again, in advance. _ Marco
Re: Questions about Unicode history
On Wednesday, January 30, 2002, at 12:29 PM, Marco Cimarosti wrote: - Are OpenType fonts currently implemented in any platform other than Windows? OpenType fonts work without modification on Mac OS X, in that the glyphs can be displayed. Any Mac application can access the OT data in the font, parse it, and process it appropriately using public functions. The one piece still missing is automatic support for OT layout data in the system. == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/
Re: Questions about Unicode history
Marco Cimarosti wrote: - Are OpenType fonts currently implemented in any platform other than Windows? FreeType implements OpenType, including layout. By construction, FreeType only requires an ANSI C implementation, and was written with embedded systems in mind. Thus, the answer to your question could be all. Eric.
Re: Questions about Unicode history
At 09:29 1/30/2002, Marco Cimarosti wrote: - Are OpenType fonts currently implemented in any platform other than Windows? 'OpenType support' means a number of different things. Support for the font file format and rasterisation of the TT or CFF outlines is widespread, including Windows, OSX (native), earlier Mac systems (CFF only, using ATM), and implementations of FreeType. Support for individual OpenType Layout typographic features varies from application to application. Support for script shaping features and character-level pre-formatting, e.g. for Indic scripts, is supported in Windows apps that use Uniscribe for text processing, and I believe the FreeType developers have also been working on Indic shaping although I am not sure if this has been released yet. John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] ... es ist ein unwiederbringliches Bild der Vergangenheit, das mit jeder Gegenwart zu verschwinden droht, die sich nicht in ihm gemeint erkannte. ... every image of the past that is not recognized by the present as one of its own concerns threatens to disappear irretrievably. Walter Benjamin
Re: Questions about Unicode history
Marco, I'll answer as many of your questions as I can, and will cc this to the unicode list (in part to forestall a gazillion Well, I think maybe X responses). --Ken - When did the Unicode project start, and who started it? The detailed history for this will soon be available on the Unicode website. The short answer is that Joe Becker (Xerox) and Lee Collins (Apple) were highly instrumental in getting the ball rolling on this, and the preliminary work they did, primarily on Han unification, dated from 1987. However, the Unicode project had many beginnings -- many points where you could mark a milestone in its early development. And the Unicode Consortium celebrated a number of 10-year anniversaries, starting from 1998 and continuing through last year. - Is it true Han Unification was the core of Unicode, and the idea of an universal encoding come afterwards? The effort by Xerox and Apple to do a Han unification was key to the motivation that eventually led to a serious effort to actually *do* Unicode and then to establish the Unicode Consortium to standardize and promote it. However, the idea of a universal encoding predated that considerably. In some respects the Xerox Character Code Standard (XCCS) was a serious attempt at providing a universal character encoding (although it did not include a unified Han encoding, but only Japanese kanji). XCCS 2.0 (1980) contained, in addition to Japanese kanji: Latin (with IPA), Hiragana, Bopomofo, Katakana, Greek, Cyrillic, Runic, Gothic, Arabic, Hebrew, Georgian, Armenian, Devanagari, Hangul jamo, and a wide variety of symbols. The early Unicoders mined XCCS 2.0 heavily for the early drafts of Unicode 1.0, and always regarded it as the prototype for a universal encoding. Additionally, you have to consider that the beginning of the ISO project for a Multi-octet Universal Character Set (10646) predated the formal establishment of Unicode. Part of the impetus for the serious work to standardize Unicode was, of course, discontent with the then architecture of the early drafts of 10646. - Who and when invented the name Unicode? This one has a definitive answer: Joe Becker coined the term, for unique, universal, and uniform character encoding, in 1987. First documented use is in December, 1987. - When did the ISO 10646 project start? Unfortunately, the document register for early WG2 documents doesn't have dates for all the early documents, and I don't have all the early documents to check. But... The 4th meeting of WG2 was held in London in February, 1986. The first three meetings were in Geneva, Turin, and London, respectively. That puts the likely timeframe for the Geneva meeting, and the establishment of WG2 by SC2 at about 1984. The *only* project for WG2 was 10646. Some of the older oldtimers on the list may have more exact information about the early WG2 work. - When did Unicode and ISO 10646 merge? It wasn't a single date that can be pointed to, like the signing of an armistice. In some respects, Unicode and ISO 10646 are *still* merging, as modifications and amendments to deal with niggling little architectural edge cases are worked out. However the key dates were: January 3, 1991. Incorporation of the Unicode Consortium, which signalled to SC2 that the Unicoders were serious in their intentions. May, 1991. Meeting #19 of WG2 in San Francisco. An ad hoc meeting took place between WG2 members and some Unicoders, which paved the way for the later merger of the standards. June, 1991. The 10646 DIS 1 was defeated in its ballotting. This left the only reasonable way forward an architectural compromise with the Unicode Standard, which at that point was in copy edit and about to go to press. June 3, 1991. The date of 10646M proposal draft to merge Unicode and 10646, by Ed Hart. This was a key document in the resulting merger of features. August, 1991. The Geneva WG2 meeting accepted Han unification, combining marks, dropped byte-by-byte restrictions on code values for UCS-2, and accepted Unicode repertoire additions. From that point forward, the overall aspect of what became ISO/IEC 10646-1:1993 was clear. - What is the name of the GB and JIS standards that have the same repertoire as Unicode? GB 13000 has the same repertoire as ISO/IEC 10646-1:1993. JIS X 0221 has the same repertoire as ISO/IEC 10646-1:1993. Those two were effectively national publications of 10646. You can work out the correlations with Unicode from that. GB 18030:2000 in principle has the same repertoire (but different encoding) as ISO/IEC 10646-1:2000, i.e. the same as Unicode 3.0. (But there were small problems in it.) However, the 4-byte form of GB 18030 maps all Unicode code points, assigned or not, so it will (in theory, at least) always have the same repertoire as Unicode. - When did Unicode stop to be 16 bits? (I.e., when were surrogates added?) In terms of publication, with Unicode 2.0 in 1996. However,
Re: Questions about Unicode history
Marco, some of your questions probalbly are answered in Roman Czyborra's WWW pages, particularly in - http://czyborra.com/unicode/standard.html, - http://czyborra.com/charsets/iso646.html, - http://czyborra.com/charsets/iso8859.html, - http://czyborra.com/charsets/cjk.html, - http://czyborra.com/charsets/codepages.html. - When did Unicode and ISO 10646 merge? The merger was initiated by an informal meeting of Unicode, and WG2 members, during the JTC1/SC2/WG2 meeting in San Francisco, Cali- fornia, USA, in May 1991. At that time, ISO DIS 10646 (the 1st one) was still in ballot, so no formal discussion, let alone an agreement, was allowed by JTC1's rules. By mid-July, DIS 10646 was formally voted down (P-members: 8 YES, 11 NO, 2 abstained; O-members: 1 YES, 3 NO, 0 abstained). 9 out of 14 NO votes mentioned the merger (only one universal code), in their national comments. The merger, and the basic architecture, were agreed on, at the ISO-IEC JTC1/Sc2/WG2 meeting in Geneva, Switzerland, August 19th through 23rd, 1991 In Octobre 1991, ISO SC2 plenary (in Rennes, France) unanimously authorized WG2 to issue a new DIS 10646 in January 1992 for a 4-month (i. e. shortened) vote. Best wishes, Otto Stolz
RE: Questions about Unicode history
Otto Stolz wrote: some of your questions probalbly are answered in Roman Czyborra's WWW pages, particularly in [czyborra.com addresses snipped] I just found: http://www.cwi.nl/~dik/english/codes/stand.html whose author (Dik Winter) notes that he 'stop[s] approximately where Roman Czyborra starts'. Thai EBCDIC, JISCII, 6-bit ISO codes, ASCII-1963 etc. Looks very thorough to me, but I wasn't there... Al.