RE: Questions about Unicode history

2002-01-31 Thread Marco Cimarosti

Thank you all for all the precious answers that I am receiving publicly and
privately.

I am collecting enough material to write a book about the history of
encoding, rather than just a short article about Unicode!

I think that much of this material has general interest, so I will post a
RESUME of all the answers as soon as I see that the thread has expired.

*** I assume that I CAN RE-POST the PRIVATE ANSWERS that I received. If any
of the authors wishes me to not republish their messages or part of them, or
wish to remain anonymous, please let me know separately. ***

Most of the answers, of course, are contained in Magda Danish's yet
unpublished summary of Unicode history. When the case, I will simply refer
to the Unicode history on the Unicode web site; everybody will be able to
read it as soon as it will be completed and published.

_ Marco




Re: Questions about Unicode history

2002-01-31 Thread Mark Davis

For when particular characters were added to Unicode, you can also
consult the new DerivedAge.txt, currently in the BETA at:

http://www.unicode.org/Public/BETA/Unicode3.2/DerivedAge-3.2.0d2.txt

Mark
—

Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο 
πάντα — Ὁμήρου Μαργίτῃ
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

- Original Message -
From: Kenneth Whistler [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Wednesday, January 30, 2002 12:18
Subject: Re: Questions about Unicode history


 Marco,

 I'll answer as many of your questions as I can, and will
 cc this to the unicode list (in part to forestall a gazillion
 Well, I think maybe X responses).

 --Ken

  - When did the Unicode project start, and who started it?

 The detailed history for this will soon be available on the
 Unicode website. The short answer is that Joe Becker (Xerox) and
 Lee Collins (Apple) were highly instrumental in getting the
 ball rolling on this, and the preliminary work they did,
 primarily on Han unification, dated from 1987.

 However, the Unicode project had many beginnings -- many points
 where you could mark a milestone in its early development. And
 the Unicode Consortium celebrated a number of 10-year
 anniversaries, starting from 1998 and continuing through last year.

 
  - Is it true Han Unification was the core of Unicode, and the idea
of an
  universal encoding come afterwards?

 The effort by Xerox and Apple to do a Han unification was key to
 the motivation that eventually led to a serious effort to actually
 *do* Unicode and then to establish the Unicode Consortium to
 standardize and promote it. However, the idea of a universal
encoding
 predated that considerably. In some respects the Xerox Character
Code
 Standard (XCCS) was a serious attempt at providing a universal
 character encoding (although it did not include a unified Han
 encoding, but only Japanese kanji). XCCS 2.0 (1980) contained, in
 addition to Japanese kanji: Latin (with IPA), Hiragana, Bopomofo,
Katakana,
 Greek, Cyrillic, Runic, Gothic, Arabic, Hebrew, Georgian, Armenian,
 Devanagari, Hangul jamo, and a wide variety of symbols. The early
 Unicoders mined XCCS 2.0 heavily for the early drafts of Unicode
1.0,
 and always regarded it as the prototype for a universal encoding.

 Additionally, you have to consider that the beginning of the ISO
project
 for a Multi-octet Universal Character Set (10646) predated the
 formal establishment of Unicode. Part of the impetus for the serious
 work to standardize Unicode was, of course, discontent with the
 then architecture of the early drafts of 10646.

 
  - Who and when invented the name Unicode?

 This one has a definitive answer: Joe Becker coined the term,
 for unique, universal, and uniform character encoding, in 1987.
 First documented use is in December, 1987.

 
  - When did the ISO 10646 project start?

 Unfortunately, the document register for early WG2 documents doesn't
 have dates for all the early documents, and I don't have all the
 early documents to check. But...

 The 4th meeting of WG2 was held in London in February, 1986. The
 first three meetings were in Geneva, Turin, and London,
respectively.
 That puts the likely timeframe for the Geneva meeting, and the
 establishment of WG2 by SC2 at about 1984. The *only* project for
WG2
 was 10646.

 Some of the older oldtimers on the list may have more exact
information
 about the early WG2 work.

 
  - When did Unicode and ISO 10646 merge?

 It wasn't a single date that can be pointed to, like the signing
 of an armistice. In some respects, Unicode and ISO 10646 are *still*
 merging, as modifications and amendments to deal with niggling
little
 architectural edge cases are worked out.

 However the key dates were:

 January 3, 1991. Incorporation of the Unicode Consortium, which
signalled to SC2 that the Unicoders were serious in their
intentions.

 May, 1991. Meeting #19 of WG2 in San Francisco. An ad hoc meeting
took place between WG2 members and some Unicoders, which paved
the way for the later merger of the standards.

 June, 1991. The 10646 DIS 1 was defeated in its ballotting. This
left
the only reasonable way forward an architectural compromise with
the Unicode Standard, which at that point was in copy edit and
about to go to press.

 June 3, 1991. The date of 10646M proposal draft to merge Unicode
and
10646, by Ed Hart. This was a key document in the resulting
merger of features.

 August, 1991. The Geneva WG2 meeting accepted Han unification,
combining
marks, dropped byte-by-byte restrictions on code values for
UCS-2,
and accepted Unicode repertoire additions. From that point
forward,
the overall aspect of what became ISO/IEC 10646-1:1993 was clear.

 
  - What is the name of the GB and JIS standards that have the same
repertoire
  as Unicode?

 GB 13000 has the same repertoire as ISO/IEC

RE: Questions about Unicode history

2002-01-30 Thread Magda Danish (Unicode)

Hi Marco,

I am currently working on a few web pages that talk about the Unicode
history. They are not publicly accessible yet but I'm sure they hold the
answers to most of your questions. I will email you the temporary url in
a separate email.

Regards,
Magda.

-Original Message-
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]] 
Sent: Wednesday, January 30, 2002 9:29 AM
To: [EMAIL PROTECTED]
Subject: Questions about Unicode history


Hallo.

I am writing a short article about Unicode, and I realized that I don't
know or I am not sure of many Unicode-related facts and dates that I
would like to mention.

I apologize for this is a huge list of questions (and I hope that they
are not all in the FAQ). Anyway, if anybody is in the mood for trivia, I
thank you in advance:


- When did the Unicode project start, and who started it?

- Is it true Han Unification was the core of Unicode, and the idea of an
universal encoding come afterwards?

- Who and when invented the name Unicode?

- When did the ISO 10646 project start?

- When did Unicode and ISO 10646 merge?

- What is the name of the GB and JIS standards that have the same
repertoire as Unicode?

- When did Unicode stop to be 16 bits? (I.e., when were surrogates
added?)

- I can't remember the version when some scripts were added: Syriac,
Thaana, Sinhala, Tibetan, Myanmar, Ethiopic, Cherokee, Canadian
Syllabics, Ogham, Runes, Khmer, Mongolian, Yi, Etruscan, Gothic,
Deseret, CJK ext. A, CJK ext. B.

- Roughly, how many ideographs are in modern use in extensions A and B?

- Roughly, when will version 3.2 become official?

- Roughly, when will the version 4 book be published?


I also have a few non-Unicode questions:


- When was ASCII first published and by whom?

- What standard was current before ASCII? (BAUDOT, is it?) How many bits
did it use?

- Did the ASCII standard expire, and when?

- When was ISO 646 published?

- I think that ISO 646 expired. When?

- When was ISO 8859 published?

- When did the first double-byte encoding appear?

- Are OpenType fonts currently implemented in any platform other than
Windows?


Thanks again, in advance.

_ Marco





Re: Questions about Unicode history

2002-01-30 Thread John H. Jenkins


On Wednesday, January 30, 2002, at 12:29 PM, Marco Cimarosti wrote:


 - Are OpenType fonts currently implemented in any platform other than
 Windows?



OpenType fonts work without modification on Mac OS X, in that the glyphs 
can be displayed.  Any Mac application can access the OT data in the font,
  parse it, and process it appropriately using public functions.  The one 
piece still missing is automatic support for OT layout data in the system.

==
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://homepage.mac.com/jenkins/





Re: Questions about Unicode history

2002-01-30 Thread Eric Muller

Marco Cimarosti wrote:

 - Are OpenType fonts currently implemented in any platform other than
 Windows?

FreeType implements OpenType, including layout. By construction, FreeType only
requires an ANSI C implementation, and was written with embedded systems in
mind. Thus, the answer to your question could be all.

Eric.






Re: Questions about Unicode history

2002-01-30 Thread John Hudson

At 09:29 1/30/2002, Marco Cimarosti wrote:

- Are OpenType fonts currently implemented in any platform other than
Windows?

'OpenType support' means a number of different things.

Support for the font file format and rasterisation of the TT or CFF 
outlines is widespread, including Windows, OSX (native), earlier Mac 
systems (CFF only, using ATM), and implementations of FreeType.

Support for individual OpenType Layout typographic features varies from 
application to application.

Support for script shaping features and character-level pre-formatting, 
e.g. for Indic scripts, is supported in Windows apps that use Uniscribe for 
text processing, and I believe the FreeType developers have also been 
working on Indic shaping although I am not sure if this has been released yet.

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

... es ist ein unwiederbringliches Bild der Vergangenheit,
das mit jeder Gegenwart zu verschwinden droht, die sich
nicht in ihm gemeint erkannte.

... every image of the past that is not recognized by the
present as one of its own concerns threatens to disappear
irretrievably.
   Walter Benjamin





Re: Questions about Unicode history

2002-01-30 Thread Kenneth Whistler

Marco,

I'll answer as many of your questions as I can, and will
cc this to the unicode list (in part to forestall a gazillion
Well, I think maybe X responses).

--Ken

 - When did the Unicode project start, and who started it?

The detailed history for this will soon be available on the
Unicode website. The short answer is that Joe Becker (Xerox) and
Lee Collins (Apple) were highly instrumental in getting the
ball rolling on this, and the preliminary work they did,
primarily on Han unification, dated from 1987.

However, the Unicode project had many beginnings -- many points
where you could mark a milestone in its early development. And
the Unicode Consortium celebrated a number of 10-year
anniversaries, starting from 1998 and continuing through last year.

 
 - Is it true Han Unification was the core of Unicode, and the idea of an
 universal encoding come afterwards?

The effort by Xerox and Apple to do a Han unification was key to
the motivation that eventually led to a serious effort to actually
*do* Unicode and then to establish the Unicode Consortium to
standardize and promote it. However, the idea of a universal encoding
predated that considerably. In some respects the Xerox Character Code
Standard (XCCS) was a serious attempt at providing a universal
character encoding (although it did not include a unified Han
encoding, but only Japanese kanji). XCCS 2.0 (1980) contained, in
addition to Japanese kanji: Latin (with IPA), Hiragana, Bopomofo, Katakana,
Greek, Cyrillic, Runic, Gothic, Arabic, Hebrew, Georgian, Armenian,
Devanagari, Hangul jamo, and a wide variety of symbols. The early
Unicoders mined XCCS 2.0 heavily for the early drafts of Unicode 1.0,
and always regarded it as the prototype for a universal encoding.

Additionally, you have to consider that the beginning of the ISO project 
for a Multi-octet Universal Character Set (10646) predated the
formal establishment of Unicode. Part of the impetus for the serious
work to standardize Unicode was, of course, discontent with the
then architecture of the early drafts of 10646.

 
 - Who and when invented the name Unicode?

This one has a definitive answer: Joe Becker coined the term,
for unique, universal, and uniform character encoding, in 1987.
First documented use is in December, 1987.

 
 - When did the ISO 10646 project start?

Unfortunately, the document register for early WG2 documents doesn't
have dates for all the early documents, and I don't have all the
early documents to check. But...

The 4th meeting of WG2 was held in London in February, 1986. The
first three meetings were in Geneva, Turin, and London, respectively.
That puts the likely timeframe for the Geneva meeting, and the
establishment of WG2 by SC2 at about 1984. The *only* project for WG2
was 10646.

Some of the older oldtimers on the list may have more exact information
about the early WG2 work.

 
 - When did Unicode and ISO 10646 merge?

It wasn't a single date that can be pointed to, like the signing
of an armistice. In some respects, Unicode and ISO 10646 are *still*
merging, as modifications and amendments to deal with niggling little
architectural edge cases are worked out.

However the key dates were:

January 3, 1991. Incorporation of the Unicode Consortium, which
   signalled to SC2 that the Unicoders were serious in their
   intentions.

May, 1991. Meeting #19 of WG2 in San Francisco. An ad hoc meeting
   took place between WG2 members and some Unicoders, which paved
   the way for the later merger of the standards.

June, 1991. The 10646 DIS 1 was defeated in its ballotting. This left
   the only reasonable way forward an architectural compromise with
   the Unicode Standard, which at that point was in copy edit and
   about to go to press.

June 3, 1991. The date of 10646M proposal draft to merge Unicode and
   10646, by Ed Hart. This was a key document in the resulting
   merger of features.

August, 1991. The Geneva WG2 meeting accepted Han unification, combining
   marks, dropped byte-by-byte restrictions on code values for UCS-2,
   and accepted Unicode repertoire additions. From that point forward,
   the overall aspect of what became ISO/IEC 10646-1:1993 was clear.

 
 - What is the name of the GB and JIS standards that have the same repertoire
 as Unicode?

GB 13000 has the same repertoire as ISO/IEC 10646-1:1993.
JIS X 0221 has the same repertoire as ISO/IEC 10646-1:1993.

Those two were effectively national publications of 10646. You can
work out the correlations with Unicode from that.

GB 18030:2000 in principle has the same repertoire (but different
encoding) as ISO/IEC 10646-1:2000, i.e. the same as Unicode 3.0.
(But there were small problems in it.) However, the 4-byte form
of GB 18030 maps all Unicode code points, assigned or not, so
it will (in theory, at least) always have the same repertoire
as Unicode.

 
 - When did Unicode stop to be 16 bits? (I.e., when were surrogates added?)

In terms of publication, with Unicode 2.0 in 1996. However, 

Re: Questions about Unicode history

2002-01-30 Thread Otto Stolz

Marco,

some of your questions probalbly are answered in Roman Czyborra's
WWW pages, particularly in
- http://czyborra.com/unicode/standard.html,
- http://czyborra.com/charsets/iso646.html,
- http://czyborra.com/charsets/iso8859.html,
- http://czyborra.com/charsets/cjk.html,
- http://czyborra.com/charsets/codepages.html.

 - When did Unicode and ISO 10646 merge?


The merger was initiated by an informal meeting of Unicode, and WG2
members, during the JTC1/SC2/WG2 meeting in San Francisco, Cali-
fornia, USA, in May 1991. At that time, ISO DIS 10646 (the 1st one)
was still in ballot, so no formal discussion, let alone an agreement,
was allowed by JTC1's rules.

By mid-July, DIS 10646 was formally voted down (P-members: 8 YES,
11 NO, 2 abstained; O-members: 1 YES, 3 NO, 0 abstained). 9 out
of 14 NO votes mentioned the merger (only one universal code),
in their national comments.

The merger, and the basic architecture, were agreed on, at the
ISO-IEC JTC1/Sc2/WG2 meeting in Geneva, Switzerland, August 19th
through 23rd, 1991

In Octobre 1991, ISO SC2 plenary (in Rennes, France) unanimously
authorized WG2 to issue a new DIS 10646 in January 1992 for a
4-month (i. e. shortened) vote.

Best wishes,
   Otto Stolz





RE: Questions about Unicode history

2002-01-30 Thread Alistair Vining

Otto Stolz wrote:

 some of your questions probalbly are answered in Roman Czyborra's
 WWW pages, particularly in

 [czyborra.com addresses snipped]

I just found:
http://www.cwi.nl/~dik/english/codes/stand.html
whose author (Dik Winter) notes that he 'stop[s] approximately where Roman Czyborra
starts'.  Thai EBCDIC, JISCII, 6-bit ISO codes, ASCII-1963 etc.  Looks very thorough
to me, but I wasn't there...

Al.