In a message dated 2002-01-02 5:05:23 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

> There are worse things than thi: what if someone discovers a script with
> more than 1,114,111 characters? Back to the drawing board to redesign all
> the UTF's!

Not all of them.  UTF-8 and UTF-32, at least, already have the architecture 
to represent 2^31 and 2^32 code points, respectively.  The definitions would 
simply have to changed to make the additional code points legal.

Only UTF-16 would truly need to be redesigned, and that has already been 
proposed.  For example, Masahiko Maedera once proposed a "UTF-16x" in which 
code points in the U+EExxx block were designated as "super surrogates."  
Three of these "super surrogates," or six 16-bit words, would be combined to 
represent code points beyond plane 17.  (This was back in the days when some 
people felt that a great and crippling schism existed between Unicode and ISO 
10646 because the former disallowed such code points and the latter allowed 
them.)

-Doug Ewell
 Fullerton, California

Reply via email to