At 12:32 PM 2/16/2001 -0800, Hong Zhang wrote:
> > > What do you mean? Have you seen people using multi-byte encoding
> > > in Japan/China/Korea?
> >
> > You're talking to the wrong person. Japanese data handling is my graduate
> > dissertation. :)
> >
> > The Unified Hangul/Kanji/Ha'nzi' Characters in Unicode (so-called
>"Unihan")
> > occupy one and only one codepoint each. Legacy data sets (EUC and the
>like)
> > can be processed internally by being converted to Unicode on entry to the
> > core.
>
>Did it buy you much? I don't believe so. Can you give some examples why
>random character access is so important? Most people are processing text
>linearly.
Most, but not all. And as this is the internals list, we have to deal with
all. We can't choose a convenient subset and ignore the rest. (No matter
how much I might like to...)
>I have been working with Java for many years. I found that Unicode is the
>best excuse people are using for i18n and l10n. English speaking developers,
>including me, want to keep their simple mind of english text process, so we
>don't have to the real hard work.
Okay, this paragraph made no sense to me, but it feels like it's saying
something that's important. Could you try again?
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk