> > What do you mean? Have you seen people using multi-byte encoding
> > in Japan/China/Korea?
>
> You're talking to the wrong person. Japanese data handling is my graduate
> dissertation. :)
>
> The Unified Hangul/Kanji/Ha'nzi' Characters in Unicode (so-called
"Unihan")
> occupy one and only one codepoint each. Legacy data sets (EUC and the
like)
> can be processed internally by being converted to Unicode on entry to the
> core.

Did it buy you much? I don't believe so. Can you give some examples why
random character access is so important? Most people are processing text
linearly.

I have been working with Java for many years. I found that Unicode is the
best excuse people are using for i18n and l10n. English speaking developers,
including me, want to keep their simple mind of english text process, so we
don't have to the real hard work.

Hong

Reply via email to