Re: string encoding

Dan Sugalski Fri, 16 Feb 2001 12:35:20 -0800
At 12:32 PM 2/16/2001 -0800, Hong Zhang wrote:
> > > What do you mean? Have you seen people using multi-byte encoding
> > > in Japan/China/Korea?
> >
> > You're talking to the wrong person. Japanese data handling is my graduate
> > dissertation. :)
> >
> > The Unified Hangul/Kanji/Ha'nzi' Characters in Unicode (so-called
>"Unihan")
> > occupy one and only one codepoint each. Legacy data sets (EUC and the
>like)
> > can be processed internally by being converted to Unicode on entry to the
> > core.
>
>Did it buy you much? I don't believe so. Can you give some examples why
>random character access is so important? Most people are processing text
>linearly.

Most, but not all. And as this is the internals list, we have to deal with 
all. We can't choose a convenient subset and ignore the rest. (No matter 
how much I might like to...)

>I have been working with Java for many years. I found that Unicode is the
>best excuse people are using for i18n and l10n. English speaking developers,
>including me, want to keep their simple mind of english text process, so we
>don't have to the real hard work.

Okay, this paragraph made no sense to me, but it feels like it's saying 
something that's important. Could you try again?

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk
Re: string encoding

Reply via email to