At 6:21 PM -0500 9/7/99, Tom Tromey wrote:
>I wasn't planning to do it at all.  When using Utf-8, you can simply
>use the ordinary strcmp, strncmp, etc.  unicode_strlen is special as
>it returns the number of characters (not bytes) in the string.
..
>Henry Spencer's latest regexp package will deal with Utf-8.  This is
>what Tcl uses.

In other words, you're suggesting that the internal representation be done
in UTF-8 and everything else be transformed to that? Interesting.

I haven't looked at any code, but I got the idea that Java stores
everything in Unicode (UCS-2?) internally. Is there any benefit to one
approach over the other?


-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to