At 10:25 PM -0500 10/27/98, Jeff Breidenbach wrote:
>>process single double character is not enough, we MUST implement a
>>mechanics to detect Chinese Words.
>Are you saying Chinese support would be specific to Chinese and
>not automatically support Korean, Japanese, and other two byte (or
>Unicode) character sets?
I'm certainly not familiar enough with Chinese or storage of Chinese
documents to know. It seems to imply a requirement to "detect Chinese
words." Granted, changes to the String class and other double-byte
character changes, etc., that I mentioned earlier would make supporting
other languages much easier.
If supporting Unicode would help (or obsolete) any changes needed for a
specific character set, I think that's the best bet. We currently have
problems with some accented high-byte characters too.
Does anyone know of programs that parse double-byte text files. It need not
be HTML or SGML, but it would be useful to look at example code.
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.