Re[2]: [htdig] A language issue.. Could you give me a favor?

Geoff Hutchison Wed, 22 Mar 2000 19:06:06 -0800

At 9:38 AM +0900 3/23/00, Oskar Bartenstein wrote:
>Boils down to 2 questions (sorry I never looked at the source code):
>       - is htdig 8-bit clean?
>       - is htdig words and dictionaries sequences of bytes?
>If both is yes, then I would guess the core is ok,
>and we only have to look at how to use it properly.
>Hope I did not overlook a parsing issue.

It is 8-bit clean, but it treats characters as synonymous with 8 
bits. Many parts of the code (the String class in particular) assume 
that a character is only 1 byte and keeps going. In many encodings, 
this is *not* the case, and so you're stuck.

>A correct HTML page includes info about its encoding, therefore
>htdig on the receiving end can convert it to any code it likes.

Yes, provided that it has code to convert from one encoding into 
another. :-) This is the crux of the problem. Currently ht://Dig 
assumes the host system has working locale support and is getting the 
pages in the default encoding of the system. If they're not, it 
assumes they are anyway. :-) It makes no attempt to convert character 
encodings.

Basically, if you have an Latin-1 encoding for your character-set, 
you're OK. That's the limit of the current i18n.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Re[2]: [htdig] A language issue.. Could you give me a favor?

Reply via email to