Rick Measham <[EMAIL PROTECTED]> writes:
>G'day Unicode Gurus and other assorted members of the perl Unicode
>community.
>
>I have a script that attempts to collect translations from Babelfish.
>I've posted it below.
>
>It uses LWP::Useragent to turn an English phrase into Japanese (or any
>other language supported by BabelFish)*
>
>However, once I get the translation out of the page it appears to be
>full of null bytes. I've tried various things like Unicode::String or
>Encode, but to no avail. 

LWP I believe just ships octets about.

But it should have a mechanism to tell you the meta-data that 
HTTP marked those octets with - in this case there should
be something like a content-transfer-encoding header that
tells _you_ what name to feed to Encode to get bytes as Unicode.
You then have to decide how you are going to present the resulting 
characters in the HTML you are generating. You probably want 
to re-encode as UTF-8 if presenting mixed languages.


Reply via email to