Rick Measham <[EMAIL PROTECTED]> writes: >G'day Unicode Gurus and other assorted members of the perl Unicode >community. > >I have a script that attempts to collect translations from Babelfish. >I've posted it below. > >It uses LWP::Useragent to turn an English phrase into Japanese (or any >other language supported by BabelFish)* > >However, once I get the translation out of the page it appears to be >full of null bytes. I've tried various things like Unicode::String or >Encode, but to no avail.
LWP I believe just ships octets about. But it should have a mechanism to tell you the meta-data that HTTP marked those octets with - in this case there should be something like a content-transfer-encoding header that tells _you_ what name to feed to Encode to get bytes as Unicode. You then have to decide how you are going to present the resulting characters in the HTML you are generating. You probably want to re-encode as UTF-8 if presenting mixed languages.