>>>>> On Mon, 11 Nov 2002 23:37:12 -0800, Daisuke Maki <[EMAIL PROTECTED]> said:
> utf82euc( $xml->findvalue( 'foobar' ) ); > where utf82euc() is a convenience function that I wrote which does: > my $octets = decode( 'utf8', $text ); Decode doesn't return octets. > return encode( 'euc-jp', $octets ); Encode doesn't take octets as second parameter. > The problem is that when I call decode(), I get the error > "Cannot decode string with wide characters" This is a typical beginner problem. I had to look up again and again what decode and what encode does. Think from perl's point of view. Decode transforms into perl's internal format, encode transforms from perl's internal format. Treat the internal format as a black box, forget that you know that it has something to do with UTF-8. Decode only works if the octets you feed to it are really in the format you specify. Encode only works if the string is really in perl's internal format. Once that is clear you know you have to have intimate knowledge which format you get from the modules you use. If the manpage doesn't help you, use Devel::Dump to determine what you have: % perl -le ' use XML::LibXML; use Devel::Peek; my $parser = XML::LibXML->new(); my $doc = $parser->parse_string(<<EOT); <foobar>foo\x{100}bar</foobar> EOT my $v = $doc->findvalue("foobar"); Dump $v; ' SV = PVMG(0x823dae8) at 0x81ab1f8 REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x81e4848 "foo\304\200bar"\0 [UTF8 "foo\x{100}bar"] CUR = 8 LEN = 9 You see, XML::LibXML returns strings already in perl's internal format. So you should not call decode() at all on this string, it is ready for use. And rename your variables! The Encode manpage uses the variable names $string and $octets for a reason:-) Hope this helps, -- andreas