Hello, I met a trouble just describing blow, XML::Parser Encoding (UTF-8 -> ISO-8859-1) http://www.netwise.it/xml/perlmonks/?node_id=197119
I'm porting some scripts written when perl 5.005 to perl 5.8. These scripts are using, XML::Grove 0.46 alpha XML::Parser 2.34 XML::Parser::PerlSAX 0.07 I upgraded perl to 5.8, but above modules are unchanged. In old time the man making these scripts was awaring that multibyes strings are not treated well, so they desided to "URL-encode" these multibyes like; '%82%B5%82%E8%81%5B%82%B8', and get XML::Parser to parse them, then to "URL-decode" into; x"82B582E8815B82B8" (8 bytes long). $str =~ s/%([0-9A-Fa-f][0-9A-Fa-f])/pack('H2', $1)/eg; But now this "URL-decode" ended into "unpack('H*',$str)"; x"C282C2B5C282C3A8C2815BC282C2B8", should be utf8. The length function retuens 8 but 15 under "use bytes;". I tried these in error; (1) Insert "no encoding;" into the main script, after all "use xxxx". (2) call the next after "URL-decode". sub de_utf8 { use bytes; return "$_[0]"; } (3) $parser->parse( Source => { SystemId => $file_name, ProtocolEncoding => "iso-8859-1" }); (4) $parser->parse( Source => { SystemId => $file_name, Encoding => "iso-8859-1" }); (3),(4) are test with '<?xml version="1.0" encoding="iso-8859-1"?>' And finaly the next way is good; use utf8; my $latin = pack("C*", unpack('U*', $utf)); My tests may not enough and missing something. Now I can strip this (maybe) utf8 flags after "URL-decode", but these data becomes utf8 again just before DB2 insert modules. These data is complicated hash references. The code of "URL-decode" is in a module, so I can rewrite it at once, but DB2 insert modules are not one, 3 or 4. After all, I reached into this point after 1 week. I can't figure out the best solution. I like to know the encoding mechanism of XML::Parser, and how to know the current encoding status, how to suppress utf8 effectively. Regards, Hirosi Taguti _______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs