
here is a program that shows something wrong when using
Text::Iconv with
IO::Scalar or IO::String

read a sample xml file, with an accented character, after
xml parsing (which translates to utf-8), translate back to iso-

also prepare a simple utf-8 string with text::iconv

problem :
when print is done with IO::Scalar or IO::String redirection,
conversion fails.


here the XML is read with XML::Simple or XML::LibXML,
but this is the same with XML::XPath

note Text::Iconv->raise_error(1) raises no error.

source program test:
use warnings;
use XML::Simple;
use Text::Iconv;
use XML::LibXML ;

my $string=q{<?xml version="1.0" encoding="iso-8859-1" ?

my $string2=Text::Iconv->new("iso-8859-1","utf-8")->convert

my $doc =XML::LibXML->new->parse_string($string) ;

my $ref=XML::Simple->new->XMLin($string);

my $C= Text::Iconv->new('UTF-8','ISO-8859-1');

print "\n------------------------------\n";
print $doc->toString, "\n raw:", $ref,"\tconv:", $C->convert
print "\n str2:", $string2,"\tconv:", $C->convert($string2);


source program calltest
use IO::Scalar;
use IO::String;

    my $page;
    my $CACHE =new IO::Scalar \$page ;
    select $CACHE;
    do 'test' ;
    select STDOUT;
    print $page ;

    my $io=IO::String->new(my $tocache);

    select  $io ;
    do 'test' ;
    select STDOUT ;

    print $tocache ;

     do 'test' ;

output of PERL test

<?xml version="1.0" encoding="iso-8859-1"?>

 raw:étude     conv:étude
 str2:étude    conv:étude

raw print is UTF-8, conv print is iso
XML output is in iso directly
Iconv conversion is successful
raw and str2 are the same
all correct

now look at other output :

output of PERL calltest

IO::Scalar ------------------------------

<?xml version="1.0" encoding="iso-8859-1"?>

 raw:étude     conv:étude
 str2:étude  conv:étude


<?xml version="1.0" encoding="iso-8859-1"?>

 raw:étude     conv:étude
 str2:étude  conv:étude


<?xml version="1.0" encoding="iso-8859-1"?>

 raw:étude     conv:étude
 str2:étude    conv:étude

ICONV fails to convert when IO redirection,
 looks like characters are not the same between raw and
string2, which
both should be utf-8 translation of "étude"

the toString method of LIbXML seems to output back in iso-8859-
1, without explicit
translation, but this fails with IO redirection too.

Accédez au courrier électronique de La Poste : www.laposte.net ; 3615 LAPOSTENET (0,13 
€/mn) ; tél : 08 92 68 13 50 (0,34€/mn)"

Reply via email to