At 10:33 am +1100 18/11/04, Rick Measham wrote:

That being the case, I grab the charset and use Encode's decode function
to turn it into 'perl's internal format' .. which in 5.8.5 is utf8
right? I then store that in the db.

What happens if you do something like this? :


my $uri = 'http://www.lemonde.fr'; my $fin = '/tmp/latin1.html'; my $fout = '/tmp/utf8.html'; my $charsetin = "text/html; charset=iso-8859-1"; my $charsetout = "text/html; charset=UTF-8"; `curl -o $fin $uri` ; open(FIN, "<:encoding(iso-8859-1)",$fin); open(FOUT, ">:encoding(utf8)", $fout); for (<FIN>) { chomp; $_ .= $/; s~$charsetin~$charsetout~ig; print FOUT; print; } `open $fout`;

JD



Reply via email to