I'm having some problems with XML/UTF8 and CGI
variables in perl5.6.1
I have attached an example of the problem, an example
string is Descripción - although you will need to have
XML::Simple installed.
The example takes an
input string and then prints it twice - one with concatenation another just
displaying the inputted string. The mangling occurs when you concatenate an XML
string with a CGI string.
I'm not sure why
this happens but here is a first attempt at a possible theory. All XML parsing
is done in UTF8, but perl has no idea of encodings for incomding CGI streams and
assumes them to be iso-88591 (latin1) - I read this somewhere don't know if
its correct. String operations upgrade none UTF8 strings to UTF8, so perl tries
to convert the CGI string from iso-88591 to UTF8 thus mangling it as its already
UTF8.
Can any point me in the right direction, explain where
I'm going wrong and maybe provide some usefull links - there seems to
be very little information on building internationalised web pages with UTF8 and
perl5.6.1.
Thanks
Mark
testUTF8.pl
Description: Binary data