Hi everyone,
finally, I found some free time to have another look at my problem, described in http://www.mail-archive.com/[EMAIL PROTECTED]/msg02385.html .
In short, I have an iso-8859-2-encoded XSP page that I want to transform with AxKit. It turns out that after the first processing step, which is XSP, the resulting intermediate XML document is marked as "UTF-8" in its header but in fact it is "double-utf-8-encoded".
Having a closer look at the way AxKit::XSP::SAXParser acts, I realized that before the XSP handler() is called, the data is still iso-8859-2 but the $doc structure returned from $parser->parse_fh() is UTF-8. Nevertheless, $doc->getEncoding() returns 'iso-8859-2'. Because of that, process_node() has the idea the data in $doc is iso-8859-2-encoded and applies encodeToUTF8(). This results in "double-utf" conversion.
I assumed in my patch that $doc is always UTF-8 which fixes my problem. Still, I'm not that much into XML::LibXML internals, so I'm not 100% sure that keeping the encoding name as 'iso-8859-2' by $doc is the right way to do it. If it is, then the SAXParser in XSP.pm makes IMHO the wrong assumption on the $doc content.
I'll have to wait for J�rg to weigh in on this one - he's the one who put in all that decode stuff into XSP.pm - I wasn't sure it was required but he convinced me at the time ;-)
Matt.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
