finally, I found some free time to have another look at my problem, described in http://www.mail-archive.com/[EMAIL PROTECTED]/msg02385.html .
In short, I have an iso-8859-2-encoded XSP page that I want to transform with AxKit. It turns out that after the first processing step, which is XSP, the resulting intermediate XML document is marked as "UTF-8" in its header but in fact it is "double-utf-8-encoded".
Having a closer look at the way AxKit::XSP::SAXParser acts, I realized that before the XSP handler() is called, the data is still iso-8859-2 but the $doc structure returned from $parser->parse_fh() is UTF-8. Nevertheless, $doc->getEncoding() returns 'iso-8859-2'. Because of that, process_node() has the idea the data in $doc is iso-8859-2-encoded and applies encodeToUTF8(). This results in "double-utf" conversion.
I assumed in my patch that $doc is always UTF-8 which fixes my problem. Still, I'm not that much into XML::LibXML internals, so I'm not 100% sure that keeping the encoding name as 'iso-8859-2' by $doc is the right way to do it. If it is, then the SAXParser in XSP.pm makes IMHO the wrong assumption on the $doc content.
In my original post, there was an additional factor of having a Debian box behaving differently than my FreeBSD development machine, but it turns out encodeToUTF8() on Debian perfomed no conversion at all, but I didn't investigate it further.
Cheers,
-- Andrzej
*** XSP.pm_ORG Wed Mar 19 10:22:21 2003
--- XSP.pm Wed Mar 19 14:25:50 2003
***************
*** 945,952 ****
}
AxKit::Debug(10, 'XSP: Parser returned doc');
$doc->process_xinclude;
!
! my $encoding = $doc->getEncoding() || 'UTF-8';
my $document = { Parent => undef };
$self->{Handler}->start_document($document);
--- 945,952 ----
}
AxKit::Debug(10, 'XSP: Parser returned doc');
$doc->process_xinclude;
!
! my $encoding = 'UTF-8'; # XML::LibXML::Document returned from parse_* is
always UTF-8
my $document = { Parent => undef };
$self->{Handler}->start_document($document);
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
