On Fri, Jul 04, 2008 at 12:57:40PM +0200, Michael Ludwig wrote: > I stumbled upon an oddity in LibXSLT: Element and attribute names end up > containing character references in the output when the characters are > not available in the selected output encoding. > > http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/200807/msg00057.html > > This oddity is actually a bug, so I reported it here: > > http://bugzilla.gnome.org/show_bug.cgi?id=541529
You ask for something impossible. You get a non-xml document instead of getting an immediate failure. It's a trade-off, unrelated to libxslt, it's actually in libxml2. The transcoding is done on a preserialized UTF-8 document (or document fragment), detecting the error means each time a character is not serializable in the target encoding, when issuing the escaped sequence to do a rewind lookup and try to guess (it's guessing because at that point you're manipulating strings there is no notion of document structure) if you're within markup or within content. Basically it makes everybody pay a rather hight cost for the few who asked for something impossible. The current state is there since the beginning of libxml2 (nearly a decade) so the bug is extremely uncommon. This makes me even less comfortable with the expansion of the cost. Again, it's a trade-off, a concious one, for more informations see libxml2 encoding.c around line 2057 that's where the escaping is done. If you see another way to handle this not penalizing heavilly the normal process, I'm all for fixing this. But right now I don't see a solution. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ _______________________________________________ xslt mailing list, project page http://xmlsoft.org/XSLT/ [email protected] http://mail.gnome.org/mailman/listinfo/xslt
