I'm working through some rather thorny experiments with new XML
support within the browser and I ran into this snippet:
static void switchToUTF16(xmlParserCtxtPtr ctxt)
{
// Hack around libxml2's lack of encoding overide support by manually
// resetting the encoding to UTF-16 before every chunk. Otherwise libxml
// will detect <?xml version="1.0" encoding="<encoding name>"?> blocks
// and switch encodings, causing the parse to fail.
const UChar BOM = 0xFEFF;
const unsigned char BOMHighByte = *reinterpret_cast<const unsigned
char*>(&BOM);
xmlSwitchEncoding(ctxt, BOMHighByte == 0xFF ?
XML_CHAR_ENCODING_UTF16LE : XML_CHAR_ENCODING_UTF16BE);
}
Looking at the libxml2 API, I've been baffled myself about how to
control the character encoding from the outside. This looks like a
serious lack of an essential feature. Anyone know about this above
"hack" and can provide more detail?
--
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."
Bertrand Russell in a footnote of Principles of Mathematics
_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev