Hello all.
I have a SAX2 parser reading from a MemoryBufferInputSource which is initialized
with a buffer of utf-8 encoded xml.
When reading the sequence C3 AA (which is 'e' with upper circumflux) the characters()
method
receives 00 c3 00 aa, which is wrong ! The right value of 'e' with upper circumflux
in UTF16 is 00EA.
Does anyone know what is going on ?
Am I correct in assuming the parser passes the handler methods UTF16 regardless of the
input encoding ?
thanks to anyone who can enlight me on this one.
Alex.
p.s. The code I am using looks like so:
someFunction(char *xml, int32_t xml_len) {
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
//ExtractHandler is derived from DefaultHandler
ExtractHandler extHandler(main_handle, result_container, xml, absLoc, result_name,
result_name_len);
parser->setContentHandler(&extHandler);
parser->setErrorHandler(&extHandler);
MemBufInputSource mbis((XMLByte*)xml, xml_len, "", false);
mbis.setCopyBufToStream(false); //not necessary to duplicate the buffer
parser->parse(mbis);
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]