UTF8 parse failure when there's a bom in the utf8 header
--------------------------------------------------------
Key: XERCESC-1385
URL: http://issues.apache.org/jira/browse/XERCESC-1385
Project: Xerces-C++
Type: Bug
Components: SAX/SAX2
Versions: 2.6.0
Environment: OSX, CodeWarrior 9.4
Reporter: Miklos Fazekas
Attachments: test.cpp
This issue probably related to 1284. (Or a duplicate of it)
The attached sample code failes with Xerces2.6.
The problem seems to be that there's double checing for the utf8 bom. Bellow is
a patch to XMLParser.cpp that resolves this issue. [ The bug is that we've
already detected utf8 bom and modified fRawBufIndex, but the seconds check
doesn't takes it into accout. ]
src/xercesc/internal/XMLReader.cpp
@@ -544,7 +544,7 @@
}
// If there's a utf-8 BOM (0xEF 0xBB 0xBF), skip past it.
else {
- const char* asChars = (const char*)fRawByteBuf;
+ const char* asChars = (const char*)(fRawByteBuf +
fRawBufIndex);
if ((fRawBytesAvail > XMLRecognizer::fgUTF8BOMLen )&&
(XMLString::compareNString( asChars
, XMLRecognizer::fgUTF8BOM
It's also possible that we should check if we detected an utf8 bom already as
the following code would probably allow a double utf8 bom.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]