Message: A new issue has been created in JIRA.
--------------------------------------------------------------------- View the issue: http://issues.apache.org/jira/browse/XERCESC-1226 Here is an overview of the issue: --------------------------------------------------------------------- Key: XERCESC-1226 Summary: Parser reports bogus content when parsing Type: Bug Status: Unassigned Priority: Major Project: Xerces-C++ Components: SAX/SAX2 Versions: Nightly build (please specify the date) Assignee: Reporter: David Bertoni Created: Thu, 10 Jun 2004 9:42 AM Updated: Thu, 10 Jun 2004 9:42 AM Environment: All platforms Description: When parsing the following document, the parser reports garbage characters. <?xml version="1.0"?> <subject>Research [𝔸]rticle</subject> I traced this down to this function in XMLReader, starting on line 612: inline bool XMLReader::isPlainContentChar(const XMLCh toCheck) { return ((fgCharCharsTable[toCheck] & gPlainContentCharMask) != 0); } Apparently, for the character "]" (U+005D RIGHT SQUARE BRACKET), the flags in fgCharCharsTable indicate it's not plain content. This causes the parser to misbehave badly, and deliver broken character data, including unpaired low surrogates. When I used the debugger, and returned "true" from this function, rather than false, the parser delivered the correct character data. --------------------------------------------------------------------- JIRA INFORMATION: This message is automatically generated by JIRA. If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa If you want more information on JIRA, or have a bug to report see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
