Message:
The following issue has been resolved as FIXED.
Resolver: Alberto Massari
Date: Tue, 6 Jul 2004 8:55 AM
A fix is in CVS. Please verify.
Alberto
---------------------------------------------------------------------
View the issue:
http://issues.apache.org/jira/browse/XERCESC-1226
Here is an overview of the issue:
---------------------------------------------------------------------
Key: XERCESC-1226
Summary: Parser reports bogus content when parsing
Type: Bug
Status: Resolved
Priority: Major
Resolution: FIXED
Project: Xerces-C++
Components:
SAX/SAX2
Versions:
Nightly build (please specify the date)
Assignee:
Reporter: David Bertoni
Created: Thu, 10 Jun 2004 9:42 AM
Updated: Tue, 6 Jul 2004 8:55 AM
Environment: All platforms
Description:
When parsing the following document, the parser reports garbage characters.
<?xml version="1.0"?>
<subject>Research [𝔸]rticle</subject>
I traced this down to this function in XMLReader, starting on line 612:
inline bool XMLReader::isPlainContentChar(const XMLCh toCheck)
{
return ((fgCharCharsTable[toCheck] & gPlainContentCharMask) != 0);
}
Apparently, for the character "]" (U+005D RIGHT SQUARE BRACKET), the flags in
fgCharCharsTable indicate it's not plain content. This causes the parser to misbehave
badly, and deliver broken character data, including unpaired low surrogates.
When I used the debugger, and returned "true" from this function, rather than false,
the parser delivered the correct character data.
---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]