At 05:27 PM 7/20/2001, Chris Hill wrote:
>I was running my modified version of Xerces 1.4 against a test suite and I 
>noticed that the character 0x05bc isn't marked in the fgCharCharsTable as 
>a NameChar.  Am I correct that the table is in error for this 
>character?  I would have thought this table would have been checked 
>against the spec.  Perhaps there are more characters which are not 
>correctly flagged?  I'm not exactly sure what all the flags in 
>XMLReader.hpp mean.

I wrote a program to compare the fgCharCharsTable to NameChar [4] in the 
XML specification.  I came up with the following differences:

{"#x05BC", "[#x064C-#x0651]", "#x06DE"}

These characters should are valid name characters according to the spec but 
not according to Xerces' table.

As I mentioned before, I don't know what all Xerces' masks are for.  Should 
I open a bug about this?

>One question I had is about the parser's support for surrogate pairs.  It 
>appears that is some places in the code there is support for surrogate 
>pairs mixed with calls to isXMLChar (which surrogate characters aren't) 
>such as XMLScanner::basicAttrValueScan.  I noticed that comments and PI's 
>can't contain characters > 0xFFFF.  As far as I can tell surrogates are 
>fine between start and end tags.  Is this a work in progress?

Comments?

Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to