whitespace normalization for whitespace facet removes character references
--------------------------------------------------------------------------

                 Key: XERCESJ-1475
                 URL: https://issues.apache.org/jira/browse/XERCESJ-1475
             Project: Xerces2-J
          Issue Type: Bug
          Components: DOM (Level 3 Load & Save), JAXP (javax.xml.parsers), XML 
Schema 1.0 Datatypes, XML Schema 1.1 Datatypes
            Reporter: Martin Thomson


Parsing an element that has a simple type with the whitespace facet set to 
collapse or replace results in character references being normalized.  
Character references must not be replaced or collapsed by whitespace 
normalization.

For example, <x>&#x20;a &#xA; b</x> should produce a value of " a \n b" in the 
PSVI.  Instead, it produces "a b" if the whitespace facet is collapse (for 
example, when the type of <x> is xs:token).  The character references are 
replaced prior to normalization and are not properly preserved.

The description of the whitespace facet [1] does not make this immediately 
apparent, but it is relatively explicit in XML [2], though the text and example 
seem to be in conflict on the use of a character reference for the space 
character (&#x20;).

[1] http://www.w3.org/TR/xmlschema11-2/#rf-whiteSpace
[2] http://www.w3.org/TR/xml11/#AVNormalize

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to