[ 
https://issues.apache.org/jira/browse/XERCESC-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230033#comment-15230033
 ] 

Ian Young commented on XERCESC-2065:
------------------------------------

There are two parts to the original report: the first is the apparent removal 
of the "text", which I will leave Scott to come back on.

The more important part is that in the output, the #xD is not re-expressed as 
"& # 1 3 ;" but is left as a "bare" CR character. This is problematic because 
if that output is then read in *again*, the result will not be identical to the 
original document as the CR will be normalised as an end-of-line.

> Carriage return entities are not handled properly
> -------------------------------------------------
>
>                 Key: XERCESC-2065
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2065
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: DOM, Non-Validating Parser, SAX/SAX2
>    Affects Versions: 3.1.3
>            Reporter: Scott Cantor
>            Priority: Critical
>
> Documents with CR entities don't seem to round trip properly in the parser if 
> you parse them and then serialize them. It's possible the bug is in the 
> serializer because signed documents don't end up with corrupt signatures, but 
> that may be due to insufficient testing as of yet.
> A simple example:
> {code}
> <?xml version="1.0" encoding="UTF-8"?>
> <foo>
>    text&#13;more&lt;&amp;
> </foo>
> {code}
> Running that through DOMPrint or SAX2Print:
> {code}
> <foo>
> more&lt;&amp;
> </foo>
> {code}
> Notice the CR entity is removed, but also all of the characters immediately 
> in front of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org

Reply via email to