[ https://issues.apache.org/jira/browse/XERCESC-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230033#comment-15230033 ]
Ian Young commented on XERCESC-2065: ------------------------------------ There are two parts to the original report: the first is the apparent removal of the "text", which I will leave Scott to come back on. The more important part is that in the output, the #xD is not re-expressed as "& # 1 3 ;" but is left as a "bare" CR character. This is problematic because if that output is then read in *again*, the result will not be identical to the original document as the CR will be normalised as an end-of-line. > Carriage return entities are not handled properly > ------------------------------------------------- > > Key: XERCESC-2065 > URL: https://issues.apache.org/jira/browse/XERCESC-2065 > Project: Xerces-C++ > Issue Type: Bug > Components: DOM, Non-Validating Parser, SAX/SAX2 > Affects Versions: 3.1.3 > Reporter: Scott Cantor > Priority: Critical > > Documents with CR entities don't seem to round trip properly in the parser if > you parse them and then serialize them. It's possible the bug is in the > serializer because signed documents don't end up with corrupt signatures, but > that may be due to insufficient testing as of yet. > A simple example: > {code} > <?xml version="1.0" encoding="UTF-8"?> > <foo> > text more<& > </foo> > {code} > Running that through DOMPrint or SAX2Print: > {code} > <foo> > more<& > </foo> > {code} > Notice the CR entity is removed, but also all of the characters immediately > in front of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org