Joe Kesselman created XALANJ-2725:
-------------------------------------

             Summary: Possible buffer-boundry issue when serializing surrogate 
pairs
                 Key: XALANJ-2725
                 URL: https://issues.apache.org/jira/browse/XALANJ-2725
             Project: XalanJ2
          Issue Type: Improvement
      Security Level: No security risk; visible to anyone (Ordinary problems in 
Xalan projects.  Anybody can view the issue.)
          Components: Serialization
            Reporter: Joe Kesselman
            Assignee: Joe Kesselman


XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a 
surrogate pair (two UTF-16 units), were not being serialized correctly. We have 
a proposed fix for that.

There is reported to still be an edge case when a surrogate pair which crosses 
buffer boundaries might not be handled correctly. [~maxfortun] offered what 
looks like a reasonable proposed fix 
(https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607),
 but in my testing this was not serializing the surrogate pairs correctly, 
causing regression on the tests XALANJ-2419 introduced. I don't know whether 
that's because we're taking multiple paths through

But the edge case does appear to be real, and if so we will need some such 
solution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to