Joe Kesselman created XALANJ-2725:
-------------------------------------
Summary: Possible buffer-boundry issue when serializing surrogate
pairs
Key: XALANJ-2725
URL: https://issues.apache.org/jira/browse/XALANJ-2725
Project: XalanJ2
Issue Type: Improvement
Security Level: No security risk; visible to anyone (Ordinary problems in
Xalan projects. Anybody can view the issue.)
Components: Serialization
Reporter: Joe Kesselman
Assignee: Joe Kesselman
XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a
surrogate pair (two UTF-16 units), were not being serialized correctly. We have
a proposed fix for that.
There is reported to still be an edge case when a surrogate pair which crosses
buffer boundaries might not be handled correctly. [~maxfortun] offered what
looks like a reasonable proposed fix
(https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607),
but in my testing this was not serializing the surrogate pairs correctly,
causing regression on the tests XALANJ-2419 introduced. I don't know whether
that's because we're taking multiple paths through
But the edge case does appear to be real, and if so we will need some such
solution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]