Damien Guillaume created XALANJ-2560:
----------------------------------------
Summary: ToXMLStream does not support unicode supplementary
characters
Key: XALANJ-2560
URL: https://issues.apache.org/jira/browse/XALANJ-2560
Project: XalanJ2
Issue Type: Bug
Security Level: No security risk; visible to anyone (Ordinary problems in
Xalan projects. Anybody can view the issue.)
Components: Serialization
Affects Versions: 2.7.1
Environment: Xalan 2.7.1 serializer.
Tested on Ubuntu 12.04 with Oracle JDK 1.7.0_05.
Reporter: Damien Guillaume
Assignee: Steven J. Hathaway
org.apache.xml.serializer.ToXMLStream (which extends ToStream) does not support
serialization of unicode supplementary characters such as U+1D49C. It creates
invalid characters entities like "��" instead of "𝒜" (or
F0 9D 92 9C with UTF-8). ToXMLStream is used by LSSerializer when Xalan's
serializer is on the classpath.
org.apache.xml.serialize.DOMSerializerImpl (included in Xerces) does not have
this problem, but it is deprecated since Xerces 2.9.0, so this is a regression.
See
http://stackoverflow.com/questions/11952289/serializing-supplementary-unicode-characters-into-xml-documents-with-java
for more details.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]