DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12105>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12105 UTF Encoding is not preserved Summary: UTF Encoding is not preserved Product: XalanJ2 Version: 2.4Dx Platform: PC OS/Version: Linux Status: NEW Severity: Normal Priority: Other Component: org.apache.xalan.serialize AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] I recently began using xalan for tranforming data that includes a few UTF encoded characters. My transforms go from XML to XML and I would like to preserve the UTF encoded chars rather than escape them, as seems to be the behavior in xalan (for example the UTF char of int value 146 gets encoded in ASCII as ’). When transforming using XML Spy, however, this UTF encoding is preserved (but I want to use xalan instead!). I wonder if this could be considered a bug or just an implementataion decision? It seems, however, that if the output is meant to be encoded as UTF, why escape UTF chars coming from the input? I was able to "correct" this problem by making the following change to the code in org/apache/xalan/serialize/SerializerToXML.java: In method public boolean canConvert(char ch): Changed the line: return bool.booleanValue() ? !Character.isISOControl(ch) : false; To: return bool.booleanValue() ? !Character.isUnicodeIdentifierStart(ch) || ! Character.isUnicodeIdentifierPart(ch) || !Character.isISOControl(ch) : false;
