DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18551>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18551 EncodingMap not consistent Summary: EncodingMap not consistent Product: Xerces2-J Version: 2.2.1 Platform: All OS/Version: Other Status: NEW Severity: Normal Priority: Other Component: Other AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] I have checked that this is not fixed in the latest CVS. I suspect it may be visible in the Serilization subsystem. This is *not* a duplicate of bug 4456. I use xerces.util.EncodingMap within my system in order to generate the correct XML declaration on outputting an XML file. The API I am coding to, probably mistakenly, allows the user to specify a Writer or an OutputStream. In the former case I try and find the Java encoding, and then use the facilities of EncodingMap to convert it to a IANA encoding. However, the two mapping Java2IANA and IANA2Java are not consistent. My understanding is that IANA2Java is a many-to-one whereas Java2IANA is a one-to-one subrelation of the inverse of IANA2Java. I suspect that IANA2Java is better tested than Java2IANA. To fix this for me I added the following code to my class: abstract public class BaseXMLWriter { static private class Fake extends EncodingMap { static { Iterator it = EncodingMap.fJava2IANAMap.entrySet().iterator(); while (it.hasNext()) { Map.Entry me = (Map.Entry) it.next(); if (!me .getKey() .equals(EncodingMap.fIANA2JavaMap.get(me.getValue()))) { System.err.println( "?1? " + me.getKey() + " => " + me.getValue()); } } it = EncodingMap.fIANA2JavaMap.entrySet().iterator(); while (it.hasNext()) { Map.Entry me = (Map.Entry) it.next(); if (null == EncodingMap.fJava2IANAMap.get(me.getValue())) { System.err.println( "?2? " + me.getKey() + " => " + me.getValue()); EncodingMap.fJava2IANAMap.put(me.getValue(),me.getKey()); } } } static void foo() { } } static { Fake.foo(); } It causes a static initializer to run over the two tables within EncodingMap and copies some of the entries from fJava2IANAMap to fIANA2JavaMap. The output it produces, indicating problems is: ?1? Cp01149 => IBM01149 ?1? Cp01148 => IBM01148 ?1? Cp01147 => IBM01147 ?1? Cp01146 => IBM01146 ?1? Cp01145 => IBM01145 ?1? Cp01144 => IBM01144 ?1? Cp01143 => IBM01143 ?1? Cp01142 => IBM01142 ?1? Cp01141 => IBM01141 ?1? Cp01140 => IBM01140 ?1? CP1047 => IBM1047 ?2? IBM-367 => ASCII ?2? IBM-1149 => Cp1149 ?2? IBM-1148 => Cp1148 ?2? IBM-1147 => Cp1147 ?2? IBM-1146 => Cp1146 ?2? IBM-1145 => Cp1145 ?2? IBM-1144 => Cp1144 ?2? IBM-1143 => Cp1143 ?2? IBM-1142 => Cp1142 ?2? IBM-1141 => Cp1141 ?2? IBM-1140 => Cp1140 ?2? IBM-1047 => Cp1047 ?2? WINDOWS-1258 => Cp1258 ?2? WINDOWS-1257 => Cp1257 ?2? WINDOWS-1256 => Cp1256 ?2? WINDOWS-1255 => Cp1255 ?2? WINDOWS-1254 => Cp1254 ?2? WINDOWS-1253 => Cp1253 ?2? WINDOWS-1252 => Cp1252 ?2? WINDOWS-1251 => Cp1251 ?2? WINDOWS-1250 => Cp1250 ?2? TIS-620 => TIS620 I suspect that every line of this output is a small error in the tables. (The presenting problem was Cp1252). I suggest that code based on my code be added to the test suite for Xerces. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
