DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18551>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18551

EncodingMap not consistent

           Summary: EncodingMap not consistent
           Product: Xerces2-J
           Version: 2.2.1
          Platform: All
        OS/Version: Other
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Other
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


I have checked that this is not fixed in the latest CVS.   
I suspect it may be visible in the Serilization subsystem.  
This is *not* a duplicate of bug 4456. 
  
I use xerces.util.EncodingMap within my system in order   
to generate the correct XML declaration on outputting an XML file.  
The API I am coding to, probably mistakenly, allows the user to  
specify a Writer or an OutputStream. In the former case I try and  
find the Java encoding, and then use the facilities of EncodingMap  
to convert it to a IANA encoding.  
  
However, the two mapping Java2IANA and IANA2Java are not consistent.  
  
My understanding is that IANA2Java is a many-to-one whereas   
Java2IANA is a one-to-one subrelation of the inverse of IANA2Java.  
  
I suspect that IANA2Java is better tested than Java2IANA.  
  
To fix this for me I added the following code to my class:  
  
abstract public class BaseXMLWriter { 
  
        static private class Fake extends EncodingMap {  
                static {  
                        Iterator it = EncodingMap.fJava2IANAMap.entrySet().iterator(); 
 
                        while (it.hasNext()) {  
                                Map.Entry me = (Map.Entry) it.next();  
                                if (!me  
                                        .getKey()  
                                        
.equals(EncodingMap.fIANA2JavaMap.get(me.getValue()))) {  
                                        System.err.println(  
                                                "?1? " + me.getKey() + " => " + 
me.getValue());  
                                }  
                        }  
                        it = EncodingMap.fIANA2JavaMap.entrySet().iterator();  
                        while (it.hasNext()) {  
                                Map.Entry me = (Map.Entry) it.next();  
                                if (null == 
EncodingMap.fJava2IANAMap.get(me.getValue())) {  
                                        System.err.println(  
                                                "?2? " + me.getKey() + " => " + 
me.getValue());  
                    EncodingMap.fJava2IANAMap.put(me.getValue(),me.getKey());  
                                }  
                        }  
  
                }  
                static void foo() {  
                }  
        }  
        static {  
                Fake.foo();  
        }  
   
   
It causes a static initializer to run over the two tables within EncodingMap and 
copies some of the entries from fJava2IANAMap to fIANA2JavaMap. 
 
The output it produces, indicating problems is: 
    
    
?1? Cp01149 => IBM01149    
?1? Cp01148 => IBM01148    
?1? Cp01147 => IBM01147    
?1? Cp01146 => IBM01146    
?1? Cp01145 => IBM01145    
?1? Cp01144 => IBM01144    
?1? Cp01143 => IBM01143    
?1? Cp01142 => IBM01142    
?1? Cp01141 => IBM01141    
?1? Cp01140 => IBM01140    
?1? CP1047 => IBM1047    
?2? IBM-367 => ASCII    
?2? IBM-1149 => Cp1149    
?2? IBM-1148 => Cp1148    
?2? IBM-1147 => Cp1147    
?2? IBM-1146 => Cp1146    
?2? IBM-1145 => Cp1145    
?2? IBM-1144 => Cp1144    
?2? IBM-1143 => Cp1143    
?2? IBM-1142 => Cp1142    
?2? IBM-1141 => Cp1141    
?2? IBM-1140 => Cp1140    
?2? IBM-1047 => Cp1047    
?2? WINDOWS-1258 => Cp1258    
?2? WINDOWS-1257 => Cp1257    
?2? WINDOWS-1256 => Cp1256    
?2? WINDOWS-1255 => Cp1255    
?2? WINDOWS-1254 => Cp1254    
?2? WINDOWS-1253 => Cp1253    
?2? WINDOWS-1252 => Cp1252    
?2? WINDOWS-1251 => Cp1251    
?2? WINDOWS-1250 => Cp1250    
?2? TIS-620 => TIS620    
    
 
I suspect that every line of this output is a small error in the 
tables. 
(The presenting problem was Cp1252). 
 
I suggest that code based on my code be added to the test suite 
for Xerces.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to