Encodings.properties

Joe Kesselman (Jira) Sat, 27 Jan 2024 07:43:03 -0800


     [ 
https://issues.apache.org/jira/browse/XALANJ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Joe Kesselman resolved XALANJ-2618.
-----------------------------------
    Fix Version/s: The Latest Development Code
       Resolution: Fixed

Merged my fix. 

> Error in org/apache/xml/serializer/Encodings.properties
> -------------------------------------------------------
>
>                 Key: XALANJ-2618
>                 URL: https://issues.apache.org/jira/browse/XALANJ-2618
>             Project: XalanJ2
>          Issue Type: Bug
>      Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>          Components: Serialization, transformation
>    Affects Versions: 2.7.2
>         Environment: Java 11
>            Reporter: Simon Schaarschmidt
>            Assignee: Joe Kesselman
>            Priority: Major
>              Labels: Java11
>             Fix For: The Latest Development Code
>
>
> We transform and serialize using encoding ISO-8859-1. With JDK 1.8 all is 
> fine, but with OpenJDK 11 the result will be written (from class 
> ToTextStream) in character references, e.g. 
> "*&amp;#105;&amp;#100;&amp;#61;&amp;#49;*" instead of "*id=1*".
> In org/apache/xml/serializer/Encodings.properties (serializer.jar) are 
> various encodings defined, e.g.
> {{ISO8859-1  ISO-8859-1  0x00FF}}
> {{ISO8859_1  ISO-8859-1  0x00FF}}
> {{{color:#ff0000}8859-1{color}     ISO-8859-1  0x00FF}}
> {{{color:#ff0000}8859_1{color}     ISO-8859-1  0x00FF}}
> First value: Java encoding name
> Second value: comma separated preferred mime names.
> The class org.apache.xml.serializer.Encodings reads this file in a Properties 
> object and processes the definitions to create EncodingInfo objects and puts 
> them (see method loadEncodingInfo()) into the member fields 
> __encodingTableKeyJava_ and __encodingTableKeyMime_ (both Hashtable). 
> Especially putting Elements into _encodingTableKeyMime is critical because 
> there is not a 1:1 mapping and the latest returned Properties.keys() element 
> replaces the previous ElementInfo object.
> Until Java 1.8 the first line from above is the latest entry in Enumeration, 
> therefor _encodingTableKeyMime returns the EncodingInfo object with Java 
> encoding "{color:#14892c}ISO8859-1{color}" for encoding "ISO-8859-1". With 
> Java 11 the elements of the Enumeration returned by Properties.keys() has a 
> different order: the third line from above is the latest entry! Therefor 
> _encodingTableKeyMime returns the EncodingInfo object with Java encoding 
> "*{color:#ff0000}8859-1{color}*" when asking for encoding "ISO-8859-1". But: 
> "8859-1" ist not a valid Java encoding name! Method 
> EncodingInfo.inEncoding(char,String) fails internally with an 
> *UnsupportedEncodingException* and returns false.
> The methods in class Encodings first searches EncodingInfo object in 
> _encodingTableKeyJava and uses elements from _encodingTableKeyMime as 
> fallback.
> I suggest the definitions in Encodings.properties must be extended with 
> additional lines, e.g.
> {{*{color:#14892c}ISO-8859-1{color}* ISO-8859-1  0x00FF}}
> Also for encodings ISO-8859-2..9. Or all entries with Java encoding name 
> "8859*" should be removed. (They are not valid Java encoding names - 
> UnsupportedEncodingException!)
> Finally I think, the current mechanism of collecting the EncodingInfo objects 
> using two Hashtables is critical.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (XALANJ-2618) Error in org/apache/xml/serializer/Encodings.properties

Reply via email to