Serializer's Encodings class methods slow: gettLastPrintable() 
convertJava2MimeEncoding() 
------------------------------------------------------------------------------------------

         Key: XALANJ-2077
         URL: http://issues.apache.org/jira/browse/XALANJ-2077
     Project: XalanJ2
        Type: Improvement
  Components: Serialization  
    Reporter: Brian Minchau
    Priority: Minor


The serializer's methods:
    org.apache.xml.serializer.Encodings.getLastPrintable(String encoding)
    org.apache.xml.serializer.Encodings.convertJava2MimeEncoding(String 
encoding)

are both slower than they need to be.
For some reason the java.lang.String.toUpperCase() methods used in each of
these seems to be quite slow.

Code like this seems to run twice as fast as calling String.toUpperCase():

    static private String toUpperCaseFast(final String s) {

        boolean different = false;
        final int mx = s.length();
                char[] chars = new char[mx];
        for (int i=0; i < mx; i++) {
                char ch = s.charAt(i);
                if (Character.isLowerCase(ch)) {
                        ch = Character.toUpperCase(ch);
                        different = true; // the uppercased String is different
                }
                chars[i] = ch;
        }
        
        // A little optimization, don't call sb.toString() if
        // the uppercased string is the same as the input string.
        final String upper;
        if (different) 
                upper = chars.toString();
        else
                upper = s;
                
        return upper;
    }

--------------------------------
Perhaps the official toUpperCase() method is Locale sensitive, although it 
doesn't need 
to be because the encoding supported are IANA-CHARSETS
(Internet Assigned Numbers Authority) Official Names for Character Sets, ed. 
Keld Simonsen et al. (See http://www.iana.org/assignments/character-sets.)

These names are from the character set of pritable US-ASCII characters, so 
perhaps even the methods called in the method above, Character.isLowerCase(ch) 
and Character.toUpperCase(ch) can be converted into faster bit manimpulation.




-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to