Re: RFR: JDK-8032012, , String.toLowerCase/toUpperCase performance improvement

Ulf Zibis Wed, 22 Jan 2014 14:12:25 -0800

Am 22.01.2014 16:20, schrieb Paul Sandoz:

On Jan 21, 2014, at 11:05 PM, Xueming Shen <[email protected]> wrote:

On 01/20/2014 09:24 AM, Paul Sandoz wrote:

- it would be nice to get rid of the pseudo goto using the "scan" labelled 
block.

webrev has been updated to remove the pseudo goto by checking the "first" 
against
"len" after the loop break.

Much for readable :-)


I think, you should compare the performance of both versions on modern + 32-bit 
CPUs.

- you might be able to optimize by doing (could depend on the answer to the 
next point):

  int c = (int)value[i];
  int lc = Character.toLowerCase(c);
  if (.....) { result[i] = (char)lc; } else { return toLowerCaseEx(result, i, 
locale, localeDependent); }

- Do you need to check ERROR for the result of toLowerCase?

2586             if (c == Character.ERROR ||

Yes, Character.toLowerCase() should never return ERROR (while the package 
private
Character.toUpperCaseEx() will). In theory there is no need to check if the 
return
value of  Character.toUpperCase(int) > min_supplementary_code_point in our loop,
because there is no bmp character returns a supplementary code point as its 
lower
case. But since it's a data driven mapping table, there is no guarantee the 
unicode
data table is not going to change in the "future", so I still keep the check.


In my opinion this check should be subject of JDK's build test, but not of 
runtime code.

or:

   int c = (int)value[i];
   int lc = Character.toLowerCase(c); // is that safe?
   if (c < '\u03A3' || (c < Character.MIN_HIGH_SURROGATE && c != 'u03A3' && lc 
< Character.MIN_SUPPLEMENTARY_CODE_POINT))) {
     result[i] = (char)lc;
   } else {
     return toLowerCaseEx(result, i, locale, localeDependent);
   }

FWIW i personally find those solutions easier to read, if they are safe w.r.t. 
Character.toLowerCase and that annoying greek character.


I would like the 3rd version.

-Ulf

Re: RFR: JDK-8032012, , String.toLowerCase/toUpperCase performance improvement

Reply via email to