Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Jim Laskey Wed, 15 Jul 2020 09:27:08 -0700

I think I'm good with this. +1

Asides:


 325             int cp1 = (int)getChar(value, k1);
 326             int cp2 = (int)getChar(other, k2);

I would be tempted to short cut by exiting when not equal, but I think we 
agreed we need to allow for upper/lowers on different planes.
 
In the UTF-16 code I was trying to think of how your could exhaust the first 
string and not the second, and still have their lengths the same. I think I 
have convinced myself that it's not possible as long as surrogates always map 
upper/lowers to surrogates (two chars each.)

Cheers,

-- Jim





> On Jul 15, 2020, at 1:00 PM, naoto.s...@oracle.com wrote:
> 
> Hello,
> 
> Please review the fix to the following issues:
> 
> https://bugs.openjdk.java.net/browse/JDK-8248655
> https://bugs.openjdk.java.net/browse/JDK-8248434
> 
> The proposed changeset and its CSR are located at:
> 
> https://cr.openjdk.java.net/~naoto/8248655.8248434/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8248664
> 
> A bug was filed against SimpleDateFormat (8248434) where case-insensitive 
> date format/parse failed in some of the new locales in JDK15. The root cause 
> was that case-insensitive String.regionMatches() method did not work with 
> supplementary characters. The problem is that the method's spec does not 
> expect case mappings of supplementary characters, possibly because it was 
> overlooked in the first place, JSR 204 - "Unicode Supplementary Character 
> support". Similar behavior is observed in other two case-insensitive methods, 
> i.e., compareToIgnoreCase() and equalsIgnoreCase().
> 
> The fix is straightforward to compare strings by code point basis, instead of 
> code unit (16bit "char") basis. Technically this change will introduce a 
> backward incompatibility, but I believe it is an incompatibility to wrong 
> behavior, not true to the meaning of those methods' expectations.
> 
> Naoto

Re: RFR: 8248655: Support supplementary characters in String case insensitive operations

Reply via email to