So you'll see that I reverted the new String impl until we fix up the possible cases in the boot sequence.
I was reviewing the callers of toLowerCase(), and see that File#hashCode() uses that method (not in boot mind you). The Java SE 6 spec for that method [1] has been clarified to add the fact that when lowercasing the pathname on Windows the "Locale is not taken into account on lowercasing the pathname string." Thinking out loud -- so how does that work? How can we lowercase a Unicode string without consideration of a locale? [1] http://download.oracle.com/javase/6/docs/api/java/io/File.html#hashCode%28%29 Regards, Tim On 23/Sep/2010 12:21, Robert Muir wrote: > On Wed, Sep 22, 2010 at 10:33 PM, Tim Ellison <t.p.elli...@gmail.com> wrote: > >> On 23/Sep/2010 01:10, Robert Muir (JIRA) wrote: >>> I thought about this too, >>> >>> one concern (not knowing if there are more cases involved) would be >>> if the input "should" be ascii, but "could" be something else. if >>> String.toLowerCase had the ascii special-case with a fallback to ICU, >>> it could fail less gracefully in such a situation if it encountered >>> non-ascii rather than simply not matching, especially since unit >>> tests tend to have more coverage for the ascii case... >>> >>> ...but this might be theoretical >> Fail less gracefully than what? Today, by using String#toLowerCase(), >> invalid ascii gets past into ICU so will get converted as though it were >> a valid char encoding, so I don't think it would make anything worse >> than it is today. >> > > well, what I meant to say is that the auto-detect idea seems a bit shaky. if > something wants to do an ascii-only uppercase/lowercase before ICU is > available, and we know we cannot load ICU yet, then I think the > toASCIILowerCase is much better than calling String.toLowerCase and saying > "yeah we know the input is all ascii, it won't load ICU". > > The toASCIILowerCase will never load ICU, doesn't depend on an > implementation detail of String, and then its explicit in the code what is > going on. > > >> I the the debate is whether to find and fix places in the class library >> code where we know the input is ascii and change uses of >> String#toLowercase to use >> org.apache.harmony.luni.util.Util#toASCIILowerCase() [1] >> > > +1, I think this is the best solution. > >