Greetings,

I remember another subtle issue which I would like to make people aware of.

3) Case-Insensitive String Comparison

When developers need to compare strings without regard to case or want to
realize a map with case-insensitive string keys, they often employ
String.toLowerCase() or String.toUpperCase() to create a "normalized" string
before doing a simple String.equals(). Now, the to*Case() methods are
overloaded: One takes no arguments and one take a Locale object.

The gotcha with the arg-less methods is that their output depends on the
default locale of the JVM but the default locale is out of control of the
developer (see also [0], [1]). That means the string expected by the
developer (who runs/tests his code in a JVM using locale xy) does not
necessarily match the string seen by another user (that runs a JVM with
locale ab). For example, the comparison

 "info".equals(debugLevel.toLowerCase())

is likely to fail for systems with default locale Turkish.

Since developers usually want to compare strings from the English language,
they must use String.to*Case(Locale.ENGLISH) to get reliable results
regardless of the end-user's default locale.

Just to make the picture complete: String.to*Case() is locale-sensitive and
context-aware. In contrast, Character.to*Case() and
String.equalsIgnoreCase() (which relies on Character.to*Case()) are neither
locale-sensitive nor context-aware [2]. For instance

 "ΣΣ".toLowerCase(Locale.ENGLISH).equals("σσ")

is false while

 "ΣΣ".equalsIgnoreCase("σσ")

is true (because the lower case form of "ΣΣ" is "σς").

Regards,


Benjamin Bentmann


[0]
http://java.sun.com/javase/6/docs/api/java/lang/String.html#toLowerCase()
[1] http://cafe.elharo.com/blogroll/turkish/
[2]
http://java.sun.com/javase/6/docs/api/java/lang/Character.html#toLowerCase(char)


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to