On 09/15/11 09:06, MRAB wrote:
It's somewhat unlikely that Unicode will become locale-dependent in
Python because it would cause problems; you don't want:

      "i".upper() == "I"

to be maybe true, maybe false.

An option would be to specify whether it should be locale-dependent.

There have been several times when I've wished that unicode strings would grow something like

  .locale_aware_insensitive_cmp(other[,
     locale=locale.getdefaultlocale()]
     )

to return -1/0/1 like cmp(), or in case sort-order is nonsensical/arbitrary (such as for things like Mandarin glyphs)

  .insensitive_locale_equal(other,[,
     locale=locale.getdefaultlocale()]
     )

so you could do something like

 if "i".locale_aware_insensitive_cmp("I"):
   not_equal()
 else:
   equal()

or

 if "i".insensitive_locale_equal("I"):
   equal()
 else:
   not_equal()

because while I know that .upper() or .lower() doesn't work in a lot of cases, I don't really care about the upper/lower'ness of the result, I want to do an insensitive compare (and most of teh time it's just for equality, not for sorting). It's my understanding[1] that the same goes for the German where "ß".upper() is traditionally written as "SS" but "SS".lower() is traditionally just "ss" instead of "ß".

So if these language-dependent comparisons were relegated to a well-tested core method of a unicode string, it may simplify the work/issue for you.

-tkc


[1]
http://en.wikipedia.org/wiki/Letter_case#Special_cases





--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to