Re: Turkic I and re

Tim Chase Thu, 15 Sep 2011 08:03:31 -0700

On 09/15/11 09:06, MRAB wrote:

It's somewhat unlikely that Unicode will become locale-dependent in
Python because it would cause problems; you don't want:


      "i".upper() == "I"

to be maybe true, maybe false.

An option would be to specify whether it should be locale-dependent.

There have been several times when I've wished that unicodestrings would grow something like


  .locale_aware_insensitive_cmp(other[,
     locale=locale.getdefaultlocale()]
     )

to return -1/0/1 like cmp(), or in case sort-order isnonsensical/arbitrary (such as for things like Mandarin glyphs)


  .insensitive_locale_equal(other,[,
     locale=locale.getdefaultlocale()]
     )

so you could do something like

 if "i".locale_aware_insensitive_cmp("I"):
   not_equal()
 else:
   equal()

or

 if "i".insensitive_locale_equal("I"):
   equal()
 else:
   not_equal()

because while I know that .upper() or .lower() doesn't work in alot of cases, I don't really care about the upper/lower'ness ofthe result, I want to do an insensitive compare (and most of tehtime it's just for equality, not for sorting). It's myunderstanding[1] that the same goes for the German where"ß".upper() is traditionally written as "SS" but "SS".lower() istraditionally just "ss" instead of "ß".

So if these language-dependent comparisons were relegated to awell-tested core method of a unicode string, it may simplify thework/issue for you.


-tkc


[1]
http://en.wikipedia.org/wiki/Letter_case#Special_cases





--
http://mail.python.org/mailman/listinfo/python-list

Re: Turkic I and re

Reply via email to