On 09/15/11 09:06, MRAB wrote:
It's somewhat unlikely that Unicode will become locale-dependent in
Python because it would cause problems; you don't want:
"i".upper() == "I"
to be maybe true, maybe false.
An option would be to specify whether it should be locale-dependent.
There have been several times when I've wished that unicode
strings would grow something like
.locale_aware_insensitive_cmp(other[,
locale=locale.getdefaultlocale()]
)
to return -1/0/1 like cmp(), or in case sort-order is
nonsensical/arbitrary (such as for things like Mandarin glyphs)
.insensitive_locale_equal(other,[,
locale=locale.getdefaultlocale()]
)
so you could do something like
if "i".locale_aware_insensitive_cmp("I"):
not_equal()
else:
equal()
or
if "i".insensitive_locale_equal("I"):
equal()
else:
not_equal()
because while I know that .upper() or .lower() doesn't work in a
lot of cases, I don't really care about the upper/lower'ness of
the result, I want to do an insensitive compare (and most of teh
time it's just for equality, not for sorting). It's my
understanding[1] that the same goes for the German where
"ß".upper() is traditionally written as "SS" but "SS".lower() is
traditionally just "ss" instead of "ß".
So if these language-dependent comparisons were relegated to a
well-tested core method of a unicode string, it may simplify the
work/issue for you.
-tkc
[1]
http://en.wikipedia.org/wiki/Letter_case#Special_cases
--
http://mail.python.org/mailman/listinfo/python-list