James Y Knight wrote: > On Jul 18, 2006, at 1:54 PM, Martin v. Löwis wrote: > >> Mihai Ibanescu wrote: >>> To follow up on my own email: it looks like, even though in some >>> locale >>> "INFO".lower() != "info" >>> >>> u"INFO".lower() == "info" (at least in the Turkish locale). >>> >>> Is that guaranteed, at least for now (for the current versions of >>> python)? >> It's guaranteed for now; unicode.lower is not locale-aware. > > That seems backwards of how it should be ideally: the byte-string > upper and lower should always do ascii uppering-and-lowering, and the > unicode ones should do it according to locale. Perhaps that can be > cleaned up in py3k?
Actually, you've got that backwards ;-) ... There are no .lower()/.upper() methods for bytes. The reason these methods are locale aware for 8-bit strings lies in the fact that we're using the C lib functions, which are locale setting dependent - with all the drawbacks that go with it. The Unicode database OTOH *defines* the upper/lower case mapping in a locale independent way, so the mappings are guaranteed to always produce the same results on all platforms. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jul 18 2006) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com