On Tue, Mar 24, 2020 at 6:31 AM Andrew Barnert <abarn...@yahoo.com> wrote: > > On Mar 23, 2020, at 04:51, Chris Angelico <ros...@gmail.com> wrote: > > > > Right, which is why for a proposal like this, it's best to start with > > the simple and straight-forward option of case sensitivity and precise > > matching. Removing a prefix of "a\u0301" will not remove a leading > > "\xe1" and vice versa (just as those two strings don't compare equal). > > Agreed, but I think it’s not just “to start with”, but forever, or at least > as long as Python strings are sequences of Unicode code points. If > "Café".startswith("Cafe\u0301") is false, "Café".stripprefix("Cafe\u0301") > had better not strip anything. And as long as "é" in "Cafe\u0301" and > any(ch=="é" for ch in "Cafe\u0301" are false, startswith is correct. > > By comparison, in Swift, "Café".hasPrefix("Cafe\u{0301}") is true, because > "Cafe\u{0301}" is a sequence of four Unicode scalars, the fourth of which is > 'é', as opposed to Python where it’s a sequence of five Unicode code points. > And of course Swift also has a slew of methods to do things like localized > vs. default case-insensitive equality, substring, etc. testing, none of which > Python has, or should have, as long as its strings are made of code points > rather than scalars (or EGCs or whatever). >
Maybe this would be something for the locale or unicodedata module? ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YOIKAM6MHYVC66BCETPOANGPKVTL6NU3/ Code of Conduct: http://python.org/psf/codeofconduct/