On Tue, Mar 24, 2020 at 6:31 AM Andrew Barnert <abarn...@yahoo.com> wrote:
>
> On Mar 23, 2020, at 04:51, Chris Angelico <ros...@gmail.com> wrote:
> >
> > Right, which is why for a proposal like this, it's best to start with
> > the simple and straight-forward option of case sensitivity and precise
> > matching. Removing a prefix of "a\u0301" will not remove a leading
> > "\xe1" and vice versa (just as those two strings don't compare equal).
>
> Agreed, but I think it’s not just “to start with”, but forever, or at least 
> as long as Python strings are sequences of Unicode code points. If 
> "Café".startswith("Cafe\u0301") is false, "Café".stripprefix("Cafe\u0301") 
> had better not strip anything. And as long as "é" in "Cafe\u0301" and 
> any(ch=="é" for ch in "Cafe\u0301" are false, startswith is correct.
>
> By comparison, in Swift, "Café".hasPrefix("Cafe\u{0301}") is true, because 
> "Cafe\u{0301}" is a sequence of four Unicode scalars, the fourth of which is 
> 'é', as opposed to Python where it’s a sequence of five Unicode code points. 
> And of course Swift also has a slew of methods to do things like localized 
> vs. default case-insensitive equality, substring, etc. testing, none of which 
> Python has, or should have, as long as its strings are made of code points 
> rather than scalars (or EGCs or whatever).
>

Maybe this would be something for the locale or unicodedata module?

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YOIKAM6MHYVC66BCETPOANGPKVTL6NU3/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to