This is just bar talk at this point. I think we've shown that this is
easy enough to do that programmers can roll their own.
But as idle chat goes, note that in your code:
set(unicodedata.category(ch) for ch in s)
If `s` is a billion characters long, then we make a billion calls to
the `.category()` method. Python calls are comparatively expensive,
even on well optimized data structures like strings.
In my version:
bool(set(s) & set(unicode_categories['Sc'])
The billion characters are first reduced to a smallish set of hundreds
or thousands of distinct characters without needing method calls. Then
that is intersected with a smallish set of characters in the category.
You could optimize your version, however, simply by using:
set(unicodedata.category(set(ch)) for ch in s)
Yours provides more information, since it lists all the categories.
But if you REALLY only care about one category, then you still have to
ask `'Sc' in set(unicodedata.category(set(ch)) for ch in s)`. Which
is fine, that's not a hard question to ask.
On Fri, Jun 2, 2023 at 5:36 PM Chris Angelico <[email protected]> wrote:
>
> On Sat, 3 Jun 2023 at 07:28, David Mertz, Ph.D. <[email protected]> wrote:
> >
> > Sure. That's fine. With a sufficiently long strings my code is faster, but
> > for "typical" strings yours will be.
>
> Really? How? Your code has to build a set of every character in the
> string; mine builds a set of every category in the string. Set
> intersection won't be slower for a smaller set.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/5C7WSPFDJ4A6LRHL67N7UFPXGU4KI56O/
> Code of Conduct: http://python.org/psf/codeofconduct/
--
The dead increasingly dominate and strangle both the living and the
not-yet born. Vampiric capital and undead corporate persons abuse
the lives and control the thoughts of homo faber. Ideas, once born,
become abortifacients against new conceptions.
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/5XXPVXLWZQXEQW7B35QIPXHJK7G4N6X7/
Code of Conduct: http://python.org/psf/codeofconduct/