BTW, this thread seems a good place to mention the under-appreciated
SortedContainers package from Grant Jenks:

    http://www.grantjenks.com/docs/sortedcontainers/

This supplies sorted containers (lists, dicts, sets) coded in pure
Python, which generally run at least as fast as C extensions
implementing fancy tree structures.  The basic implementation "trick"
is to maintain, under the covers, a list _of_ Python lists.  Not
deeper than that.  It you're familiar with the lingo, it's most like a
memory-resident 2-level B-tree.  It scales extremely well, to billions
of elements - to sizes the fancy packages can't get near, because
their space overheads are so much higher they run out of RAM.

There's extensive explanation of that here, which should be read to
get a feel for just how much of modern processor behavior hasn't even
been mentioned in _this_ thread ;-)

    http://www.grantjenks.com/docs/sortedcontainers/performance-scale.html

Here's a tease:

"""
For comparison, consider traversing an AVL-binary tree with one
billion elements. A highly optimized implementation will require at
least 24 gigabytes of memory. The binary tree will likely traverse
thirty levels, each of which is a data-dependent lookup. Some lookups
will have good locality but most will not. Each lookup could be
hundreds to thousands of times slower than sequential accesses. These
slow lookups are why Sorted Containers can afford to shift a thousand
sequential elements in memory and have most additions take less time
than binary tree implementations.
"""

It was very much designed with cache behavior in mind, where a hit in
L1 cache can easily be 100 times faster than needing to go to RAM.

That said, the package's SortedSet still uses a Python set under the
covers, because he didn't find anything that could beat that.  The
sorted order is maintained, under the covers, in a parallel
SortedList.  For that reason, adding an element to (or removing one
from) a SortedSet takes O(log(N)) time rather than O(1).
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UGMP2IEMFE44JCZBHMN2T5UFKPFM5JEF/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to