[issue9520] Add Patricia Trie high performance container

Antoine Pitrou Sat, 14 Aug 2010 04:25:59 -0700

Antoine Pitrou <[email protected]> added the comment:

So, here is the modified benchmark. It first creates a cache of the random 
wordlist, because it is quite long to compute (for N up to 10000000). The cache 
is reused in subsequent runs. This takes some memory though (probably won't run 
it if you have less than 2GB). The cache itself is quite small on-disk (around 
80 MB).


I'm outputting the total dict size and the space occupied per individual key, 
and I've also timed lookups separately. Here are results on my machine:

    1000 words (     961 keys), 3269137 words/s, 16980987 lookups/s, 51 
bytes/key (0.0MB)
   10000 words (    9042 keys), 2697648 words/s, 13680052 lookups/s, 87 
bytes/key (0.8MB)
  100000 words (   83168 keys), 2462269 words/s, 6956074 lookups/s, 37 
bytes/key (3.0MB)
  500000 words (  389442 keys), 1802431 words/s, 4733774 lookups/s, 64 
bytes/key (24.0MB)
 1000000 words (  755372 keys), 1702130 words/s, 4485229 lookups/s, 66 
bytes/key (48.0MB)
 2000000 words ( 1463359 keys), 1616658 words/s, 4251021 lookups/s, 68 
bytes/key (96.0MB)
 5000000 words ( 3501140 keys), 1608889 words/s, 3909212 lookups/s, 57 
bytes/key (192.0MB)
10000000 words ( 6764089 keys), 1531136 words/s, 3600395 lookups/s, 59 
bytes/key (384.0MB)

As you can see, the O(1) behaviour seems almost observed up to the 10000000 
words limit (given the way the data is computed, I'm afraid I can't go further 
than that). Of course, very small dicts are faster because everything fits in 
the CPU caches.

----------
Added file: http://bugs.python.org/file18520/dcbench-py3k.py

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue9520>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9520] Add Patricia Trie high performance container

Reply via email to