Chris Foote wrote: > Hi all. > > I have the need to store a large (10M) number of keys in a hash table, > based on a tuple of (long_integer, integer). The standard python > dictionary works well for small numbers of keys, but starts to > perform badly for me inserting roughly 5M keys: > > # keys dictionary metakit (both using psyco) > ------ ---------- ------- > 1M 8.8s 22.2s > 2M 24.0s 43.7s > 5M 115.3s 105.4s > > Has anyone written a fast hash module which is more optimal for > large datasets ? > > p.s. Disk-based DBs are out of the question because most > key lookups will result in a miss, and lookup time is > critical for this application. > > Cheers, > Chris Python Bindings (\Python24\Lib\bsddb vers. 4.3.0) and the DLL for BerkeleyDB (\Python24\DLLs\_bsddb.pyd vers. 4.2.52) are included in the standard Python 2.4 distribution.
"Berkeley DB was 20 times faster than other databases. It has the operational speed of a main memory database, the startup and shut down speed of a disk-resident database, and does not have the overhead of a client-server inter-process communication." Ray Van Tassle, Senior Staff Engineer, Motorola Please let me/us know if it is what you are looking for. Claudio -- http://mail.python.org/mailman/listinfo/python-list