John Machin wrote: > On Aug 26, 7:36 pm, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >> Martin Marcher wrote: >> > On 2008-08-26 00:32:20, cnb wrote: >> >> Are dictionaries the same as hashtables? >> >> > Yes, but there is nothing in there that does sane collision handling >> > like making a list instead of simply overwriting. >> >> The term "collision" is rather well defined when talking about >> associative arrays using hashing (which python's dicts are): it means >> that two distinct keys produce the same hash-value, leading to a bucket >> collision. And of course python deals with that sanely, and creates a >> list of objects inside the bucket which is traversed and comparison is >> used to determine which actual value to retrieve. > > I see neither buckets nor lists in the CPython svn trunk version of > dictobject.c and IIRC it's been using open addressing for a very long > time.
I don't know the exact names of the involved structures - I named them liberally from my understanding of how associative arrays based on hashing are implemented. But the below code shows that hash-collisions can occur without corrupting data, while OTOH of course degenerating the lookup from usually O(1) to O(n). Try playing around with it, commenting out the proper hashing of the key will lead to great performance enhancements. Diez import pprint class StupidThing(object): def __init__(self, key): self.key = key def __hash__(self): #return hash(self.key) return 1 def __cmp__(self, other): return cmp(self.key, other.key) def __repr__(self): return "<%s>" % self.key d = {} for a in xrange(1000): d[StupidThing(str(a))] = a pprint.pprint(d) -- http://mail.python.org/mailman/listinfo/python-list