Quoting Brian van den Broek <[EMAIL PROTECTED]>:
I had thought lookup was by hash value, and thus expected the access to some_dict to cause troubles. Yet it worked. Is it that lookup is by hash value, and then equality if need be so as to settle ambiguity, or have I completely misunderstood the mechanism of dictionary lookup?
Hash functions are rarely perfect (in terms of mapping every input to a different output) [1], so they have to deal with collisions.
AFAIK, the hash table lookup algorithm basically works like this:
def lookup(self, key): hits = self.internalDataStructure[key] for key_, value in hits: if key_ == key: return value raise KeyError, 'Key not found'.
(except that it will be written in C and there may be advanced trickery involved to improve efficiency in other ways)
HTH!
Thanks. The code snip puts into code what I meant by asking if "lookup is by hash value, and then equality if need be so as to settle ambiguity", but the code is clearer than the prose ;-)
[1] The pigeonhole principle [2] tells us that it is rarely possible to make perfect hash functions. For example, there are more possible strings than there are integers [3,4], so, in principle, you could find arbitrarily many strings all with the same hash value. In that case, a dictionary with them as keys would be O(n), not O(1). But a good hash function will make that very unlikely. [2] http://en.wikipedia.org/wiki/Pigeonhole_principle [3] Because every integer n has a corresponding string "n", but there are strings which do not (directly) correspond to an integer (eg, "foo"). [4] Unless we allow integers to go to infinity... But computers generally do not have infinite memory :-)
I just about spat a beverage all over my keyboard when I read:
there are more possible strings than there are integers
until I read the footnote and realized that you meant ints not integers (or, integer in the python sense, rather than in the mathematical sense).
I got tempted to start droning on about why I wanted to spit, but there is a limit to how off topic one can go ;-) (For the curious or the insomniacs, due to some cool results in set-theory, if we had infinitely many integers in our computers, we could get a perfect hash function, not just for the strings or the integers, but for all recursively specifiable Python objects, in the same single hash function.)
Thanks for the reply,
Brian vdB
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor