[EMAIL PROTECTED] said unto the world upon 2005-03-29 03:14:
Quoting Brian van den Broek <[EMAIL PROTECTED]>:


I had thought lookup was by hash value, and thus expected the access to some_dict to cause troubles. Yet it worked. Is it that lookup is by hash value, and then equality if need be so as to settle ambiguity, or have I completely misunderstood the mechanism of dictionary lookup?


Hash functions are rarely perfect (in terms of mapping every input to a
different output) [1], so they have to deal with collisions.

AFAIK, the hash table lookup algorithm basically works like this:

    def lookup(self, key):
        hits = self.internalDataStructure[key]
        for key_, value in hits:
            if key_ == key:
                return value
        raise KeyError, 'Key not found'.

(except that it will be written in C and there may be advanced trickery involved
to improve efficiency in other ways)

HTH!


Thanks. The code snip puts into code what I meant by asking if "lookup is by hash value, and then equality if need be so as to settle ambiguity", but the code is clearer than the prose ;-)


[1] The pigeonhole principle [2] tells us that it is rarely possible to make
perfect hash functions.  For example, there are more possible strings than there
are integers [3,4], so, in principle, you could find arbitrarily many strings
all with the same hash value.  In that case, a dictionary with them as keys
would be O(n), not O(1).  But a good hash function will make that very unlikely.
[2] http://en.wikipedia.org/wiki/Pigeonhole_principle
[3] Because every integer n has a corresponding string "n", but there are
strings which do not (directly) correspond to an integer (eg, "foo").
[4] Unless we allow integers to go to infinity... But computers generally do not
have infinite memory :-)


I just about spat a beverage all over my keyboard when I read:

there are more possible strings than there are integers

until I read the footnote and realized that you meant ints not integers (or, integer in the python sense, rather than in the mathematical sense).


I got tempted to start droning on about why I wanted to spit, but there is a limit to how off topic one can go ;-) (For the curious or the insomniacs, due to some cool results in set-theory, if we had infinitely many integers in our computers, we could get a perfect hash function, not just for the strings or the integers, but for all recursively specifiable Python objects, in the same single hash function.)

Thanks for the reply,

Brian vdB

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to