Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
>Pointers are worse. They can contain all sorts of periodicity. An
>it's different for each and every one of them). A prime modulo is very
>effective to put all such concerns to rest. Maybe there are other
Is Pike doing a lot of pointer hashing?
Incidentally, using a prime module to hash a pointer sounds fine to me,
it just means that the modulo is part of the hash function. The only
thing I find "bad design" is if the hash-table lookup forces a prime
modulo, even if a good hash function has already been used. We should
focus on improving the hash functions in general.
>One more concern is also how it behaves for the ultimately bad hash
>function, i.e. one that returns the same value all the time. In that
>case the hash table is of course O(n) on lookup and store, but it
>shouldn't try to grow the table all the time due to a reprobe limit. I
>don't know if he found a clever way to avoid that.
I don't think that was discussed during the talk.
However, pondering the problem for a brief moment, I can imagine something
like the following:
Keep a relative live count per thread (to avoid thread contention).
Increment it when inserting a new item, decrement it when deleting an item
(Tombstone).
Then whenever it seems necessary to grow the hashtable, quickly sum up all
the live counters and you'll have a fairly accurate population count
of the total hashtable. If the ratio is below a certain threshold, do not
grow the table.
--
Sincerely,
Stephen R. van den Berg.
"Make it as simple as possible. And no simpler."