On 10.08.2010 16:59, Paul C. Anagnostopoulos wrote:
At 8/10/2010 08:00 AM, [email protected] wrote:
Date: Tue, 10 Aug 2010 01:01:42 +0300
From: luben karavelov <[email protected]>
When we look for the bucket to put/get a key we use something like this:
bucket_index = (hash_fn(key)) & hash->mask
Alternative approach that supports arbitrary values for M will be:
bicket_index = (hash_fn(key)) % M
The claim why it is like that (masking and power of 2 buckets) is that
this makes hash table expanding cheaper. A am not so sure that this is
true because I have not solid understanding of the current expand/resize
algorithm.
I'm not sure why it would help with expansion, but it does help with the
bucket index calculation, since it avoids a division. I don't know if
that's worth worrying about. If we did go to an arbitrary-size bucket
vector, we could easily detect whether the size is a power of two (at
expansion time) and then AND the hash value; otherwise we would do a
modulus.
~~ Paul
It helps with hash expansion because when you double the size of bucket
store you have to move arround only a half of the keys, and their new
locations is guarantted to be free
Luben
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev