At 8/11/2010 08:00 AM, [email protected] wrote:
>> It helps with hash expansion because when you double the size of bucket
>> store you have to move arround only a half of the keys, and their new
>> locations is guarantted to be free
>
>I'm not sure I believe that's the case with our implementation. Besides that,
>I was under the impression (see The Practice of Programming) that picking a
>prime number for the array size produces a better distribution of keys in
>buckets. The argument is that the lack of a common factor between the array
>size, the hash seed, and likely hash values produces a better distribution.
>
>Maybe we need some sort of intrusive benchmark with likely hash values that
>shows the statistical goodness or badness of our implementation.
I believe the current implementation, when enlarging a hash table, does make
use of the fact that the table sizes are powers of 2 and that the new table is
double in size. That does make for faster resizing. If we go to a scheme using
a hash table size vector, we could detect the cases where we are doubling a
power-of-2-sized table and use the faster algorithm.
There are so many design variants with hash tables. We certainly do need a
benchmark against which to evaluate design decisions.
~~ Paul
----------------------------------------------------------------
Windfall Paul C. Anagnostopoulos
----------------------------------------------------------
Software 978 371-2316
www.windfall.com
----------------------------------------------------------------
Metaphorical invocations ... often suffer from the
weakness of giving such satisfaction to the human
mind that they tend to be mistaken for incisive and
illuminating observations. ---Torkel Franzen _______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev