[EMAIL PROTECTED] wrote:
>
> Just looking at R5 I knew it wasn't going to do well in this application
> because it's similar to a number of hash functions I tried with the same
> idea in mind: to place similar names together in the same leaf block.
> That turned out to be not very important compared to achieving a
> relatively high fullness of leaf blocks. The problem with R5 when used
> with my htree is, it doesn't give very uniform dispersal.
>
> The bottom line: dx_hack_hash is still the reigning champion.
>
> Now that you provide source for r5 and dx_hack_hash, let me feed my
> collections to them.
> r5: catastrophic
> dx_hack_hash: not bad, but the linear hash is better.
I never expected dx_hack_hash to be particularly good at anything, but
we might as well test the version without the mistake in it - I was
previously using < 0 to test the sign bit - on an unsigned variable :-/
unsigned dx_hack_hash (const char *name, int len)
{
u32 hash0 = 0x12a3fe2d, hash1 = 0x37abe8f9;
while (len--)
{
u32 hash = hash1 + (hash0 ^ (*name++ * 71523));
if (hash & 0x80000000) hash -= 0x7fffffff;
hash1 = hash0;
hash0 = hash;
}
return hash0;
}
The correction gained me another 1% in the leaf block fullness measure.
I will try your hash with the htree index code tomorrow.
--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/