Re: fun project - improving calcHash

Martin Nowak Sun, 23 Jun 2013 18:21:18 -0700

On 06/23/2013 11:22 PM, Walter Bright wrote:

Profiling shows the calcHash function is a significant contributor to
compilation time (3.25% of total time). So making it faster is a win.
Even making dmd 1% faster would be a nice win - all those little drops
add up.


You'll find a benchmark at the end of the post.
http://forum.dlang.org/thread/op.v2yky9pdsqu...@dawg-freebsd.lan


Obviously the switch in a for loop is a branch prediction killer.
This should work much better.

for (size_t i = 0; i < len / 4; ++i)
{
    hash *= 37;
    hash += *(const uint32_t *)str;
    str += 4;
}
switch (len & 3)
{
case 0:
    break;
case 1:
    hash *= 37;
    hash += *(const uint8_t *)str;
    break;
case 2:
    hash *= 37;
    hash += *(const uint16_t *)str;
    break;
case 3:
    hash *= 37;
    hash += (*(const uint16_t *)str << 8) +
             ((const uint8_t *)str)[2];
    break;
default:
    assert(0);
}
return hash;

Finding a faster hash algorithm isn't simple because primemultiplication is so simple. At least we're not diving the hash resultby 37 any longer.I think using the Hsieh hash is a good choice because it has quite an OKdistribution so it leads to fewer collisions. It's also pretty fast forshort strings and we already use it in druntime.

Re: fun project - improving calcHash

Reply via email to