On 06/23/2013 11:22 PM, Walter Bright wrote:
Profiling shows the calcHash function is a significant contributor to
compilation time (3.25% of total time). So making it faster is a win.
Even making dmd 1% faster would be a nice win - all those little drops
add up.
You'll find a benchmark at the end of the post.
http://forum.dlang.org/thread/op.v2yky9pdsqu...@dawg-freebsd.lan
Obviously the switch in a for loop is a branch prediction killer.
This should work much better.
for (size_t i = 0; i < len / 4; ++i)
{
hash *= 37;
hash += *(const uint32_t *)str;
str += 4;
}
switch (len & 3)
{
case 0:
break;
case 1:
hash *= 37;
hash += *(const uint8_t *)str;
break;
case 2:
hash *= 37;
hash += *(const uint16_t *)str;
break;
case 3:
hash *= 37;
hash += (*(const uint16_t *)str << 8) +
((const uint8_t *)str)[2];
break;
default:
assert(0);
}
return hash;
Finding a faster hash algorithm isn't simple because prime
multiplication is so simple. At least we're not diving the hash result
by 37 any longer.
I think using the Hsieh hash is a good choice because it has quite an OK
distribution so it leads to fewer collisions. It's also pretty fast for
short strings and we already use it in druntime.