On 06/23/2013 09:20 PM, Juan Manuel Cabo wrote: > On 06/23/2013 06:22 PM, Walter Bright wrote: >> https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c#L21 >> >> Profiling shows the calcHash function is a significant contributor to >> compilation time (3.25% of total time). So making >> it faster is a win. Even making dmd 1% faster would be a nice win - all >> those little drops add up. >> >> There are many, many string hash functions findable through google. Anyone >> want to spend the effort to make a faster >> one? Remember, ya gotta prove it's faster! >> >> A nice timing test would be the time expending compiling Phobos. I would >> suggest that the 64 bit build of dmd be used >> for timing tests. >> >> Also, be careful, many of those hash functions on the intarnets have a >> license that makes it unusable for dmd. > > > I performed a quick test, and I don't think that the original function > can be improved for speed (though it might be improved for less > collisions): > > https://gist.github.com/jmcabo/5847036 > > I used words and lines from the complete works of Shakespeare. > I tested separating the loop from the switch, as was suggested > in Timon Gehr post. It didn't improve the speed when compiled > with "g++ -O3", though it improved it a bit without -O3. > > I also tested removing the multiplication by 37. It didn't > improve the speed. With "g++ -O3" they are all the same. > > So, unless I'm measuring it wrong, the algorithm is as fast > as can be (as fast as just adding chars). > > --jm > >
Oh, it might be improved by loading 128bits at a time instead of 32bits... but that would benefit strings of more than 16 bytes. Google's CitiHash seems tuned for large strings. --jm