Re: fun project - improving calcHash

Anders Halager Mon, 24 Jun 2013 13:11:12 -0700

On Monday, 24 June 2013 at 11:11:42 UTC, qznc wrote:

On Sunday, 23 June 2013 at 21:22:40 UTC, Walter Bright wrote:
https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c#L21
Profiling shows the calcHash function is a significantcontributor to compilation time (3.25% of total time). Somaking it faster is a win. Even making dmd 1% faster would bea nice win - all those little drops add up.
There are many, many string hash functions findable throughgoogle. Anyone want to spend the effort to make a faster one?Remember, ya gotta prove it's faster!
A nice timing test would be the time expending compilingPhobos. I would suggest that the 64 bit build of dmd be usedfor timing tests.
Also, be careful, many of those hash functions on theintarnets have a license that makes it unusable for dmd.
My first idea was to look at Python, since it requires a lot ofhash calculation dynamically and probably is one of the mosttuned implementations. Interesting how naive it is:
http://hg.python.org/cpython/file/7aab60b70f90/Objects/object.c#l759

Just a simple for loop over chars. No switch.
Wikipedia: "Python Software Foundation License (PSFL) is aBSD-style permissive free software license"

Python is one of the slower interpreted languages. It would bemore interesting to look at luajit which actually does somethingclever.If the string is at least 4 chars long it only hashes the first 4bytes, the last 4, the 4 starting at floor(len/2)-2 and the 4starting at floor(len/4)-1. Any of these may overlap of coursebut that isn't a problem.


The code is MIT licensed and can be found here:
http://repo.or.cz/w/luajit-2.0.git/blob/053041a9f47e3d341f98682ea1e4907a578e4920:/src/lj_str.c#l104

As others have mentioned it will be hard to improve the speedenough to get dmd 1% faster - but if anything can it's dirtytricks.

Re: fun project - improving calcHash

Reply via email to