On Monday, 24 June 2013 at 11:11:42 UTC, qznc wrote:
On Sunday, 23 June 2013 at 21:22:40 UTC, Walter Bright wrote:
https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c#L21

Profiling shows the calcHash function is a significant contributor to compilation time (3.25% of total time). So making it faster is a win. Even making dmd 1% faster would be a nice win - all those little drops add up.

There are many, many string hash functions findable through google. Anyone want to spend the effort to make a faster one? Remember, ya gotta prove it's faster!

A nice timing test would be the time expending compiling Phobos. I would suggest that the 64 bit build of dmd be used for timing tests.

Also, be careful, many of those hash functions on the intarnets have a license that makes it unusable for dmd.

My first idea was to look at Python, since it requires a lot of hash calculation dynamically and probably is one of the most tuned implementations. Interesting how naive it is:

http://hg.python.org/cpython/file/7aab60b70f90/Objects/object.c#l759

Just a simple for loop over chars. No switch.

Wikipedia: "Python Software Foundation License (PSFL) is a BSD-style permissive free software license"

Python is one of the slower interpreted languages. It would be more interesting to look at luajit which actually does something clever. If the string is at least 4 chars long it only hashes the first 4 bytes, the last 4, the 4 starting at floor(len/2)-2 and the 4 starting at floor(len/4)-1. Any of these may overlap of course but that isn't a problem.

The code is MIT licensed and can be found here:
http://repo.or.cz/w/luajit-2.0.git/blob/053041a9f47e3d341f98682ea1e4907a578e4920:/src/lj_str.c#l104

As others have mentioned it will be hard to improve the speed enough to get dmd 1% faster - but if anything can it's dirty tricks.

Reply via email to