Re: Reserving/Preallocating associative array?

Benjamin Thaut Fri, 27 Dec 2013 01:41:41 -0800

Am 25.12.2013 17:12, schrieb Gordon:

Good question, I did test each one incrementally:


1. Switching from "split+to!size_t" to "parse" (and commenting out the
"union") reduced running time from ~26s to ~20s (on average).

2. Switching from "byLine" to "byLineFast" (and commenting out the
"union") reduced running time from ~20s to ~14s (on average).

3. With "parse" and "byLineFast", and with GC still on, and populating
the "union" the program took about 2m45s .

4. Adding "GC.disable" brought it down to 25s.

HTH,
  -gordon

So I was interrested how far you can take this. The version bearophileposted runs in 14 seconds (48420527 ticks) on my machine.

I created a new version using my own hashmap implementation and manualmemory management (no GC present at all).

This version runs in 12 seconds (41614375 ticks)

Preallocating all the entries for the hashmap brings quite a boost. Itbrings it down to 9 seconds (32926550 ticks)

If you replace the default hash function D uses with the MurMur hashfunction it brings it down even more to 8 seconds (29425713 ticks)

I also tried not using any hash function at all, but that made the timeskyrocket.

I stopped at that point, because now the most time is spend looking fora free entry in the hashmap, which pretty much comes down to cache-misses.


At this point the profiler looked something like this:
50%  find free entry in hashmap
21%  parse uint
9%   find end of line + tab
3.5% read data from disk


The ticks in my measurements have been obtained via QueryPerformanceCounter.

Kind Regards
Benjamin Thaut

Re: Reserving/Preallocating associative array?

Reply via email to