On 04/07/2015 07:24 PM, Walter Bright wrote: > https://github.com/D-Programming-Language/dmd/blob/master/src/root/stringtable.c > > uses quadratic probing instead of linked lists, but this is not cache > friendly, either.
Quite to the contrary, it's very cache friendly. The triangular numbers are 1 3 6 10 15 21 With 8 byte per entry as in stringtable you need 2 collisions on the same "bucket" to land 48 bytes away, which might still be the same cacheline. The problem with linear probing is primary clustering, especially when your hash isn't too good, which can lead to long streaks of neighbouring buckets that are filled. Any insert that hashes into such a lump (chances are increasing the bigger it becomes) will make it grow even further. For higher load factor such as 0.75 this can lead to a huge variance of the probe sequence length and extreme lengths on the upper 95/99 percentiles. Here is the ticket for open addressing, btw. https://issues.dlang.org/show_bug.cgi?id=14385