On Fri, 29 Apr 2016, Linus Torvalds wrote: > Picking a new value almost at random (I say "almost", because I just > started with that 32-bit multiplicand value that mostly works and > shifted it up by 32 bits and then randomly added a few more bits to > avoid long ranges of ones and zeroes), I picked > > #define GOLDEN_RATIO_PRIME_64 0x9e3700310c100d01UL > > and it is *much* better in my test harness. > > Of course, things like that depend on what patterns you test, But I > did have a "range of strides and hash sizes" I tried. So just for fun: > try changing GOLDEN_RATIO_PRIME_64 to that value, and see if the > absolutely _horrid_ page-aligned case goes away for you?
It solves that horrid case: https://tglx.de/~tglx/f-ops-h64-t.png It's faster than the shifts based version but the degradation with hyperthreading is slightly worse. Here for comparison the 64bit -> 32 shift version https://tglx.de/~tglx/f-ops-wang32-t.png FYI, that works way better than the existing shift machinery in hash_64 and the modulo prime one: https://tglx.de/~tglx/f-ops-mod-t.png > It really looks like those multiplication numbers were very very badly picked. Indeed. > Still, that number doesn't do very well if the hash is small (say, 8 > bits). I'm still waiting for the other test to complete. Will send numbers later today. Thanks, tglx