Martin, Joakim,

thank you for your cttest runs. They both provide new data points,
coming from a router instead of a server. They confirm that we should
at least strongly avoid even hash bucket counts. In all examples we have
now, both the original hash, and Don's abcd hash detoriate badly.

The interesting new theme, when comparing my "transproxy server"
statistics with your router statistics, is that for the odd
bucket sizes, the original hash does a lot better in the router
case than it does in the server case.

Today, I wrote a new comparison, using rdtsc() to time the CPU usage
of the hash calculation within cttest. I'll make a cttest-0.2 soon,
so everybody can reproduce. However, here's the preliminary result:
on a 1Ghz P-III machine, compiled with march=i686, the original hash
function takes 55ns per call. The abcd hash is better, at 43ns per call.
After inlining the crc32() function, the crc32 hash 185ns per call.
My experience is that a one-longer hash list, when scanned, needs one
more memory roundtrip at about 200ns latency. Maybe it's double that
with the current ip_conntrack layout, but I'll whine about this at
another time. It is clear that even the cost of the crc32 calculation
is negligible if we can save one on the list length on average.

Overall, these results seem to call for this set of actions:

- make the hash bucket count at least individible by 2. This should go
  as a strong suggestion into the documentation, and should be implemented
  in the default initialization code. Anybody volunteering for one or
  the other? I'll do the code part, but I won't do the docs.
- given the clear advantage of the abcd hash over the original hash in
  the server case, make the abcd hash at least a compile time option,
  or better yet, a module load time option. I can do that, too.

For the code, would one or two patch-o-matic type patches against 2.4.18-rc1
be the desired deliverable?

best regards
  Patrick

Reply via email to