Hello Don,

> I had mentioned that I was hoping to remember/find the appropriate
> formula for the probability of a hash bucket reaching its limit.

[SNIP calculation - thanks for sharing your research!]

> I assume that the hash function maps a connection to each bucket
> with equal probability, P = 1/B.

Can you find an analysis that permits some non-uniformity, given as
a parameter?  I am pretty sure that the hash function we use for
conntracking is nonuniform in lots of real world cases. Thus, if
you want to propose using that probability derived length limit,
then you must prove uniformity of our hash function.

I have the vague feeling (being a programmer, not a mathematician :)
that even small nonuniformities will have large impact on your
probability distribution.

May I suggest a different, experimental approach: extract the hash function
from the code, creating a small userlevel tool which reads in the saved
contents of some /proc/net/ip_conntrack, and with a hash bucket and limit
specified on the commandline, counts the distribution for that specific
set of connections, maybe comparing it with the theoretical prediction,
resulting in a "how does that real world conntrack table fit to theory"
metric? Once done, such a tool could permit command line selection of
different hash functions, for comparison.

best regards
  Patrick

Reply via email to