On Thu, May 28, 2015 at 10:43:37AM -0600, Shawn Heisey wrote:
> On 4/30/2015 11:50 PM, Willy Tarreau wrote:
> > If you're working on preparing the OS, please *do* verify that
> > conntrack is properly tuned (large hash table with at least 1/4 of the
> > total number of sessions). Otherwise under load it will become
> > extremely slow.
> 
> When I asked about recommendations earlier, I was not using the
> firewall, but now circumstances (FTP load balancing) will force me into
> turning the firewall on.  At that point, I expect will need to pay
> attention to netfilter tuning.
> 
> I found another message on the Internet where you advised someone to
> look at nf_conntrack_max and nf_conntrack_htable_size.  On a recent
> (3.13) kernel, the htable size parameter doesn't seem to exist.  I found
> netfilter/nf_conntrack_max which is set to 65536.  There is also
> /netfilter/nf_conntrack_expect_max which is set to 256.
> 
> Is a value of 65536 for nf_conntrack_max high enough?  I'm definitely no
> expert, but that certainly seems like a pretty high number, although I
> did see one recommendation of 262144, and another where they used 10485760.

No, 64k is pretty low. It means that you can have one connection per source
port to a single destination port between a single source and a single
destination. Add to that the fact that on a proxy you'll have connections
on both sides, it means that you'll saturate with 32k forwarded connections.
And if you add to that that a TIME_WAIT stays for 60s, your 32k forwarded
connections turn into an absolute maximum of 500 connections per second.
It can be fine for most uses. My hope DSL proxy doesn't need this for
example. But a server facing the internet definitely needs a bit more. To
give you an idea, you would just have to meet a bug in a script running on
one of your servers that retries in loops to fetch an object via haproxy,
and possibly after a few seconds all your entries are in use and you need
to wait for them to expire. At the very least, ensure that you can sustain
a communication on *all* source ports from a single client to a server,
which means 64k per side thus 128k total. That will protect you against
such type of accidents.

> I would expect normal peak traffic to be below a few hundred requests
> per second.  If we ever saw 1000-2000 per second, I'm not sure our
> current backend hardware could keep up.

The calculus is easy : take the highest number you'd dream of. Let's say
2000/s. Multiply it by the session's life time, which is the sum of all
states from SYN_RECV to TIME_WAIT and even CLOSE for a firewall. Typically
with an ideal server responding immediately, you'd have 60s TIME_WAIT plus
10s CLOSE, which is 70s. You then get 140000 concurrent connections per
side. That's 280k total. It's about the worst case scenario based on your
input. In practice, haproxy will ensure that no TIME_WAITs are kept on
their way to the server.

It's very important to target the highest possible peak. Some admins
consider that if they break at certain rates it's not a problem, but it's
the worst thing that can happen : your site falling in front of the most
witnesses. Very bad publicity. So take some margins and consider your
site's growth and the fact that you will forget about this tunable in 6
months and you don't want to have a bad surprize next year.

Regards,
Willy


Reply via email to