Hi Ricardo,

On Thu, Mar 28, 2024 at 06:21:16PM -0300, Ricardo Nabinger Sanchez wrote:
> Hi Willy,
> 
> On Thu, 28 Mar 2024 04:37:11 +0100
> Willy Tarreau <w...@1wt.eu> wrote:
> 
> > Thanks guys! So there seems to be an annoying bug. However I'm not sure
> > how this is related to your "connection refused", except if you try to
> > connect at the moment the process crashes and restarts, of course. I'm
> > seeing that the bug here is stktable_requeue_exp() calling task_queue()
> > with an invalid task expiration. I'm having a look now. I'll respond in
> > the issue with what I can find, thanks for your report.
> 
> These "connection refused" is from our watchdog; but the effects are as
> perceptible from the outside.  When our watchdog hits this situation,
> it will forcefully restart HAProxy (we have 2 instances) because there
> will be a considerable service degradation.  If you remember, there's
> https://github.com/haproxy/haproxy/issues/1895 and we talked briefly
> about this in person, at HAProxyConf.

Thanks for the context. I didn't remember about the issue. I remembered
we discussed for a while but didn't remember about the issue in question
obviously, given the number of issues I'm dealing with :-/

In the issue above I'm seeing an element from Felipe saying that telnet
to port 80 can take between 3 seconds to accept. That really makes me
think about either the SYN queue being full, causing drops and retransmits,
or a lack of socket memory to accept packets. That one could possibly be
caused by tcp_mem not being large enough due to some transfers with high
latency fast clients taking a lot of RAM, but it should not affect the
local UNIX socket. Also, killing the process means killing all the
associated connections and will definitely result in freeing a huge
amount of network buffers, so it could fuel that direction. If you have
two instances, did you notice if the two start to behave badly at the
same time ? If that's the case, it would definitely indicate a possible
resource-based cause like socket memory etc.

> But this is incredibly elusive to reproduce; it comes and goes.  It
> might happen every few minutes, or not happen at all for months.  Not
> tied to a specific setup: different versions, kernels, machines.  In
> fact, we do not have better ways to detect the situation, at least not
> as fast, reactive, and resilient.

It might be useful to take periodic snapshots of /proc/slabinfo and
see if something jumps during such incidents (grep for TCP, net, skbuff
there). I guess you have not noticed any "out of socket memory" nor such
indications in your kernel logs, of course :-/

Another one that could make sense to monitor is "PoolFailed" in
"show info". It should always remain zero.

> > Since you were speaking about FD count and maxconn at 900k, please let
> > me take this opportunity for a few extra sanity checks. By default we
> > assign up to about 50% of the FD to pipes (i.e. up to 25% pipes compared
> > to connections), so if maxconn is 900k you can reach 1800 + 900 = 2700k
> > FD. One thing to keep in mind is that /proc/sys/fs/nr_open sets a
> > per-process hard limit and usually is set to 1M, and that
> > /proc/sys/fs/file-max sets a system-wide limit and depends on the amount
> > of RAM, so both may interact with such a large setting. We could for
> > example imagine that at ~256k connections with as many pipes you're
> > reaching around 1M FDs and that the connection from socat to the CLI
> > socket cannot be accepted and is rejected. Since you recently updated
> > your kernel, it might be worth checking if the default values are still
> > in line with your usage.
> 
> We set our defaults pretty high in anticipation:
> 
>       /proc/sys/fs/file-max = 5M;
>       /proc/sys/fs/nr_open = 3M;

OK!

> Even with our software stack, we do not reach the limits.  A long time
> ago we did hit (lower limits back then) and the effects are devastating.

Yes, that's always the problem with certain limits, they hit you at the
worst ever moments, when the most users are counting on you to work fine
and when it's the hardest to spot anomalies.

Willy

Reply via email to