Re: HAProxy 2.3.14 sporadic resets

2022-02-20 Thread Tim Düsterhus

Joerg,

On 2/16/22 15:44, Lenhard, Joerg wrote:

I am running HAProxy 2.3.14 in a Kubernetes cluster managed by the 
haproxy-ingress ingress controller: 
https://github.com/jcmoraisjr/haproxy-ingress



Without looking into your issue in too much detail I'd like to note that 
2.3.14 is outdated, almost 6 months old and affected by 79 bugs: 
https://www.haproxy.org/bugs/bugs-2.3.14.html


There was at least one issue that caused sporadic TCP RST to be sent 
when using SNI for backend servers: 
https://github.com/haproxy/haproxy/issues/1495. I'm not entirely sure if 
this also affects 2.3.x, but the fix was backported so it probably does.


In any case I recommend upgrading your HAProxy and then reporting back.

Best regards
Tim Düsterhus



HAProxy 2.3.14 sporadic resets

2022-02-16 Thread Lenhard, Joerg
Hi all,

I am running HAProxy 2.3.14 in a Kubernetes cluster managed by the 
haproxy-ingress ingress controller: 
https://github.com/jcmoraisjr/haproxy-ingress

There are sporadic connection resets and by capturing traffic I could identify 
the proxy to be the origin. The author of haproxy-ingress suggested that it 
might be a crash of the proxy and suggested to reach out here.

What I am seeing is regular traffic until the proxy suddenly sends a FIN/ACK to 
the server. The server is surprised by this and replies with a RST/ACK which 
the proxy forwards to the client. From the logs of the ingress controller, I 
could see that a reload was going on at that time and the connection was 20 
seconds old. In my setting, there are about 4800 backends and frequent changes 
to them that require reloads. At the same time, there are many long-living TCP 
connections incoming so old processes will not quickly terminate. At times, 
there are > 400 haproxy processes running and there are usually 200-300 
processes running at any given time.

I am wondering if there are known issues that could be the root of this or what 
I could do to prevent such resets. Any help would be appreciated.

More context: https://github.com/jcmoraisjr/haproxy-ingress/issues/899

Reload happens via:
haproxy -f "$PARAM_CFG" -p "$HAPROXY_PID" -D -sf $OLD_PID -x "$HAPROXY_SOCKET"

Global and default settings:
global
daemon
unix-bind mode 0600
nbthread 63
cpu-map auto:1/1-63 0-62
stats socket /var/run/haproxy/admin.sock level admin expose-fd listeners 
mode 600
maxconn 30100
tune.ssl.default-dh-param 2048
ssl-default-bind-ciphers 
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites 
TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
ssl-default-server-ciphers 
ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-server-ciphersuites 
TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
pp2-never-send-local

defaults
log global
maxconn 30100
option redispatch
option http-server-close
option http-keep-alive
timeout client  50s
timeout client-fin  50s
timeout connect 5s
timeout http-keep-alive 1m
timeout http-request5s
timeout queue   5s
timeout server  50s
timeout server-fin  50s
timeout tunnel  24h

Thanks and best regards
Joerg