Hello,
I am trying to understand a possible issue we have regarding haproxy (seamless)
reloads.
I am using haproxy v1.8.9 with the following config (using nbthread):
global
log 127.0.0.1 local0 info
maxconn 262144
user haproxy
group haproxy
nbproc 1
daemon
stats socket /var/lib/haproxy/stats level admin mode 644 expose-fd
listeners
stats timeout 2m
tune.bufsize 33792
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
ssl-default-bind-ciphers
ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
ssl-default-server-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
ssl-default-server-ciphers
ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
ssl-server-verify none
crt-base /etc/haproxy/tls/
nbthread 4
cpu-map auto:1/1-4 0-3
defaults
log global
mode http
retries 3
timeout connect 10s
timeout client 180s
timeout server 180s
timeout http-keep-alive 10s
timeout http-request 10s
timeout queue 1s
timeout check 5s
option httplog
option dontlognull
option redispatch
option prefer-last-server
option dontlog-normal
option http-keep-alive
option forwardfor except 127.0.0.0/8
balance roundrobin
maxconn 262134
http-reuse safe
default-server inter 5s fastinter 1s fall 3 slowstart 20s observe layer7
error-limit 1 on-error fail-check
http-check send-state
http-check expect status 200 string passing
listen stats
bind *:8080
stats enable
stats uri /haproxy_stats
frontend fe_main
# arbitrary split in two for http/https traffic
bind *:80 name http_1 process 1/1
bind *:80 name http_2 process 1/2
bind *:443 name https_3 ssl crt /etc/haproxy/tls/fe_main process 1/3
bind *:443 name https_4 ssl crt /etc/haproxy/tls/fe_main process 1/4
[...]
The rest of the config contains lost of acl/backends (> 1000)
We do frequent reloads (approximatively every 10s).
After a while some processes remains alive and seem to never exit (waited >24
hours). While stracing them, some of them are still handling traffic and
doing healthchecks. Some of them are exiting normally after the reload.
I was wondering how I can help to debug this issue. I assume I won't
have any other info through the stats socket, since it concerns older
processes but maybe I missed something.
Do you have any hint to help me understand what is going on?
Best regards,
--
William