On Wed, May 23, 2018 at 08:45:04PM +0200, William Dauchy wrote: > More details which could help understand what is going on: > > ps output: > > root 15928 0.3 0.0 255216 185268 ? Ss May21 10:11 > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 16988 16912 6340 28271 30590 30334 -x /var/lib/haproxy/stats > haproxy 6340 2.0 0.0 526172 225476 ? Ssl May22 35:03 \_ > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 6328 6315 -x /var/lib/haproxy/stats > haproxy 28271 1.8 0.0 528720 229508 ? Ssl May22 27:13 \_ > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 28258 28207 28232 6340 -x /var/lib/haproxy/stats > haproxy 30590 265 0.0 527268 225032 ? Rsl 04:35 2188:55 \_ > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 30578 28271 6340 -x /var/lib/haproxy/stats > haproxy 30334 197 0.0 526704 224544 ? Rsl 09:17 1065:59 \_ > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 30322 30295 27095 6340 28271 30590 -x /var/lib/haproxy/stats > haproxy 16912 1.7 0.0 527544 216552 ? Ssl 18:14 0:03 \_ > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 16899 28271 30590 30334 6340 -x /var/lib/haproxy/stats > haproxy 17001 2.2 0.0 528392 214656 ? Ssl 18:17 0:00 \_ > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf > 16988 16912 6340 28271 30590 30334 -x /var/lib/haproxy/stats > > > lsof output: > > haproxy 6340 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 6340 6341 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 6340 6342 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 6340 6343 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 17020 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 17020 17021 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 17020 17022 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 17020 17023 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 28271 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 28271 28272 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 28271 28273 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > haproxy 28271 28274 haproxy 5u unix 0xffff883feec97000 0t0 > 679289634 /var/lib/haproxy/stats.15928.tmp > > (So on unhealthy nodes, I find old processes which are still linked to > the socket.) > > The provisioning part is also seeing data which are supposed to be > already updated through the runtime API. I suspect I am getting old > data when connecting to the unix socket. The later being still attached > to an old process? > Indeed, if I try > for i in {1..500}; do sudo echo "show info" | sudo socat stdio > /var/lib/haproxy/stats | grep Pid; done > > I get "Pid: 17001" most of the time, which is the last process > but I sometimes get: "Pid: 28271"(!) which is a > 24 hours old > process. > > Is there something we are doing wrongly?
After some more testing, I don't have this issue using haproxy v1.8.8 (rollbacked for > 12 hours). I hope I don't speak too fast. -- William