Hi Janusz, On Thu, May 24, 2018 at 01:49:52PM +0200, Janusz Dziemidowicz wrote: > Recently I've moved several servers from haproxy 1.7.x to 1.8.x I have > a setup with nghttpx handling h2 (haproxy connects to nghttpx via unix > socket which handles h2 and connects back to haproxy with plain > http/1.1 also through unix socket). > > After the upgrade I wanted to switch to native h2 supported by > haproxy. Unfortunately, it seems that over time haproxy is > accumulating sockets in CLOSE_WAIT state. Currently, after 12h I have > 5k connections in this state. All of them have non-zero Recv-Q and > zero Send-Q. netstat -ntpa shows something like this: > > tcp 1 0 IP:443 IP:28032 CLOSE_WAIT 115495/haproxy > tcp 35 0 IP:443 IP:49531 CLOSE_WAIT 115495/haproxy > tcp 507 0 IP:443 IP:31938 CLOSE_WAIT 115495/haproxy > tcp 134 0 IP:443 IP:49672 CLOSE_WAIT 115495/haproxy > tcp 732 0 IP:443 IP:3180 CLOSE_WAIT 115494/haproxy > tcp 746 0 IP:443 IP:39731 CLOSE_WAIT 115494/haproxy > tcp 35 0 IP:443 IP:62986 CLOSE_WAIT 115495/haproxy > tcp 585 0 IP:443 IP:51318 CLOSE_WAIT 115493/haproxy > tcp 100 0 IP:443 IP:60449 CLOSE_WAIT 115493/haproxy > tcp 35 0 IP:443 IP:1274 CLOSE_WAIT 115494/haproxy > ..
I never managed to see this happen yet. Even haproxy.org uses H2 and I've just checked on the server, zero CLOSE_WAIT. What is strange is that they all have pending data, it means they sent some data and closed. It could correspond to a timeout where the client finally closed not receiving a response. > Those are all frontend connections. Reloading haproxy removes those > connections, but only after hard-stop-after kicks in and old processes > are killed. Disabling native h2 support and switching back to nghttpx > makes the problem disappear. OK. > This kinda seems like the socket was closed on the writing side, but > the client has already sent something and everything is stuck. I was > not able to reproduce the problem by myself. Any ideas how to debug > this further? For now not much comes to my mind. I'd be interested in seeing the output of "show fd" issued on the stats socket of such a process (it can be large, be careful). > haproxy -vv (Debian package rebuilt on stretch with USE_TFO): Interesting, and I'm seeing "tfo" on your bind line. We don't have it on haproxy.org. Could you please re-test without it, just in case ? Maybe you're receiving SYN+data+FIN that are not properly handled. > HA-Proxy version 1.8.9-1~tsg9+1 2018/05/21 Is 1.8.9 the first version you tested or is it the first one you saw the issue on, or did you notice the issue on another 1.8 version ? If it turned out to be a regression it could be easier to spot in fact. Your config is very clean and shows nothing suspicious at all. Thus at first knowing if tfo changes anything would be a good start. Thanks! Willy