On Wed, 6 Jun 2018 at 11:20, Willy Tarreau <w...@1wt.eu> wrote: > Hi Milan, > > On Wed, Jun 06, 2018 at 11:09:19AM +0200, Milan Petruzelka wrote: > > Hi Willy, > > > > I've tracked one of connections hanging in CLOSE_WAIT state with tcpdump > > over last night. It started at 17:19 like this: > > > > "Packet No.","Time in > seconds","Source","Destination","Protocol","Length","Info" > > "1","0.000000","ip_client","ip_haproxy_server","TCP","62","64311 > > 443 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 SACK_PERM=1" > > "2","0.001049","ip_haproxy_server","ip_client","TCP","62","443 > > 64311 [SYN, ACK] Seq=0 Ack=1 Win=29200 Len=0 MSS=1460 SACK_PERM=1" > > Sure, it's extremely useful. It looks like a shutdown is ignored when > waiting for the H2 preface. The FD is not being polled for reading, so > I tend to conclude that either it was disabled for a reason I still do > not know, or it used to be enabled, the shutdown was received then the > polling was disabled. But that doesn't appear in the connection flags. > So it seems that the transition between the end of handshake and the > start of parsing could be at fault. Maybe we refrain from entering the > H2 state machine because of the early shutdown. I'm going to have a look > in that direction.
Hi Willy, Here is some additional info about the hanging CLOSE_WAIT connection described in my last message - screenshot from our log management - https://pasteboard.co/HoVCR0b.jpg and text variant: No. Time hap.ts.accept hap.client.port hap.term.state hap.rq.method hap.rsp.status hap.rsp.size 1 June 5th 2018, 17:19:22.913 June 5th 2018, 17:19:22.435 64311 ---- GET 200 17,733 2 June 5th 2018, 17:19:23.814 June 5th 2018, 17:19:23.806 64311 ---- GET 200 4,671 3 June 5th 2018, 17:19:23.814 June 5th 2018, 17:19:23.806 64311 ---- GET 200 4,481 4 June 5th 2018, 17:19:23.820 June 5th 2018, 17:19:23.808 64311 ---- GET 200 4,613 5 June 5th 2018, 17:19:23.821 June 5th 2018, 17:19:23.808 64311 ---- GET 200 4,584 6 June 5th 2018, 17:19:24.967 June 5th 2018, 17:19:24.961 64311 ---- GET 200 2,786 7 June 5th 2018, 17:19:28.945 June 5th 2018, 17:19:25.049 64311 ---- GET 200 78,325 8 June 5th 2018, 17:19:28.945 June 5th 2018, 17:19:25.049 64311 ---- GET 200 46,830 9 June 5th 2018, 17:19:31.617 June 5th 2018, 17:19:24.707 64311 CL-- GET 200 13,086 10 June 5th 2018, 17:19:31.618 June 5th 2018, 17:19:25.049 64311 CD-- GET 200 49,162 11 June 5th 2018, 17:19:31.618 June 5th 2018, 17:19:25.049 64311 CD-- GET 200 40,328 12 June 5th 2018, 17:19:31.618 June 5th 2018, 17:19:25.066 64311 CL-- POST 200 3,198 As you can see, client first made 8 successful requests using http/2 connection through Haproxy to our backend server. Remaining 4 requests finished with termination states CL-- or CD--, probably because of FIN packet received form client. C : the TCP session was unexpectedly aborted by the client D : the session was in the DATA phase. L : the proxy was still transmitting LAST data to the client while the server had already finished. This one is very rare as it can only happen when the client dies while receiving the last packets. It looks like Haproxy knows about client disconnection in muxed requests inside frontend http/2 connection. It just seems not to propagate this knowledge to the frontend connection, leaving it hanging in CLOSE_WAIT state. Sorry for not sending this piece of log last time, i didn't expect to see any of those CL-- / CD-- requests. I'll switch http/2 off on the site because I'll be AFK for 2 weeks. After that I'll be ready to provide any additional info to hunt this one down. Or to test the patched version if available. Thank you, best regards Milan