Hi Milan, hi Janusz, thanks to your respective traces, I may have come up with a possible scenario explaining the CLOSE_WAIT you're facing. Could you please try the attached patch ?
Thanks, Willy
>From fc527303f1391849ae5880b304486e328010b5ff Mon Sep 17 00:00:00 2001 From: Willy Tarreau <w...@1wt.eu> Date: Wed, 13 Jun 2018 14:24:56 +0200 Subject: BUG/MEDIUM: h2: make sure the last stream closes the connection after a timeout If a timeout strikes on the connection side with some active streams, there is a corner case which can sometimes cause the following sequence to happen : - There are active streams but there are data in the mux buffer (eg: a client suddenly disconnected during a download with pending requests). The timeout is active. - The timeout strikes, h2_timeout_task() is called, kills the task and doesn't close the connection since there are streams left ; The connection is marked in H2_CS_ERROR ; - the streams are woken up and closed ; - when the last stream closes, calling h2_detach(), it sees the tree list is empty, but there is no condition allowing the connection to be closed (mbuf->o > 0), thus it does nothing ; - since the task is dead, there's no more hope to clear this situation later For now we can take care of this by adding a test for the presence of H2_CS_ERROR and !task, implying the timeout task triggered already and will not be able to handle this again. Over the long term it seems like a more reliable test on should be made, so that it is possible to know whether or not someone is still able to close this connection. A big thanks to Janusz Dziemidowicz and Milan Petruzelka for providing many details helping in figuring this bug. --- src/mux_h2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mux_h2.c b/src/mux_h2.c index 57b1722..bc33ce2 100644 --- a/src/mux_h2.c +++ b/src/mux_h2.c @@ -2469,6 +2469,7 @@ static void h2_detach(struct conn_stream *cs) */ if (eb_is_empty(&h2c->streams_by_id) && /* don't close if streams exist */ ((h2c->conn->flags & CO_FL_ERROR) || /* errors close immediately */ + (h2c->st0 >= H2_CS_ERROR && !h2c->task) || /* a timeout stroke earlier */ (h2c->flags & H2_CF_GOAWAY_FAILED) || (!h2c->mbuf->o && /* mux buffer empty, also process clean events below */ (conn_xprt_read0_pending(h2c->conn) || -- 1.7.12.1