On Wed, Jun 13, 2018 at 07:06:58PM +0200, Janusz Dziemidowicz wrote:
> 2018-06-13 14:42 GMT+02:00 Willy Tarreau <w...@1wt.eu>:
> > Hi Milan, hi Janusz,
> >
> > thanks to your respective traces, I may have come up with a possible
> > scenario explaining the CLOSE_WAIT you're facing. Could you please
> > try the attached patch ?
> 
> Unfortunately there is no change for me. CLOSE_WAIT sockets still
> accumulate if I switch native h2 on. Milan should probably double
> check this though.
> https://pasteboard.co/HpJj72H.png

:-(

With still the same perfectly straight line really making me think of either
a periodic activity which I'm unable to guess nor model, or something related
to our timeouts.

> I'll try move some low traffic site to a separate instance tomorrow,
> maybe I'll be able to capture some traffic too.

Unfortunately with H2 that will not help much, there's the TLS layer
under it that makes it a real pain. TLS is designed to avoid observability
and it does it well :-/

I've suspected a received shutdown at the TLS layer, which I was not
able to model at all. Tools are missing at this point. I even tried
to pass the traffic through haproxy in TCP mode to help but I couldn't
reproduce the problem.

It could possibly help if you can look for the affected client's IP:port
in your logs to see if they are perfectly normal or if you notice they
have something in common (eg: always the exact same requests, or they
never made a request from the affected connections, etc).

I won't merge the current patch for now. At minima it's incomplete,
and there is always a risk that it breaks something else in such a
difficult to detect way.

Thanks,
Willy

Reply via email to