Hi Willy, Thanks for continuing to look into this.
> > I've place an nginx instance after my local haproxy dev config, and > found something which might explain what you're observing : the process > apparently leaks FDs and fails once in a while, causing 500 to be returned : That's fascinating. I would have thought nginx would have had a bit better care given to things like that. . . Oddly enough, I cannot find any log entries that approximate this. However, it's possible since we're primarily (99+%) using nginx as a reverse-proxy that the fd issues wouldn't appear for us. My next thought is to try tcpdump to try to determine what's on the wire when the CD-- and SD-- pairs appear, but since our stack is SSL e2e, that might prove difficult. Any suggestions? One more interesting piece of data: if we use htx without h2 on the backends, we only see CD-- entries consistently (with a very, very few SD-- entries). Thus, it would seem whatever is causing the issue is directly related to h2 backends. I further think we can safely say it is directly related to h2 streams breaking (due to client-side request cancellations) resulting in the whole connection breaking in HAProxy or nginx (though determining which will be the trick). There's also a strong possibility we replace nginx with HAProxy entirely for our SSL + H2 setup as we overhaul the backends, so this problem will probably be resolved by removing the problematic interaction. I'm still working on running h2load against our nginx servers to see if that turns anything up. > And at this point the connection is closed and reopened for new requests. > There's never any GOAWAY sent. If I'm understanding this correctly, that implies as long as nginx sends GOAWAY properly, HAProxy will not attempt to reuse the connection? > I managed to work around the problem by limiting the number of total > requests per connection. I find this extremely dirty but if it helps... > I just need to figure how to best do it, so that we can use it as well > for H2 as for H1. We're pretty satisfied with our h2 fe <-> be h1.1 setup right now, so we will probably stick with that for now, since we don't want to have any more operational issues from bleeding-edge bugs. (Not a comment on HAProxy, per se, just a business reality. :-) ) I'm more than happy to try out anything you turn up on our staging setup! Best, Luke — Luke Seelenbinder Stadia Maps | Founder stadiamaps.com ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Wednesday, January 23, 2019 8:28 AM, Willy Tarreau <w...@1wt.eu> wrote: > Hi Luke, > > I've place an nginx instance after my local haproxy dev config, and > found something which might explain what you're observing : the process > apparently leaks FDs and fails once in a while, causing 500 to be returned : > > 2019/01/23 08:22:13 [crit] 25508#0: *36705 open() > "/usr/local/nginx/html/index.html" failed (24: Too many open files), client: > 1> > 2019/01/23 08:22:13 [crit] 25508#0: accept4() failed (24: Too many open files) > > 127.0.0.1 - - [23/Jan/2019:08:22:13 +0100] "GET / HTTP/2.0" 500 579 "-" > "Mozilla/4.0 (compatible; MSIE 7.01; Windows)" > > The ones are seen by haproxy : > > 127.0.0.1:47098 [23/Jan/2019:08:22:13.589] decrypt trace/ngx 0/0/0/0/0 500 > 701 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1" > > And at this point the connection is closed and reopened for new requests. > There's never any GOAWAY sent. > > I managed to work around the problem by limiting the number of total > requests per connection. I find this extremely dirty but if it helps... > I just need to figure how to best do it, so that we can use it as well > for H2 as for H1. > > Best regards, > Willy
publickey - luke.seelenbinder@stadiamaps.com - 0xB23C1E8A.asc
Description: application/pgp-keys
signature.asc
Description: OpenPGP digital signature