Hi Willy, > In fact it's a race between the GOAWAY frame caused by the invalid request, > and the HEADERS frame being sent in response to the stream being closed > I agree that it's quite confusing, but we're talking about responses to > conditions that are explicitly forbidden in the spec, so I'd rather not spend > too much energy on this for now.
As much as I agree about that specs should be followed, I realized that even if there’s people that want to follow the spec 100%, there will always be implementations used in large scale that won’t be following the spec 100% - the reasoning behind this can be multiple – one I could imagine is the fact when browsers or servers start implementing a new protocol (h2 is a good example) before the spec is actually 100% finalized – when it’s then finalized, the vendor might end up with a implementation that is slightly violating the actual spec, but however either won’t fix it because the violations are minor or because the violations are not by any way “breaking” when you also compare it to the other implementations done. In this case if I understand you correctly, the errors are related to the fact that certain clients didn’t implement the spec correctly in first place. I was very curious why e.g. the Connection header (even if it isn’t sent by Firefox or Safari/Webkit even though their webdev tools say it is), would work in nginx, and Apache for that matter, so I asked on their mailing list why they were violating the spec. Valentin gave a rather interesting answer why they in their software actually decided to sometimes violate specific parts, it all boiled down to client support, because they also realized the fact that many browsers (that might be EOL and never get updated), might have implementations that would not work with http2 in that case. http://mailman.nginx.org/pipermail/nginx/2017-December/055356.html I know that it’s different software, and that how others decide to design their software is completely up to them. Violating specs on purpose is generally bad, no doubt about that – but if it’s a requirement to be able to get good coverage in regards to clients (both new and old browsers that are actually in use), then I understand why one would go to such lengths as having to “hack” a bit to make sure generally used browsers can use the protocol. > So at least my analysis for now is that for a reason still to be determined, > this version of firefox didn't correctly interoperate with haproxy in a given > environment Downgrading Firefox to earlier versions (such as 55, which is “pre”-quantum) reveals the same issue with bad requests. Hopefully you’ll not have to violate the http2 spec in any way – but I do see a valid point explained by Valentin – the fact that you cannot guarantee all clients to be 100% compliant by the spec, and there might be a bunch of (used) EOL devices around. I used to work at a place where haproxy were used extensively, so seeing http2 support getting better and better is a really awesome thing, because it would actually mean that http2 could be implemented in that specific environment – I do hope in a few releases that http2 in haproxy gets to a point where we could rate it as “production ready”, with no real visible bugs from a customer perspective, at that point I think it would be good to implement in a large scale environment (for a percentage of the requests) to see how much traffic might actually get dropped in case the spec is followed – to see from some real world workload how many clients actually violate the spec. For now, I’ll personally leave http2 support disabled – since it’s breaking my applications for a big percentage of my users, and I’ll have to find an intermediate solution until at least the bug in regards to Firefox losing connections (this thing): Dec 28 21:22:35 localhost haproxy[1534]: 80.61.160.xxx:64921 [28/Dec/2017:21:22:12.309] https_frontend~ https_frontend/<NOSRV> -1/-1/-1/-1/22978 400 0 - - CR-- 1/1/0/0/0 0/0 "<BADREQ>" Dec 28 21:22:40 localhost haproxy[1534]: 80.61.160.xxx:64972 [28/Dec/2017:21:22:35.329] https_frontend~ cdn-backend/mycdn 0/0/1/0/5001 200 995 - - ---- 1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1" I never expect software to be bug free – but at this given point, this specific issue that happens causes too much visible “trouble” for end-users for me to be able to keep it enabled I’ll figure out if I can replicate the same issue in more browsers (without connection: keep-alive header), maybe that would give us more insight. Best Regards, Lucas Rolff On 29/12/2017, 00.08, "Willy Tarreau" <w...@1wt.eu> wrote: Hi Lukas, On Thu, Dec 28, 2017 at 09:19:24PM +0100, Lukas Tribus wrote: > On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus <lu...@ltri.eu> wrote: > > Hello, > > > > > >> But in this example, you're using HTTP/1.1, The "Connection" header is > >> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no > >> inconsistency here. > > > > For me a request like this: > > $ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive" > > -d "bla=bla" > > > > Fired multiple times from the shell, leads to a "400 Bad Request" > > response in about 20 ~ 30 % of the cases and is forwarded to the > > backend in other cases. In fact it's a race between the GOAWAY frame caused by the invalid request, and the HEADERS frame being sent in response to the stream being closed. It pretty much depends which one makes its way through the mux first, and given that both depend on the scheduling of all pending events, I hardly see what we can do to achieve a better consistency, except cheating (eg: killing the stream in a way to make it silent). In both cases the GOAWAY should be sent, and only sometimes there is enough time to get the 400 sent in the middle, which gets reported. I agree that it's quite confusing, but we're talking about responses to conditions that are explicitly forbidden in the spec, so I'd rather not spend too much energy on this for now. > However I am unable to reproduce the issue with Firefox: none of the > quantum releases (57.0, 57.0.1, 57.0.2, 57.0.3) emit a connection > header in my testing: That's pretty much interesting, so in fact probably that in the end it's not really sent. I can't test, I installed 57.0.3 on my machine and it's totally broken, tabs spin forever and even google.com does not load, so I had to revert to the last working Firefox ESR version :-( > - https://http2.golang.org/reqinfo never shows a connection header > (not even with POST) You never know whether this one could be stripped on the server side however. > - sniffing with wiresshark (using SSLKEYLOGFILE) also shows that > Firefox never emits a connection header in H2 OK this one sounds better. > - the developer tools *always* show a connection header in the > request, although there really isn't one - clearly there is a > discrepancy between what is transmitted on the wire and what is shown > on in dev tools Great, so that makes more sense regarding the observations so far. It's never fun when dev tools report false elements but it possibly depends where the information is extracted and we could even imagine that the header is internally emitted and stripped just before the request is converted to H2, so let's not completely blame the dev tool yet either :-) > What am I missing? Can you guys provide a decrypted trace showing this > behavior, the output of the http2 golang test and can you please both > clarify which OS you reproduce this on? So at least my analysis for now is that for a reason still to be determined, this version of firefox didn't correctly interoperate with haproxy in a given environment, that the dev tools reported a connection header, which once forced to be sent via curl or nghttp proved that haproxy rejected the request as mandated by the spec. This then led us to conclude that firefox was hit by the same problem, which in fact isn't the case as you just found. Thus we're indeed back to first round trying to figure why firefox+haproxy overthere do not cope well with h2 (given that it doesn't even work with H1 on my machine and that "ps auxw" clearly shows some buffer overflows affecting the argument strings, so I have zero trust at all in this version for now). Cheers, Willy