Hi Willy,

> In fact it's a race between the GOAWAY frame caused by the invalid request, 
> and the HEADERS frame being sent in response to the stream being closed
> I agree that it's quite confusing, but we're talking about responses to 
> conditions that are explicitly forbidden in the spec, so I'd rather not spend 
> too much energy on this for now.

As much as I agree about that specs should be followed, I realized that even if 
there’s people that want to follow the spec 100%, there will always be 
implementations used in large scale that won’t be following the spec 100% - the 
reasoning behind this can be multiple – one I could imagine is the fact when 
browsers or servers start implementing a new protocol (h2 is a good example) 
before the spec is actually 100% finalized – when it’s then finalized, the 
vendor might end up with a implementation that is slightly violating the actual 
spec, but however either won’t fix it because the violations are minor or 
because the violations are not by any way “breaking” when you also compare it 
to the other implementations done.

In this case if I understand you correctly, the errors are related to the fact 
that certain clients didn’t implement the spec correctly in first place.

I was very curious why e.g. the Connection header (even if it isn’t sent by 
Firefox or Safari/Webkit even though their webdev tools say it is), would work 
in nginx, and Apache for that matter, so I asked on their mailing list why they 
were violating the spec.

Valentin gave a rather interesting answer why they in their software actually 
decided to sometimes violate specific parts, it all boiled down to client 
support, because they also realized the fact that many browsers (that might be 
EOL and never get updated), might have implementations that would not work with 
http2 in that case.

http://mailman.nginx.org/pipermail/nginx/2017-December/055356.html

I know that it’s different software, and that how others decide to design their 
software is completely up to them.
Violating specs on purpose is generally bad, no doubt about that – but if it’s 
a requirement to be able to get good coverage in regards to clients (both new 
and old browsers that are actually in use), then I understand why one would go 
to such lengths as having to “hack” a bit to make sure generally used browsers 
can use the protocol.

> So at least my analysis for now is that for a reason still to be determined, 
> this version of firefox didn't correctly interoperate with haproxy in a given 
> environment

Downgrading Firefox to earlier versions (such as 55, which is “pre”-quantum) 
reveals the same issue with bad requests.

Hopefully you’ll not have to violate the http2 spec in any way – but I do see a 
valid point explained by Valentin – the fact that you cannot guarantee all 
clients to be 100% compliant by the spec, and there might be a bunch of (used) 
EOL devices around.

I used to work at a place where haproxy were used extensively, so seeing http2 
support getting better and better is a really awesome thing, because it would 
actually mean that http2 could be implemented in that specific environment – I 
do hope in a few releases that http2 in haproxy gets to a point where we could 
rate it as “production ready”, with no real visible bugs from a customer 
perspective, at that point I think it would be good to implement in a large 
scale environment (for a percentage of the requests) to see how much traffic 
might actually get dropped in case the spec is followed – to see from some real 
world workload how many clients actually violate the spec.

For now, I’ll personally leave http2 support disabled – since it’s breaking my 
applications for a big percentage of my users, and I’ll have to find an 
intermediate solution until at least the bug in regards to Firefox losing 
connections (this thing):

Dec 28 21:22:35 localhost haproxy[1534]: 80.61.160.xxx:64921 
[28/Dec/2017:21:22:12.309] https_frontend~ https_frontend/<NOSRV> 
-1/-1/-1/-1/22978 400 0 - - CR-- 1/1/0/0/0 0/0 "<BADREQ>"
Dec 28 21:22:40 localhost haproxy[1534]: 80.61.160.xxx:64972 
[28/Dec/2017:21:22:35.329] https_frontend~ cdn-backend/mycdn 0/0/1/0/5001 200 
995 - - ---- 1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"

I never expect software to be bug free – but at this given point, this specific 
issue that happens causes too much visible “trouble” for end-users for me to be 
able to keep it enabled
I’ll figure out if I can replicate the same issue in more browsers (without 
connection: keep-alive header), maybe that would give us more insight.

Best Regards,
Lucas Rolff

On 29/12/2017, 00.08, "Willy Tarreau" <w...@1wt.eu> wrote:

    Hi Lukas,
    
    On Thu, Dec 28, 2017 at 09:19:24PM +0100, Lukas Tribus wrote:
    > On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus <lu...@ltri.eu> wrote:
    > > Hello,
    > >
    > >
    > >> But in this example, you're using HTTP/1.1, The "Connection" header is
    > >> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
    > >> inconsistency here.
    > >
    > > For me a request like this:
    > > $ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
    > > -d "bla=bla"
    > >
    > > Fired multiple times from the shell, leads to a "400 Bad Request"
    > > response in about 20 ~ 30 % of the cases and is forwarded to the
    > > backend in other cases.
    
    In fact it's a race between the GOAWAY frame caused by the invalid
    request, and the HEADERS frame being sent in response to the stream
    being closed. It pretty much depends which one makes its way through
    the mux first, and given that both depend on the scheduling of all
    pending events, I hardly see what we can do to achieve a better
    consistency, except cheating (eg: killing the stream in a way to
    make it silent). In both cases the GOAWAY should be sent, and only
    sometimes there is enough time to get the 400 sent in the middle,
    which gets reported. I agree that it's quite confusing, but we're
    talking about responses to conditions that are explicitly forbidden
    in the spec, so I'd rather not spend too much energy on this for now.
    
    > However I am unable to reproduce the issue with Firefox: none of the
    > quantum releases (57.0, 57.0.1, 57.0.2, 57.0.3) emit a connection
    > header in my testing:
    
    That's pretty much interesting, so in fact probably that in the end
    it's not really sent. I can't test, I installed 57.0.3 on my machine
    and it's totally broken, tabs spin forever and even google.com does
    not load, so I had to revert to the last working Firefox ESR version :-(
    
    > - https://http2.golang.org/reqinfo never shows a connection header
    > (not even with POST)
    
    You never know whether this one could be stripped on the server side
    however.
    
    > - sniffing with wiresshark (using SSLKEYLOGFILE) also shows that
    > Firefox never emits a connection header in H2
    
    OK this one sounds better.
    
    > - the developer tools *always* show a connection header in the
    > request, although there really isn't one - clearly there is a
    > discrepancy between what is transmitted on the wire and what is shown
    > on in dev tools
    
    Great, so that makes more sense regarding the observations so far.
    It's never fun when dev tools report false elements but it possibly
    depends where the information is extracted and we could even imagine
    that the header is internally emitted and stripped just before the
    request is converted to H2, so let's not completely blame the dev
    tool yet either :-)
    
    > What am I missing? Can you guys provide a decrypted trace showing this
    > behavior, the output of the http2 golang test and can you please both
    > clarify which OS you reproduce this on?
    
    So at least my analysis for now is that for a reason still to be
    determined, this version of firefox didn't correctly interoperate with
    haproxy in a given environment, that the dev tools reported a connection
    header, which once forced to be sent via curl or nghttp proved that
    haproxy rejected the request as mandated by the spec. This then led
    us to conclude that firefox was hit by the same problem, which in fact
    isn't the case as you just found.
    
    Thus we're indeed back to first round trying to figure why firefox+haproxy
    overthere do not cope well with h2 (given that it doesn't even work with
    H1 on my machine and that "ps auxw" clearly shows some buffer overflows
    affecting the argument strings, so I have zero trust at all in this
    version for now).
    
    Cheers,
    Willy
    

Reply via email to