Hi Nathan,

On Tue, Mar 30, 2021 at 09:21:30AM -0700, Nathan Konopinski wrote:
> Sometimes clients (clients are only http 1.1 and use connection: close) are
> reporting a body length of ~4000 is less than the content length of ~14000.
> The issue does not appear when using nginx as an LB and I've verified
> complete responses are being sent from the backends for the requests
> clients report errors on.
> 
> It's not clear why a portion of the clients aren't receiving the entire
> response. I'm unable to replicate the issue with curl. I have a vanilla
> config using https, prometheus metrics, and a h1-case-adjust-bogus-client
> option to adjust a couple headers.
> 
> Has anyone come across similar issues? I see an option for request
> buffering but nothing for response buffering. Are there options I can
> adjust that could be related to this type of issue?

No it's not expected at all and should really never happen. One option
could have caused this to happen, it's "option nolinger" but you don't
have it and your config is really clean and straightforward.

Could you take a capture of the communications between the clients and
haproxy ? The fact that you're using close opens the way for a subtle
issue that affects certain old clients with POST requests. Some of them
send POST requests with a body, and for now particular reason after
half a second to a second, they emit a CRLF that cannot be read as not
being part of the current body, and could even happen after the response.

If haproxy has already sent the response back (and 14kB perfectly fit
in a single buffer so that sounds plausible), closed (since there's the
connection: close), and the CRLF from the client arrives *after* the
close, then the TCP stack will reset the connection and send a TCP RST
back. First this will result in pending data to be dropped. Second,
when the client receives it, it can also drop some of its previously
received but unread data.

You don't necessarily need to decrypt HTTPS to detect this. Simply taking
a network capture, looking for RSTs and checking if some non-empty TCP
segments flow from the client to haproxy just before the RST would
already be an indication. What's nasty if you have to deal with this
is that it's totally timing-dependent, and that possible workarounds
are just that, workarounds.

Regards,
Willy

Reply via email to