On Fri, Apr 02, 2021 at 06:38:35PM +0200, Willy Tarreau wrote: > > This has come out of cases where we upgraded HAProxy 1.8 -> 2.2, and > > $work customers started reporting requests that previously worked fine > > now return 400 Invalid Request errors. > That's never good. Often it indicates that they've long been doing > something wrong without ever noticing and that for various reasons > it's not possible anymore. Yes, see below for details of what a tool used by customers has been doing wrong.
> > Two things stand out: > > - way to reliably capture that output, not being limited to the last > > error > Ideally I'd like to have a global ring buffer to replace all per-proxy > ones (which also consume a lot of RAM). We could imagine keeping the > ability to have a per-proxy copy based on a config option. I'd keep per-proxy support, because I can see cases where different retention might be wanted. > > - messaging about WHY a given position is an error > > - partial list of reasons I've seen so far included below > In a capture, the indicated position is *always* an invalid character. > It may require to have a look at the standards to know why, but it > seems particularly difficult to me to start to emit the list of all > permitted characters whenever a forbidden one is met. I remember that > we emit the parser's state, maybe this is what should be turned to a > more human-readable form to indicate "in header field name" or "in > header field value" or stuff like this which can help the reader > figure why the char is not welcome (maybe because they expect that > a previous one had switched the state). I wouldn't emit an entire list of permitted characters, but certainly via the parser's state we can point to it by reference. E.g. in the target URI, non-encoded spaces or unicode. > > Partial list of low-level invalid request reasons > > - path/queryparams has character that was supposed to be encoded > > - header key invalid character for given position > > - header line malformed (no colon!) > > - header value invalid relative to prior pieces of request** > For the last one we will not have a position because the request is > *semantically* invalid, it's not a parsing issue. Hmm, I think it does have a position value, but one that didn't seem to make sense. > > ** This one is bugging me: user requests with an absolute URI as the > > urlpath, but the hostname in that URI does not match the hostname in the > > Host header. > > This is mandated by the standard: > > https://tools.ietf.org/html/rfc7230#section-5.4 > > If the target URI includes an authority component, then a > client MUST send a field-value for Host that is identical to that > authority component, excluding any userinfo subcomponent and its "@" > delimiter (Section 2.7.1). > ... > A server MUST respond with a 400 (Bad Request) status code to any > HTTP/1.1 request message that lacks a Host header field and to any > request message that contains more than one Host header field or a > Host header field with an invalid field-value. > > Do you regularly encounter this case ? If so maybe we could have an > option to relax the rule in certain cases. The standard allows proxies > to ignore the provided Host field and rewrite it from the authority. > Note that we're not a proxy by a gateway and it's often the case that > a number of other gateways (and possibly proxies) have been met from > the client, so we wouldn't do that by default but with a compelling > case I woudln't find this problematic. If this changed in 1.8->2.2, then yes, it's absolutely the case. I have already asserted to $work customers, that the third-party tool that is being used is doing the wrong thing, but I haven't had much traction in getting them to change what they are doing, since it needs. I can't disclose the name of tool in question on the mailing list, but it choses to implement HTTPS by having users of the tool run a local stunnel, pointing to $work service, and point the tool to stunnel as an http proxy As for 'regularly', what is the metric to compare that to? It's a tiny fraction of overall requests, but has been loud at the support level. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
signature.asc
Description: PGP signature