On Fri, Apr 02, 2021 at 06:38:35PM +0200, Willy Tarreau wrote:
> > This has come out of cases where we upgraded HAProxy 1.8 -> 2.2, and
> > $work customers started reporting requests that previously worked fine
> > now return 400 Invalid Request errors.
> That's never good. Often it indicates that they've long been doing
> something wrong without ever noticing and that for various reasons
> it's not possible anymore.
Yes, see below for details of what a tool used by customers has been
doing wrong.

> > Two things stand out:
> > - way to reliably capture that output, not being limited to the last
> >   error
> Ideally I'd like to have a global ring buffer to replace all per-proxy
> ones (which also consume a lot of RAM). We could imagine keeping the
> ability to have a per-proxy copy based on a config option.
I'd keep per-proxy support, because I can see cases where different
retention might be wanted.

> > - messaging about WHY a given position is an error
> >   - partial list of reasons I've seen so far included below
> In a capture, the indicated position is *always* an invalid character.
> It may require to have a look at the standards to know why, but it
> seems particularly difficult to me to start to emit the list of all
> permitted characters whenever a forbidden one is met. I remember that
> we emit the parser's state, maybe this is what should be turned to a
> more human-readable form to indicate "in header field name" or "in
> header field value" or stuff like this which can help the reader
> figure why the char is not welcome (maybe because they expect that
> a previous one had switched the state).
I wouldn't emit an entire list of permitted characters, but certainly
via the parser's state we can point to it by reference. E.g. in the
target URI, non-encoded spaces or unicode.

> > Partial list of low-level invalid request reasons
> > - path/queryparams has character that was supposed to be encoded
> > - header key invalid character for given position
> > - header line malformed (no colon!)
> > - header value invalid relative to prior pieces of request**
> For the last one we will not have a position because the request is
> *semantically* invalid, it's not a parsing issue.
Hmm, I think it does have a position value, but one that didn't seem to
make sense.

> > ** This one is bugging me: user requests with an absolute URI as the
> > urlpath, but the hostname in that URI does not match the hostname in the
> > Host header.
> 
> This is mandated by the standard:
> 
>    https://tools.ietf.org/html/rfc7230#section-5.4
> 
>    If the target URI includes an authority component, then a
>    client MUST send a field-value for Host that is identical to that
>    authority component, excluding any userinfo subcomponent and its "@"
>    delimiter (Section 2.7.1).
>    ...
>    A server MUST respond with a 400 (Bad Request) status code to any
>    HTTP/1.1 request message that lacks a Host header field and to any
>    request message that contains more than one Host header field or a
>    Host header field with an invalid field-value.
> 
> Do you regularly encounter this case ? If so maybe we could have an
> option to relax the rule in certain cases. The standard allows proxies
> to ignore the provided Host field and rewrite it from the authority.
> Note that we're not a proxy by a gateway and it's often the case that
> a number of other gateways (and possibly proxies) have been met from
> the client, so we wouldn't do that by default but with a compelling
> case I woudln't find this problematic.
If this changed in 1.8->2.2, then yes, it's absolutely the case. I have
already asserted to $work customers, that the third-party tool that is
being used is doing the wrong thing, but I haven't had much traction in
getting them to change what they are doing, since it needs.

I can't disclose the name of tool in question on the mailing list, but
it choses to implement HTTPS by having users of the tool run a local
stunnel, pointing to $work service, and point the tool to stunnel as an
http proxy

As for 'regularly', what is the metric to compare that to? It's a tiny
fraction of overall requests, but has been loud at the support level.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

Attachment: signature.asc
Description: PGP signature

Reply via email to