On Fri, May 02, 2025 at 04:54:47PM -0400, Demi Marie Obenour wrote:
> On 5/1/25 1:40 AM, Willy Tarreau wrote:
> > Hi,
> > 
> > On Thu, May 01, 2025 at 01:08:22AM -0400, Demi Marie Obenour wrote:
> >> I noticed that if HAProxy receives a message containing \x01 in a
> >> field value, it will happily forward that message.  This appears
> >> to be permitted by RFC9113 and RFC9114.  However, it might cause
> >> a problem if the message is a response, the response is sent to
> >> the client over HTTP/3, and the client uses nghttp3.  nghttp3 will
> >> silently discard the field, causing HAProxy and the client to
> >> disagree on whether the field exists.
> >>
> >> Is this a serious concern, or is it considered too theoretical to
> >> matter?  From my reading of 
> >> https://github.com/ngtcp2/nghttp3/discussions/346,
> >> it seems that the nghttp3 maintainer considers this to be working
> >> as intended (a valid decision).  Is there anything that should be
> >> be done on the HAProxy side, or is HAProxy thinking that a header
> >> exists when a client thinks it does not expected behavior?
> > 
> > When H2 begun, some people saw a great opportunity in its ability
> > to be fully binary-transparent, while others (like us) dealing with
> > intermediaries saw a big danger in it. From what I remember (but I
> > could be wrong, I'd need to recheck), the spec only requires that
> > chars are filtered when converting from H2 to H1. And IIRC in the
> > end only CR, LF and NUL are forbidden in header values in H2, as
> > a protective measure for H2 to H1 translation, while HPACK takes
> > care of permitting absolutely everything.
> > 
> > The thing is that full H2 applications are explicitly permitted
> > to rely on this. For example you could imagine sending payload's
> > SHA256 digests directly in a header, escaping NUL with \0, LF
> > with \n, CR with \r and '\' with '\\' an keep it compact like
> > this. So while we have to perform the cleanup for H2,H3 -> H1,
> > we cannot afford to be excessive and go beyond the spec where
> > applications expect it to work as permitted by the spec.
> 
> Which specification are you referring to?  RFC9110 limits the allowed
> characters in an HTTP field value to 0x9, 0x20 ... 0x7E, and 0x80 ... 0xFF,
> with the additional requirement that a field cannot start or end with
> 0x9 or 0x20.  RFC9112 (HTTP/1.1), RFC9113 (HTTP/2), and RFC9114 (HTTP/3)
> all refer to RFC9110 for permitted field values.  As far as I can tell,
> the list of permitted characters has not changed since RFC2068
> (January 1997).

Note, HTTP/2 was first specified in 7540, whose text was less clear
than it currently is since by then there was no HTTP spec but an HTTP/1
spec. For example that spec explicitly allowed "values in the same
format as in 7230", except that 7230 permitted the obsolete header
folding, thus LF or CRLF followed by an LWS, causing ambiguity
depending how you read the spec or what existing pieces of code you
would reuse for this purpose.

In 9113 it's much clearer: #8.2.1 Field Validity says:

 * A field value MUST NOT contain the zero value (ASCII NUL, 0x00),
   line feed (ASCII LF, 0x0a), or carriage return (ASCII CR, 0x0d)
   at any position.

 * A field value MUST NOT start or end with an ASCII whitespace
   character (ASCII SP or HTAB, 0x20 or 0x09).

This part was essentially clarified as part of this work:

    https://github.com/httpwg/http2-spec/issues/815
    https://github.com/httpwg/http2-spec/pull/846

The discussion about CTL existing in the wild is here:

    https://github.com/httpwg/http-core/issues/683

(and I've personally seen quite a number of them). We've even had
complaints for not being binary-transparent on non-standard headers,
from users who used to place raw binary (e.g. gzip, SHA256 or encrypted
blocks) directly in a their own header fields between their client and
their servers, and complaining that placing haproxy in the middle would
break. At least with the clarifications above there's less room for the
doubt. NUL, CR, LF are forbidden in values and there's no way around
this. There have been some provisions for allowing to replace CR/LF/NUL
with spaces for some time but I'm not seeing that in the final spec. I
think it came from what the WHATWG does (the group that deals with how
browsers should behave, because they're *way* more tolerant than any
other component, due to having to work around errors everywhere). In
any case, replacing with a space at the beginning or end of the value
still is forbidden.

I tried, before 9113 was released, to introduce some extra validation
of the pseudo-headers before reassembly, here:

    https://github.com/httpwg/http2-spec/pull/936
    https://github.com/httpwg/http2-spec/issues/1015

It was generally welcome but a bit too late and the spec was emitted
without it, the only thing we could save was a non-normative warning
for implementers:

    https://github.com/httpwg/http2-spec/pull/1017

But that doesn't mean we cannot improve, like your suggestions about
unwanted characters such as '/#?\\[]@\0' in :authority that I'm still
on!

Cheers,
Willy


Reply via email to