Hi Ryan,

On Mon, Mar 04, 2024 at 12:13:48PM -0600, Ryan O'Hara wrote:
> I am looking at CVE-2023-45539 as it affects older versions of haproxy (ie.
> haproxy-1.8). At this point I have verified that 1.8 is affected by this
> issue, which is in agreement with the original bug/commit which states
> versions prior to 2.8 need a backport.

So apparently this is about this one:

  2eab6d3543 ("BUG/MINOR: h1: do not accept '#' as part of the URI component")

> I am wondering if anyone has
> attempted or completed this backport. I am happy to provide one with the
> understanding that this will not be merged as 1.8 is EOL.

Not attempted, but I had a quick look, and it seems to me that the part
for 2.0 that's related to the legacy code should apply well and work,
since there are very few differences between 1.8 and 2.0 regarding this.
The part that applies to HTX just has to be dropped.

Well, finally I tried and it was not hard, I could even validate that
it works using the regtest.

I'm attaching the following backported patches, only the first 3 are
needed, the remaining ones are for h2, but since in 1.8 h2 is turned
into h1, you probably don't need them anyway.

Hoping this helps,
Willy
>From 4e98c0c1d36104ed426d3b198a176e1a5df814fa Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Tue, 8 Aug 2023 16:17:22 +0200
Subject: BUG/MINOR: h1: do not accept '#' as part of the URI component

Seth Manesse and Paul Plasil reported that the "path" sample fetch
function incorrectly accepts '#' as part of the path component. This
can in some cases lead to misrouted requests for rules that would apply
on the suffix:

    use_backend static if { path_end .png .jpg .gif .css .js }

Note that this behavior can be selectively configured using
"normalize-uri fragment-encode" and "normalize-uri fragment-strip".

The problem is that while the RFC says that this '#' must never be
emitted, as often it doesn't suggest how servers should handle it. A
diminishing number of servers still do accept it and trim it silently,
while others are rejecting it, as indicated in the conversation below
with other implementers:

   https://lists.w3.org/Archives/Public/ietf-http-wg/2023JulSep/0070.html

Looking at logs from publicly exposed servers, such requests appear at
a rate of roughly 1 per million and only come from attacks or poorly
written web crawlers incorrectly following links found on various pages.

Thus it looks like the best solution to this problem is to simply reject
such ambiguous requests by default, and include this in the list of
controls that can be disabled using "option accept-invalid-http-request".

We're already rejecting URIs containing any control char anyway, so we
should also reject '#'.

In the H1 parser for the H1_MSG_RQURI state, there is an accelerated
parser for bytes 0x21..0x7e that has been tightened to 0x24..0x7e (it
should not impact perf since 0x21..0x23 are not supposed to appear in
a URI anyway). This way '#' falls through the fine-grained filter and
we can add the special case for it also conditionned by a check on the
proxy's option "accept-invalid-http-request", with no overhead for the
vast majority of valid URIs. Here this information is available through
h1m->err_pos that's set to -2 when the option is here (so we don't need
to change the API to expose the proxy). Example with a trivial GET
through netcat:

  [08/Aug/2023:16:16:52.651] frontend layer1 (#2): invalid request
    backend <NONE> (#-1), server <NONE> (#-1), event #0, src 127.0.0.1:50812
    buffer starts at 0 (including 0 out), 16361 free,
    len 23, wraps at 16336, error at position 7
    H1 connection flags 0x00000000, H1 stream flags 0x00000810
    H1 msg state MSG_RQURI(4), H1 msg flags 0x00001400
    H1 chunk len 0 bytes, H1 body len 0 bytes :

    00000  GET /aa#bb HTTP/1.0\r\n
    00021  \r\n

This should be progressively backported to all stable versions along with
the following patch:

    REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri 
tests

Similar fixes for h2 and h3 will come in followup patches.

Thanks to Seth Manesse and Paul Plasil for reporting this problem with
detailed explanations.

(cherry picked from commit 2eab6d354322932cfec2ed54de261e4347eca9a6)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 9bf75c8e22a8f2537f27c557854a8803087046d0)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 9facd01c9ac85fe9bcb331594b80fa08e7406552)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 832b672eee54866c7a42a1d46078cc9ae0d544d9)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit e5a741f94977840c58775b38f8ed830207f7e4d0)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 178cea76b1c9d9413afa6961b6a4576fcb5b26fa)
[wt: applied the same to http_parse_reqline() in http_msg.c]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 4ad6fd9eeb3078685fffdc58f1c6d4eb97e05d98)
[wt: dropped the HTX part, adapted the legacy one in http_msg.c]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 src/h1.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/h1.c b/src/h1.c
index d3a20c2ed..57be42f31 100644
--- a/src/h1.c
+++ b/src/h1.c
@@ -341,11 +341,11 @@ const char *http_parse_reqline(struct http_msg *msg,
     defined(__ARM_ARCH_7A__)
                /* speedup: skip bytes not between 0x21 and 0x7e inclusive */
                while (ptr <= end - sizeof(int)) {
-                       int x = *(int *)ptr - 0x21212121;
+                       int x = *(int *)ptr - 0x24242424;
                        if (x & 0x80808080)
                                break;
 
-                       x -= 0x5e5e5e5e;
+                       x -= 0x5b5b5b5b;
                        if (!(x & 0x80808080))
                                break;
 
@@ -357,8 +357,15 @@ const char *http_parse_reqline(struct http_msg *msg,
                        goto http_msg_ood;
                }
        http_msg_rquri2:
-               if (likely((unsigned char)(*ptr - 33) <= 93)) /* 33 to 126 
included */
+               if (likely((unsigned char)(*ptr - 33) <= 93)) { /* 33 to 126 
included */
+                       if (*ptr == '#') {
+                               if (msg->err_pos < -1) /* PR_O2_REQBUG_OK not 
set */
+                                       goto invalid_char;
+                               if (msg->err_pos == -1) /* PR_O2_REQBUG_OK set: 
just log */
+                                       msg->err_pos = ptr - msg_start;
+                       }
                        EAT_AND_JUMP_OR_RETURN(ptr, end, http_msg_rquri2, 
http_msg_ood, state, HTTP_MSG_RQURI);
+               }
 
                if (likely(HTTP_IS_SPHT(*ptr))) {
                        msg->sl.rq.u_l = ptr - msg_start - msg->sl.rq.u;
-- 
2.35.3

>From 2d848a09fb7a1fb661a418cc07c59496d7eb6b3e Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Tue, 8 Aug 2023 19:53:51 +0200
Subject: REGTESTS: http-rules: verify that we block '#' by default for
 normalize-uri

Since we now block fragments by default, let's add an extra test there
to confirm that it's blocked even when stripping it.

(cherry picked from commit 4d0175b54b2b4eeb01aa6e31282b0a5b0d7d8ace)
 [ad: backported to test conformance of BUG/MINOR: h1: do not accept '#'
  as part of the URI component]
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit b3f26043df74c661155566a0abd56103e8116078)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 41d161ccbbfa846b4b17ed0166ff08f6bf0c3ea1)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit b6b330eb117d520a890e5b3cd623eaa73479db1b)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 73b9b13ac2654ef5384789685e3d65ca5f2f880a)
[wt: rewrote the test for 2.2 without normalize-uri and called it
 fragments-in-uri]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit dbf47600f63ffe161ce08d2f0faef7e0deb32b6e)
[wt: removed tune.idle-pool.shared from global section]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit f04fec9f3efe7f8b70fbe72d6a4473f01699728c)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 reg-tests/http-rules/fragment_in_uri.vtc | 35 ++++++++++++++++++++++++
 1 file changed, 35 insertions(+)
 create mode 100644 reg-tests/http-rules/fragment_in_uri.vtc

diff --git a/reg-tests/http-rules/fragment_in_uri.vtc 
b/reg-tests/http-rules/fragment_in_uri.vtc
new file mode 100644
index 000000000..621751356
--- /dev/null
+++ b/reg-tests/http-rules/fragment_in_uri.vtc
@@ -0,0 +1,35 @@
+varnishtest "check for fragments in URL"
+#REQUIRE_VERSION=2.0
+
+# This reg-test checks that '#' is properly blocked in requests
+
+feature ignore_unknown_macro
+
+server s1 {
+    rxreq
+    txresp -hdr "connection: close"
+} -start
+
+haproxy h1 -conf {
+    global
+
+    defaults
+        mode http
+        timeout connect 1s
+        timeout client  1s
+        timeout server  1s
+
+    frontend fe_fragment_block
+        bind "fd@${fe_fragment_block}"
+        default_backend be
+
+    backend be
+        server s1 ${s1_addr}:${s1_port}
+
+} -start
+
+client c11 -connect ${h1_fe_fragment_block_sock} {
+    txreq -url "/#foo"
+    rxresp
+    expect resp.status == 400
+} -run
-- 
2.35.3

>From 379a330ad8a56f6cf1031ff2cd3a093ead7e8585 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Tue, 8 Aug 2023 19:35:25 +0200
Subject: DOC: clarify the handling of URL fragments in requests

We indicate in path/pathq/url that they may contain '#' if the frontend
is configured with "option accept-invalid-http-request", and that option
mentions the fragment as well.

(cherry picked from commit 7ab4949ef107a7088777f954de800fe8cf727796)
 [ad: backported as a companion to BUG/MINOR: h1: do not accept '#' as
  part of the URI component]
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 965fb74eb180ab4f275ef907e018128e7eee0e69)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit e9903d6073ce9ff0ed8b304700e9d2b435ed8050)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit c47814a58ec153a526e8e9e822cda6e66cef5cc2)
[wt: minor ctx adj]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 3706e1754b925e56951b604cce63f3bb290ed838)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit b5062da485e78f4448a617a0f8b67dc5b23065d5)
[wt: dropped pathq]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 1ee98d04314d35b694206195b8399c501776afc5)
[wt: allow to run with version 1.8]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 doc/configuration.txt                    | 15 ++++++++++++---
 reg-tests/http-rules/fragment_in_uri.vtc |  2 +-
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index b30aaa9fb..c0607519a 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -5433,7 +5433,8 @@ no option accept-invalid-http-request
   remaining ones are blocked by default unless this option is enabled. This
   option also relaxes the test on the HTTP version, it allows HTTP/0.9 requests
   to pass through (no version specified) and multiple digits for both the major
-  and the minor version.
+  and the minor version. Finally, this option also allows incoming URLs to
+  contain fragment references ('#' after the path).
 
   This option should never be enabled by default as it hides application bugs
   and open security breaches. It should only be deployed after a problem has
@@ -15328,7 +15329,11 @@ path : string
   information from databases and keep them in caches. Note that with outgoing
   caches, it would be wiser to use "url" instead. With ACLs, it's typically
   used to match exact file names (e.g. "/login.php"), or directory parts using
-  the derivative forms. See also the "url" and "base" fetch methods.
+  the derivative forms. See also the "url" and "base" fetch methods. Please
+  note that any fragment reference in the URI ('#' after the path) is strictly
+  forbidden by the HTTP standard and will be rejected. However, if the frontend
+  receiving the request has "option accept-invalid-http-request", then this
+  fragment part will be accepted and will also appear in the path.
 
   ACL derivatives :
     path     : exact string match
@@ -15502,7 +15507,11 @@ url : string
   "path" is preferred over using "url", because clients may send a full URL as
   is normally done with proxies. The only real use is to match "*" which does
   not match in "path", and for which there is already a predefined ACL. See
-  also "path" and "base".
+  also "path" and "base". Please note that any fragment reference in the URI
+  ('#' after the path) is strictly forbidden by the HTTP standard and will be
+  rejected. However, if the frontend receiving the request has "option
+  accept-invalid-http-request", then this fragment part will be accepted and
+  will also appear in the url.
 
   ACL derivatives :
     url     : exact string match
diff --git a/reg-tests/http-rules/fragment_in_uri.vtc 
b/reg-tests/http-rules/fragment_in_uri.vtc
index 621751356..8de0adeb2 100644
--- a/reg-tests/http-rules/fragment_in_uri.vtc
+++ b/reg-tests/http-rules/fragment_in_uri.vtc
@@ -1,5 +1,5 @@
 varnishtest "check for fragments in URL"
-#REQUIRE_VERSION=2.0
+#REQUIRE_VERSION=1.8
 
 # This reg-test checks that '#' is properly blocked in requests
 
-- 
2.35.3

>From e55c2ade33b74ccf636e18feae0d158683bc1b34 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Tue, 8 Aug 2023 15:23:19 +0200
Subject: MINOR: ist: add new function ist_find_range() to find a character
 range

This looks up the character range <min>..<max> in the input string and
returns a pointer to the first one found. It's essentially the equivalent
of ist_find_ctl() in that it searches by 32 or 64 bits at once, but deals
with a range.

(cherry picked from commit 197668de975e495f0c0f0e4ff51b96203fa9842d)
 [ad: backported for following fix : BUG/MINOR: h2: reject more chars
 from the :path pseudo header]
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 451ac6628acc4b9eed3260501a49c60d4e4d4e55)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 3468f7f8e04c9c5ca5c985c7511e05e78fe1eded)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit b375df60341c7f7a4904c2d8041a09c66115c754)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit edcff741698c9519dc44f3aa13de421baad7ff43)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit cbac8632582d82a1452ccb3fe3c38196e8ad9f45)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 77c014ea018b80095329402264ae8887398ef4e8)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 include/common/ist.h | 47 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/include/common/ist.h b/include/common/ist.h
index 986e1df9f..5eb8bf23b 100644
--- a/include/common/ist.h
+++ b/include/common/ist.h
@@ -407,6 +407,53 @@ static inline const char *ist_find_ctl(const struct ist 
ist)
        return NULL;
 }
 
+/* Returns a pointer to the first character found <ist> that belongs to the
+ * range [min:max] inclusive, or NULL if none is present. The function is
+ * optimized for strings having no such chars by processing up to sizeof(long)
+ * bytes at once on architectures supporting efficient unaligned accesses.
+ * Despite this it is not very fast (~0.43 byte/cycle) and should mostly be
+ * used on low match probability when it can save a call to a much slower
+ * function. Will not work for characters 0x80 and above. It's optimized for
+ * min and max to be known at build time.
+ */
+static inline const char *ist_find_range(const struct ist ist, unsigned char 
min, unsigned char max)
+{
+       const union { unsigned long v; } __attribute__((packed)) *u;
+       const char *curr = (void *)ist.ptr - sizeof(long);
+       const char *last = curr + ist.len;
+       unsigned long l1, l2;
+
+       /* easier with an exclusive boundary */
+       max++;
+
+       do {
+               curr += sizeof(long);
+               if (curr > last)
+                       break;
+               u = (void *)curr;
+               /* add 0x<min><min><min><min>..<min> then subtract
+                * 0x<max><max><max><max>..<max> to the value to generate a
+                * carry in the lower byte if the byte contains a lower value.
+                * If we generate a bit 7 that was not there, it means the byte
+                * was min..max.
+                */
+               l2  = u->v;
+               l1  = ~l2 & ((~0UL / 255) * 0x80); /* 0x808080...80 */
+               l2 += (~0UL / 255) * min;          /* 0x<min><min>..<min> */
+               l2 -= (~0UL / 255) * max;          /* 0x<max><max>..<max> */
+       } while ((l1 & l2) == 0);
+
+       last += sizeof(long);
+       if (__builtin_expect(curr < last, 0)) {
+               do {
+                       if ((unsigned char)(*curr - min) < (unsigned char)(max 
- min))
+                               return curr;
+                       curr++;
+               } while (curr < last);
+       }
+       return NULL;
+}
+
 /* looks for first occurrence of character <chr> in string <ist> and returns
  * the tail of the string starting with this character, or (ist.end,0) if not
  * found.
-- 
2.35.3

>From 7a18c6a2887b542896a2a0242189e7035155f0d5 Mon Sep 17 00:00:00 2001
From: Christopher Faulet <cfau...@haproxy.com>
Date: Thu, 22 Oct 2020 14:37:12 +0200
Subject: MINOR: ist: Add istend() function to return a pointer to the end of
 the string

istend() is a shortcut to istptr() + istlen().

(cherry picked from commit cf26623780bdd66f4fff4154d0e5081082aff89b)
[wt: needed for next fix]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit b12ab9c04a896a90383dbaf5c808a6d9a26cde98)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 7a62a17abd2cc6f14a3cca47043db0061e2f6664)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 include/common/ist.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/common/ist.h b/include/common/ist.h
index 5eb8bf23b..fbbfcbef7 100644
--- a/include/common/ist.h
+++ b/include/common/ist.h
@@ -119,6 +119,12 @@ static inline size_t istlen(const struct ist ist)
        return ist.len;
 }
 
+/* returns the pointer to the end the string */
+static inline char *istend(const struct ist ist)
+{
+       return (ist.ptr + ist.len);
+}
+
 /* skips to next character in the string, always stops at the end */
 static inline struct ist istnext(const struct ist ist)
 {
-- 
2.35.3

>From 1d5e49737cf815f3a65d677c26bbf7ce56112458 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Tue, 8 Aug 2023 15:24:54 +0200
Subject: MINOR: http: add new function http_path_has_forbidden_char()

As its name implies, this function checks if a path component has any
forbidden headers starting at the designated location. The goal is to
seek from the result of a successful ist_find_range() for more precise
chars. Here we're focusing on 0x00-0x1F, 0x20 and 0x23 to make sure
we're not too strict at this point.

(cherry picked from commit 30f58f4217d585efeac3d85cb1b695ba53b7760b)
 [ad: backported for following fix : BUG/MINOR: h2: reject more chars
  from the :path pseudo header]
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit b491940181a88bb6c69ab2afc24b93a50adfa67c)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit f7666e5e43ce63e804ebffdf224d92cfd3367282)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit c699bb17b7e334c9d56e829422e29e5a204615ec)
[wt: adj minor ctx in http.h]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 0f57ac20b046b70275192651d7b6c978032e6a36)
[wt: adj minor ctx in http.h]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit 921f79588c6180c406e88236228a5be1c5c67c55)
[wt: applied to h2.c like has_forbidden_char since it will be used there]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit cedfa791d1a5fd03ec6b77bfa495341af37a26c3)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 src/h2.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/src/h2.c b/src/h2.c
index e5351d72e..014e40212 100644
--- a/src/h2.c
+++ b/src/h2.c
@@ -49,6 +49,26 @@ static int has_forbidden_char(const struct ist ist, const 
char *start)
        return 0;
 }
 
+/* Looks into <ist> for forbidden characters for :path values (0x00..0x1F,
+ * 0x20, 0x23), starting at pointer <start> which must be within <ist>.
+ * Returns non-zero if such a character is found, 0 otherwise. When run on
+ * unlikely header match, it's recommended to first check for the presence
+ * of control chars using ist_find_ctl().
+ */
+static inline int http_path_has_forbidden_char(const struct ist ist, const 
char *start)
+{
+       do {
+               if ((uint8_t)*start <= 0x23) {
+                       if ((uint8_t)*start < 0x20)
+                               return 1;
+                       if ((1U << ((uint8_t)*start & 0x1F)) & ((1<<3) | 
(1<<0)))
+                               return 1;
+               }
+               start++;
+       } while (start < istend(ist));
+       return 0;
+}
+
 /* Prepare the request line into <*ptr> (stopping at <end>) from pseudo headers
  * stored in <phdr[]>. <fields> indicates what was found so far. This should be
  * called once at the detection of the first general header field or at the end
-- 
2.35.3

>From 5f9b9c909399b51498ddabb39341416381fc06a2 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Tue, 8 Aug 2023 15:38:28 +0200
Subject: MINOR: h2: pass accept-invalid-http-request down the request parser

We're adding a new argument "relaxed" to h2_make_htx_request() so that
we can control its level of acceptance of certain invalid requests at
the proxy level with "option accept-invalid-http-request". The goal
will be to add deactivable checks that are still desirable to have by
default. For now no test is subject to it.

(cherry picked from commit d93a00861d714313faa0395ff9e2acb14b0a2fca)
 [ad: backported for following fix : BUG/MINOR: h2: reject more chars
  from the :path pseudo header]
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit b6be1a4f858eb6602490c192235114c1a163fef9)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 26fa3a285df0748fc79e73e552161268b66fb527)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 014945a1508f43e88ac4e89950fa9037e4fb0679)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit f86e994f5fb5851cd6e4f7f6b366e37765014b9f)
[wt: adjusted ctx in h2.h]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit d87aeb80c45cc504274188f0e5048148f3c4f2ff)
[wt: extended to h2_make_h1_request() as well for legacy mode]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit f2436eab7d21bab3d85cb750023a1770411f716e)
[wt: only kept the legacy mode part (h2-to-h1)]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 include/common/h2.h | 2 +-
 src/h2.c            | 6 +++++-
 src/mux_h2.c        | 3 ++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/common/h2.h b/include/common/h2.h
index 0cecc2d4e..ef15f3cda 100644
--- a/include/common/h2.h
+++ b/include/common/h2.h
@@ -180,7 +180,7 @@ enum h2_err {
 
 /* various protocol processing functions */
 
-int h2_make_h1_request(struct http_hdr *list, char *out, int osize, unsigned 
int *msgf);
+int h2_make_h1_request(struct http_hdr *list, char *out, int osize, unsigned 
int *msgf, int relaxed);
 
 /*
  * Some helpful debugging functions.
diff --git a/src/h2.c b/src/h2.c
index 014e40212..cb40b2e1b 100644
--- a/src/h2.c
+++ b/src/h2.c
@@ -166,8 +166,12 @@ static int h2_prepare_h1_reqline(uint32_t fields, struct 
ist *phdr, char **ptr,
  *
  * The Cookie header will be reassembled at the end, and for this, the <list>
  * will be used to create a linked list, so its contents may be destroyed.
+ *
+ * When <relaxed> is non-nul, some non-dangerous checks will be ignored. This
+ * is in order to satisfy "option accept-invalid-http-request" for
+ * interoperability purposes.
  */
-int h2_make_h1_request(struct http_hdr *list, char *out, int osize, unsigned 
int *msgf)
+int h2_make_h1_request(struct http_hdr *list, char *out, int osize, unsigned 
int *msgf, int relaxed)
 {
        struct ist phdr_val[H2_PHDR_NUM_ENTRIES];
        char *out_end = out + osize;
diff --git a/src/mux_h2.c b/src/mux_h2.c
index 79e70f60b..ecd9c59f8 100644
--- a/src/mux_h2.c
+++ b/src/mux_h2.c
@@ -2844,7 +2844,8 @@ static int h2_frt_decode_headers(struct h2s *h2s, struct 
buffer *buf, int count)
 
        /* OK now we have our header list in <list> */
        msgf = (h2c->dff & H2_F_DATA_END_STREAM) ? 0 : H2_MSGF_BODY;
-       outlen = h2_make_h1_request(list, bi_end(buf), try, &msgf);
+       outlen = h2_make_h1_request(list, bi_end(buf), try, &msgf,
+                                   !!(((const struct session 
*)h2c->conn->owner)->fe->options2 & PR_O2_REQBUG_OK));
 
        if (outlen < 0) {
                h2c_error(h2c, H2_ERR_COMPRESSION_ERROR);
-- 
2.35.3

>From d81b4c952dae3468e73f4df701c62ac3a8644ba0 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <w...@1wt.eu>
Date: Tue, 8 Aug 2023 15:40:49 +0200
Subject: BUG/MINOR: h2: reject more chars from the :path pseudo header

This is the h2 version of this previous fix:

    BUG/MINOR: h1: do not accept '#' as part of the URI component

In addition to the current NUL/CR/LF, this will also reject all other
control chars, the space and '#' from the :path pseudo-header, to avoid
taking the '#' for a part of the path. It's still possible to fall back
to the previous behavior using "option accept-invalid-http-request".

This patch modifies the request parser to change the ":path" pseudo header
validation function with a new one that rejects 0x00-0x1F (control chars),
space and '#'. This way such chars will be dropped early in the chain, and
the search for '#' doesn't incur a second pass over the header's value.

This should be progressively backported to stable versions, along with the
following commits it relies on:

     REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri 
tests
     REORG: http: move has_forbidden_char() from h2.c to http.h
     MINOR: ist: add new function ist_find_range() to find a character range
     MINOR: http: add new function http_path_has_forbidden_char()
     MINOR: h2: pass accept-invalid-http-request down the request parser

(cherry picked from commit b3119d4fb4588087e2483a80b01d322683719e29)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 462a8600ce9e478573a957e046b446a7dcffd286)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit 648e59e30723b8fd4e71aab02cb679f6ea7446e7)
Signed-off-by: Amaury Denoyelle <adenoye...@haproxy.com>
(cherry picked from commit c8e07f2fd8b5462527f102f7145d6027c0d041da)
[wt: minor ctx adjustments]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit af232e47e6264122bed3681210b054ff38ec8de8)
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit e0c9008874b89621449f7ff3e9bc6db4e94fac6d)
[wt: note: added as well for legacy mode, though since h2 is turned
     to h1 in this mode, this will be rejected anyway]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
(cherry picked from commit ad05bf865cdc77e1c48d2e608ef8c39bd6c08c31)
[wt: dropped the htx part]
Signed-off-by: Willy Tarreau <w...@1wt.eu>
---
 src/h2.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/h2.c b/src/h2.c
index cb40b2e1b..ff8ae4572 100644
--- a/src/h2.c
+++ b/src/h2.c
@@ -208,9 +208,15 @@ int h2_make_h1_request(struct http_hdr *list, char *out, 
int osize, unsigned int
                /* RFC7540#10.3: intermediaries forwarding to HTTP/1 must take 
care of
                 * rejecting NUL, CR and LF characters.
                 */
-               ctl = ist_find_ctl(list[idx].v);
-               if (unlikely(ctl) && has_forbidden_char(list[idx].v, ctl))
-                       goto fail;
+               if (phdr == H2_PHDR_IDX_PATH && !relaxed) {
+                       ctl = ist_find_range(list[idx].v, 0, '#');
+                       if (unlikely(ctl) && 
http_path_has_forbidden_char(list[idx].v, ctl))
+                               goto fail;
+               } else {
+                       ctl = ist_find_ctl(list[idx].v);
+                       if (unlikely(ctl) && has_forbidden_char(list[idx].v, 
ctl))
+                               goto fail;
+               }
 
                if (phdr > 0 && phdr < H2_PHDR_NUM_ENTRIES) {
                        /* insert a pseudo header by its index (in phdr) and 
value (in value) */
-- 
2.35.3

Reply via email to