Remi,

please find a small fix for the 'Vary' support of the cache to correctly handle the difference between a missing 'accept-encoding' and an empty 'accept-encoding'.

Best regards
Tim Düsterhus
Developer WoltLab GmbH

--

WoltLab GmbH
Nedlitzer Str. 27B
14469 Potsdam

Tel.: +49 331 96784338

duester...@woltlab.com
www.woltlab.com

Managing director:
Marcel Werk

AG Potsdam HRB 26795 P
>From e33b3332a138319476d690acbf0aa20395df5598 Mon Sep 17 00:00:00 2001
From: Tim Duesterhus <duester...@woltlab.com>
Date: Fri, 18 Jun 2021 15:09:28 +0200
Subject: [PATCH] BUG/MINOR: cache: Correctly handle existing-but-empty
 'accept-encoding' header
To: haproxy@formilux.org
Cc: rlebre...@haproxy.com,
    w...@1wt.eu

RFC 7231#5.3.4 makes a difference between a completely missing
'accept-encoding' header and an 'accept-encoding' header without any values.

This case was already correctly handled by accident, because an empty accept
encoding does not match any known encoding. However this resulted in the
'other' encoding being added to the bitmap. Usually this also succeeds in
serving cached responses, because the cached response likely has no
'content-encoding', thus matching the identity case instead of not serving the
response, due to the 'other' encoding. But it's technically not 100% correct.

Fix this by special-casing 'accept-encoding' values with a length of zero and
extend the test to check that an empty accept-encoding is correctly handled.
Due to the reasons given above the test also passes without the change in
cache.c.

Vary support was added in HAProxy 2.4. This fix should be backported to 2.4+.
---
 reg-tests/cache/vary.vtc | 47 ++++++++++++++++++++++++++++++++++++++++
 src/cache.c              | 15 +++++++++++--
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/reg-tests/cache/vary.vtc b/reg-tests/cache/vary.vtc
index 61d7af671..197f822b3 100644
--- a/reg-tests/cache/vary.vtc
+++ b/reg-tests/cache/vary.vtc
@@ -108,7 +108,18 @@ server s1 {
                -hdr "Content-Encoding: gzip" \
               -bodylen 188
 
+       rxreq
+       expect req.url == "/empty-vs-missing"
+       txresp -hdr "Content-Encoding: gzip" \
+               -hdr "Vary: accept-encoding" \
+               -hdr "Cache-Control: max-age=5" \
+               -bodylen 234
 
+       rxreq
+       expect req.url == "/empty-vs-missing"
+       txresp -hdr "Vary: accept-encoding" \
+               -hdr "Cache-Control: max-age=5" \
+               -bodylen 256
 } -start
 
 server s2 {
@@ -344,6 +355,42 @@ client c1 -connect ${h1_fe_sock} {
        expect resp.bodylen == 188
        expect resp.http.X-Cache-Hit == 0
 
+       # A missing 'Accept-Encoding' implies that anything is acceptable,
+       # while an empty 'Accept-Encoding' implies nothing is acceptable.
+
+       # Start by caching a gzip response.
+       txreq -url "/empty-vs-missing" -hdr "Accept-Encoding: gzip"
+       rxresp
+       expect resp.status == 200
+       expect resp.bodylen == 234
+       expect resp.http.content-encoding == "gzip"
+       expect resp.http.X-Cache-Hit == 0
+
+       # Check that it is cached.
+       txreq -url "/empty-vs-missing" -hdr "Accept-Encoding: gzip"
+       rxresp
+       expect resp.status == 200
+       expect resp.bodylen == 234
+       expect resp.http.content-encoding == "gzip"
+       expect resp.http.X-Cache-Hit == 1
+
+       # Check that the cached response is returned when no accept-encoding is
+       # specified.
+       txreq -url "/empty-vs-missing"
+       rxresp
+       expect resp.status == 200
+       expect resp.bodylen == 234
+       expect resp.http.content-encoding == "gzip"
+       expect resp.http.X-Cache-Hit == 1
+
+       # Check that the cached response is not returned when an empty
+       # accept-encoding is specified.
+       txreq -url "/empty-vs-missing" -hdr "Accept-Encoding:"
+       rxresp
+       expect resp.status == 200
+       expect resp.bodylen == 256
+       expect resp.http.content-encoding == "<undef>"
+       expect resp.http.X-Cache-Hit == 0
 
        # The following requests are treated by a backend that does not cache
        # responses containing a Vary header
diff --git a/src/cache.c b/src/cache.c
index bdb82583f..feab63f07 100644
--- a/src/cache.c
+++ b/src/cache.c
@@ -2280,6 +2280,19 @@ static int accept_encoding_normalizer(struct htx *htx, struct ist hdr_name,
 	/* Iterate over all the ACCEPT_ENCODING_MAX_ENTRIES first accept-encoding
 	 * values that might span acrosse multiple accept-encoding headers. */
 	while (http_find_header(htx, hdr_name, &ctx, 0) && count < ACCEPT_ENCODING_MAX_ENTRIES) {
+		count++;
+
+		/* As per RFC7231#5.3.4, "An Accept-Encoding header field with a
+		 * combined field-value that is empty implies that the user agent
+		 * does not want any content-coding in response."
+		 *
+		 * We must (and did) count the existence of this empty header to not
+		 * hit the `count == 0` case below, but must ignore the value to not
+		 * include VARY_ENCODING_OTHER into the final bitmap.
+		 */
+		if (istlen(ctx.value) == 0)
+			continue;
+
 		/* Turn accept-encoding value to lower case */
 		ist2bin_lc(istptr(ctx.value), ctx.value);
 
@@ -2294,8 +2307,6 @@ static int accept_encoding_normalizer(struct htx *htx, struct ist hdr_name,
 			/* Unknown encoding */
 			encoding_bitmap |= VARY_ENCODING_OTHER;
 		}
-
-		count++;
 	}
 
 	/* If a "*" was found in the accepted encodings (without a null weight),
-- 
2.32.0

Reply via email to