Hi Tim,

On Mon, Dec 28, 2020 at 12:47:52PM +0100, Tim Duesterhus wrote:
> This patch fixes GitHub Issue #988. Commit 
> ce9e7b25217c46db1ac636b2c885a05bf91ae57e
> was not sufficient, because it fell back to a hash comparison if the bitmap
> of known encodings was not acceptable instead of directly returning the the
> cached response is not compatible.
> 
> This patch also extends the reg-test to test the hash collision that was
> mentioned in #988.
> 
> Vary handling is 2.4, no backport needed.

Thanks for this, but while your patch looks valid, the regtest fails
for me:

**** c1    http[ 0] |HTTP/1.1
**** c1    http[ 1] |200
**** c1    http[ 2] |OK
**** c1    http[ 3] |content-type: text/plain
**** c1    http[ 4] |vary: accept-encoding
**** c1    http[ 5] |cache-control: max-age=5
**** c1    http[ 6] |content-length: 48
**** c1    http[ 7] |age: 0
**** c1    http[ 8] |x-cache-hit: 1
**** c1    c-l|!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO
**** c1    bodylen = 48
**   c1    === expect resp.status == 200
**** c1    EXPECT resp.status (200) == "200" match
**   c1    === expect resp.bodylen == 45
---- c1    EXPECT resp.bodylen (48) == "45" failed

I don't know if it's the code or the test, but we're having an issue here
it seems. Note that this response has no content-encoding so there might
be something caused by this.

On a related note, I think we're still having an issue that was revealed
by your discovery of the bug and the fix. Indeed, initially there was
only the hash for the list of values. The goal was to simply have an
easy to compare value for a set of known encodings. Now for the known
ones we have a bitmap. Thus we shouldn't let the known values contribute
to the hash anymore.

I remain convinced that now that we have the bitmap, we're probably
deploying too many efforts to allow caching of unknown encodings with
all the consequences it implies, and that just like in the past we would
simply not cache when Vary was presented, not caching when an unknown
encoding is presented would seem perfectly acceptable to me, especially
given the already wide list of known encodings. And when I see the test
using "Accept-Encoding: br,gzip,xxx,jdcqiab", this comforts me in this
idea, because my feeling here is that the client supports two known
encodings and other ones, if we find an object in the cache using no
more than these encodings, it can be used, otherwise the object must
be fetched from the server, and the response should only be cached if
it uses known encodings. And if the server uses "jdcqiab" as one of
the encodings I don't care if it results in the object not being cached.

Just my two cents,
Willy

Reply via email to