Hi Tim, On Mon, Dec 28, 2020 at 12:47:52PM +0100, Tim Duesterhus wrote: > This patch fixes GitHub Issue #988. Commit > ce9e7b25217c46db1ac636b2c885a05bf91ae57e > was not sufficient, because it fell back to a hash comparison if the bitmap > of known encodings was not acceptable instead of directly returning the the > cached response is not compatible. > > This patch also extends the reg-test to test the hash collision that was > mentioned in #988. > > Vary handling is 2.4, no backport needed.
Thanks for this, but while your patch looks valid, the regtest fails for me: **** c1 http[ 0] |HTTP/1.1 **** c1 http[ 1] |200 **** c1 http[ 2] |OK **** c1 http[ 3] |content-type: text/plain **** c1 http[ 4] |vary: accept-encoding **** c1 http[ 5] |cache-control: max-age=5 **** c1 http[ 6] |content-length: 48 **** c1 http[ 7] |age: 0 **** c1 http[ 8] |x-cache-hit: 1 **** c1 c-l|!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO **** c1 bodylen = 48 ** c1 === expect resp.status == 200 **** c1 EXPECT resp.status (200) == "200" match ** c1 === expect resp.bodylen == 45 ---- c1 EXPECT resp.bodylen (48) == "45" failed I don't know if it's the code or the test, but we're having an issue here it seems. Note that this response has no content-encoding so there might be something caused by this. On a related note, I think we're still having an issue that was revealed by your discovery of the bug and the fix. Indeed, initially there was only the hash for the list of values. The goal was to simply have an easy to compare value for a set of known encodings. Now for the known ones we have a bitmap. Thus we shouldn't let the known values contribute to the hash anymore. I remain convinced that now that we have the bitmap, we're probably deploying too many efforts to allow caching of unknown encodings with all the consequences it implies, and that just like in the past we would simply not cache when Vary was presented, not caching when an unknown encoding is presented would seem perfectly acceptable to me, especially given the already wide list of known encodings. And when I see the test using "Accept-Encoding: br,gzip,xxx,jdcqiab", this comforts me in this idea, because my feeling here is that the client supports two known encodings and other ones, if we find an object in the cache using no more than these encodings, it can be used, otherwise the object must be fetched from the server, and the response should only be cached if it uses known encodings. And if the server uses "jdcqiab" as one of the encodings I don't care if it results in the object not being cached. Just my two cents, Willy