Re: Bad interaction between max_stale and negative caching (2.HEAD)

Mark Nottingham Thu, 18 Sep 2008 07:27:54 -0700

Will do tomorrow.

On 18/09/2008, at 10:39 PM, Amos Jeffries wrote:

Mark Nottingham wrote:
I've got a user who's running a pair of peered accelerators, usingboth stale-while-revalidate and max_stale.Occasionally, they see extremely old content being served; e.g., ifCC: max-age is 60s, they might see something go by which is1000-3000 seconds old (but still within the max_stale window).The pattern that appears to trigger this is when a resource with anin-cache 200 response starts returning 404s; when this happens,Squid will start returning TCP_NEGATIVE_HIT/200's. E.g. (trafficdriven by squidclient),1713703.815 0 127.0.0.1 TCP_STALE_HIT/200 5234 GET http://server1//5012904 - NONE/- application/json1221713703.979 164 0.0.0.0 TCP_ASYNC_MISS/404 193 GET http://server1/5012904 - FIRST_UP_PARENT/back-end-server1 text/plain1221713711.431 0 127.0.0.1 TCP_NEGATIVE_HIT/200 5234 GET http://server1/5012904 - NONE/- application/json1221713720.978 0 127.0.0.1 TCP_NEGATIVE_HIT/200 5234 GET http://server1/5012904 - NONE/- application/json1221713723.483 0 127.0.0.1 TCP_NEGATIVE_HIT/200 5234 GET http://server1/5012904 - NONE/- application/jsonAs you can see, stale-while-revalidate kicks in, and the asyncrefresh brings back a 404, but that doesn't get stored properly.Looking at the code, I *think* the culprit is storeNegativeCache(),which will, assuming that max_stale is set (either in squid.conf orresponse headers), block the new response from updating the cache-- no matter what its status code is.It makes sense to do this for 5xx status codes, because they'reoften transient, and reflect server-side problems. It doesn't makeas much sense to do this for 4xx status codes, which reflect client-side issues. In those cases, you always want to update the cachewith the most recent response (and potentially negative cache it,if the server is silly enough to not put a freshness lifetime on it).The interesting thing, BTW, is that this only happens whencollapsed forwarding is on, because this in httpReplyProcessHeader:if (neighbors_do_private_keys && !Config.onoff.collapsed_forwarding)
   httpMaybeRemovePublic(entry, reply);
masks this behaviour.
Thoughts? I'm not 100% on this diagnosis, as the use of peering andstale-while-revalidate make things considerably more complex, butI've had pretty good luck reproducing it... I'm happy to attempt afix, but wanted input on what approach people preferred. Left to myown devices, I'd add another condition to this instoreNegativeCache():if (oe && !EBIT_TEST(oe->flags, KEY_PRIVATE) && !EBIT_TEST(oe->flags, ENTRY_REVALIDATE))
to limit it to 5xx responses.
I'd agree with you based on that analysis. Can you add a bugzillaentry with a patch that does it?
Amos
--
Please use Squid 2.7.STABLE4 or 3.0.STABLE9


--
Mark Nottingham       [EMAIL PROTECTED]

Re: Bad interaction between max_stale and negative caching (2.HEAD)

Reply via email to