Bug 2468;
  http://www.squid-cache.org/bugs/show_bug.cgi?id=2468
patch attached.

I marked it as blocking 2255, as that's the bug for porting stale-if- error to 3.x.



On 18/09/2008, at 10:39 PM, Amos Jeffries wrote:

Mark Nottingham wrote:
I've got a user who's running a pair of peered accelerators, using both stale-while-revalidate and max_stale. Occasionally, they see extremely old content being served; e.g., if CC: max-age is 60s, they might see something go by which is 1000-3000 seconds old (but still within the max_stale window). The pattern that appears to trigger this is when a resource with an in-cache 200 response starts returning 404s; when this happens, Squid will start returning TCP_NEGATIVE_HIT/200's. E.g. (traffic driven by squidclient), 1713703.815 0 127.0.0.1 TCP_STALE_HIT/200 5234 GET http:// server1//5012904 - NONE/- application/json 1221713703.979 164 0.0.0.0 TCP_ASYNC_MISS/404 193 GET http://server 1/5012904 - FIRST_UP_PARENT/back-end-server1 text/plain 1221713711.431 0 127.0.0.1 TCP_NEGATIVE_HIT/200 5234 GET http://server 1/5012904 - NONE/- application/json 1221713720.978 0 127.0.0.1 TCP_NEGATIVE_HIT/200 5234 GET http://server 1/5012904 - NONE/- application/json 1221713723.483 0 127.0.0.1 TCP_NEGATIVE_HIT/200 5234 GET http://server 1/5012904 - NONE/- application/json As you can see, stale-while-revalidate kicks in, and the async refresh brings back a 404, but that doesn't get stored properly. Looking at the code, I *think* the culprit is storeNegativeCache(), which will, assuming that max_stale is set (either in squid.conf or response headers), block the new response from updating the cache -- no matter what its status code is. It makes sense to do this for 5xx status codes, because they're often transient, and reflect server-side problems. It doesn't make as much sense to do this for 4xx status codes, which reflect client- side issues. In those cases, you always want to update the cache with the most recent response (and potentially negative cache it, if the server is silly enough to not put a freshness lifetime on it). The interesting thing, BTW, is that this only happens when collapsed forwarding is on, because this in httpReplyProcessHeader: if (neighbors_do_private_keys && ! Config.onoff.collapsed_forwarding)
   httpMaybeRemovePublic(entry, reply);
masks this behaviour.
Thoughts? I'm not 100% on this diagnosis, as the use of peering and stale-while-revalidate make things considerably more complex, but I've had pretty good luck reproducing it... I'm happy to attempt a fix, but wanted input on what approach people preferred. Left to my own devices, I'd add another condition to this in storeNegativeCache(): if (oe && !EBIT_TEST(oe->flags, KEY_PRIVATE) && !EBIT_TEST(oe- >flags, ENTRY_REVALIDATE))
to limit it to 5xx responses.

I'd agree with you based on that analysis. Can you add a bugzilla entry with a patch that does it?

Amos
--
Please use Squid 2.7.STABLE4 or 3.0.STABLE9

--
Mark Nottingham       [EMAIL PROTECTED]


Reply via email to