Re: Thundering herd patch for mod_cache

Graham Leggett Sun, 10 Jan 2010 02:28:52 -0800

On 2010/01/10 1:43 AM, Nick Kew wrote:

> - In ap_cache_try_lock - do we really need the hashed directories
> hardwired?  I thought most modern filesystems had abandoned
> linear search, so that kind of thing is redundant!
> At least make it optional - perhaps an additional arg to
> the Cachelock directive.


I was trying to avoid yet-another-directive. The locks files are very
short lived, so there should be few of them at any given time. Is it
reasonable to assume that no modern platforms suffer linear search?
(*cough* Windows *cough* :) ).

> - Why is ap_cache_remove_lock called from so many places
> rather than just use a pool cleanup, and maybe in
> cache_save_filter as the earliest possible point in the
> normal processing path?

Because it's a lock - we want to unlock it at the earliest possible
point, as soon as we've determined the lock isn't necessary any more. We
definitely don't want to keep an URL locked waiting for a slow client to
eventually terminate the request and destroy the pool hundreds or
thousands of milliseconds later (or until the lock reaches it's maxage,
whichever is sooner). I could have created a lock in a subpool and then
created the lock from the subpool, but that now adds the overhead of the
creation of a subpool, instead of just creation of the lock itself.

> - Do we need to use lock files like this?  Not I think in
> every case: with mod_disk_cache we already have files we
> could lock (and create if they don't already exist).

This is how I started to look at the problem, but I discovered there is
ultimately no overlap between the thundering herd lock and
mod_disk_cache. The thundering herd lock locks URLs right at the start
of the request, while mod_disk_cache decides to cache and create files
at the very end of the request. If we tried to lock cache files in
mod_disk_cache, we would still leave a wide gap of time between the
start of the request and the time the backend server sent a response,
which in many cases can be many hundreds of milliseconds later, and
enough to allow hundreds or thousands of requests through depending on load.

> - Your ELSE clause (cache_util, line 623) for requests
> for a stale object that is being refreshed by another
> request serves the stale object and adds a warning.
> Is this the best thing to do?  What about waiting for
> the file to be fetched?  Should still be quicker than
> going to the backend in parallel with the other request.

If the lock kept you waiting for 1000 milliseconds, and 1000 requests
arrived in that time[1], you could very quickly tie up all the httpd
children and start rejecting requests. As we have a stale cached copy of
the request at hand, we might as well serve the stale copy in the mean
time and shed those 1000 requests, rather than leaving them backed up
behind us.

[1] These are the kinds of numbers we have in the environment we have
that needed the thundering herd lock.

Regards,
Graham
--

Re: Thundering herd patch for mod_cache

Reply via email to