Seems to me that the thundering herd / performance degradation is inherent to apache design: all threads/processes are exact clones.
A more suitable design for this task I think would be to make each process to have a special purpose: cache maintenance (purging expired entries, purging entries to make room for new ones, creating new entries, and so on), request processing (network/disk I/O, content filtering, and so on), or what ever. This way, performance degradation caused by cache mutex can be minimized. Request processors would only get queued/locked when querying the cache, which can be made a single operation if cache is smart enough to figure out the right response from original request, right? Regards, -- Gonzalo A. Arana