Grace,
Thanks for the thorough writeup.
Your suspicion about a race condition is likely correct, and the overlay
filesystem is almost certainly a contributing factor. OverlayFS has
well-documented timing issues where a file created in the upper layer may not
be immediately visible to concurrent reads -- especially under heavy I/O. This
would match your symptoms exactly: the .data file exists when you check after
the fact, but wasn't visible at the instant the concurrent request tried to
open it.
A couple of things to try:
Move the cache off the overlay filesystem. Mount a tmpfs or a bind-mounted
volume for your CacheRoot instead:
# docker run ... -v /host/cache:/tmp/apache ...
# or in docker-compose:
volumes:
- type: tmpfs
target: /tmp/apache
If the error disappears, that confirms the overlay fs as the culprit.
CacheLock -- you're right that it only serializes cache updates (it's designed
to prevent the "thundering herd" problem where multiple requests try to refresh
the same expired entry simultaneously). It doesn't lock reads against
concurrent writes, so it won't help here.
If the issue persists even on a non-overlay filesystem, it may be worth filing
a bug at https://bz.apache.org/bugzilla/ -- there could be a genuine race
window between the .header and .data file creation in mod_cache_disk.
Hope that helps
--Rich
On 2026/04/27 17:18:09 wrote:
> Hi all, this is the error we have been seeing in our apache HTTP logs:
>
> [cache_disk:error] [pid 421298:tid 421362] (2)No such file or directory:
> [client [REDACTED_IP]] AH00708: Cannot open data file
> /tmp/apache/[HASH_PATH]/AA.data, referer: https://[REFERER_URL]
>
> When I check the cache, that file already exists and was created about the
> same time the error message was triggered. I believe the problem could be a
> race condition by .header being created before .data can be read from a
> slightly earlier request, but I can’t prove this based on the file creation
> times which are rewritten after this error message is triggered and the cache
> is rewritten. I have been unable to recreate this problem even by triggering
> a request with an extremely large .data file that takes longer than
> [mod_cache: CacheLockMaxAge] to write
>
> I have ruled out the problem being that our cache is full and files can’t be
> written or htcacheclean being the problem by starting with a clean cache that
> begins to get these errors before it reaches its max size. I verified that I
> am getting multiple consecutive requests when I see this error
>
> Some information about our configuration:
>
> * cache.conf has CacheLock on
>
> * This only happens on our production environment which gets millions of
> daily requests
>
> * I do not see any other relevant errors when I turn debug mode on
>
> * This is an infrequent error message, but it is reoccurring
>
> * We are using an overlay file system in a Docker container
>
> Has anyone encountered this issue before? I have not found an apache HTTP
> setting to lock reading the cache, only writing the cache
>
>
> Thank you,
>
> Grace
>
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]