I'm sorry that this mail is long, but I wanted to include the necessary
details.
This is my scenario:
Software:
Apache httpd 2.4.6 VC10 Windows 32 bits binary built with OpenSL 0.9.8y
from www.apachelounge.com.
Frontend server:
lots of bandwidth
caching reverse proxy setup using httpd 2.4.6 + mod_cache +
mod_cache_disk + mod_proxy + mod_proxy_httpd on Windows 2003 Server
Backend server:
very limited bandwidth (so it is very important for me to minimize the
data transferred between the frontend and backend servers)
Frontend server virtual host configuration:
<VirtualHost 192.168.10.8:443>
ServerName front.example.com
CacheEnable disk /dir/
CacheIgnoreHeaders Set-Cookie
CacheLock On
CacheLockPath C:/ApacheCache/Locks
CacheRoot C:/ApacheCache/Cache
CacheDirLevels 1
CacheDirLength 1
CacheMaxFileSize 10000000
<Directory C:/ApacheRoot/dir>
RewriteEngine on
RewriteRule ^update/$ http://back.example.com/update.cfm [P,L]
Header unset Set-Cookie
</Directory>
</VirtualHost>
I have studied the documentation of mod_cache as well as different
documents regarding HTTP caching but I am encountering some behaviour I
cannot explain with this setup and hope that someone is able to help me.
Problem 1: the second and subsequent frontend requests during the
refreshment of a stale entry don't return stale data but cause multiple
backend requests (the "Thundering Herd" is not kept at bay although the
cache lock is enabled)
The requests are completely unconditional and look like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com
This is the mod_cache logging for the first (tid 2628) and second (tid
11048) request:
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628]
cache_storage.c(664): [client 192.168.2.234:49770] AH00698: cache: Key
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628]
cache_util.c(732): [client 192.168.2.234:49770] AH00782: Cache lock
obtained for stale cached URL, revalidating entry: /dir/update/
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628]
cache_storage.c(358): [client 192.168.2.234:49770] AH00695: Cached
response for /dir/update/ isn't fresh. Adding conditional request headers.
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628]
mod_cache.c(192): [client 192.168.2.234:49770] AH00750: Adding
CACHE_SAVE filter for /dir/update/
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628]
mod_cache.c(202): [client 192.168.2.234:49770] AH00751: Adding
CACHE_REMOVE_URL filter for /dir/update/
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048]
cache_storage.c(664): [client 192.168.2.234:49771] AH00698: cache: Key
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048]
cache_util.c(757): (OS 80)Die Datei ist vorhanden. : [client
192.168.2.234:49771] AH00784: Attempt to obtain a cache lock for stale
cached URL failed, revalidating entry anyway: /dir/update/
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048]
cache_storage.c(358): [client 192.168.2.234:49771] AH00695: Cached
response for /dir/update/ isn't fresh. Adding conditional request headers.
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048]
mod_cache.c(218): (OS 80)Die Datei ist vorhanden. : [client
192.168.2.234:49771] AH00752: Cache locked for url, not caching
response: /dir/update/
[Fri Aug 16 13:06:31.438500 2013] [cache:debug] [pid 2284:tid 2628]
mod_cache.c(1325): [client 192.168.2.234:49770] AH00769: cache: Caching
url: /dir/update/
[Fri Aug 16 13:06:31.438500 2013] [cache:debug] [pid 2284:tid 2628]
mod_cache.c(1331): [client 192.168.2.234:49770] AH00770: cache: Removing
CACHE_REMOVE_URL filter.
I took a quick look at the source of mod_cache and I guess there may be
something wrong with the locking code, probably only on Windows systems.
The first request correctly obtains a cache lock (line 2).
The second request cannot obtain a cache lock (line 7) because the lock
file already exists (the germain Windows error message translates to
"The file already exists.") but in that case the log message should
actually read "Cache already locked for stale cached URL, pretend it is
fresh: /dir/update/" and stale content should be returned. Obviously the
call to cache_try_lock(conf, cache, r) in cache_util.c does not return
APR_EEXIST, like is expected in this case, but something else, which is
not correctly detected as an existing lock...
Additionally to the multiple backend requests there is also the critical
problem that the second request receives a 304 Not Modified response
(passed through from the backend), although it was unconditional and the
first, lock-obtaining, request received an expected 200 OK response with
the cached data.
Do you think my assumption is correct that this is a problem with
mod_cache on Windows? Is the 304 response to the unconditional request
also expected or is it maybe some kind of aftereffect of the cache lock
problems?
Problem 2: conditional frontend requests where the If-Modified-Since
request header field is earlier than the cached Last-Modified response
header field always cause backend requests
I am starting with an empty cache and issue the same request twice
(waiting until the first request finishes before executing the second
one). The requests contain the same If-Modified-Since header field
value, which is one second before the Last-Modified header field
returned by the backend.
The first and second requests look like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com
If-Modified-Since: Fri, 16 Aug 2013 09:40:00 GMT
After that I issue a third request where I set the If-Modified-Since
request header field to the exact value of the Last-Modified header
field returned by the backend.
The third request looks like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com
If-Modified-Since: Fri, 16 Aug 2013 09:40:01 GMT
As I understand it the first request should result in a request to the
backend server. The returned data should be cached and returned to the
second request without causing a second request to the backend. The
third request should receive 304 Not Modifed from the cache, also
without a backend request. But that is not what happens.
This is the mod_cache logging for the three requests:
[Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344]
cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(192): [client 192.168.2.234:50220] AH00750: Adding
CACHE_SAVE filter for /dir/update/
[Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(202): [client 192.168.2.234:50220] AH00751: Adding
CACHE_REMOVE_URL filter for /dir/update/
[Fri Aug 16 16:49:10.579125 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(1325): [client 192.168.2.234:50220] AH00769: cache: Caching
url: /dir/update/
[Fri Aug 16 16:49:10.579125 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(1331): [client 192.168.2.234:50220] AH00770: cache: Removing
CACHE_REMOVE_URL filter.
[Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344]
cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(192): [client 192.168.2.234:50220] AH00750: Adding
CACHE_SAVE filter for /dir/update/
[Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(202): [client 192.168.2.234:50220] AH00751: Adding
CACHE_REMOVE_URL filter for /dir/update/
[Fri Aug 16 16:49:25.610375 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(1325): [client 192.168.2.234:50220] AH00769: cache: Caching
url: /dir/update/
[Fri Aug 16 16:49:25.610375 2013] [cache:debug] [pid 1860:tid 10344]
mod_cache.c(1331): [client 192.168.2.234:50220] AH00770: cache: Removing
CACHE_REMOVE_URL filter.
[Fri Aug 16 16:49:35.344750 2013] [cache:debug] [pid 1860:tid 10344]
cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
This is the backend server's access log:
192.168.10.8 - - [16/Aug/2013:16:49:10 +0200] "GET /update.cfm HTTP/1.1"
200 359261 "-" "xyz"
192.168.10.8 - - [16/Aug/2013:16:49:25 +0200] "GET /update.cfm HTTP/1.1"
200 359261 "-" "xyz"
As you can see the second request also issues the same backend request
as the first request although the data was already in the cache at the
time. The third request is handled correctly (the If-Modified-Since
header field is matched with the cached Last-Modified header field and
304 Not Modified is returned without a backend request). Is there a
reason why the second request behaves like this?
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org