I'm sorry that this mail is long, but I wanted to include the necessary details.

This is my scenario:

Software:
Apache httpd 2.4.6 VC10 Windows 32 bits binary built with OpenSL 0.9.8y from www.apachelounge.com.

Frontend server:
lots of bandwidth
caching reverse proxy setup using httpd 2.4.6 + mod_cache + mod_cache_disk + mod_proxy + mod_proxy_httpd on Windows 2003 Server

Backend server:
very limited bandwidth (so it is very important for me to minimize the data transferred between the frontend and backend servers)

Frontend server virtual host configuration:
<VirtualHost 192.168.10.8:443>
  ServerName front.example.com

  CacheEnable disk /dir/
  CacheIgnoreHeaders Set-Cookie
  CacheLock On
  CacheLockPath C:/ApacheCache/Locks

  CacheRoot C:/ApacheCache/Cache
  CacheDirLevels 1
  CacheDirLength 1
  CacheMaxFileSize 10000000

  <Directory C:/ApacheRoot/dir>
    RewriteEngine on
    RewriteRule ^update/$ http://back.example.com/update.cfm [P,L]
    Header unset Set-Cookie
  </Directory>
</VirtualHost>

I have studied the documentation of mod_cache as well as different documents regarding HTTP caching but I am encountering some behaviour I cannot explain with this setup and hope that someone is able to help me.

Problem 1: the second and subsequent frontend requests during the refreshment of a stale entry don't return stale data but cause multiple backend requests (the "Thundering Herd" is not kept at bay although the cache lock is enabled)

The requests are completely unconditional and look like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com

This is the mod_cache logging for the first (tid 2628) and second (tid 11048) request: [Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] cache_storage.c(664): [client 192.168.2.234:49770] AH00698: cache: Key for entity /dir/update/?(null) is https://front.example.com:443/dir/update/? [Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] cache_util.c(732): [client 192.168.2.234:49770] AH00782: Cache lock obtained for stale cached URL, revalidating entry: /dir/update/ [Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] cache_storage.c(358): [client 192.168.2.234:49770] AH00695: Cached response for /dir/update/ isn't fresh. Adding conditional request headers. [Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] mod_cache.c(192): [client 192.168.2.234:49770] AH00750: Adding CACHE_SAVE filter for /dir/update/ [Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] mod_cache.c(202): [client 192.168.2.234:49770] AH00751: Adding CACHE_REMOVE_URL filter for /dir/update/ [Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] cache_storage.c(664): [client 192.168.2.234:49771] AH00698: cache: Key for entity /dir/update/?(null) is https://front.example.com:443/dir/update/? [Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] cache_util.c(757): (OS 80)Die Datei ist vorhanden. : [client 192.168.2.234:49771] AH00784: Attempt to obtain a cache lock for stale cached URL failed, revalidating entry anyway: /dir/update/ [Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] cache_storage.c(358): [client 192.168.2.234:49771] AH00695: Cached response for /dir/update/ isn't fresh. Adding conditional request headers. [Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] mod_cache.c(218): (OS 80)Die Datei ist vorhanden. : [client 192.168.2.234:49771] AH00752: Cache locked for url, not caching response: /dir/update/ [Fri Aug 16 13:06:31.438500 2013] [cache:debug] [pid 2284:tid 2628] mod_cache.c(1325): [client 192.168.2.234:49770] AH00769: cache: Caching url: /dir/update/ [Fri Aug 16 13:06:31.438500 2013] [cache:debug] [pid 2284:tid 2628] mod_cache.c(1331): [client 192.168.2.234:49770] AH00770: cache: Removing CACHE_REMOVE_URL filter.

I took a quick look at the source of mod_cache and I guess there may be something wrong with the locking code, probably only on Windows systems.
The first request correctly obtains a cache lock (line 2).
The second request cannot obtain a cache lock (line 7) because the lock file already exists (the germain Windows error message translates to "The file already exists.") but in that case the log message should actually read "Cache already locked for stale cached URL, pretend it is fresh: /dir/update/" and stale content should be returned. Obviously the call to cache_try_lock(conf, cache, r) in cache_util.c does not return APR_EEXIST, like is expected in this case, but something else, which is not correctly detected as an existing lock...

Additionally to the multiple backend requests there is also the critical problem that the second request receives a 304 Not Modified response (passed through from the backend), although it was unconditional and the first, lock-obtaining, request received an expected 200 OK response with the cached data.

Do you think my assumption is correct that this is a problem with mod_cache on Windows? Is the 304 response to the unconditional request also expected or is it maybe some kind of aftereffect of the cache lock problems?

Problem 2: conditional frontend requests where the If-Modified-Since request header field is earlier than the cached Last-Modified response header field always cause backend requests

I am starting with an empty cache and issue the same request twice (waiting until the first request finishes before executing the second one). The requests contain the same If-Modified-Since header field value, which is one second before the Last-Modified header field returned by the backend.

The first and second requests look like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com
If-Modified-Since: Fri, 16 Aug 2013 09:40:00 GMT

After that I issue a third request where I set the If-Modified-Since request header field to the exact value of the Last-Modified header field returned by the backend.

The third request looks like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com
If-Modified-Since: Fri, 16 Aug 2013 09:40:01 GMT

As I understand it the first request should result in a request to the backend server. The returned data should be cached and returned to the second request without causing a second request to the backend. The third request should receive 304 Not Modifed from the cache, also without a backend request. But that is not what happens.

This is the mod_cache logging for the three requests:
[Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344] cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key for entity /dir/update/?(null) is https://front.example.com:443/dir/update/? [Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(192): [client 192.168.2.234:50220] AH00750: Adding CACHE_SAVE filter for /dir/update/ [Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(202): [client 192.168.2.234:50220] AH00751: Adding CACHE_REMOVE_URL filter for /dir/update/ [Fri Aug 16 16:49:10.579125 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(1325): [client 192.168.2.234:50220] AH00769: cache: Caching url: /dir/update/ [Fri Aug 16 16:49:10.579125 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(1331): [client 192.168.2.234:50220] AH00770: cache: Removing CACHE_REMOVE_URL filter. [Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344] cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key for entity /dir/update/?(null) is https://front.example.com:443/dir/update/? [Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(192): [client 192.168.2.234:50220] AH00750: Adding CACHE_SAVE filter for /dir/update/ [Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(202): [client 192.168.2.234:50220] AH00751: Adding CACHE_REMOVE_URL filter for /dir/update/ [Fri Aug 16 16:49:25.610375 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(1325): [client 192.168.2.234:50220] AH00769: cache: Caching url: /dir/update/ [Fri Aug 16 16:49:25.610375 2013] [cache:debug] [pid 1860:tid 10344] mod_cache.c(1331): [client 192.168.2.234:50220] AH00770: cache: Removing CACHE_REMOVE_URL filter. [Fri Aug 16 16:49:35.344750 2013] [cache:debug] [pid 1860:tid 10344] cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?

This is the backend server's access log:
192.168.10.8 - - [16/Aug/2013:16:49:10 +0200] "GET /update.cfm HTTP/1.1" 200 359261 "-" "xyz" 192.168.10.8 - - [16/Aug/2013:16:49:25 +0200] "GET /update.cfm HTTP/1.1" 200 359261 "-" "xyz"

As you can see the second request also issues the same backend request as the first request although the data was already in the cache at the time. The third request is handled correctly (the If-Modified-Since header field is matched with the cached Last-Modified header field and 304 Not Modified is returned without a backend request). Is there a reason why the second request behaves like this?


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org

Reply via email to