On 2014-08-23 12:36, Graham Leggett wrote:
On 23 Aug 2014, at 3:40 PM, Mark Montague <[email protected]> wrote:
AH00526: Syntax error on line 148 of
/etc/httpd/conf/dev.catseye.org.conf: CacheEnable cannot occur within
<If> section
The solution here is to lift the restriction above. Having a generic mechanism
to handle conditional behaviour, and then having a special case to handle the
same behaviour in a different way is wrong way to go.
I assumed this would be OK because the Header directive has a similar
expr=expression clause.
But, I'll look into whether if restriction on If could be removed. If I
rewrite things to use the If directive, do you see bypass functionality
as something worth including? I ask because from your points below I
get the impression that the answer is "no".
The proposed enhancement is about the server deciding when to serve items from
the cache. Although the client can specify a Cache-Control request header in
order to bypass the server's cache, there is no good way for a web application
to signal to a client when it should do this (for example., when a login cookie
is set). The behavior of other caches is controlled using the Cache-Control
response header.
There is - use “Cache-Control: private”. This will tell all public caches,
including mod_cache and ISP caches, not to cache content with cookies attached,
while at the same time telling browser caches that they should.
The problem is not whether the content should be cached: it should.
The problem is, to which clients should the cached content be served?
If the client's request does not contain a login cookie, that client
should get the cached copy. If the client's request does contain a
login cookie, the cache should be bypassed and the client should get a
copy of the resource generated specifically for it.
"Cache-Control: private" cannot be used in a request, only in a
response, where it works as you said. The problem is that the first
request for a given resource where the client includes a login cookie
gets intercepted by mod_cache and served from the cache (if you assume
that other clients without login cookies have already requested it).
There must therefore be some way to tell mod_cache that this client
needs something different. One way to do this would be by having
different URL paths for logged in versus non-logged in users, but this
is awkward, user-visible, and may not be feasible with all web application.
> - Back-end sets response header "Cache-Control: max-age=0, s-maxage=14400" so
that mod_cache
> caches the response, but ISP caches and browser caches do not. (mod_cache
removes s-maxage
> and does not pass it upstream).
mod_cache shouldn’t remove any Cache-Control headers.
It apparently does, although I haven't found where in the code yet. I
would be interested to see if anyone can reproduce my experience. As far
as I know, I don't have any configuration that would result in this.
httpd 2.4.10 with mod_proxy_fcgi (Fedora 19 build)
PHP 5.5.5 with PHP-FPM
Relevant configuration:
CacheEnable disk /
CacheDefaultExpire 86400
CacheIgnoreHeaders Set-Cookie
CacheHeader on
CacheDetailHeader on
# We'll be paying attention to "Cache-Control: s-maxage=xxx" for all
# of our caching decisions. The browser will use max-age=yyy for its
# decisions. So we drop the Expires header. See the following page
# from Google which says, "It is redundant to specify both Expires and
# Cache-Control: max-age"
# https://developers.google.com/speed/docs/best-practices/caching?hl=sv
Header unset Expires
RewriteRule ^(.*\.php)$
fcgi://127.0.0.1:9001/www/dev.catseye.org/content/$1 [P,L]
File test.php, containing:
<?php
header( "Cache-Control: max-age=0, s-maxage=14400" );
header( "Content-type: text/html" );
?>
<html><body>Hello!</body></html>
Browser transaction for https://dev.catseye.org/test.php:
GET /test.php HTTP/1.1
Host: dev.catseye.org
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0)
Gecko/20100101 Firefox/31.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
Connection: keep-alive
HTTP/1.1 200 OK
Date: Sat, 23 Aug 2014 20:11:00 GMT
Server: Apache/2.4
Cache-Control: max-age=0
X-Cache: MISS from dev.catseye.org
X-Cache-Detail: "cache miss: attempting entity save" from dev.catseye.org
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html;charset=UTF-8
And mod_cache definitely receives s-maxage from the backend:
[root@sky cache]# cat ./J/k/WPiKG0bwW@R_H4YvSOdw.header
(binary data omitted)https://dev.catseye.org:443/test.php?Cache-Control:
max-age=0
Cache-Control: max-age=0, s-maxage=14400
Content-Security-Policy: default-src 'self'; script-src 'self'
'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src
'self' data: ; font-src 'self' data: ; report-uri /csp-report.php
Content-Type: text/html;charset=UTF-8
Host: dev.catseye.org
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0)
Gecko/20100101 Firefox/31.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
[root@sky cache]# cat ./J/k/WPiKG0bwW@R_H4YvSOdw.data
<html><body>Hello!</body></html>
[root@sky cache]#
- When back-end content changes (e.g., an author makes an update), the back-end invokes
"htcacheclean /path/to/resource" to invalidate the cached page so that it is
regenerated the next time a client requests it.
Set your max-age correctly and this becomes unnecessary. If you have long lived
resources that you want caching for a very long time, and you want to change
that resource, place the version number of the resource in the URL and refer to
the new URL after the change.
This is fine for JavaScript, CSS files, and images, but I'd rather have
users see nice, human-friendly URLs in their browsers location bar, like
https://example.com/latest-news
Rather than
https://example.com/latest-news?20140823T164300
...and I certainly don't want them bookmarking the latter one.
- Clients have multiple cookies set. Tracking cookies and cookies used by
JavaScript should not cause a mod_cache miss.
- Dynamic pages that are generated when a login cookie is set should not be cached.
This is accomplished by the back-end setting the response header
"Cache-Control: max-age=0”.
This is incorrect, max-age=0 means that a cache is welcome to cache the
content, but the content must be declared stale immediately and revalidated.
I checked the code and what is actually getting set for all pages
dynamically generated for logged-in users is:
Cache-Control: no-cache, must-revalidate, max-age=0
I apologize for being sloppy and not verifying this before sending my
previous reply.
- However, when a login cookie is set, dynamic pages that are currently cached
should not be served to the client with the login cookie, while they should
still be served to all other clients.
All of the above is handled by HTTP already, just follow the protocol.
Make sure you separate your cacheable content from your uncacheable content.
Ensure that you use HTTP conditional requests so that expensive calls can be
made cheap. Properly declare the request headers you vary on using the Vary
header, but keep in mind that headers with many variations will DoS a cache.
Cache long-lived content and change the URL if the content is updated. Use
max-age (and s-maxage) on short lived content to make the generation of it
cheap.
The only thing I see above that will actually help is having separate
URL paths for cachable and non-cachable content, but I'd have to hack
that in using mod_rewrite (since I'm limited to the scope of changes I
can make to the code of 3rd party web applications). I'd prefer to
avoid having logged in and non-logged in users seeing different URLs in
their browser location bars.
Thanks for all of your replies!
--
Mark Montague
[email protected]