Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Thu, 6 Dec 2001, Paul Lindner wrote: BTW -- I think where the docs are cached should be configurable. I don't like the idea of the document root writable by the web process. That's the price you pay for this functionality. Because we use Apache's native file serving code we need a url-directory mapping somewhere. uh, why couldn't Apache::CacheContent just set $r-filename(/where/we/put/the/cache/$file) ? If you add Bill's suggestion about caching on args, headers and whatnot you would (on some filesystems) need something like that anyway to make a hashed directory tree. - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); more than a billion impressions per week, http://valueclick.com
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Tue, Dec 11, 2001 at 01:50:52AM -0800, Ask Bjoern Hansen wrote: On Thu, 6 Dec 2001, Paul Lindner wrote: BTW -- I think where the docs are cached should be configurable. I don't like the idea of the document root writable by the web process. That's the price you pay for this functionality. Because we use Apache's native file serving code we need a url-directory mapping somewhere. uh, why couldn't Apache::CacheContent just set $r-filename(/where/we/put/the/cache/$file) ? Simplicity really. This was an example in our upcoming book so I didn't want to add a filename generator to the code, instead we use Apache's url-file mapping mechanism. Also this code was derived from a 404 error handler that I wrote ages ago :) I assume (since you suggested it) that you can set $r-filename to any file in any directory without adding a Directory config? I'll have to see how this interacts with the built-in access control logic . If you add Bill's suggestion about caching on args, headers and whatnot you would (on some filesystems) need something like that anyway to make a hashed directory tree. Right. A more elaborate Apache::CacheContent would have a filename hash function, and a separate cache directory structure along the lines of Cache::FileCache. I suppose that one could put the whole uri-cachefile mapping into a custom PerlTransHandler and leave Apache::CacheContent as-is.. -- Paul Lindner [EMAIL PROTECTED]| | | | | | | | | | mod_perl Developer's Cookbook http://www.modperlcookbook.org Human Rights Declaration http://www.unhchr.ch/udhr/index.htm
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Tue, Dec 11, 2001 at 02:31:36AM -0800, Paul Lindner wrote: Right. A more elaborate Apache::CacheContent would have a filename hash function, and a separate cache directory structure along the lines of Cache::FileCache. Just curious -- any reason not to use Cache::Cache as the persistance mechanism? It was designed for exactly this scenario and could provide a nice abstraction for the filesystem or shared memory, as well as handle things like filename hashing and branching directories (and namespaces, size limits, OS independance, taint checking, and more). -DeWitt
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
Paul Lindner wrote: [snip] I suppose that one could put the whole uri-cachefile mapping into a custom PerlTransHandler and leave Apache::CacheContent as-is.. yeah, I think that we're starting to talk about two different approaches now. the cool thing about the current logic is that no filename mapping has to take place making it rather fast - basically, after a simple call to some cached stat() properties and you're done, Apache's native translation mechanism has done all the work. the price you pay for that quick simplicity is stuff written to your document root. adding a URI-filename translation step adds overhead, though it may be preferable to some. it shouldn't be a requirement, however. one of the neat things is about this module is that it makes (pretty creative) use of method handlers. the base class comes with disk_cache(), but memory_cache(), uri_cache(), i_dont_want_the_file_in_my_docroot_cache(), or whatever can be added either to the module proper or a subclass. so, maybe disk_cache() needs a better and less generic name, like docroot_cache(). --Geoff
[RFC] Apache::CacheContent - Caching PerlFixupHandler
Hi, I would like to propose a new Apache module before I send it off to CPAN. The name chosen is Apache::CacheContent. It's pretty generic code, and is intended to be subclassed. It handles the gory details of caching a page to disk and serving it up until it expires. It's derived from work done on the mod_perl Developer's Cookbook, so it's already been reviewed by a number of people. I've attached a README below. To download it go to http://www.modperlcookbook.org/code.html NAME Apache::CacheContent - PerlFixupHandler class that caches dynamic content SYNOPSIS * Make your method handler a subclass of Apache::CacheContent * allow your web server process to write into portions of your document root. * Add a ttl() subroutine (optional) * Add directives to your httpd.conf that are similar to these: PerlModule MyHandler # dynamic url Location /dynamic SetHandler perl-script PerlHandler MyHandler-handler /Location # cached URL Location /cached SetHandler perl-script PerlFixupHandler MyHandler-disk_cache PerlSetVar CacheTTL 120 # in minutes... /Location DESCRIPTION Note: This code is derived from the *Cookbook::CacheContent* module, available as part of The mod_perl Developer's Cookbook The Apache::CacheContent module implements a PerlFixupHandler that helps you to write handlers that can automatically cache generated web pages to disk. This is a definite performance win for sites that end up generating the exact same content for many users. The module is written to use Apache's built-in file handling routines to efficiently serve data to clients. This means that your code will not need to worry about HTTP/1.X, byte ranges, if-modified-since, HEAD requests, etc. It works by writing files into your DocumentRoot, so be sure that your web server process can write there. To use this you MUST use mod_perl method handlers. This means that your version of mod_perl must support method handlers (the argument EVERYTHING=1 to the mod_perl build will do this). Next you'll need to have a content-generating mod_perl handler. If isn't a method handler modify the *handler* subroutine to read: sub handler ($$) { my ($class, $r) = @_; Next, make your handler a subclass of *Apache::CacheContent* by adding an ISA entry: @MyHandler::ISA = qw(Apache::CacheContent); You may need to modify your handler code to only look at the *uri* of the request. Remember, the cached content is independent of any query string or form elements. After this is done, you can activate your handler. To use your handler in a fully dyamic way configure it as a PerlHandler in your httpd.conf, like this: PerlModule MyHandler Location /dynamic SetHandler perl-script PerlHandler MyHandler-handler /Location So requests to *http://localhost/dynamic/foo.html* will call your handler method directly. This is great for debugging and testing the module. To activate the caching mechanism configure httpd.conf as follows: PerlModule MyHandler Location /cached SetHandler perl-script PerlFixupHandler MyHandler-disk_cache PerlSetVar CacheTTL 120 # in minutes.. /Location Now when you access URLs like *http://localhost/cached/foo.html* the content will be generated and stored in the file *DocumentRoot*/cached/foo.html. Subsequent request for the same URL will return the cached content, depending on the *CacheTTL* setting. For further customization you can write your own *ttl* function that can dynamically change the caching time based on the current request. AUTHOR Paul Lindner [EMAIL PROTECTED], Geoffrey Young, Randy Kobes SEE ALSO The example mod_perl method handler the CacheWeather manpage. The mod_perl Developer's Cookbook -- Paul Lindner [EMAIL PROTECTED]| | | | | | | | | | mod_perl Developer's Cookbook http://www.modperlcookbook.org Human Rights Declaration http://www.unhchr.ch/udhr/index.htm
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
I would like to propose a new Apache module before I send it off to CPAN. The name chosen is Apache::CacheContent. This is very cool. I was planning to write one of these, and now I don't have to. Your implementation is short and interesting. I was planning to do it with a PerlFixupHandler and an Apache::Filter module to capture the output. While that approach wouldn't require the use of method handlers, I think yours may be easier for newbies because it doesn't require them to understand as many modules. The only real advantage of using Apache::Filter is that it would work well with existing Registry scripts. A couple of other C's for your R: A cache defines parameters that constitute a unique request. Your cache currently only handles the filename from the request as a parameter. It would be nice to also handle query args, POST data, and arbitrary headers like cookies or language choices. You could even support an optional request_keys method for handlers which would let people generate their own unique key based on their analysis of the request. Doing this would mean you would need to generate filenames based on the unique keys (probably by hashing, as in Cache::FileCache) and do an internal redirect to that file if available when someone sends a request that matches. Another thing that might be nice would be to store the TTL with the file rather than making the handler give it to you again each time. This is done in mod_proxy by putting an Expires header in the file and reading it before sending the file, but you could also store them in a dbm or something. Support for sending Expires headers automatically would also be useful. When I first thought about this problem, I wanted to do it the way Vignette StoryServer does: by having people link to the cached files directly and making the content generating code be the 404 handler for those files. That gives the best possible performance for cached files, since no PerlFixupHandler needs to run. The downside is that then you need an external process to go through and clean up expired files. It's also hard to handle complex cache criteria like query args. StoryServer does it by having really crazy generated file names and processing all the links to files on the way out so that they use the cached file names. Pretty ugly. I know you guys are pushing to get the book done, so don't feel pressured to address this stuff now. I think the current module looks more than good enough for an initial CPAN release. - Perrin
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
At 08:19 AM 12/06/01 -0800, Paul Lindner wrote: Ok, hit me over the head. Why wouldn't you want to use a caching proxy? BTW -- I think where the docs are cached should be configurable. I don't like the idea of the document root writable by the web process. Bill Moseley mailto:[EMAIL PROTECTED]
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Thu, 06 Dec 2001 10:04:26 -0800 Bill Moseley [EMAIL PROTECTED] wrote: BTW -- I think where the docs are cached should be configurable. I don't like the idea of the document root writable by the web process. Maybe: Alias /cached /tmp/cache -- Tatsuhiko Miyagawa [EMAIL PROTECTED]
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Thu, 6 Dec 2001 08:19:09 -0800 Paul Lindner [EMAIL PROTECTED] wrote: I've attached a README below. To download it go to http://www.modperlcookbook.org/code.html Nice one. here's a patch to make the sample code work :) --- CacheContent.pm~Thu Dec 6 22:11:35 2001 +++ CacheContent.pm Fri Dec 7 03:23:39 2001 @@ -6,6 +6,7 @@ @Apache::CacheContent::ISA = qw(Apache); use Apache; +use Apache::Log; use Apache::Constants qw(OK SERVER_ERROR DECLINED); use Apache::File (); --- eg/CacheWeather.pm~ Thu Dec 6 08:10:09 2001 +++ eg/CacheWeather.pm Fri Dec 7 03:24:14 2001 @@ -8,7 +8,7 @@ use strict; -@CacheWeather::ISA = qw(Cookbook::CacheContent); +@CacheWeather::ISA = qw(Apache::CacheContent); sub ttl { my($self, $r) = @_; -- Tatsuhiko Miyagawa [EMAIL PROTECTED]
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Thu, 6 Dec 2001, Paul Lindner wrote: On Thu, Dec 06, 2001 at 10:04:26AM -0800, Bill Moseley wrote: At 08:19 AM 12/06/01 -0800, Paul Lindner wrote: Ok, hit me over the head. Why wouldn't you want to use a caching proxy? Apache::CacheContent gives you more control over the caching process and keeps the expiration headers from leaking to the browser. Or maybe you want to dynamically control the TTL? sub ttl { ... if ($load_avg 5) { return 60 * 5; } else { return 60; } } While a ttl might be useful to some projects, others I'm sure would prefer a per-hit checking, so you can say Yes, this thing has changed now. Just a thought. -- Matt/ /||** Founder and CTO ** ** http://axkit.com/ ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** mod_perl news and resources: http://take23.org ** \\// //\\ // \\
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
At 10:33 AM 12/06/01 -0800, Paul Lindner wrote: On Thu, Dec 06, 2001 at 10:04:26AM -0800, Bill Moseley wrote: At 08:19 AM 12/06/01 -0800, Paul Lindner wrote: Ok, hit me over the head. Why wouldn't you want to use a caching proxy? Apache::CacheContent gives you more control over the caching process and keeps the expiration headers from leaking to the browser. Ok, I see. Or maybe you want to dynamically control the TTL? Would you still use it with a front-end lightweight server? Even with caching, a mod_perl server is still used to send the cached file (possibly over 56K modem), right? Bill Moseley mailto:[EMAIL PROTECTED]
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Thu, Dec 06, 2001 at 10:47:35AM -0800, Bill Moseley wrote: Ok, hit me over the head. Why wouldn't you want to use a caching proxy? Apache::CacheContent gives you more control over the caching process and keeps the expiration headers from leaking to the browser. Ok, I see. Or maybe you want to dynamically control the TTL? Would you still use it with a front-end lightweight server? Even with caching, a mod_perl server is still used to send the cached file (possibly over 56K modem), right? You definitely want a proxy-cache in front of your mod_perl server. One thing that I like about this module is that you can control the server-side caching of content separate from the client/browser cache. So, on to the RFC. Is the name acceptable for Apache::* I will endeavor to add any functionality that makes it worthy :) For example, I think adding a virtual method that generates the filename might be useful. -- Paul Lindner [EMAIL PROTECTED]| | | | | | | | | | mod_perl Developer's Cookbook http://www.modperlcookbook.org Human Rights Declaration http://www.unhchr.ch/udhr/index.htm
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Thu, 6 Dec 2001, Paul Lindner wrote: On Thu, Dec 06, 2001 at 10:04:26AM -0800, Bill Moseley wrote: At 08:19 AM 12/06/01 -0800, Paul Lindner wrote: Ok, hit me over the head. Why wouldn't you want to use a caching proxy? Apache::CacheContent gives you more control over the caching process and keeps the expiration headers from leaking to the browser. Or maybe you want to dynamically control the TTL? You can use mod_accel to cache flexible backend: ftp://ftp.lexa.ru/pub/apache-rus/contrib/mod_accel-1.0.7.tar.gz mod_accel understands standard Expires and Cache-Control headers and special X-Accel-Expires header (it is not sent to client). Besides it allows to ignore Expires and Cache-Control headers from backend and set expiration by AccelDefaultExpire directive. Comparing to mod_proxy mod_accel reads backend response and closes connection to backend as soon as possible. There is no 2-second backend lingering close timeout with big answers and slow clients. Big answer means bigger then frontend kernel TCP-send buffer - 16K in FreeBSD and 64K in Linux by default. Besides mod_accel read whole POST body before connecting to backend. mod_accel can ignore client's Pragma: no-cache, Cache-Control: no-cache and even Authorization headers. mod_accel allow to not pass to backend some URLs. mod_accel allow to tune various buffer size and timeouts. mod_accel can cache responses with cookie-depended content. mod_accel can use busy locks and can limit number of connection to backend. mod_accel allows simple fault-tolerance with DNS-balanced backends. mod_accel logs various information about request processing. mod_accel can invalidate cache on per-URL basis. mod_accel has two drawbacks only - too much memory per connection (inherited Apache drawback) and Russian only documentation. Igor Sysoev
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
Hello, PLThat's the price you pay for this functionality. Because we use PLApache's native file serving code we need a url-directory mapping PLsomewhere. PL PLOf course you don't need to make the entire docroot writable, just the PLdirectory corresponding to your script. Apologies if this is obvious--I haven't downloaded and tried this module yet. But would it not be possible to specify a separate directory altogether and make it serveable (Directory ... ... Allow from all ...)? If so perhaps it'd be easy to add this as a configurable parameter. In general it is a fine idea to not make the DocumentRoot writeable by the web user. In fact, I believe it is a good policy that the web user should be able to write only to a small subset of controlled locations. Humbly, Andrew -- Andrew Ho http://www.tellme.com/ [EMAIL PROTECTED] Engineer [EMAIL PROTECTED] Voice 650-930-9062 Tellme Networks, Inc. 1-800-555-TELLFax 650-930-9101 --
Re: [RFC] Apache::CacheContent - Caching PerlFixupHandler
On Thu, Dec 06, 2001 at 12:55:25PM -0800, Andrew Ho wrote: Hello, PLThat's the price you pay for this functionality. Because we use PLApache's native file serving code we need a url-directory mapping PLsomewhere. PL PLOf course you don't need to make the entire docroot writable, just the PLdirectory corresponding to your script. Apologies if this is obvious--I haven't downloaded and tried this module yet. But would it not be possible to specify a separate directory altogether and make it serveable (Directory ... ... Allow from all ...)? If so perhaps it'd be easy to add this as a configurable parameter. Yes, you can do this using the regular Apache directives: # mkdir /var/cache/www/mydir # chown apache /var/cache/www/mydir # vi /etc/httpd/conf/httpd.conf Directory /var/cache/www/mydir AllowOverride None Order allow,deny Allow from all /Directory Alias /mydir/ /var/cache/www/mydir/ In general it is a fine idea to not make the DocumentRoot writeable by the web user. In fact, I believe it is a good policy that the web user should be able to write only to a small subset of controlled locations. Yes, I agree totally! I'll add some warning to the docs to make sure that people do not inadvertently misconfigure their servers.. -- Paul Lindner [EMAIL PROTECTED]| | | | | | | | | | mod_perl Developer's Cookbook http://www.modperlcookbook.org Human Rights Declaration http://www.unhchr.ch/udhr/index.htm