Neil Gunton wrote:
The cache and front-end proxy help to serve images without bogging down the heavy mod_perl processes, while also obviously caching the mod_perl content. The site gets around 100,000 page requests or more per day. The cache is set to 1000MB, with htcacheclean running in daemon mode, interval 60 minutes (but looking at the performance charts, it seems to be running constantly).

I am finding that the cache directories that mod_cache builds are very large, and take a long time to traverse under ext2. There is currently about 10 GB under the cache according to du, and it took 162 minutes just to tell me that. Basically, htcacheclean is not keeping up. I'm using three levels of directory. Htcacheclean also takes a long time to process this if I try running it from cron nightly, during which time I would see a huge spike in iowait on the server, and it would take upward of 3 hours to complete. If I run htcacheclean in daemon mode, using the -n (nice) option, then it doesn't seem to be able to keep up, the cache just creeps up in size. If I take off the nice option, then it takes up a lot more resources, to the point where I'm concerned it'll be impacting the server performance by monopolising the disks.

So what I'm observing is that at least part of the problem appears to be that the directory structure is just very, very big and wide and takes a long time to traverse, even for basic system functions like du.

Someone replied to me off-list suggesting using Squid instead of httpd for the front-end caching reverse proxy. I guess that is a good question - I use Apache for proxying mainly because I know apache quite well, and like being able to use mod_rewrite and other neat features that httpd gives. I've never used Squid. Does anyone have opinions there? Is Squid better at managing its cache files in a sane (and efficient, i.e. no 100% iowait) fashion?

Does anyone run a 3-layer combination of Squid for cache, and then an Apache front end proxy (no mod_cache) for it's mod_rewrite capabilities, and then the back-end mod_perl server?

I need mod_rewrite at some point for stuff like stopping image hotlinking from other websites (people stealing my bandwidth by making my server act as an image server for their forums, auctions etc), and other access control stuff. I'll have to look into whether squid can do all that.

I'm open to alternatives, if it turns out that Apache's mod_cache simply isn't mature enough yet. I notice that some of the features of mod_cache have not even been implemented yet, so maybe this module isn't really ready for prime time yet? Opinions? Surely most people using mod_perl in a production environment must be using some form of reverse proxy, since it just makes so much sense from a server utilization point of view.

Thanks again,

Neil

Reply via email to