Neil Gunton wrote:
The cache and front-end proxy help to serve images without bogging down
the heavy mod_perl processes, while also obviously caching the mod_perl
content. The site gets around 100,000 page requests or more per day. The
cache is set to 1000MB, with htcacheclean running in daemon mode,
interval 60 minutes (but looking at the performance charts, it seems to
be running constantly).
I am finding that the cache directories that mod_cache builds are very
large, and take a long time to traverse under ext2. There is currently
about 10 GB under the cache according to du, and it took 162 minutes
just to tell me that. Basically, htcacheclean is not keeping up. I'm
using three levels of directory. Htcacheclean also takes a long time to
process this if I try running it from cron nightly, during which time I
would see a huge spike in iowait on the server, and it would take upward
of 3 hours to complete. If I run htcacheclean in daemon mode, using the
-n (nice) option, then it doesn't seem to be able to keep up, the cache
just creeps up in size. If I take off the nice option, then it takes up
a lot more resources, to the point where I'm concerned it'll be
impacting the server performance by monopolising the disks.
So what I'm observing is that at least part of the problem appears to be
that the directory structure is just very, very big and wide and takes a
long time to traverse, even for basic system functions like du.
Someone replied to me off-list suggesting using Squid instead of httpd
for the front-end caching reverse proxy. I guess that is a good question
- I use Apache for proxying mainly because I know apache quite well, and
like being able to use mod_rewrite and other neat features that httpd
gives. I've never used Squid. Does anyone have opinions there? Is Squid
better at managing its cache files in a sane (and efficient, i.e. no
100% iowait) fashion?
Does anyone run a 3-layer combination of Squid for cache, and then an
Apache front end proxy (no mod_cache) for it's mod_rewrite capabilities,
and then the back-end mod_perl server?
I need mod_rewrite at some point for stuff like stopping image
hotlinking from other websites (people stealing my bandwidth by making
my server act as an image server for their forums, auctions etc), and
other access control stuff. I'll have to look into whether squid can do
all that.
I'm open to alternatives, if it turns out that Apache's mod_cache simply
isn't mature enough yet. I notice that some of the features of mod_cache
have not even been implemented yet, so maybe this module isn't really
ready for prime time yet? Opinions? Surely most people using mod_perl in
a production environment must be using some form of reverse proxy, since
it just makes so much sense from a server utilization point of view.
Thanks again,
Neil