Ryan Thompson wrote:

> Mark Maunder wrote to Ryan Thompson:
>
> > Ryan Thompson wrote:
> >
> > > There must be a faster way. I have thought about pre-compiling each
> > > HTML file into a Perl module, but there would have to be an automated
> > > (and secure) way to suck these in if the original file changes.
> > >
> > > Either that, or maybe someone has written a better parser. My code
> > > looks something like this, to give you an idea of what I need:
> >
> > Sure there are tons of good template systems out there. I think
> > someone made a comment about writing a template system being a
> > right of passage as a perl developer. But it's also more fun to do
> > it yourself.
>
> :-)
>
> > I guess you've tried compiling your regex with the o modifier?
>
> Yep, problem is there are several of them. I've done some work
> recently to simplify things, which might have a positive effect.
>
> > Also, have you tried caching your HTML in global package variables
> > instead of shared memory?  I think it may be a bit faster than
> > shared memory segments like Apache::Cache uses. (The first request
> > for each child will be slower, but after they've each served once,
> > they'll all be fast). Does your engine stat (access) the html file
> > on disk for each request? You mentioned you're caching, but
> > perhaps you're checking for changes to the file. Try to stat as
>
> My caching algorithm uses 2 levels:
>
> When an HTML file is requested, the instance of my template class
> checks in its memory cache. If it finds it there, great... everything
> is done within that server process.
>
> If it's not in the memory cache, it checks in a central MySQL cache
> database on the local machine. These requests are on the order of a
> few ms, thanks to an optimized query and Apache::DBI. NOT a big deal.
>
> If it's not in either cache, it takes it's lumps and goes to disk.
>

If you're using a disk based table, in most cases, mysql would access the
disk itself anyway. So whether you're getting the cached data from mysql or a
file, it's still coming from disk. (yes mysql caches - especially if you're
using InnoDB tables, but you're not gauranteed to save a disk access). Not
sure how much html/content you have, but any chance you can stick it all in
shared memory, or even better, give each child their own copy in a package
global variable (like a hashref)? If it's under a meg (maybe even 2) you
might be able to get away with that.

>
> In each cache, I use a TTL. (time() + $TTL), which is configurable,
> and usually set to something like 5 minutes in production, or 60
> seconds during development/bug fixes. (And, for this kind of data, 5
> minutes is pretty granular, as templates don't change very often.. but
> setting it any higher would, on average, have only a negligible
> improvement in performance at the risk of annoying developers :-).
>
> And, with debugging in my template module turned on, it has been
> observed that cache misses are VERY infrequent (< 0.1% of all
> requests).
>
> In fact, if I use this cache system and disable all parsing (i.e.,
> just use it to include straight HTML into mod_perl apps), I can serve
> 150-200 requests/second on the same system.
>
> With my parsing regexps enabled, it drops to 50-60 requests/second.
>
> So, to me, it is clear where performance needs to be improved. :-)

How about instead of having a cache expiry/TTL, you parse the HTML on the
first request only and then always serve from the cache. To refresh the
cache, you set a flag in shared memory. Then whenever a child is about to
serve from cache, it just checks the flag in shared memory to see if it needs
to refresh it's cache. That way you can 'push' out new code by just setting
the flag and unsetting it once all running children have read from the cache.
You'll also have every request served from the cache except the first. You'll
also get the benifit of having each child serve live code for every request
by keeping the flag set. That way your developers can code in realtime
without a 60 second latency for new HTML to take effect when you want them
to.





Reply via email to