Probably not what you want to hear, but any chance you could put the
templates (which you say are much less than your terabyte of content),
local on all the relevant boxes?  We had a similar problem a few years
ago and decided to put our template stuff local (which yes, was a little
bit of work), and not even touch nfs for them.  Local stats are pretty
much free, or at least much closer to free than nfs stats.  And yeah, we
had several terabytes of storage, but our templates were much, much
smaller.

Hope this helps,
Earl

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tim Tompkins
Sent: Tuesday, July 05, 2005 2:38 PM
To: Andy Wardley
Cc: Andy Lester; [email protected]
Subject: Re: [Templates] Template Caching & premature optimization

Thanks, Andy, for this analysis, but unfortunately it really doesn't 
come close to the scale I'm dealing with.  I have a couple of thousand 
apache and thttpd processes constantly hitting nfs shares for file stats

on over a terabyte of content (only a very small fraction of this is web

templates).   And we already have projects slated to migrate other sites

that will double the traffic.  This is definite and we need to be ready 
for tripling the traffic within the next 2-3 years.  We are, however, 
due for some current benchmarking which will have to be done anyway as 
development ensues on our rewrite.  Previous benchmarking was performed 
a few years ago by the CTO at the time, and is no longer available.

Certainly I can see where this thread is being considered as preemptive 
optimization.  This is my fault for not giving the full scope of the 
issue and just leaving it as "too much nfs activity."  But I don't see 
this as preemptive optimization.  I see it as an unnecessary call beyond

the initial page load and not much less than if I were to attempt to 
frequently validate that my chair exists after I've sat down in it.  
Once it's there and it's in use it does not require re-validation.  If 
the chair were to break while I'm sitting in it, the entire process of 
sitting down must be restarted--I get up, find a replacement chair and 
then sit down again.  It's the same thing with templates: if an error is

found in the template then a revision is made which must be approved, it

then replaces the template and the servers are restarted.  Think of the 
templates less in the light of traditional web pages and more in the 
light of perl modules.  Perl doesn't care if a module has changed or 
even if it has been deleted from disk after it's been loaded.  If you 
want to enact a changed library, you (typically) must bounce the
process.

This may sound a bit over the edge, but it helps to ensure the integrity

of any code that could be used for processing credit cards.  Only a few 
people can approve these these types of changes while many people may 
have their hands in the development of templates.  As a further 
complication, those who can approve changes cannot be involved in them 
beyond reviewing the revision.  I've really been trying to avoid getting

into much detail here, it's time consuming and borders on disclosing 
company policy.  I was hoping that simply stating that this is my need 
and asking "what is the accepted approach with TT" would suffice, but it

seems that there's not an "accepted" approach. 

For whatever reason and whether it's accepted by the community or not I 
have a few goals in mind for our redesign that I'm hoping to come close 
to using TT.   Here are a couple that are relevant to this topic:

  * Mark certain templates as "protected" so they cannot be modified 
after being loaded and reinstate the ability to modify non-sensitive 
pages (which mostly eliminates this whole stat issue from my perspective

except for protected components, because statting a file would once 
again be needed).

  * Preload selective (primarily the protected) templates in the parent 
apache (1.3) process to ensure that changes can't sneak through as new 
apache children are spawned

If these two goals in particular can be done with TT, then this issue is

resolved for me as soon as I find out how.  Otherwise, I'm left with 
locking down *all* template revisions until I come up with an 
alternative.  For as much as I know about TT at this point, it might 
mean sub-classing from Template::Provider, but as I mentioned, I'm new 
to TT and I'd really prefer to keep my hands out of there until I become

more familiar with it.

Locking down template revisions (in part or in whole) is a tiny detail 
in the big picture and that it's not being done because I *want* to do 
it or that I think it's the best approach (it's certainly not the 
easiest), it's being done because I *must* do it to show strict auditing

policy over any piece of code involved in a point of sale.  We've not 
yet solidified our final templating solution; I'm still working out 
discovery and so far TT is the forerunner.  This entire issue may 
resolve out to having my head stuck in previous solutions that I really 
need to rethink.  But I was just looking for a response to how this has 
been dealt with previously by experienced TT users (as I think was the 
original post on this thread).  My joining this thread was simply in 
that it sounded similar to what I will be dealing with.

--
Tim


Andy Wardley wrote:

>Andy Lester wrote:
>  
>
>Actually, you'll only have half a million stat calls, which according
to my 
>test below is less than a second of machine overhead per day.
>
>  perl -MBenchmark -e 'timethis(10_000_000, sub { stat $0 })'
>
>  timethis 432000: -1 wallclock secs ( 0.18 usr +  0.28 sys =  0.46
CPU) 
>                    @ 939130.43/s (n=432000)
>
>Why?  $Template::Provider::STAT_TTL is set to 1 (second) by default.
That
>means that each file is checked once a second, at most, regardless of
how 
>many page impressions you're getting.  That's 86k stat() calls per day 
>(60*60*24), per template used (which I assumed to be 5 in the
calculation 
>above) = 432,000
>
>And even if you were hitting stat() for every template, for every page,
20 
>million stat() calls is still only approx. 20 seconds of processor
overhead
>per day.  That's pretty cheap.
>
>You mention that you're mounted across NFS, which will certainly make
things
>a little slower.  But if you're looking to speed thing up, then
replicating
>the templates to a local filesytem is going to have a much greater
impact
>than trying to optimise away stat() calls.
>
>So I think Andy's advice is sound: measure what you're doing, and be
>sure that you're optimising the right thing.
>
>I personally suspect that tuning out the stat() calls isn't going to
save
>you a great deal of time, but I could be wrong.  So if you want to
reduce the
>number of stat calls, simply set STAT_TTL to a higher value.
>
>  $Template::Provider::STAT_TTL = 60;  
>
>HTH
>A
>
>
>  
>


_______________________________________________
templates mailing list
[email protected]
http://lists.template-toolkit.org/mailman/listinfo/templates

_______________________________________________
templates mailing list
[email protected]
http://lists.template-toolkit.org/mailman/listinfo/templates

Reply via email to