On Monday, April 14, 2003, at 07:02 PM, Matt Sergeant wrote: <snip>
OK, at the moment we cache at the end, prior to delivery. The cache has to
try very hard to figure out if it should be applied or not (by checking
mtime on every single resource involved in the transaction). This works
well for direct pipelines of XSLT -> XSLT -> XSLT, but sucks for anything
else.
For anything involving XSP, it means you have no control. Caching is off.
With incremental caching you get the magic of the current cache
implementation, but happening at all stages in the pipeline. That can only
be slower.
What I'm trying to say is that I think I designed the caching system
wrong, or at least "too smart". Instead I'd prefer the user to decide when
the cache gets used (witness also the confusion about the cache
being used despite changes in querystring). The sensible place for this to
occur is another "stage" in the pipeline. So when you design your pipeline
you choose where the caching occurs (hence no need for the incremental
caching stuff as the cache becomes a manual thing). You also choose what
things (other than files) affect the cache - like TTL, querystring, POST
params, etc.
This is not totally coherent yet, but hopefully its getting closer.
I think it's a nice thought, but there's going to be a lot of caveats.
- If the cache "stage" is towards the end of the pipeline, it's still going to have to check all the 'dependancies' before it so the overhead is the same. If it's towards the start then the cache isn't going to be very effective since the later stages will always have to be run.
- Unless there aren't multiple cache "stages" then there's going to be the same issue of forcing a re-run of the entire pipeline if anything changes anywhere. If there are multiple stages to avoid this, then the overhead is similar to that of incremental caching.
- Users are going to have to have detailed knowledge of how elements in the pipeline work in order to place cache points appropriately, and to be able to specify what affects the caching.
I'll be blunt - implementing it in this way would cause a lot of confusion for users and would most likely result in a lot of people doing it the wrong way. You'd be forcing the users to deal with something that can be handle internally for the most part. The only benefit would be the ability to optimize for specific scenarios (what I would consider as rather premature optimization).
Ideally the caching should be done the same way cocoon does it - as part of each "stage" element. Each stage is responsible for it's own dependancy checking and caching. Each "stage" can then take options from the user to manipulate the way it performs the caching. In this situation it is still possible to optimize the caching by manipulating each "stage" and by also coding the pipeline to be able to calculate those caching points that are redundant (eg "stages" that are followed by another "stage" that depends only upon it's input).
Regards, Chris
PGP.sig
Description: This is a digitally signed message part