Re: Incremental Caching Patches [Long]

Chris Leishman 11 Apr 2003 11:24:02 -0000

On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
<snip>

Some things to note and random ideas:
* There's small increase in requests per second with caching turned off, but a modest loss with caching turned on.

This agrees with my assessment. Of course you should also test with dynamic objects later in the pipeline (eg. XSLT -> XSLT -> XSP ). In those situations you should see very marked increases in performance as the first two stages are entirely cacheable.

* The content-length returned from requests to the CVS and patched is different.

Really? I didn't notice that... AFAIK a Content-Length header is only actually returned when delivering from a cache file (apache calculates the content-length, etc, after the $r->filename is set and the handler declines). When delivering normally I noticed that there is no Content-Length header (although HTTP/1.1 connections used chunked encoding which is preferable to content length anyway).

The changes to the way data is returned from the LibXSLT Language module in the new Provider/Cache/Language interactions further exposes differences in the document returned from XML::LibXSLT when the result is selected as DOM vs.as a string. That is, given:

<snip>

hence, if only the DOM is returned, there may be unexpected results.

AFAIK my patches should change this situation at all. At the moment a language module can return a dom (via pnotes), or just a string. And the next module has the opportunity to look for the dom or the string in pnotes. So are you saying that there are actually differences in the output under the new patches, because that seems strange...

This is not specifically related to Chris' proposed patches directly, but it does point to the fact that we my have rethink the very idea of passing around a DOM tree between Provider and Language modules and between different Language modules in the processing chain. The sad fact is that if we are to adhere to the notion of least surprise for AxKit users, we may need to fully serialize and re-parse at every processing stage in order to allow each processor the chance to perform whatever serialization magic it needs to to deliver the proper result. .

I don't see any reason why AxKit should serialise and re-parse at each stage. If a language processor is known to be broken wrt using a prebuilt DOM, it can query the provider for get_strref and then parse it's own DOM internally. It would be silly to impose that overhead if the modules being used don't need it. It would also encourage people to fix bugs that stop it working properly.

* It wasn't clear to me whether or not the new Language module design allows direct output to the client if the current processor is the last in the chain. If not, this presumption puts limits on what can be integrated as last-in-chain Language modules that may need access to the Apache object to return content appropriately.

The last-in-chain module can't output directly to the client, since the response should be returned to AxKit for it to deliver. But I don't think the previous code did either. I noticed some modules (eg. LibXSLT) took the effort of serialising the DOM and printing it if last-in-chain was in effect, but the AxKit::Apache module simply redirects the print and puts the data in pnotes('xml_string') anyway - so it's still not dynamic (and is thus kind of pointless....maybe this is historic?).

* Bolting on the result of the transformation to the return code in the new Language modules (e.g. return (Apache::Constants::OK, \%results); ) seems a smidge hacky to me. They should return one or the other, both is an ugly mix of coding strategies.

It is a smidge hacky I agree. I did it to be backwards compatible. I guess the same could be achieved by returning a hashref with a 'status' member, and then detecting whether a scalar or a hash ref was returned by the processor back in AxKit.

So, I guess the sum of my evaluation is "I'm not sure". There are definitely some good things in the proposed patches, but I think they go too far in some places.


I'm not sure you've really said where in particular they 'go to far'....

The questions that need to be answered are:
In general, does switching to incremental caching give us something that we don't already have, or is it arguably a better generic solution than the current all-or-nothing implementation (especially in light of the fact that the typical use-case seems to put the dynamic parts at the front of the processing chain)?

Well, it buys you advantages in the cases where dynamic parts are later in the chain. And I still haven't heard an overly convincing argument for why that shouldn't be a far more appropriate thing in many cases. My view is that XSP by nature is a form of 'styling', and should be added in rather than being in the original XML document.

[ all: please change the subject to create a new thread if you want to discuss this point with me further. The initial discussion on this was on -users with the topic 'Adding XSP to an XSLT pipeline'. ]

Do the proposed patches make it easier for users to write custom Cache modules? How or how not?

My thoughts (for what it's worth) is that it shouldn't make too much difference to the previous way. The basic way the caching works internally is pretty much the same - it's just the interface that's changed to make it fit better with the whole 'Cache is a provider too' idea.

Regards,
Chris

PGP.sig
Description: This is a digitally signed message part

Re: Incremental Caching Patches [Long]

Reply via email to