Re: [HippoCMS-dev] Cocoon caching problem

Ard Schrijvers Thu, 07 Jan 2010 08:36:36 -0800

hello,

On Thu, Jan 7, 2010 at 5:17 PM, Reinier van den Born
<[email protected]> wrote:
> Hi Ard,
>
> Thanks for your reply.
>
> Your suggestion looks quite like what we need, but I am not sure whether it
> will really work for us.
>
> You are talking about an external resource where the (async) regeneration is
> simply triggered when the cache has expired.


it works also for internal

> In our case we are dealing with a repository resource, where cache
> invalidations are triggered by resource changes.
> So normally update events are invalidating all caches upto our final page.
> If this continues to happen the cache
> will be invalidated in between expires, so the problem would remain.

I think I addressed this, where the old response is still being served.

> Or will the cache-expires option protect the "slowpart" cache from those
> cache invalidations?

think so, but it was long time ago i built it

>
> If not, maybe it is possible to set up an intermediate pipeline to provide
> the necessary isolation?

I would just test whether it does what you expect from it...I know the
Cocoon caching is quite complex, and this part was much complexer then
standard Cocoon caching

>
> Some more general questions:
>
> I've been looking around a bit further and ran into the following page:
> http://wiki.onehippo.com/display/CMS/Using+asynchronic+get+for+cached+content
> If I am not mistaken this is using the async option on Cocoon's
> CachingSourceFactory.
> Is this related to what you mention or are these entirely different things?

Heey, I wrote that wiki page, great! This is exactly what I meant. I
only forgot I added a seperate source-factory async for it...whow, I
forgot I did all that back then :-))

You should follow that page!

>>
> Btw, if I understand the async option correctly, the regeneration is still
> initiated by an incoming request.

yes... (you can have some cron script calling it if you want?)

> Does that mean that if a request arrives first after an expire it still gets
> the old content served?

Exactly....

> In other words, when page traffic is sufficiently low, the site will always
> produce expired content?

Tja, that's what you get with a async fetch...iirc, I did not add a
cron job checking the cache for async expired entries and refetch
them...

>
> This shouldn't be a problem for the case we're experiencing our problems,
> but it would be something to take into consideration for different cases.

Well....yes, I see your point, but it was quite hard already :-))

Just try it I think is the best!

Regards Ard

>
> Regards,
>
> Reinier
>
> Reinier van den Born
> HintTech B.V.
>
> T: +31(0)88 268 25 00
> F: +31(0)88 268 25 01
> M: +31(0)6 494 171 36
>
> Delftechpark 37i | 2628 XJ Delft | The Netherlands
> www.hinttech.com<javascript:void('http://www.hinttech.com');>
>
> HintTech is a specialist in eBusiness Technology ( .Net, Java platform,
> Tridion ) and IT-Projects.
> Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. NL8062.16.396.B01
>
>
> On Thu, Jan 7, 2010 at 11:37 AM, Ard Schrijvers
> <[email protected]>wrote:
>
>> Hello,
>>
>> On Thu, Jan 7, 2010 at 10:51 AM, Reinier van den Born
>> <[email protected]> wrote:
>> > Hi all,
>> >
>> > We are using Hippo 6 with a Cocoon frontend. A few days ago our frontend
>> > went down because it ran out of memory.
>> > Analysis pointed in the direction of the following:
>> >
>> > 1. There is a page based on quite a big document (thus memory intensive)
>> > that is updated every couple of minutes or so.
>> > 2. The page gets a timeout of 30 seconds, so browsers request an update
>> with
>> > that frequency.
>> >
>> > The problem occurred when we had a lot of visitors. At some point the
>> source
>> > document was updated and flushed from the cache.
>> > Many requests were coming in and as long as the page wasn't cached each
>> > request resulted in an attempt to regenerate the page.
>> > The result was a load of threads, each of them trying to build the page
>> from
>> > the big document and thus together eating up all memory.
>> >
>> > In a way it surprised us that neither Apache httpd's mod_cache or
>> Cocoon's
>> > caching seems to have some mechanism to prevent these redundant parallel
>> > activities.
>>
>> I am not to surprised as it is really hard to have one general solution
>>
>> >
>> > Question is whether anyone can point us to a solution?
>> >
>> > Our own thoughts:
>> >
>> > First of all we need to assure that at most one thread renders the page.
>> For
>> > the other threads two possibilities seem to be reasonable:
>> > 1. Serve the "outdated" page as long as the updated is not ready
>> > 2. Wait until the updated is available.
>> > The first solution seems preferable since it prevents the possibility of
>> > having a load of "dangling" requests.
>> >
>> > However one of the solutions we are thinking of, is to create our own
>> Cocoon
>> > cache class, that only allows the first thread to build a specific page
>> > and,  using semaphores or so, has the others waiting for the first to
>> > finish.
>> > Obviously it is a solution type 2 but it is attractive because it seems
>> > relatively easy to implement.
>> > We are basically hesitant to mess more than necessary with Cocoon's
>> caching
>> > mechanism.
>> >
>> > Any advice/ideas on this?
>>
>> I wouldn't try to solve it in cocoon's cache. Are you talking about
>> one single part that is the same for every request, that takes a lot
>> of time to build?
>>
>> I did add something (quite some time ago) like asynchronous fetching
>> of sources, where you could configure an expires after which an
>> asynchronous fetch was done, serving from cache until the asynchronous
>> background thread was finished, replacing the existing cached part.
>> Sounds like something you need isn't? It was though designed for
>> something like fetching an external rss feed, where you don't want to
>> wait for an external rss feed while the external site being down for
>> example.
>>
>> I am not totally sure it works for non repository sources, but you
>> might want to try this:
>>
>> Suppose your slow pipeline is:
>>
>> pattern="myslowpart"
>>
>> the place that calls this pipeline, should be change into:
>>
>> <map:generate
>> src="cached:cocoon://myslowpart?cocoon:cache-expires=60&amp;async=true"/>
>>
>> from the top of my head, it has to be a call to the root sitemap, this
>> cocoon://.
>>
>> Another way of course to achieve something similar, is create your own
>> generator, which stores the result on filesystem. You use this for all
>> request, and every now and then, recompute the result and store it
>> again
>>
>> Regards Ard
>>
>> >
>> > Thanks,
>> >
>> > Reinier
>> > ********************************************
>> > Hippocms-dev: Hippo CMS development public mailinglist
>> >
>> > Searchable archives can be found at:
>> > MarkMail: http://hippocms-dev.markmail.org
>> > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>> >
>> >
>> ********************************************
>> Hippocms-dev: Hippo CMS development public mailinglist
>>
>> Searchable archives can be found at:
>> MarkMail: http://hippocms-dev.markmail.org
>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>>
>>
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
>
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>
>
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Re: [HippoCMS-dev] Cocoon caching problem

Reply via email to