Hello Reinier, thanks for your feedback and writing down all your findings/analyses. Great that it applies so well for your usecase! Note that this feature is also really valuable for external sources, like an external get to a rss feed.
Regards Ard On Fri, Jan 15, 2010 at 12:21 PM, Reinier van den Born <[email protected]> wrote: > Hi Ard, > > Wanted to see what is really happening so had to fight with the logging > system. Classloading problems. > But I am happy to confirm: yes, it is working and doing pretty much what I > need. > > For others to decide whether async: can be useful for them: > - See manual page: > http://wiki.onehippo.com/display/CMS/Using+asynchronic+get+for+cached+content > - A cocoon: source is internally converted to a http request, so cannot be > referring to a match in an internal-only="true" pipeline. > > Properties: > - It will immediately return the page from the cache (even if the page > expired long ago and cache is complety outdated). > If this is a problem one can periodically generate requests for the page > to refresh the cache. > > - Only if the cached page has expired it will request a background refresh > of the cache. > Refresh requests for distinct URLs are first collected in a queue. > > - Once per second it will try to refresh as many URLs from the queue as > possible (limited by number of threads the refresher is allowed to use, > threads running, etc). > Note that this cycle creates a delay for having the refreshed page > available of 1/2 second on average. > > - If a page is being refreshed, another refresh request for the same page > may be queued again. > So multiple refresh attempt for the same page may end up running in > parallel (started 1 second apart), but this will be limited by: > 1. The number of threads the refresher may use. > 2. The time it takes to refresh the page (as soon as the first refresh is > finished, the page is no longer expired so no more requests will be queued). > > One minor thought: the 1 second refresh-period (and the delay it causes) > seems not to be a problem, but I (or my users) may change my mind about > that. > It might be useful to make this configurable (as max-threads is). > > Thanks for the excellent support! > > regards, > > Reinier > > > > On Tue, Jan 12, 2010 at 10:33 PM, Ard Schrijvers > <[email protected]>wrote: > >> On Tue, Jan 12, 2010 at 5:59 PM, Reinier van den Born >> <[email protected]> wrote: >> > Hi Ard, >> > >> > Shoot, I am blind. The refresher is already dropping requests for keys >> that >> > are already in the queue. >> > So then I don't think there is a serious problem left :-). >> >> Great! It just works then? I like it when people actually use the >> async stuff I wrote back then, it wasn't easy :-)) >> >> Thanks for letting me know it works, >> >> Cheers >> >> > >> > >> > Reinier >> > >> > >> > On Tue, Jan 12, 2010 at 5:50 PM, Reinier van den Born < >> > [email protected]> wrote: >> > >> >> Hi Ard, >> >> >> >> This is becoming interesting :-). >> >> >> >> On Tue, Jan 12, 2010 at 1:16 PM, Ard Schrijvers < >> [email protected] >> >> > wrote: >> >> >> >>> On Tue, Jan 12, 2010 at 11:40 AM, Reinier van den Born >> >>> <[email protected]> wrote: >> >>> > Hello Ard, >> >>> > >> >>> > I followed your proposal to take the outside (http:) route. >> >>> > So I have two pipeline matchers now: "lazysource" and >> >>> "lazysource_direct", >> >>> > where "lazysource" generates "async:http://host/lazysource_direct" >> and >> >>> the >> >>> > latter does the original work. >> >>> > >> >>> > This seems to work, ie. I get cached results in return until it >> expires, >> >>> > while the _direct returns an updated result instantly. >> >>> > The only odd thing is that when I log the matchers, >> "lazysource_direct" >> >>> > appears to be invoked every time lazysource is. >> >>> >> >>> this is to compute the cachekey. The cached result should be returned >> >>> still. So, that the call is done is correct >> >>> >> >> >> >> Still a bit confusing (to me) that the logging action is also called >> when >> >> only the key is generated. >> >> I would expect it to be only called when entering the pipeline to >> >> generate(), >> >> Maybe I should configure the actions differently (I am not very >> experienced >> >> with Cocoon). >> >> >> >> >> >> > So it seems the cached result is used, but still gets the result to be >> >> > generated. Sounds odd, because that would defy the purpose of the >> whole >> >> >> >> No, the cachekey is generated, see >> >>> http://cocoon.apache.org/2.2/core-modules/core/2.2/690_1_1.html >> >> >> >> >> >> Interesting, but goes a bit too deep to fully understand with my limited >> >> knowledge :-) >> >> The page they claim to make more intelligible is actually easier to >> >> follow...but is probably outdated. >> >> But does this one describe really what is happening for an async-ed >> Source? >> >> >> >> Still not really sure why cocoon would need to generate a cachekey for >> >> "lazysource_direct" as long as "lazysource" is in cache and valid. >> >> I can see it being necessary for normal caching go down a full cache >> tree. >> >> But for an expiring, time-triggered cache that doesn't seem necessary, >> or? >> >> Unless it is checking for existence. According to the page you refer to >> >> that might be it... >> >> >> >> > exercise. >> >>> > >> >>> > I tried to put lazysource in a "caching", as opposed to "ecaching", >> >>> pipeline >> >>> > but that doesn't seem to make a difference. >> >>> > >> >>> > Made me wonder further, whether you have foreseen a mechanism that >> keeps >> >>> > simultaneous requests from being able to kick off parallel requests >> for >> >>> > "_direct" once it is outdated. >> >>> >> >>> you can try to create your own generator (configured in the >> >>> cocoon.xconf to have a pool limit size of 1) and have in here a >> >>> synchronized method, which blocks other requests >> >>> >> >> >> >> Isn't having pool-size=1 and using synchronized, somehow doing things >> >> double? >> >> >> >> Also it seems to me this would serialize all generate/refresh requests >> >> (yes, this would prevent us from running out of memory) >> >> but not suppress them. See below. >> >> >> >> >> >> > Because that is what my original problem was. >> >> > >> >> > I made an attempt to locate the source code to take a peek myself. >> Only >> >> > found something doing with async in the repository block of cocoon >> itself >> >> > (CachingSource.java). >> >> > Is that where I should be looking? >> >> >> >> you should take a look at the hippo cachingsource block, but be >> >>> warned, Cocoon's caching is a very complex thing. >> >> >> >> >> >> Just looking around the code I ran into refresher, the one that controls >> >> the actual generating work on asynced stuff. >> >> What if it were to be modified to drop refresh() requests that are >> already >> >> in the queue (match on cacheKey)? >> >> Since the refresher starts processing no more than one request per >> second >> >> (which in cases may be limiting, might want to make a parameter of >> that?) >> >> and is threadCount (which is a parameter) limited, unnecessary >> generation >> >> will not be completely avoided, but they will be kept under tight >> control. >> >> Quick upper limit estimate: number of threads+1 or so?? >> >> >> >> Doesn't seem like a risky or complicated modification, but my view of >> the >> >> world may be to simplified :-) >> >> What do you think? >> >> >> >> Groeten, >> >> >> >> Reinier >> >> >> >> >> >> >> >>> >> >>> Regards Ard >> >>> >> >>> > >> >>> > Thanks, >> >>> > >> >>> > Reinier >> >>> > >> >>> > Reinier van den Born >> >>> > HintTech B.V. >> >>> > The Netherlands >> >>> > >> >>> > T: +31(0)88 268 25 00 >> >>> > F: +31(0)88 268 25 01 >> >>> > M: +31(0)6 494 171 36 >> >>> > HintTech is a specialist in eBusiness Technology ( .Net, Java >> platform, >> >>> > Tridion ) and IT-Projects. >> >>> > Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. >> >>> NL8062.16.396.B01 >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > On Fri, Jan 8, 2010 at 2:45 PM, Ard Schrijvers < >> >>> [email protected]>wrote: >> >>> > >> >>> >> Hello Reinier, >> >>> >> >> >>> >> On Fri, Jan 8, 2010 at 2:37 PM, Reinier van den Born >> >>> >> <[email protected]> wrote: >> >>> >> > Hi Ard, >> >>> >> > >> >>> >> > I have followed the instructions, but it doesn't seem to work. At >> >>> least >> >>> >> not >> >>> >> > for a cocoon: resource. >> >>> >> > >> >>> >> > I created a separate pipeline (original was calling a resource, so >> >>> that >> >>> >> was >> >>> >> > easier). >> >>> >> > >> >>> >> > Without async: it works like before, but with async the cache >> seems >> >>> not >> >>> >> to >> >>> >> > expire at all. >> >>> >> > I tried both with and without a cache-expires argument. >> >>> >> > This is what I am using: >> >>> >> > <map:generate src="async:cocoon://lazysource"/> >> >>> >> > >> >>> >> > I also tried >> >>> >> > <map:generate >> >>> src="async:cocoon://lazysource?cocoon:cache-expires=10"/> >> >>> >> > >> >>> >> > When I switch back and forth between with async: and without >> async: I >> >>> get >> >>> >> > the first cached version and the latest version respectively. >> >>> >> > >> >>> >> >>> just found out this is not entirely correct: after a real long >> >>> while - >> >>> >> at >> >>> >> > least 20 minutes but more like an hour - the cache is being >> updated >> >>> << >> >>> >> > >> >>> >> > However when I replace the cocoon: url with a http: one things >> start >> >>> to >> >>> >> work >> >>> >> > like you described. >> >>> >> > So for instance >> >>> >> > <map:generate src="async: >> >>> >> > http://www.anwb.nl/verkeer/verkeersinformatie_files_nl"/> >> >>> >> > is updated properly (with the delay). >> >>> >> > (except that the first request will show the old cached version, >> like >> >>> we >> >>> >> > discussed below, showing it is actually active :-) >> >>> >> > >> >>> >> > I hope this is just me overlooking something, because it looks >> very >> >>> >> > promosing. >> >>> >> > Any ideas? >> >>> >> >> >>> >> It might have been broken for the cocoon:// protocol. I vaguely >> >>> >> remember that it was extremely hard to accomplish for the cocoon:// >> >>> >> protocol. >> >>> >> >> >>> >> But can't you just do an http call to the cocoon instance? Thus, >> >>> instead >> >>> >> of: >> >>> >> >> >>> >> <map:generate >> src="async:cocoon://lazysource?cocoon:cache-expires=10"/> >> >>> >> >> >>> >> use >> >>> >> >> >>> >> <map:generate src="async: >> >>> >> http://www.mydomain.com/lazysource?cocoon:cache-expires=10"/> >> >>> >> >> >>> >> Regards Ard >> >>> >> >> >>> >> > >> >>> >> > Reinier >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > On Thu, Jan 7, 2010 at 5:36 PM, Ard Schrijvers < >> >>> >> [email protected]>wrote: >> >>> >> > >> >>> >> >> hello, >> >>> >> >> >> >>> >> >> On Thu, Jan 7, 2010 at 5:17 PM, Reinier van den Born >> >>> >> >> <[email protected]> wrote: >> >>> >> >> > Hi Ard, >> >>> >> >> > >> >>> >> >> > Thanks for your reply. >> >>> >> >> > >> >>> >> >> > Your suggestion looks quite like what we need, but I am not >> sure >> >>> >> whether >> >>> >> >> it >> >>> >> >> > will really work for us. >> >>> >> >> > >> >>> >> >> > You are talking about an external resource where the (async) >> >>> >> regeneration >> >>> >> >> is >> >>> >> >> > simply triggered when the cache has expired. >> >>> >> >> >> >>> >> >> it works also for internal >> >>> >> >> >> >>> >> >> > In our case we are dealing with a repository resource, where >> cache >> >>> >> >> > invalidations are triggered by resource changes. >> >>> >> >> > So normally update events are invalidating all caches upto our >> >>> final >> >>> >> >> page. >> >>> >> >> > If this continues to happen the cache >> >>> >> >> > will be invalidated in between expires, so the problem would >> >>> remain. >> >>> >> >> >> >>> >> >> I think I addressed this, where the old response is still being >> >>> served. >> >>> >> >> >> >>> >> >> > Or will the cache-expires option protect the "slowpart" cache >> from >> >>> >> those >> >>> >> >> > cache invalidations? >> >>> >> >> >> >>> >> >> think so, but it was long time ago i built it >> >>> >> >> >> >>> >> >> > >> >>> >> >> > If not, maybe it is possible to set up an intermediate pipeline >> to >> >>> >> >> provide >> >>> >> >> > the necessary isolation? >> >>> >> >> >> >>> >> >> I would just test whether it does what you expect from it...I >> know >> >>> the >> >>> >> >> Cocoon caching is quite complex, and this part was much complexer >> >>> then >> >>> >> >> standard Cocoon caching >> >>> >> >> >> >>> >> >> > >> >>> >> >> > Some more general questions: >> >>> >> >> > >> >>> >> >> > I've been looking around a bit further and ran into the >> following >> >>> >> page: >> >>> >> >> > >> >>> >> >> >> >>> >> >> >>> >> http://wiki.onehippo.com/display/CMS/Using+asynchronic+get+for+cached+content >> >>> >> >> > If I am not mistaken this is using the async option on Cocoon's >> >>> >> >> > CachingSourceFactory. >> >>> >> >> > Is this related to what you mention or are these entirely >> >>> different >> >>> >> >> things? >> >>> >> >> >> >>> >> >> Heey, I wrote that wiki page, great! This is exactly what I >> meant. I >> >>> >> >> only forgot I added a seperate source-factory async for >> it...whow, I >> >>> >> >> forgot I did all that back then :-)) >> >>> >> >> >> >>> >> >> You should follow that page! >> >>> >> >> >> >>> >> >> >> >> >>> >> >> > Btw, if I understand the async option correctly, the >> regeneration >> >>> is >> >>> >> >> still >> >>> >> >> > initiated by an incoming request. >> >>> >> >> >> >>> >> >> yes... (you can have some cron script calling it if you want?) >> >>> >> >> >> >>> >> >> > Does that mean that if a request arrives first after an expire >> it >> >>> >> still >> >>> >> >> gets >> >>> >> >> > the old content served? >> >>> >> >> >> >>> >> >> Exactly.... >> >>> >> >> >> >>> >> >> > In other words, when page traffic is sufficiently low, the site >> >>> will >> >>> >> >> always >> >>> >> >> > produce expired content? >> >>> >> >> >> >>> >> >> Tja, that's what you get with a async fetch...iirc, I did not add >> a >> >>> >> >> cron job checking the cache for async expired entries and refetch >> >>> >> >> them... >> >>> >> >> >> >>> >> >> > >> >>> >> >> > This shouldn't be a problem for the case we're experiencing our >> >>> >> problems, >> >>> >> >> > but it would be something to take into consideration for >> different >> >>> >> cases. >> >>> >> >> >> >>> >> >> Well....yes, I see your point, but it was quite hard already :-)) >> >>> >> >> >> >>> >> >> Just try it I think is the best! >> >>> >> >> >> >>> >> >> Regards Ard >> >>> >> >> >> >>> >> >> > >> >>> >> >> > Regards, >> >>> >> >> > >> >>> >> >> > Reinier >> >>> >> >> > >> >>> >> >> > Reinier van den Born >> >>> >> >> > HintTech B.V. >> >>> >> >> > >> >>> >> >> > T: +31(0)88 268 25 00 >> >>> >> >> > F: +31(0)88 268 25 01 >> >>> >> >> > M: +31(0)6 494 171 36 >> >>> >> >> > >> >>> >> >> > Delftechpark 37i | 2628 XJ Delft | The Netherlands >> >>> >> >> > www.hinttech.com<javascript:void('http://www.hinttech.com');> >> >>> >> >> > >> >>> >> >> > HintTech is a specialist in eBusiness Technology ( .Net, Java >> >>> >> platform, >> >>> >> >> > Tridion ) and IT-Projects. >> >>> >> >> > Chamber of Commerce The Hague nr. 27242282 | Sales Tax nr. >> >>> >> >> NL8062.16.396.B01 >> >>> >> >> > >> >>> >> >> > >> >>> >> >> >>> >> Deleted tail of the conversation here. >> >>> > ******************************************** >> >>> > Hippocms-dev: Hippo CMS development public mailinglist >> >>> > >> >>> > Searchable archives can be found at: >> >>> > MarkMail: http://hippocms-dev.markmail.org >> >>> > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >> >>> > >> >>> > >> >>> ******************************************** >> >>> Hippocms-dev: Hippo CMS development public mailinglist >> >>> >> >>> Searchable archives can be found at: >> >>> MarkMail: http://hippocms-dev.markmail.org >> >>> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >> >>> >> >>> >> >> >> > ******************************************** >> > Hippocms-dev: Hippo CMS development public mailinglist >> > >> > Searchable archives can be found at: >> > MarkMail: http://hippocms-dev.markmail.org >> > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >> > >> > >> ******************************************** >> Hippocms-dev: Hippo CMS development public mailinglist >> >> Searchable archives can be found at: >> MarkMail: http://hippocms-dev.markmail.org >> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html >> >> > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
