Re: [Fwd: Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h]
On Mon, October 30, 2006 4:54 am, William A. Rowe, Jr. wrote: > You mean sandboxes (at least that's what we normally refer to them as, > trunk/ IS the dev branch :) ... > > I strongly disagree because MOST of the flaws in the HTTP/1.1 > implementation, > mod_proxy and even mod_cache exist because the development happened with > insufficient oversight. > > Only code that's actively reviewed on trunk/ is going to get the level of > scrutiny required; let's all row in the same direction, shall we? +1. Regards, Graham --
Re: [Fwd: Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h]
On 10/29/06, William A. Rowe, Jr. <[EMAIL PROTECTED]> wrote: I strongly disagree because MOST of the flaws in the HTTP/1.1 implementation, mod_proxy and even mod_cache exist because the development happened with insufficient oversight. Only code that's actively reviewed on trunk/ is going to get the level of scrutiny required; let's all row in the same direction, shall we? As long as the ideas are discussed on list first and we come to a consensus, I'm fine with changes going into trunk instead of a branch. But, when large changes get dropped into trunk without any warning, it's extremely annoying. -- justin
Re: [Fwd: Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h]
Ruediger Pluem wrote: > > Apart from this, Paul created a branch a while ago for mod_cache refactoring. > As it has turned out the whole thing creates some bigger discussion and > patches > go in and out. So I think it would be a good idea to do this on a dev branch > instead > of the trunk. So I propose the following thing: You mean sandboxes (at least that's what we normally refer to them as, trunk/ IS the dev branch :) ... I strongly disagree because MOST of the flaws in the HTTP/1.1 implementation, mod_proxy and even mod_cache exist because the development happened with insufficient oversight. Only code that's actively reviewed on trunk/ is going to get the level of scrutiny required; let's all row in the same direction, shall we? Bill
Re: [Fwd: Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h]
On 10/29/2006 01:56 PM, Graham Leggett wrote: > Ruediger Pluem wrote: > >> Apart from this, Paul created a branch a while ago for mod_cache >> refactoring. >> As it has turned out the whole thing creates some bigger discussion >> and patches >> go in and out. So I think it would be a good idea to do this on a dev >> branch instead >> of the trunk. So I propose the following thing: >> >> 1. Create a dev branch again for mod_cache based on the current trunk. >> 2. Rollback the patches on trunk >> 3. Work out the whole thing on the dev branch until there is consensus >> about >>the solution and only minor issues need to be addressed. >> 4. Merge the dev branch back into trunk. >> 5. Address the minor issues on trunk and tweak it there. >> >> This gives people who cannot follow up the whole history the chance to >> review >> the whole thing on step 4. as some sort of reviewing a complete new >> module :-) > > > A trunk by any other name, will still smell as sweet. > > If the branch was created beforehand, then this would have made sense, It was there, but sadly it was not used. But I admit that I should have pointed out this much much earlier, so this is also my fault. > but to have created the branch so late in the process, we are just > creating work for ourselves that will ultimately end up with the same > result. > > Currently history reflects the reality of what was tried along the road, > what was objected to, backed out, and tried again. > > If we redo this without the objected-to bits, we are simply rewriting > history (literally), thus removing the history's real value. We actually do not remove the history. It will still live in the branch, but yes on the trunk once we merge the branch back in it would look like some kind of magical jump. To get all the gory details you need to go through the logs of the (then deleted) branch to get all the history of the back and forth and the rationales. Regards Rüdiger
Re: [Fwd: Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h]
Ruediger Pluem wrote: Apart from this, Paul created a branch a while ago for mod_cache refactoring. As it has turned out the whole thing creates some bigger discussion and patches go in and out. So I think it would be a good idea to do this on a dev branch instead of the trunk. So I propose the following thing: 1. Create a dev branch again for mod_cache based on the current trunk. 2. Rollback the patches on trunk 3. Work out the whole thing on the dev branch until there is consensus about the solution and only minor issues need to be addressed. 4. Merge the dev branch back into trunk. 5. Address the minor issues on trunk and tweak it there. This gives people who cannot follow up the whole history the chance to review the whole thing on step 4. as some sort of reviewing a complete new module :-) A trunk by any other name, will still smell as sweet. If the branch was created beforehand, then this would have made sense, but to have created the branch so late in the process, we are just creating work for ourselves that will ultimately end up with the same result. Currently history reflects the reality of what was tried along the road, what was objected to, backed out, and tried again. If we redo this without the objected-to bits, we are simply rewriting history (literally), thus removing the history's real value. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Graham Leggett wrote: > We have significant contributions from two people - Davi Arnaut and Niklas > Edmundsson, and I've been integrating the issues fixed by both these > contributions into a coherent workable whole, so that the effort spent by > these people isn't wasted. Both of their efforts have focused on different > aspects of the cache, making this workable. Some parts are not RFC > compliant, other parts are not implemented elegantly enough, but these are > details that need to be raised, addressed and fixed, not used as a feeble > excuse to abandon the effort and return to some cache code that nobody > wants to use. I'm afraid that your count is wrong - the significant contributions came from THREE people, not TWO: Issac, Davi and Niklas. -- Eli Marmor [EMAIL PROTECTED] Netmask (El-Mar) Internet Technologies Ltd. __ Tel.: +972-9-766-1020 8 Yad-Harutzim St. Fax.: +972-9-766-1314 P.O.B. 7004 Mobile: +972-50-5237338 Kfar-Saba 44641, Israel
[Fwd: Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h]
On 10/27/2006 09:41 PM, William A. Rowe, Jr. wrote: > Graham Leggett wrote: > >>I see lots of comments on the code, but the comments are summarised as >>"the cache is fine as it is". It isn't. If it was fine, key users of >>network caching wouldn't be standing up saying they're using something >>else. > > > I concur, but the history becomes a nightmare. Let's back out the vetoed > efforts and work up -clean- patches to apply that solve one issue each, > and don't raise the objections again? Apart from this, Paul created a branch a while ago for mod_cache refactoring. As it has turned out the whole thing creates some bigger discussion and patches go in and out. So I think it would be a good idea to do this on a dev branch instead of the trunk. So I propose the following thing: 1. Create a dev branch again for mod_cache based on the current trunk. 2. Rollback the patches on trunk 3. Work out the whole thing on the dev branch until there is consensus about the solution and only minor issues need to be addressed. 4. Merge the dev branch back into trunk. 5. Address the minor issues on trunk and tweak it there. This gives people who cannot follow up the whole history the chance to review the whole thing on step 4. as some sort of reviewing a complete new module :-) Furthermore it might be helpful to start the discussion Paul suggested before. The patches from Niklas have been around for a while and there had been some discussions about them before but not much happened before Graham started to commit them. I must admit for myself that I did not have a very close look to these patches before due to time constraints but Graham's commits "forced" me to have a closer look on it and to shift some priorites. So I think his commits had been very helpful to get momentum in this topic and I would expect that this momentum can be kept even if we go for a dev branch and have some "non commit mail" discussion in advance as Paul suggested. As I see some frustration in some of the recent mails from Niklas and Graham: Although I do not agree with every technical aspect of the patches and agree with some of the critical remarks, a big thanks to both (and to Davi of course) for being persistent on this topic and move things forward. Regards Rüdiger
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
William A. Rowe, Jr. wrote: I see lots of comments on the code, but the comments are summarised as "the cache is fine as it is". It isn't. If it was fine, key users of network caching wouldn't be standing up saying they're using something else. I concur, but the history becomes a nightmare. Let's back out the vetoed efforts and work up -clean- patches to apply that solve one issue each, and don't raise the objections again? Currently I am aware of only one outstanding -1, and it came without any technical justification, and so meant "I disagree". This most recent objection was based on the assumption that r450105 still stood, when this is no longer the case. Unfortunately r450105 was not in itself a clean patch, and could not be backed out cleanly like the split buckets patch. If you look at r467655, a large chunk of the patch is simply the deletion of the contentious copy_body() function. Joe also -1'd the split buckets patch from yesterday, as it duplicated the native behaviour of ap_bucket_read(). This was backed out cleanly late yesterday in preparation for today's fix, based on Joe's suggestion on how to solve the problem. I understand the desire to make incremental progress, but patches of patches of patches make the overall history hard to follow, and the net results harder to review. A significant amount of work has been done by Niklas to break up a monolithic change that what was never designed to be a series of patches into a series of patches, and to be honest I think we have the best we could have at this point. If we start backing out and reapplying again, the history will be even harder to follow. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Graham Leggett wrote: > > I see lots of comments on the code, but the comments are summarised as > "the cache is fine as it is". It isn't. If it was fine, key users of > network caching wouldn't be standing up saying they're using something > else. I concur, but the history becomes a nightmare. Let's back out the vetoed efforts and work up -clean- patches to apply that solve one issue each, and don't raise the objections again? I understand the desire to make incremental progress, but patches of patches of patches make the overall history hard to follow, and the net results harder to review.
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Graham Leggett wrote: > On Fri, October 27, 2006 6:25 pm, Niklas Edmundsson wrote: > >> I would have been most happy if this had been fixed ages ago so I >> hadn't been forced to spend lots and lots of hours kludging stuff >> togehter. At least, my kludges seem to have sparked some development >> in this area, so they have served some purpose other than enabling a >> non-profit computer club building a FTP/HTTP server that actually >> works. > > When Brian Atkins stood up at Apachecon and presented a talk pretty much > summarised as "we don't use Apache's cache because it sucked, and here's > why", that was a wake up call to realise that the disk cache in its > current form sucks. > > We have significant contributions from two people - Davi Arnaut and Niklas > Edmundsson, and I've been integrating the issues fixed by both these > contributions into a coherent workable whole, so that the effort spent by > these people isn't wasted. Both of their efforts have focused on different > aspects of the cache, making this workable. Some parts are not RFC > compliant, other parts are not implemented elegantly enough, but these are > details that need to be raised, addressed and fixed, not used as a feeble > excuse to abandon the effort and return to some cache code that nobody > wants to use. I particularly don't care if my patches are not used if a better solution cames up. It takes time for ideas to "maturate". > I see lots of comments on the code, but the comments are summarised as > "the cache is fine as it is". It isn't. If it was fine, key users of > network caching wouldn't be standing up saying they're using something > else. No one said that, but we must think before acting. -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Fri, October 27, 2006 6:25 pm, Niklas Edmundsson wrote: > I would have been most happy if this had been fixed ages ago so I > hadn't been forced to spend lots and lots of hours kludging stuff > togehter. At least, my kludges seem to have sparked some development > in this area, so they have served some purpose other than enabling a > non-profit computer club building a FTP/HTTP server that actually > works. When Brian Atkins stood up at Apachecon and presented a talk pretty much summarised as "we don't use Apache's cache because it sucked, and here's why", that was a wake up call to realise that the disk cache in its current form sucks. We have significant contributions from two people - Davi Arnaut and Niklas Edmundsson, and I've been integrating the issues fixed by both these contributions into a coherent workable whole, so that the effort spent by these people isn't wasted. Both of their efforts have focused on different aspects of the cache, making this workable. Some parts are not RFC compliant, other parts are not implemented elegantly enough, but these are details that need to be raised, addressed and fixed, not used as a feeble excuse to abandon the effort and return to some cache code that nobody wants to use. I see lots of comments on the code, but the comments are summarised as "the cache is fine as it is". It isn't. If it was fine, key users of network caching wouldn't be standing up saying they're using something else. Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, 25 Oct 2006, Graham Leggett wrote: I managed to solve this problem last night. This is what this code needed: Someone with a clue on the apache internals so stuff can be solved properly. I have said it before and say it again: I'm not that guy, but I know what functionality is needed for our usecase. People have complained at the kludges present in my patches, and yes they were kludgy. However, they miss the big point: Despite the kludges they get the job done, with the end result being something usable for our usecase. With good performance, no less. If I can improve stuff from the state unusable to actually-pretty-good with kludges, then this should be a rather obvious hint that things suck and should be fixed. To just keep repeating "this is no good" probably won't achieve this. If the goal is to never accept code that isn't perfect, mod*cache never should have been committed to the httpd tree, and probably most modules (including mod_example) too. Once in a while you have to acknowledge that commited code is crap, and accept patches, albeit kludges, if it improves the situation. Otherwise you might end up with code that keeps on rotting away (mod_example is a good example, again). I would have been most happy if this had been fixed ages ago so I hadn't been forced to spend lots and lots of hours kludging stuff togehter. At least, my kludges seem to have sparked some development in this area, so they have served some purpose other than enabling a non-profit computer club building a FTP/HTTP server that actually works. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED] --- My favorite color? Red. No, BluAHHH! =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Thu, Oct 26, 2006 at 05:20:10PM +0200, Plüm, Rüdiger, VF EITO wrote: > > Index: modules/cache/mod_disk_cache.c > > === > > --- modules/cache/mod_disk_cache.c (revision 450104) > > +++ modules/cache/mod_disk_cache.c (working copy) > > > @@ -998,12 +998,16 @@ > > dobj->file_size = 0; > > } > > > > -for (e = APR_BRIGADE_FIRST(bb); > > - e != APR_BRIGADE_SENTINEL(bb); > > - e = APR_BUCKET_NEXT(e)) > > -{ > > +e = APR_BRIGADE_FIRST(bb); > > +while (e != APR_BRIGADE_SENTINEL(bb) && !APR_BUCKET_IS_EOS(e)) { > > const char *str; > > apr_size_t length, written; > > + > > +if (APR_BUCKET_IS_METADATA(e)) { > > +e = APR_BUCKET_NEXT(e); > > +continue; > > +} > > + > > Why ignore the metadata buckets? This means that flush buckets do not get > passed > up the chain via the temporary brigade tmpbb. This is bad IMHO. > I guess the best thing we can do here is to add them to tmpbb before doing the > continue. Of course this means that all additions need to be done to the tail > of tmpbb. Yeah, I noticed this too. For FLUSH buckets batching them up in tmpbb is no good either, they need to be sent up the filter chain immediately. I changed it like this: (incremental patch with diff -wu to hide the re-indenting) --- mod_disk_cache.c.prev 2006-10-26 16:39:17.0 +0100 +++ mod_disk_cache.c2006-10-26 16:32:16.0 +0100 @@ -1003,11 +1003,8 @@ const char *str; apr_size_t length, written; -if (APR_BUCKET_IS_METADATA(e)) { -e = APR_BUCKET_NEXT(e); -continue; -} - +/* Store contents of non-metadata buckets to disk. */ +if (!APR_BUCKET_IS_METADATA(e)) { rv = apr_bucket_read(e, &str, &length, APR_BLOCK_READ); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ERR, 0, r->server, @@ -1036,21 +1033,23 @@ file_cache_errorcleanup(dobj, r); return APR_EGENERAL; } +} next = APR_BUCKET_NEXT(e); -/* Move the bucket to the temp brigade and flush it up the +/* Move the bucket to the temp brigade and pass it down the * filter chain. */ APR_BUCKET_REMOVE(e); APR_BRIGADE_INSERT_HEAD(tmpbb, e); + +if (!r->connection->aborted) { rv = ap_pass_brigade(f_next, tmpbb); if (rv) { +r->connection->aborted = 1; ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, r->server, "disk_cache: failed to flush brigade for URL %s", h->cache_obj->key); -/* Remove the intermediate cache file and return non-APR_SUCCESS */ -file_cache_errorcleanup(dobj, r); -return rv; +} } /* Discard the bucket and move on. */ @@ -1059,19 +1058,29 @@ e = next; } -/* Was this the final bucket? If yes, close the temp file and perform - * sanity checks. - */ -if (e != APR_BRIGADE_SENTINEL(bb) && APR_BUCKET_IS_EOS(e)) { -if (r->connection->aborted || r->no_cache) { +if (r->no_cache) { +ap_log_error(APLOG_MARK, APLOG_INFO, 0, r->server, + "disk_cache: Discarding body for URL %s, " + "no-cache flag set", h->cache_obj->key); +/* Remove the intermediate cache file and return non-APR_SUCCESS */ +file_cache_errorcleanup(dobj, r); +return APR_SUCCESS; +} + +if (r->connection->aborted) { ap_log_error(APLOG_MARK, APLOG_INFO, 0, r->server, "disk_cache: Discarding body for URL %s " "because connection has been aborted.", h->cache_obj->key); /* Remove the intermediate cache file and return non-APR_SUCCESS */ file_cache_errorcleanup(dobj, r); -return APR_EGENERAL; +return APR_ECONNABORTED; } + +/* Was this the final bucket? If yes, close the temp file and perform + * sanity checks. + */ +if (e != APR_BRIGADE_SENTINEL(bb) && APR_BUCKET_IS_EOS(e)) { if (dobj->file_size < conf->minfs) { ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server, "disk_cache: URL %s failed the size check "
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
> -Ursprüngliche Nachricht- > Von: Joe Orton > Gesendet: Mittwoch, 25. Oktober 2006 17:59 > An: dev@httpd.apache.org > Betreff: Re: svn commit: r467655 - in /httpd/httpd/trunk: > CHANGES docs/manual/mod/mod_cache.xml > modules/cache/mod_cache.c modules/cache/mod_cache.h > > > Index: modules/cache/mod_disk_cache.c > === > --- modules/cache/mod_disk_cache.c(revision 450104) > +++ modules/cache/mod_disk_cache.c(working copy) > @@ -998,12 +998,16 @@ > dobj->file_size = 0; > } > > -for (e = APR_BRIGADE_FIRST(bb); > - e != APR_BRIGADE_SENTINEL(bb); > - e = APR_BUCKET_NEXT(e)) > -{ > +e = APR_BRIGADE_FIRST(bb); > +while (e != APR_BRIGADE_SENTINEL(bb) && !APR_BUCKET_IS_EOS(e)) { > const char *str; > apr_size_t length, written; > + > +if (APR_BUCKET_IS_METADATA(e)) { > +e = APR_BUCKET_NEXT(e); > +continue; > +} > + Why ignore the metadata buckets? This means that flush buckets do not get passed up the chain via the temporary brigade tmpbb. This is bad IMHO. I guess the best thing we can do here is to add them to tmpbb before doing the continue. Of course this means that all additions need to be done to the tail of tmpbb. Regards Rüdiger
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Thu, October 26, 2006 4:29 pm, Graham Leggett wrote: > In that case the solution is to expose the logic inside > ap_core_output_filter() that decides whether the socket will block, and > make that generally available. To be more specific, there would be a need to be able to, given a connection_rec, to ask the question of the ap_core_output_filter "are there still bytes that were written but not yet confirmed, or that remain to be written in your temporary setaside brigade?". Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Thu, October 26, 2006 2:44 pm, Joe Orton wrote: > This is a non-starter. The cache cannot rely on any particular > implementation detail of the trunk core output filter (which sounds like > a bug), nor can it assume that filters between it and the core output > filter will pass through FILE buckets intact - think about e.g. SSL. In that case the solution is to expose the logic inside ap_core_output_filter() that decides whether the socket will block, and make that generally available. I have a better solution for the split-large-brigades concept inspired by your patch from yesterday - there is a placeholder for this in there. Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Thu, Oct 26, 2006 at 11:02:40AM +0200, Graham Leggett wrote: > On Thu, October 26, 2006 10:50 am, Joe Orton wrote: > > > I'm not sure how that is relevant. The core output filter writes to the > > socket directly - it can use non-blocking writes or whatever it likes to > > do that. The cache must write to the output filter chain. How do you > > propose to do non-blocking writes up the output filter chain? > > The cache by design sits as close to the output filter as humanly > possible. This means that cache can pass file buckets to the output > filter, and it does. If ap_core_output_filter() is asked to write a file > bucket to the network, and this would block, it instead sets aside the > file bucket in a temporary brigade for sending later, and returns > immediately. This is a non-starter. The cache cannot rely on any particular implementation detail of the trunk core output filter (which sounds like a bug), nor can it assume that filters between it and the core output filter will pass through FILE buckets intact - think about e.g. SSL. joe
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
> -Ursprüngliche Nachricht- > Von: Davi Arnaut > Gesendet: Donnerstag, 26. Oktober 2006 13:47 > An: dev@httpd.apache.org > Betreff: Re: svn commit: r467655 - in /httpd/httpd/trunk: > CHANGES docs/manual/mod/mod_cache.xml > modules/cache/mod_cache.c modules/cache/mod_cache.h > > > Graham Leggett wrote: > > On Wed, October 25, 2006 11:48 pm, Davi Arnaut wrote: > > > >> What Joe's patch does is remove this first implicitly > created bucket > >> from the brigade, placing it on the brigade on a temporary > brigade for > >> sending it to the client. > > > > Ok, this makes more sense, but in its current form the > temporary brigade > > should not be necessary. > > But it would be better. We don't need to keep creating brigades with > apr_brigade_split(), we could only move the buckets to a temporary +1 on this. Creating brigades over and over again consumes memory that is only freed once the request pool gets cleared. So if we create a lot of brigades we have a temporary memory leak here. AFAIK the (data only?) memory allocated by buckets is freed immediately when they get destroyed. Regards Rüdiger
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Graham Leggett wrote: > On Wed, October 25, 2006 11:48 pm, Davi Arnaut wrote: > >> No, we start with a file bucket on the brigade, I will try to explain >> what happens to see if we are talking about the same thing. >> >> Suppose we have a brigade containing a a file bucket, and the file size >> is 4.7GB. We want to read it fully. >> >> When we call apr_bucket_read() on this bucket, we end-up calling the >> bucket read function (file_bucket_read). What does the bucket file read do >> ? >> >> If mmap is supported, it mmaps APR_MMAP_LIMIT bytes (4MB) by creating a >> new mmap bucket and splits the bucket. So, after calling read on a file >> bucket you have two buckets on the brigade. The first one is the mmap >> bucket and last is file bucket with a updated offset. >> >> The same thing happens if mmap is not supported, but the bucket type >> will be a heap bucket. If we don't delete or flush those implicitly >> created buckets they will keep the whole file in memory, but one single >> read will not put the entire file on memory. >> >> What Joe's patch does is remove this first implicitly created bucket >> from the brigade, placing it on the brigade on a temporary brigade for >> sending it to the client. > > Ok, this makes more sense, but in its current form the temporary brigade > should not be necessary. But it would be better. We don't need to keep creating brigades with apr_brigade_split(), we could only move the buckets to a temporary brigade. Take a look at apr_brigade_split: APU_DECLARE(apr_bucket_brigade *) apr_brigade_split(apr_bucket_brigade *b, apr_bucket *e) { apr_bucket_brigade *a; apr_bucket *f; a = apr_brigade_create(b->p, b->bucket_alloc); /* Return an empty brigade if there is nothing left in * the first brigade to split off */ if (e != APR_BRIGADE_SENTINEL(b)) { f = APR_RING_LAST(&b->list); APR_RING_UNSPLICE(e, f, link); APR_RING_SPLICE_HEAD(&a->list, e, f, apr_bucket, link); } APR_BRIGADE_CHECK_CONSISTENCY(a); APR_BRIGADE_CHECK_CONSISTENCY(b); return a; } If we used a tmpbb we could replace the apr_brigade_split with something similar to: if (e != APR_BRIGADE_SENTINEL(b)) { f = APR_RING_LAST(&b->list); APR_RING_UNSPLICE(e, f, link); APR_RING_SPLICE_HEAD(&a->list, e, f, apr_bucket, link); } And instead of destroying the brigade, we could only do a call to: apr_brigade_cleanup(tmpbb); > While disk cache goes through the brigade, it replaces buckets on the > incoming brigade with a bucket referring to the cached file. > > I think a number of the thorny memory problems in mod_*cache have come > about because the brigade is read twice - once by store_body, and once by > ap_core_output_filter. Maybe. Buckets are reference counted and get reused pretty quickly. > The replacing of the buckets with file buckets in the brigade by disk > cache means the brigade will only be read once - by save_body(). Right, I liked your approach. I just wanted to be sure everyone is on the same page on this issue. >> That's why splitting the brigade with magical values (16MB) is not such >> a good idea, because the bucket type knows betters and will split the >> bucket anyway. > > Right, the picture is getting clearer. > > Look closer at the cache code in mod_cache, right at the end of the > CACHE_SAVE filter. > > Notice how store_body() is called, followed by ap_pass_brigade(). Notice > how store_body() is expected to handle the complete brigade, all 4.7GB of > it, before ap_pass_brigade() has a look see. Now - create a 4.7GB file on > your drive. Copy this file to a second filename. Time this and tell me how > long it takes. Do you see the further problem? Yup. > The split-before-save_body patch wasn't described well enough by me - it > is designed to prevent save_body() from having to handle bucket sizes that > are impractically large to handle by a cache, both in terms of memory, and > in terms of time. > That is clear. -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Thu, October 26, 2006 10:50 am, Joe Orton wrote: > I'm not sure how that is relevant. The core output filter writes to the > socket directly - it can use non-blocking writes or whatever it likes to > do that. The cache must write to the output filter chain. How do you > propose to do non-blocking writes up the output filter chain? The cache by design sits as close to the output filter as humanly possible. This means that cache can pass file buckets to the output filter, and it does. If ap_core_output_filter() is asked to write a file bucket to the network, and this would block, it instead sets aside the file bucket in a temporary brigade for sending later, and returns immediately. If this behaviour cannot be relied apon, in other words, if it is found that there are cases where the ap_core_output_filter() does block despite being handed file buckets, then the solution is to expose a function that, given a request_rec, will tell you whether the socket will block or not. I have a simpler solution for the bucket split in mod_cache that I am trying out as well, that will give us more control over the simultaneous write-to-file and write-to-network problem. Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, Oct 25, 2006 at 10:21:26PM +0200, Graham Leggett wrote: > Joe Orton wrote: > > >There is no other acceptable solution AFAICS. Buffering the entire > >brigade (either to disk, or into RAM as the current code does) before > >writing to the client is not OK, polling on buckets is not possible, > >using threads is not OK, using non-blocking writes up the output filter > >chain is not possible. Any other ideas? > > I managed to solve this problem last night. > > Took a while and a lot of digging to figure it out, but in the end it is > relatively simple. > > The ap_core_output_filter helps us out: I'm not sure how that is relevant. The core output filter writes to the socket directly - it can use non-blocking writes or whatever it likes to do that. The cache must write to the output filter chain. How do you propose to do non-blocking writes up the output filter chain? joe
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Thu, October 26, 2006 10:28 am, Plüm, Rüdiger, VF EITO wrote: > AFAIK this is only true on trunk due to Brians async write patches there. > They have not been backported to 2.2.x and I think they are unlikely to > get backported. Thus this solution for mod_disk_cache only solves > the problem on trunk. This not necessarily bad, but I guess people should > be aware of it. I am aware of it, thus only bugfixes are being proposed for backport to v2.2. Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
> -Ursprüngliche Nachricht- > Von: Graham Leggett > Gesendet: Mittwoch, 25. Oktober 2006 22:21 > An: dev@httpd.apache.org > Betreff: Re: svn commit: r467655 - in /httpd/httpd/trunk: > CHANGES docs/manual/mod/mod_cache.xml > modules/cache/mod_cache.c modules/cache/mod_cache.h > > > I managed to solve this problem last night. > > Took a while and a lot of digging to figure it out, but in > the end it is > relatively simple. > > The ap_core_output_filter helps us out: > > /* Scan through the brigade and decide whether to > attempt a write, > * based on the following rules: > * > * 1) The new_bb is null: Do a nonblocking write of as much as > * possible: do a nonblocking write of as much data > as possible, > * then save the rest in ctx->buffered_bb. (If > new_bb == NULL, > * it probably means that the MPM is doing asynchronous write > * completion and has just determined that this connection > * is writable.) > * > [snip] > AFAIK this is only true on trunk due to Brians async write patches there. They have not been backported to 2.2.x and I think they are unlikely to get backported. Thus this solution for mod_disk_cache only solves the problem on trunk. This not necessarily bad, but I guess people should be aware of it. Regards Rüdiger
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, October 25, 2006 11:48 pm, Davi Arnaut wrote: > No, we start with a file bucket on the brigade, I will try to explain > what happens to see if we are talking about the same thing. > > Suppose we have a brigade containing a a file bucket, and the file size > is 4.7GB. We want to read it fully. > > When we call apr_bucket_read() on this bucket, we end-up calling the > bucket read function (file_bucket_read). What does the bucket file read do > ? > > If mmap is supported, it mmaps APR_MMAP_LIMIT bytes (4MB) by creating a > new mmap bucket and splits the bucket. So, after calling read on a file > bucket you have two buckets on the brigade. The first one is the mmap > bucket and last is file bucket with a updated offset. > > The same thing happens if mmap is not supported, but the bucket type > will be a heap bucket. If we don't delete or flush those implicitly > created buckets they will keep the whole file in memory, but one single > read will not put the entire file on memory. > > What Joe's patch does is remove this first implicitly created bucket > from the brigade, placing it on the brigade on a temporary brigade for > sending it to the client. Ok, this makes more sense, but in its current form the temporary brigade should not be necessary. While disk cache goes through the brigade, it replaces buckets on the incoming brigade with a bucket referring to the cached file. I think a number of the thorny memory problems in mod_*cache have come about because the brigade is read twice - once by store_body, and once by ap_core_output_filter. The replacing of the buckets with file buckets in the brigade by disk cache means the brigade will only be read once - by save_body(). > That's why splitting the brigade with magical values (16MB) is not such > a good idea, because the bucket type knows betters and will split the > bucket anyway. Right, the picture is getting clearer. Look closer at the cache code in mod_cache, right at the end of the CACHE_SAVE filter. Notice how store_body() is called, followed by ap_pass_brigade(). Notice how store_body() is expected to handle the complete brigade, all 4.7GB of it, before ap_pass_brigade() has a look see. Now - create a 4.7GB file on your drive. Copy this file to a second filename. Time this and tell me how long it takes. Do you see the further problem? The split-before-save_body patch wasn't described well enough by me - it is designed to prevent save_body() from having to handle bucket sizes that are impractically large to handle by a cache, both in terms of memory, and in terms of time. Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Davi Arnaut wrote: > Graham Leggett wrote: >> Davi Arnaut wrote: >> >>> Yes, first because size_t is 32 bits :). When you do a read like this on >>> a file bucket, the whole bucket is not read into memory. The read >>> function splits the bucket and changes the current bucket to refer to >>> data that was read. >> 32 bits is 4GB. A large number of webservers don't have memory that >> size, thus the problem. >> >>> The problem lies that those new buckets keep accumulating in the >>> brigade! See my patch again. >> Where? > > I was referring to "it will attempt to read all 4.7GB". Such a thing > does not exist! > >> We start with one 4.7GB bucket in a brigade. > > No, we start with a file bucket on the brigade, I will try to explain > what happens to see if we are talking about the same thing. > > Suppose we have a brigade containing a a file bucket, and the file size > is 4.7GB. We want to read it fully. > > When we call apr_bucket_read() on this bucket, we end-up calling the > bucket read function (file_bucket_read). What does the bucket file read do ? > > If mmap is supported, it mmaps APR_MMAP_LIMIT bytes (4MB) by creating a > new mmap bucket and splits the bucket. So, after calling read on a file > bucket you have two buckets on the brigade. The first one is the mmap > bucket and last is file bucket with a updated offset. s/last/last one is the old/ > The same thing happens if mmap is not supported, but the bucket type > will be a heap bucket. If we don't delete or flush those implicitly > created buckets they will keep the whole file in memory, but one single > read will not put the entire file on memory. s/keep the whole file in memory/will accumulate on the brigade, implying that the whole file is in memory/ > What Joe's patch does is remove this first implicitly created bucket > from the brigade, placing it on the brigade on a temporary brigade for > sending it to the client. s/on the brigade// > That's why splitting the brigade with magical values (16MB) is not such > a good idea, because the bucket type knows betters and will split the > bucket anyway. -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Graham Leggett wrote: > Davi Arnaut wrote: > >> Yes, first because size_t is 32 bits :). When you do a read like this on >> a file bucket, the whole bucket is not read into memory. The read >> function splits the bucket and changes the current bucket to refer to >> data that was read. > > 32 bits is 4GB. A large number of webservers don't have memory that > size, thus the problem. > >> The problem lies that those new buckets keep accumulating in the >> brigade! See my patch again. > > Where? I was referring to "it will attempt to read all 4.7GB". Such a thing does not exist! > We start with one 4.7GB bucket in a brigade. No, we start with a file bucket on the brigade, I will try to explain what happens to see if we are talking about the same thing. Suppose we have a brigade containing a a file bucket, and the file size is 4.7GB. We want to read it fully. When we call apr_bucket_read() on this bucket, we end-up calling the bucket read function (file_bucket_read). What does the bucket file read do ? If mmap is supported, it mmaps APR_MMAP_LIMIT bytes (4MB) by creating a new mmap bucket and splits the bucket. So, after calling read on a file bucket you have two buckets on the brigade. The first one is the mmap bucket and last is file bucket with a updated offset. The same thing happens if mmap is not supported, but the bucket type will be a heap bucket. If we don't delete or flush those implicitly created buckets they will keep the whole file in memory, but one single read will not put the entire file on memory. What Joe's patch does is remove this first implicitly created bucket from the brigade, placing it on the brigade on a temporary brigade for sending it to the client. That's why splitting the brigade with magical values (16MB) is not such a good idea, because the bucket type knows betters and will split the bucket anyway. -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
William A. Rowe, Jr. wrote: Paul's right. The next expectiation we break, that a protocol spends it's life on a single thread. That is really a 3.0 magnitude change because it will break modules in a major way. Then, the final expectation to break is that a -request- spends it's life on a given thread. That's earth shattering to developers, and really is beyond the 3.0 release (if we want to see 3.0 adopted in the next few years). Please test out trunk for me. I believe I have found a way to solve this problem without the need for threads or forking. Problems at this stage I believe are bugs for which practical solutions can be found, but I need more eyes on it. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Davi Arnaut wrote: Yes, first because size_t is 32 bits :). When you do a read like this on a file bucket, the whole bucket is not read into memory. The read function splits the bucket and changes the current bucket to refer to data that was read. 32 bits is 4GB. A large number of webservers don't have memory that size, thus the problem. The problem lies that those new buckets keep accumulating in the brigade! See my patch again. Where? We start with one 4.7GB bucket in a brigade. The bucket is split into 16MB + rest of bucket, then the brigade is split across the split bucket. We now have two brigades, one with a 16MB bucket, the other with a 4.64GB bucket. The resulting left hand brigade consisting of one 16MB file bucket is written to the cache, then written to the network, then deleted, so we're back to one brigade, containing one 4.64GB bucket. Rinse, repeat. The only place buckets can accumulate is in the ap_core_output_filter(), but that's fine - 293 buckets all pointing at the same file backend is manageable. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Joe Orton wrote: There is no other acceptable solution AFAICS. Buffering the entire brigade (either to disk, or into RAM as the current code does) before writing to the client is not OK, polling on buckets is not possible, using threads is not OK, using non-blocking writes up the output filter chain is not possible. Any other ideas? I managed to solve this problem last night. Took a while and a lot of digging to figure it out, but in the end it is relatively simple. The ap_core_output_filter helps us out: /* Scan through the brigade and decide whether to attempt a write, * based on the following rules: * * 1) The new_bb is null: Do a nonblocking write of as much as * possible: do a nonblocking write of as much data as possible, * then save the rest in ctx->buffered_bb. (If new_bb == NULL, * it probably means that the MPM is doing asynchronous write * completion and has just determined that this connection * is writable.) * [snip] Brigades handed to the output filter are written to the network with a non blocking write. Any parts of the brigades which cannot be written without blocking are set aside to be sent the next time the filter is invoked with more data. There is a catch - the output filter will only setaside a certain number of non file buckets before it enforces a blocking write to clear the backlog and keep memory usage down. The solution to this catch is to ensure that you always write file buckets to the network. This way, the output filter will never block [1]. Enter mod_disk_cache. One of the last things mod_disk_cache does after saving the body, is to replace whatever buckets were just written regardless of bucket type [2] in the brigade, with a file bucket pointing at the cached file and containing the exact same data. This behaviour has two side effects: responses no longer hang around in RAM waiting to be sent to a slow client, these responses can now sit on disk [3], and this potentially improves performance on "expensive" processes like CGI, which can go away immediately and not hang around waiting for slow clients. The second side effect is that the bucket handed to the output filter is a file bucket - and therefore can be set aside and handled with non blocking writes by the output filter. Now, enter mod_cache. None of the above would mean anything if the file buckets being sent consisted of a single 4.7GB bucket. In this case, the save_body() would only finish after 4.7GB was written to disk, and the network write would only start after the first complete invocation of save_body(), and by that point the browser got bored and is long gone. Oops. What will we do. But mod_cache no longer passes 4.7GB file buckets to the providers, it now splits them up into buckets of a maximum size defaulting to 16MB. So 16MB at a time gets written to the cache, then written to the non blocking network, then written to the cache, and so on. Suddenly the write-to-cache, then write-to-network problem is gone, and without threads, and without fork. Run a wget on a 250MB file. Watch it being downloaded and cached at the same time, the size of the file in the cache tracks the size of the file downloaded reported by wget. Run a second wget on the same file moments later. Watch that wget quickly read the file from the cache up to where the first wget is running, and then watch it track the first wget's progress from that point on. Run cmp on the original file, the downloaded files, and the cache body, all the same. Works like a charm. The work is not finished. There are alternate use cases that need to be checked. Some alternate use cases are not practical to handle, and we must make decisions on these. This code however is based on code running in production right now, so bugs should be reasonably clear and straightforward. I need some help on the behaviour of the brigades, especially with the cleanup of the brigades so they don't hang around for the entire request unnecessarily. I also need help solving some of the less savory solutions that people are not happy with, like fstat/sleep. Please don't mail me any more about copy_body(). This function is no longer necessary and will be removed next. Hopefully the above will explain why copy_body() was attempted in the first place, as flawed as it was. It was committed as is because it was a prerequisite of Niklas' second patch, which is a critical component of the above. It was better to commit then change, rather than never commit the first or second patches, and never get anywhere. [1] testing shows this is the case. More testing is needed to make sure this is true in all cases. [2] I need to teach mod_disk_cache to handle metadata buckets more intelligently. [3] Again, I need some help making sure brigades are cleared when they should be, and there are no leaks in mod_disk_cache. Regards, G
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
William A. Rowe, Jr. wrote: > Paul Querna wrote: >> Thats not going to be possible until 3.0. Good luck :) > > Paul's right. The next expectiation we break, that a protocol spends it's > life on a single thread. That is really a 3.0 magnitude change because it > will break modules in a major way. > > Then, the final expectation to break is that a -request- spends it's life on > a given thread. That's earth shattering to developers, and really is beyond > the 3.0 release (if we want to see 3.0 adopted in the next few years). > Ok. I will come back in a few years :) j/k How about moving the proxy/cache parts to use their own core filters/architecture ? I obviously lack experience, but in my limited understanding the proxy/cache code is being restrained because we can't improve core features that may affect the HTTP serving parts. In a proxy/cache situation no one cares about php/python, cgi, or what-so-ever filters. Can't we change the rules for apache (core and modules) operating as a proxy/cache more easily ? I don't know all possible implications of such a move, but it may work. -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Paul Querna wrote: > > Thats not going to be possible until 3.0. Good luck :) Paul's right. The next expectiation we break, that a protocol spends it's life on a single thread. That is really a 3.0 magnitude change because it will break modules in a major way. Then, the final expectation to break is that a -request- spends it's life on a given thread. That's earth shattering to developers, and really is beyond the 3.0 release (if we want to see 3.0 adopted in the next few years).
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Davi Arnaut wrote: Joe Orton wrote: and we must not slow the caching because of a slow client (that's why I didn't pass the brigade). There is no other acceptable solution AFAICS. Buffering the entire brigade (either to disk, or into RAM as the current code does) before writing to the client is not OK, polling on buckets is not possible, using threads is not OK, using non-blocking writes up the output filter chain is not possible. Any other ideas? I think we should build the necessary infrastructure to solve this problem once and for all. Be it either non-blocking writes, AIO or add support to the core output filter to write data to different destinations. Thats not going to be possible until 3.0. Good luck :) -Paul
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Joe Orton wrote: > On Wed, Oct 25, 2006 at 01:54:04PM -0300, Davi Arnaut wrote: >> Joe Orton wrote: >>> Another couple of hundred lines of code and even a new config directive, >>> and this still doesn't get close to actually fixing the problem! -1 >>> already, this code is just not getting better. mod_disk_cache is still >>> liable to eat all your RAM in that apr_bucket_read() loop, the >>> apr_bucket_split() is not guaranteed to work for a morphing bucket type. >>> >>> It is simple enough to fix this problem without adding all this code and >>> without all the stuff in r450105 too, something like the below. >>> >> And you almost got it right. We don't want to stop caching if the >> client's connection fail > > ...a simple enough change. Yes. >> and we must not slow the caching because of a slow client (that's why >> I didn't pass the brigade). > > There is no other acceptable solution AFAICS. Buffering the entire > brigade (either to disk, or into RAM as the current code does) before > writing to the client is not OK, polling on buckets is not possible, > using threads is not OK, using non-blocking writes up the output filter > chain is not possible. Any other ideas? I think we should build the necessary infrastructure to solve this problem once and for all. Be it either non-blocking writes, AIO or add support to the core output filter to write data to different destinations. -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Graham Leggett wrote: > On Wed, October 25, 2006 5:58 pm, Joe Orton wrote: > >> Another couple of hundred lines of code and even a new config directive, >> and this still doesn't get close to actually fixing the problem! -1 >> already, this code is just not getting better. > > As has been explained a few times before, there isn't "a problem" or "the > problem", but rather many problems. > > These problems, are only practically solveable with many patches, each one > addressing a specific problem or behavior. Well said. > The patch just committed removes a burden from the cache providers, in > that from a single source, there is control over the size of buckets that > cache providers are expected to handle. The alternative was one directive > per cache provider, which is not ideal. > > The patch just committed is part of an ongoing work to improve the > behaviour of the cache in the real world, to solve real problems at real > sites. > > Progress to date: > > - The cache can now cache a file, and serve from the cache at the same time. > > - While caching a file, data is sent both to the cache, and to the end > client, at the same time. The cache no longer caches the entire file > before sending it to the client. > > - It is possible to cache a file at full disk speed, even when the > downstream client is slower than the disk, without buffering data in RAM. > > Next step is to remove the copy_body() hack inside mod_disk_cache, as it > is unnecessary. I would (I know I can't :) vote for reverting that last patches (up to r450089) so that we can start all over. > To say "it's not getting better" without actually running the code and > seeing the progress involved is very hollow criticism. > > I appreciate the effort involved in pointing out areas where issues need > to be addressed, but having to contend with the constant barrage of > negativity, and the ridiculous notion that "the problem cannot be solved", > is a really tiring exercise. +1 >> mod_disk_cache is still >> liable to eat all your RAM in that apr_bucket_read() loop, > > Correct, and this will be fixed. > > We ideally want file writes to happen using a file write output filter, > which can then encapsulate any bucket specific weirdness exactly like > ap_core_output_filter does. > >> the >> apr_bucket_split() is not guaranteed to work for a morphing bucket type. > > Then morphing buckets are broken. > > The split method must either succeed, or return a non success APR value. > Both of these cases are handled for by the split loop. If morphing buckets > crash, then they must be fixed. > >> It is simple enough to fix this problem without adding all this code and >> without all the stuff in r450105 too, something like the below. > > I still see that this call is intact: > > apr_bucket_read(e, &str, &length, APR_BLOCK_READ) > > Given a 4.7GB bucket, it will attempt to read all 4.7GB of the bucket into > RAM, and abort. > > Is there something I am missing? Yes, first because size_t is 32 bits :). When you do a read like this on a file bucket, the whole bucket is not read into memory. The read function splits the bucket and changes the current bucket to refer to data that was read. The problem lies that those new buckets keep accumulating in the brigade! See my patch again. So Joe's patch removes this newly implicitly created bucket from the brigade, passes it to the client, and delete it. -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, Oct 25, 2006 at 01:54:04PM -0300, Davi Arnaut wrote: > Joe Orton wrote: > > Another couple of hundred lines of code and even a new config directive, > > and this still doesn't get close to actually fixing the problem! -1 > > already, this code is just not getting better. mod_disk_cache is still > > liable to eat all your RAM in that apr_bucket_read() loop, the > > apr_bucket_split() is not guaranteed to work for a morphing bucket type. > > > > It is simple enough to fix this problem without adding all this code and > > without all the stuff in r450105 too, something like the below. > > > > And you almost got it right. We don't want to stop caching if the > client's connection fail ...a simple enough change. > and we must not slow the caching because of a slow client (that's why > I didn't pass the brigade). There is no other acceptable solution AFAICS. Buffering the entire brigade (either to disk, or into RAM as the current code does) before writing to the client is not OK, polling on buckets is not possible, using threads is not OK, using non-blocking writes up the output filter chain is not possible. Any other ideas? Regards, joe
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, October 25, 2006 5:58 pm, Joe Orton wrote: > Another couple of hundred lines of code and even a new config directive, > and this still doesn't get close to actually fixing the problem! -1 > already, this code is just not getting better. As has been explained a few times before, there isn't "a problem" or "the problem", but rather many problems. These problems, are only practically solveable with many patches, each one addressing a specific problem or behavior. The patch just committed removes a burden from the cache providers, in that from a single source, there is control over the size of buckets that cache providers are expected to handle. The alternative was one directive per cache provider, which is not ideal. The patch just committed is part of an ongoing work to improve the behaviour of the cache in the real world, to solve real problems at real sites. Progress to date: - The cache can now cache a file, and serve from the cache at the same time. - While caching a file, data is sent both to the cache, and to the end client, at the same time. The cache no longer caches the entire file before sending it to the client. - It is possible to cache a file at full disk speed, even when the downstream client is slower than the disk, without buffering data in RAM. Next step is to remove the copy_body() hack inside mod_disk_cache, as it is unnecessary. To say "it's not getting better" without actually running the code and seeing the progress involved is very hollow criticism. I appreciate the effort involved in pointing out areas where issues need to be addressed, but having to contend with the constant barrage of negativity, and the ridiculous notion that "the problem cannot be solved", is a really tiring exercise. > mod_disk_cache is still > liable to eat all your RAM in that apr_bucket_read() loop, Correct, and this will be fixed. We ideally want file writes to happen using a file write output filter, which can then encapsulate any bucket specific weirdness exactly like ap_core_output_filter does. > the > apr_bucket_split() is not guaranteed to work for a morphing bucket type. Then morphing buckets are broken. The split method must either succeed, or return a non success APR value. Both of these cases are handled for by the split loop. If morphing buckets crash, then they must be fixed. > It is simple enough to fix this problem without adding all this code and > without all the stuff in r450105 too, something like the below. I still see that this call is intact: apr_bucket_read(e, &str, &length, APR_BLOCK_READ) Given a 4.7GB bucket, it will attempt to read all 4.7GB of the bucket into RAM, and abort. Is there something I am missing? Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Joe Orton wrote: > On Wed, Oct 25, 2006 at 01:44:48PM -, Graham Leggett wrote: >> Author: minfrin >> Date: Wed Oct 25 06:44:47 2006 >> New Revision: 467655 >> >> URL: http://svn.apache.org/viewvc?view=rev&rev=467655 >> Log: >> mod_cache: Fix an out of memory condition that occurs when the >> cache tries to save huge files (greater than RAM). Buckets bigger >> than a tuneable threshold are split into smaller buckets before >> being passed to mod_disk_cache, etc. PR 39380 > > Another couple of hundred lines of code and even a new config directive, > and this still doesn't get close to actually fixing the problem! -1 > already, this code is just not getting better. mod_disk_cache is still > liable to eat all your RAM in that apr_bucket_read() loop, the > apr_bucket_split() is not guaranteed to work for a morphing bucket type. > > It is simple enough to fix this problem without adding all this code and > without all the stuff in r450105 too, something like the below. > And you almost got it right. We don't want to stop caching if the client's connection fail and we must not slow the caching because of a slow client (that's why I didn't pass the brigade). In the end, your's do almost the same as mine [1] expect that I dint's pass the new buckets up the chain before deleting then (and it was file bucket exclusive). Copying to disk tends to be faster then sending to a client. -- Davi Arnaut 1 http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=115971884005419&w=2
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, Oct 25, 2006 at 01:44:48PM -, Graham Leggett wrote: > Author: minfrin > Date: Wed Oct 25 06:44:47 2006 > New Revision: 467655 > > URL: http://svn.apache.org/viewvc?view=rev&rev=467655 > Log: > mod_cache: Fix an out of memory condition that occurs when the > cache tries to save huge files (greater than RAM). Buckets bigger > than a tuneable threshold are split into smaller buckets before > being passed to mod_disk_cache, etc. PR 39380 Another couple of hundred lines of code and even a new config directive, and this still doesn't get close to actually fixing the problem! -1 already, this code is just not getting better. mod_disk_cache is still liable to eat all your RAM in that apr_bucket_read() loop, the apr_bucket_split() is not guaranteed to work for a morphing bucket type. It is simple enough to fix this problem without adding all this code and without all the stuff in r450105 too, something like the below. Index: modules/cache/mod_cache.c === --- modules/cache/mod_cache.c (revision 450104) +++ modules/cache/mod_cache.c (working copy) @@ -342,6 +342,13 @@ ap_set_module_config(r->request_config, &cache_module, cache); } +if (!cache->tmpbb) { +cache->tmpbb = apr_brigade_create(r->pool, r->connection->bucket_alloc); +} +else { +apr_brigade_cleanup(cache->tmpbb); +} + reason = NULL; p = r->pool; /* @@ -364,7 +371,8 @@ /* pass the brigades into the cache, then pass them * up the filter stack */ -rv = cache->provider->store_body(cache->handle, r, in); +rv = cache->provider->store_body(cache->handle, r, in, + f->next, cache->tmpbb); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, r->server, "cache: Cache provider's store_body failed!"); @@ -827,7 +835,7 @@ return ap_pass_brigade(f->next, in); } -rv = cache->provider->store_body(cache->handle, r, in); +rv = cache->provider->store_body(cache->handle, r, in, f->next, cache->tmpbb); if (rv != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, r->server, "cache: store_body failed"); Index: modules/cache/mod_cache.h === --- modules/cache/mod_cache.h (revision 450104) +++ modules/cache/mod_cache.h (working copy) @@ -210,7 +210,13 @@ typedef struct { int (*remove_entity) (cache_handle_t *h); apr_status_t (*store_headers)(cache_handle_t *h, request_rec *r, cache_info *i); -apr_status_t (*store_body)(cache_handle_t *h, request_rec *r, apr_bucket_brigade *b); +/* The store_body callback writes the contents of the bucket + * brigade to the cache; if necessary any buckets may be flushed + * up the filter chain by moving them to tmpbb and passing that + * brigade to the f_next filter. */ +apr_status_t (*store_body)(cache_handle_t *h, request_rec *r, + apr_bucket_brigade *b, + ap_filter_t *f_next, apr_bucket_brigade *tmpbb); apr_status_t (*recall_headers) (cache_handle_t *h, request_rec *r); apr_status_t (*recall_body) (cache_handle_t *h, apr_pool_t *p, apr_bucket_brigade *bb); int (*create_entity) (cache_handle_t *h, request_rec *r, @@ -246,6 +252,7 @@ apr_time_t lastmod; /* last-modified time */ cache_info *info; /* current cache info */ ap_filter_t *remove_url_filter; /* Enable us to remove the filter */ +apr_bucket_brigade *tmpbb; /* Temporary bucket brigade. */ } cache_request_rec; Index: modules/cache/mod_disk_cache.c === --- modules/cache/mod_disk_cache.c (revision 450104) +++ modules/cache/mod_disk_cache.c (working copy) @@ -56,7 +56,6 @@ /* Forward declarations */ static int remove_entity(cache_handle_t *h); static apr_status_t store_headers(cache_handle_t *h, request_rec *r, cache_info *i); -static apr_status_t store_body(cache_handle_t *h, request_rec *r, apr_bucket_brigade *b); static apr_status_t recall_headers(cache_handle_t *h, request_rec *r); static apr_status_t recall_body(cache_handle_t *h, apr_pool_t *p, apr_bucket_brigade *bb); static apr_status_t read_array(request_rec *r, apr_array_header_t* arr, @@ -977,9 +976,10 @@ } static apr_status_t store_body(cache_handle_t *h, request_rec *r, - apr_bucket_brigade *bb) + apr_bucket_brigade *bb, + ap_filter_t *f_next, apr_bucket_brigade *tmpbb) { -apr_bucket *e; +apr_bucket *e, *next; apr_status_t rv; disk_cache_object_t *dobj = (disk_cache_object_t *) h->cache_obj->vobj; disk_cache_conf *conf = ap_ge
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, October 25, 2006 5:00 pm, Davi Arnaut wrote: > Huh ? The loop will break and the whole brigade will be sent at once a > few lines below. Why would we want to keep splitting the brigade ? > > There is no reason to keep breaking the buckets if we won't store then, > or I am missing something ? This is true. I've just made it break out if store_body returns anything other than APR_SUCCESS. Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
Graham Leggett wrote: > On Wed, October 25, 2006 4:14 pm, Davi Arnaut wrote: > >> && (rv == APR_SUCCESS && rv2 == APR_SUCCESS) { >> >> Otherwise we may keep walking through the brigade unnecessarily. > > This is intentional - we don't want a failure on store_body() to cause a > premature abort on the network write. Huh ? The loop will break and the whole brigade will be sent at once a few lines below. Why would we want to keep splitting the brigade ? There is no reason to keep breaking the buckets if we won't store then, or I am missing something ? -- Davi Arnaut
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
On Wed, October 25, 2006 4:14 pm, Davi Arnaut wrote: > && (rv == APR_SUCCESS && rv2 == APR_SUCCESS) { > > Otherwise we may keep walking through the brigade unnecessarily. This is intentional - we don't want a failure on store_body() to cause a premature abort on the network write. The inverse of not wanting a failure on network write to affect store_body() also needs to be fleshed out, and needs to be configurable, but that's a job for another patch. Regards, Graham --
Re: svn commit: r467655 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_cache.xml modules/cache/mod_cache.c modules/cache/mod_cache.h
.. > +static int do_store_body(cache_request_rec *cache, > + ap_filter_t *f, > + apr_bucket_brigade *in) { > +apr_bucket *e; > +apr_bucket_brigade *bb; > +apr_status_t rv, rv2; > +cache_server_conf *conf; > + > +conf = (cache_server_conf *) > ap_get_module_config(f->r->server->module_config, > + &cache_module); > + > +/* try split any buckets larger than threshold */ > +rv = APR_SUCCESS; /* successful unless found otherwise */ > +rv2 = APR_SUCCESS; > +if (conf->maxbucketsize > 0) { > +e = APR_BRIGADE_FIRST(in); > +while (e != APR_BRIGADE_SENTINEL(in)) { && (rv == APR_SUCCESS && rv2 == APR_SUCCESS) { Otherwise we may keep walking through the brigade unnecessarily. > + > +/* if necessary, split the brigade and send what we have so far > */ > +if (APR_SUCCESS == apr_bucket_split(e, conf->maxbucketsize)) { > +e = APR_BUCKET_NEXT(e); > +bb = in; > +in = apr_brigade_split(bb, e); > + > +/* if store body fails, don't try store body again */ > +if (APR_SUCCESS == rv) { Drop if. > +rv = cache->provider->store_body(cache->handle, f->r, > bb); > +} > + > +/* try write split brigade to the filter stack and network */ > +if (APR_SUCCESS == rv2) { Drop if. > +rv2 = ap_pass_brigade(f->next, bb); > +} > +apr_brigade_destroy(bb); > +} > +else { > +e = APR_BUCKET_NEXT(e); > +} > +} > +} > + Nice patch. -- Davi Arnaut