On 06 Sep 2010, at 11:00 PM, Paul Querna wrote:

Isn't this problem an artifact of how all bucket brigades work, and is
present in all output filter chains?

An output filter might be called multiple times, but a single bucket
can still contain a 4gb chunk easily.

It seems to me it would be better to think about this holistically
down the entire output filter chain, rather than building in special
case support for this inside mod_cache's internal methods?

In the cache case, thinking about it a bit the in and out brigades are probably unavoidable, as the cache is a special case in that it wants to write the data twice, once to the cache, a second time to the rest of the filter stack. Right now, the cache is forced to read the complete brigade to cache it, no option to give up early. And the cache has no choice but to keep the brigade buckets in the brigade so that they can be passed a second time up the filter stack, no deleting buckets as you go like you normally would. Read one 4GB file bucket in the cache, and in the process the file bucket gets morphed into 1/2 million heap buckets, oops. With two brigades, one in, one out, the in brigade can have the buckets removed as they are consumed, as normal, and moved to the out brigade. The cache can quit at any time, and the code following knows what data to write to the network (out), and what data to loop round and resend to the cache (in). The cache provider could choose to quit and ask to be called again either because writing took too long, or too much data was read (and in the process became heap buckets), either reason is fine.

That said, following on your suggestion of thinking about this in the general sense, it would be really nice if the filter stack had the option to say "I have bitten off as much of the brigade as I am prepared to chew on right now, and the leftovers are still in the brigade, can you call me back with this data, maybe with more data added, and I'll try swallow some more?".

In theory, that would mean all handlers (or entities that sent data) would no longer be allowed to make the blind assumption that the filter stack was willing to consume every possible set of buckets the handler wanted to send, and that the stack had the right to go "I'm full, give me a second to chew on this".

This wouldn't need separate brigades, probably just a return code that meant EAGAIN, and that was expected to be honoured by handlers.

Regards,
Graham
--

Reply via email to