On Thu, Sep 2, 2010 at 10:16 AM, Graham Leggett <minf...@sharp.fm> wrote:
> Hi all,
>
> An issue with mod_cache I would like to address this weekend is the
> definition of the store_body() function in the cache implementation
> provider:
>
>    apr_status_t (*store_body)(cache_handle_t *h, request_rec *r,
> apr_bucket_brigade *b);
>
> Right now, mod_cache expects a cache implementation to swallow the entire
> bucket brigade b before returning to mod_cache.
>
> This is fine until the bucket brigade b contains something really large,
> such as a single file bucket pointing at a 4GB DVD image (such a scenario
> occurs when files on a slow disk are cached on a fast SSD disk). At this
> point, mod_cache expects the cache implementation to swallow the entire
> brigade in one go, and this can take a significant amount of time, certainly
> enough time for the client to get bored and time out should the file be
> large and the original disk slow.

Isn't this problem an artifact of how all bucket brigades work, and is
present in all output filter chains?

An output filter might be called multiple times, but a single bucket
can still contain a 4gb chunk easily.

It seems to me it would be better to think about this holistically
down the entire output filter chain, rather than building in special
case support for this inside mod_cache's internal methods?

> What I propose is a change to the function that looks like this:
>
>    apr_status_t (*store_body)(cache_handle_t *h, request_rec *r,
> apr_bucket_brigade *in, apr_bucket_brigade *out);
>
> Instead of one brigade b being passed in, we pass two brigades in, one
> labelled "in", the other labelled "out".
>
> The brigade previously marked "b" becomes "in", and the cache implementation
> is free to consume as much of the "in" brigade as it sees fit, and as the
> "in" brigade is consumed, the consumed buckets are moved to the "out"
> brigade.
>
> If store_body() returns with an empty "in" brigade, mod_cache writes the
> "out" brigade to the output filter stack and we are done as is the case now.
>
> Should however the cache implementation want to take a breath, it returns to
> mod_cache with unconsumed bucket(s) still remaining in the "in" brigade.
> mod_cache in turn sends the already-processed buckets in the "out" brigade
> down the filter stack to the client, and then loops round, calling the
> store_body() function again until the "in" brigade is empty.
>
> In this way, the cache implementation has the option to swallow data in as
> many smaller chunks as it sees fit, and in turn the client gets fed data
> often enough to not get bored and time out if the file is very large.
>
> Regards,
> Graham
> --
>
>

Reply via email to