On Mon, 12 Jul 2004, Ian Holsman wrote: > ok, now before I start this let me say one thing, this is not for *ALL* > requests, it will only work for ones which don't have content-length > modifiable filters (like gzip) applied to the request, and it would be > left to the webserver admin to figure out what they were, and if you > could use this.
But that's not an issue if the byterange filter comes after any filters that modify content (CONTENT_SET). > ok.. > at the moment when a byterange request goes to a dynamic module, the > dynamic module can not use any tricks to only serve the bytes requested, > it *HAS* to serve the entire content up as buckets. Indeed. That only becomes a problem when a filter breaks pipelining. > what I am proposing is something like: > > 1. the filter keeps a ordered list of range requests that the person > requests. > 2. it keeps state on how far it has processed in the file. thanks to > knowing the length of the buckets processed so far. > Q: when do the actual headers get put in.. I think they are after no? ITYM data, not "the file". The case of a single file is trivial, and can more efficiently be handled in a separate optimised execution path. And some bucket types have to be read to get their length. > 3. it then examines the bucket + bucket length to see which range > requests match this range, if some do it grabs that range (possibly > splitting/copying if it meets multiple ranges) and puts it on the right > bits of each range request. > > 4. if the top range request is finished, it passes those buckets through. > > 5. repeat until EOS/Sentinel, flushing the ordered list at the end. This doesn't completely address the issue that this might cause excessive memory usage; particularly if we have to serve ranges in a perverse order. I would propose two admin-configurable limits: (1) Total data buffered in memory by the byterange filter. This can be computed in advance from the request headers. If this is exceeded, the filter should create a file bucket to store the data, and the ordered list then references offsets into the file. (2) A limit above which byteranges won't be served at all: most of us have neither the memory nor the /tmp space for a gigabyte. > now.. this assumes that splitting a bucket (and copying) is a zero cost > operation which doesn't actually *read* the bucket, is this true for > most bucket types? > > would this kind of thing work? As I said, the trivial cases should (transparently) be treated separately and more simply. Otherwise ... well, as discussed on IRC. -- Nick Kew