On 11/19/10 11:58 AM, Ryan Gies wrote:
One is to iterate over the filenames with subrequests (if this is even
possible/supported), so that each can be passed internally to a single request
Although you could get them to work, I don't think sub-requests are your answer.
They run through all of the handler phases and are expected to return full HTTP
responses.

I think that's right. I gave up on that idea after an exchange with André.

If that doesn't work then I can imagine iterating over the files with calls to
"sendfile()" and using a modified filter to guess at file boundaries.
Because your out-of-band signal may be split across buckets, the output-filter
approach is probably not your answer either. Once again it can be done, however
introduces [seemingly] unneeded complexity. I would say the same for tracking
boundaries according to their offset.

Well there are lots of cases where you have to worry about data being split across buckets (or even brigades) in an output filter, but there are known solutions for this by maintaining context. The reason I don't trust custom in-band signals is because the filter is handling binary data so I can't predict what will pass through reliably. Tracking by offset is more promising, even though like you say there's complexity and openings for error.

Unless there is some constraint, the most straight-forward approach may be to
implement your routine to modify the file contents as they are read from disk:

Yeah, André came up with the same idea, but I'm hoping to avoid that for the same reason that I gave up on the idea of subrequests: efficiency (and re-use). I already have a filter that does what I want and operates on bucket brigades; it's designed for binary files and in most cases only needs to read the first few kilobytes of multi-megabyte files before deciding that it can pass on the rest untouched. For efficiency it's much better to skip the calls to read() and have the data read only only when it's written to the client, rather than multiple times and into memory by the response handler.

So the only way I think to maintain this efficiency for multiple files in a single stream would be to have their filehandles going in succession into bucket brigades and having the filter track the boundaries by offsets. I know I can't rely on brigade boundaries or flush buckets because single files can be spread over multiple brigades, and I'm not confident that I can control where flush buckets appear unless I insert a filter directly before to strip them out except at the boundaries (does anyone know whether flush buckets are predictable?). It's a bit messy and I'm still hoping someone here may offer a cleaner mechanism, but if not then I'll try that.

Thanks,

Brian

Reply via email to