Definitely don't buffer the whole thing. Footers are the way to go for efficiency reasons, but if there is a problem supporting them the alternative is to just read over the section of file and compute the MD5 hash, then read it again and stream it to the client. It sounds more expensive to do it that way, but double reading eliminates double caching as file system cache will keep it in memory most of the time anyway. And the FS cache is smarter about not swapping out more important data if the section is large.

But footers are definitely the way to go for efficiency. Which brings up a good question, are there known problems with footer support?

-Damien


On Jun 30, 2009, at 10:44 AM, Paul Davis wrote:

On Tue, Jun 30, 2009 at 7:12 AM, Damien Katz<dam...@apache.org> wrote:

On Jun 30, 2009, at 12:17 AM, Noah Slater wrote:

On Fri, Jun 26, 2009 at 07:08:32AM -0400, Damien Katz wrote:

Md5 here is for integrity purposes, not security, so manufactured
collisions aren't a problem we are worried about. And I don't think
there is standard SHA1 header, not that I could find anyway.

I've been seeing some unrelated emails go past on the W3C HTTP WG mailing
list
about Content-MD5 header which reminded me of this thread. It seems that
this
value must be calculated from the MIME canonical response body, which
means a
different value for content ranges. This presumably means that CouchDB
must
refuse content range requests, send an MD5 value that does not match the
document revision, or break RFC 1864.

Im not sure I understand why we can't just calculate and send the MD5 header
for the content range.


I reckon you'd have to buffer the response no? Hard to know the MD5 of
an a priori unknown set of bytes until the end of the range which
kinda conflicts with sending the MD5 as a header. Technically there
are HTTP Footers, but i've never actually seen them used.

-Damien


Best,

--
Noah Slater, http://tumbolia.org/nslater



Reply via email to