Re: [naviserver-devel] uploading large files

Stephen Deasey Fri, 17 Jun 2005 00:06:31 -0700

On 6/16/05, Zoran Vasiljevic <[EMAIL PROTECTED]> wrote:
> 
> Am 16.06.2005 um 22:13 schrieb Stephen Deasey:
> 
> >
> > It was at one point implemented with mmap:
> >
> > http://cvs.sourceforge.net/viewcvs.py/aolserver/aolserver/nsd/
> > driver.c?rev=1.34&view=markup
> >
> 
> Aha! But it was taken out for some reason...
> It would be good to know why?
> 
> >
> > Spooling large requests to disk is clearly necessary.  Almost always,
> > a large request needs to be stored in a file server-side anyway, so it
> > might as well go there from the start.
> >
> > The only problem I see is that the calls to open a file and write
> > content to disk are blocking, and they happen in the context of the
> > driver thread which is busy multiplexing many non-blocking operations
> > on sockets.  Should one of the calls used to spool to disk block,
> > everything comes to a stop.
> 
> Oh yes. This is very right.
> 
> >
> > The version 3.x model was to read-ahead up to the end of headers in
> > the driver thread, then pass the conn to a conn thread.  As far as I
> > remember, the conn thread would then spool all content to disk, but
> > this could have been on-demand and it might have been only file-upload
> > data. A programmer could then access the spooled data using
> > Ns_ConnRead(), Ns_ConnReadLine() etc., or they would be called for you
> > by ns_getform etc.  The thing to note however is that all the blocking
> > calls happened in separate conn threads.
> 
> Yes. I remember this because the multipart data was parsed by a
> Tcl script. Not very speedy but it worked surprisingly stable.
> 
> >
> > In early version 4.0 the model was changed so that the driver thread
> > would read-ahead all data (up to Content-Length bytes) before the conn
> > was passed to a conn thread.  In a point release a limit was
> > introduced to avoid the obvious DOS attack.  This allowed an easier
> > interface to the data for programmers: Ns_ConnContent() which returns
> > a pointer to an array of bytes.  Ns_ConnReadLine() etc. are no longer
> > used and are currently broken.
> 
> But this brought the memory bloat as a side-effect
> (you can make all happy) :-)
> 
> >
> > Version 4.1 work seems to be trying to tackle the problem of what
> > happens when large files are uploaded.  The version 4.0 model work
> > extremely well for HTTP POST data up to a few K in size, but as you've
> > noticed it really bloats memory when multi-megabyte files are
> > uploaded.  This version also introduces limits, which are a
> > URL-specific way of pre-declaring the maxupload size and some other
> > parameters.
> 
> But still, just spools the data into temp-file at the very sensitive
> point (in the driver.c) as you already pointed-out above.
> 
> >
> >
> > So anyway, here's my idea of how it should work:
> 
> I'm all ears :-)
> 
> >
> > There's a maxreadahead parameter which is <= maxinput.  When a request
> > arrives with Content-Length > 0 && < maxinput, maxreadahead bytes are
> > read into a buffer by the driver thread before the conn is passed to a
> > conn thread.
> >
> > The conn thread runs the registered proc for that URL.  If that
> > procedure does not try to acces the content, then when the conn is
> > returned to the driver thread any content > maxreadahead is drained.
> >
> > If the registered proc does try to access the content via e.g.
> > Ns_ConnContent() (and I think this would be the main method, used
> > underneath by all others) and the content is <= maxreadahead, then a
> > pointer to the readahead buffer is returned.
> >
> > If the content is accessed and it is > maxreadahead, a temp file is
> > mmaped, the readahead buffer is dumped to it, and the remaining bytes
> > are read from the socket, possibly blocking many times, and dumped to
> > the mmaped file, again possibly blocking.  A call to Ns_ConnContent()
> > returns a pointer to the mmaped bytes of the open file.
> 
> But this is now happening outside the driver and in the connection
> thread, ok.
> 
> >
> > At any time before the registered proc asks for the content it can
> > check the content length header and decide whether it is too large to
> > accept.  You could imagine setting a low global maxinput, and a call
> > such as Ns_ConnSetMaxInput() which a registered proc could call to
> > increase the limit for that connection only.  The advantage over the
> > limits scheme in 4.1 is that the code which checks the size of the
> > content and processes it is kept together, rather than having to
> > pre-decalare maxinput sizes for arbitrary URLs in the config file.
> 
> Hm... this is the only part I do not understand.
> 
> >
> > This is similar to 4.1 except the task of overflowing to disk is moved
> > to the conn thread and is lazy.  The mental model is also slightly
> > different, the assumption is that content goes to disk, but theres a
> > cache for read-ahead which may be enough to hold everything.  In 4.1
> > it's everything is read into memory, unless it's large in which case
> > it overflows to disk.
> >
> >
> > How does that sound?
> >
> 
> Summary (my understanding):
> 
> This is a kind of marriage between the 3.0 and 4.0 strategies as I see.
> Up to maxreadahead, it is behaving like 4.0 (loads all in memory) and
> above maxreadahead it behaves like 3.0 (spools to disk).



We would be slightly better than 3.x in that we would use your new
mmap abstraction, which is nice because of the simple interface
(buffer of bytes) it allows us to give the application programmer.  In
3.x, I think your only options were the provided Ns_ConnReadLine()
etc. or a file descriptor.

As far as I remember, 3.x was a little bit more complicated than
described so far.  Once the headers were read by the driver thread and
the conn passed to a conn thread, the remaining request body would be
read lazily in chunks.  That is, if you called Ns_ConnReadLine(), it
would read a single buffer of bytes from the network and look for the
next newline character in that buffer.  If one was found, it would
hand you a line of data.  If not, it would block reading more data
from the network, and so on until the line is found or there's no more
data.

Just to be clear, that's *not* what I suggested we build above.  I was
suggesting that on the first attempt to read any of the request body,
*all* remaining data > maxreadahead would be spooled to disk.  This is
a much simpler model, and I think code simplification was a big reason
for the 3.x -> 4.x transition.

However...  Vlad's question about ns_getform has me wondering about
the common case of HTTP file upload.  Consider a 10MB file upload --
just before the form parsing code does it's thing, the situation is
that there is a file on disk containing the 10MB of data, plus the
mime header, plus the content separator as a trailer.  What you want
is a file with just the 10MB of data.  You can truncate a file to trim
off the trailer, but you can't trim the mime header at the beginning. 
The 10MB of data will have to be copied to a new file, and the spool
file will be deleted when the connection is closed.

This is no worse than the situation we're currently in or that of the
4.1 code base. I'm not sure what 3.x did here.  But it does seem a bit
unfortunate.

How about this:  we introduce a new private interface something like
NsConnPeek(Ns_Conn *conn, int *lenPtr) which returns a pointer into
the read-ahead buffer just after the HTTP headers, *without* first
spooling the entire request body to disk.  The form processing code
would use this to hopefully parse the mime header, and probably some
standard www-form-urlencoded data that was sent at the same time.  It
would then reset the current offset into the buffer (which was
previously positioned at the end of the HTTP headers) so that when the
spooling code takes over it begins at the start of the file content. 
The spool file would then be truncated to remove the trailing content
separator.

For multiple simultaneous file uploads you'd have to revert to copy
the data to new files.  But for the common case, we'd have something
pretty efficient without too much complexity and retaining the nice
interface that mmap allows.


How does this sound?



> The most significant fact is that potentially blocking calls while
> spooling
> to disk are taken out of the driver and done in the connection thread
> (where it does not matter that much) and that the process is lazy
> (done only when content is actually requested and not in advance).
> 
> Yes, this sound fine, indeed. The only think I'd have yet to
> understand is the twiddling with the maxinput value on per-connection
> basis. I do not know *when* this would happen. Can you give a simple
> practical example?
> 
> 
> Zoran
> 
> 
> 
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> _______________________________________________
> naviserver-devel mailing list
> naviserver-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/naviserver-devel
>

Re: [naviserver-devel] uploading large files

Reply via email to