Yo,

I'm in the process of moving from mod_webkit to using the built-in  
HTTP server (because my hosting co wants to run my webapp behind  
Apache mod_rewrite/mod_proxy).

My service supports uploads of some very large files.  This worked  
fine under mod_webkit, which is what we had been using.  However,  
things are blowing up with the built-in HTTP server.  I did some  
investigating, and discovered that, for *any* request with a separate  
body (a POST, a multipart/form, etc), WebKit.HTTPServer.HTTPHandler  
is reading the entire body into memory, then wrapping it in a  
StringIO instance before handing if off to the Application class.

This is a disaster for large files (> 50 M, say), and just seems like  
an odd design choice all around.

After looking carefully at HTTPHandler, I think I've found a way to  
avoid holding that file in memory.  I wanted to ask the list if there  
was a reason for the current design that I'm missing.  If not, I'd  
propose the patch below as an improvement to the built-in HTTP  
server.  Here's what I'm doing:

In WebKit/HTTPServer.py, l. 58, it does the read into memory from  
rfile (which is a file-like wrapper around the connection from the  
client, poised to read at the start of the body):

     input = self.headers.has_key('Content-Length') \
             and self.rfile.read(int(self.headers['Content-Length']))  
or ''

Which is then wrapped in StringIO and passed on to the app on l. 140:

     'input': StringIO(input),


I changed that to (again, l.58):

     if self.headers.has_key('Content-Length'):
         input = self.rfile
         env['CONTENT_LENGTH'] = self.headers['Content-Length']
     else:
         input = StringIO('')


And then just passed that input in directly on l. 140:
     'input': input,


I had to set the CONTENT_LENGTH in the environment dict so that  
cgi.py could accurately parse it.  Without that, it just locked up.   
With it, it seems to be working fine.

Basically, I'm just removing an extra level of reading into memory  
and then wrapping in StringIO, which, as far as I can tell, is  
serving no purpose, and, for large request bodies, is a crippling  
problem.

Anyone know what it's set up the way it is now?
-Dan Milstein

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Webware-devel mailing list
Webware-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/webware-devel

Reply via email to