You are correct that the solution you proposed would work. My only concern is that some existing application do URL(app,controller, function) instead of URL(r=request,function). That is valid. When they try to deploy behind WSGI in a subfolder they will find the links break. I agree with you that they can change the URL(...) arguments but they are not supposed to. I like your fix but it would be nice if we could come up with a way that does not break any existing app.
On Sep 4, 5:53 am, Graham Dumpleton <graham.dumple...@gmail.com> wrote: > The WSGI specification defines an optional extension referred to as > wsgi.file_wrapper. > > http://www.python.org/dev/peps/pep-0333/#optional-platform-specific-f... > > The intent of this is so that WSGI hosting mechanisms can provide a > better performing way for responding with content from a file within a > WSGI application. At the moment, web2py doesn't use this feature. > > The patches for it are reasonably simple: > > *** globals.py.dist 2009-09-04 13:47:34.000000000 +1000 > --- globals.py 2009-09-04 13:44:10.000000000 +1000 > *************** > *** 168,174 **** > self.headers['Content-Length'] = os.stat(filename) > [stat.ST_SIZE] > except OSError: > pass > ! self.body = streamer(stream, chunk_size) > return self.body > > def download(self, request, db, chunk_size = 10 ** 6): > --- 168,177 ---- > self.headers['Content-Length'] = os.stat(filename) > [stat.ST_SIZE] > except OSError: > pass > ! if request and request.env.wsgi_file_wrapper: > ! self.body = request.env.wsgi_file_wrapper(stream, > chunk_size) > ! else: > ! self.body = streamer(stream, chunk_size) > return self.body > > def download(self, request, db, chunk_size = 10 ** 6): > > *** streamer.py.dist 2009-09-04 14:34:24.000000000 +1000 > --- streamer.py 2009-09-04 20:06:43.000000000 +1000 > *************** > *** 80,87 **** > stream.seek(part[0]) > headers['Content-Range'] = 'bytes %i-%i/%i' % part > headers['Content-Length'] = '%i' % bytes > ! raise HTTP(206, streamer(stream, chunk_size=chunk_size, > ! bytes=bytes), **headers) > else: > try: > stream = open(static_file, 'rb') > --- 80,91 ---- > stream.seek(part[0]) > headers['Content-Range'] = 'bytes %i-%i/%i' % part > headers['Content-Length'] = '%i' % bytes > ! if request and request.env.wsgi_file_wrapper: > ! raise HTTP(200, request.env.wsgi_file_wrapper(stream, > chunk_size), > ! **headers) > ! else: > ! raise HTTP(206, streamer(stream, chunk_size=chunk_size, > ! bytes=bytes), **headers) > else: > try: > stream = open(static_file, 'rb') > *************** > *** 91,95 **** > else: > raise HTTP(404) > headers['Content-Length'] = fsize > ! raise HTTP(200, streamer(stream, chunk_size=chunk_size), > ! **headers) > --- 95,103 ---- > else: > raise HTTP(404) > headers['Content-Length'] = fsize > ! if request and request.env.wsgi_file_wrapper: > ! raise HTTP(200, request.env.wsgi_file_wrapper(stream, > chunk_size), > ! **headers) > ! else: > ! raise HTTP(200, streamer(stream, chunk_size=chunk_size), > ! **headers) > > As example of performance gain one can expect on Apache/mod_wsgi if > the streamer code from web2py is used in a simple WSGI application > outside of web2py. Ie., > > def streamer(stream, chunk_size=10 ** 6, bytes=None): > offset = 0 > while bytes == None or offset < bytes: > if bytes != None and bytes - offset < chunk_size: > chunk_size = bytes - offset > data = stream.read(chunk_size) > length = len(data) > if not length: > break > else: > yield data > if length < chunk_size: > break > offset += length > > def application(environ, start_response): > status = '200 OK' > output = 'Hello World!' > > response_headers = [('Content-type', 'text/plain')] > start_response(status, response_headers) > > file = open('/usr/share/dict/words', 'rb') > return streamer(file) > > With Apache/mod_wsgi and using embedded mode, one can get on recent > MacBook Pro with a 2.4MB file a rate of about 175 requests/sec > serialised. > > If one uses wsgi.file_wrapper from Apache/mod_wsgi instead. Ie., > > def application(environ, start_response): > status = '200 OK' > output = 'Hello World!' > > response_headers = [('Content-type', 'text/plain')] > start_response(status, response_headers) > > file = open('/usr/share/dict/words', 'rb') > return environ['wsgi.file_wrapper'](file) > > With Apache/mod_wsgi using embedded mode, and same size file, one can > get a rate of 250 requests/sec. > > Note here that web2py streamer code actually uses quite a large chunk > size of 1MB which means that having many concurrent requests for large > file can see memory usage grow a noticeable amount. > > Under Apache/mod_wsgi, use of a large chunk size actually appears to > result in worse performance than a smaller chunk size. For example, > with a 64KB chunk size, one can get about 230 requests/sec. > > Unfortunately the sweet spot probably varies depending on the WSGI > hosting mechanism being used. One suggestion may be to allow the > default chunk size to be overridden to allow tuning of this value if > an application returns file contents a lot. > > Do note that the chunk size for wsgi.file_wrapper only comes into play > if the streamed object isn't an actual file object and so sending > can't be optimised, or if Windows used. Where optimisation is > possible, then Apache/mod_wsgi is using either memory mapping or > sendfile() system call to speed things up. > > As to performance, if using daemon mode of Apache/mod_wsgi, the > benefits of using wsgi.file_wrapper are not as marked. This is because > of the additional socket hop due to proxying data between Apache > server child process and the daemon process. You still have this hop > if using fastcgi or mod_proxy to backend web2py running with internal > WSGI server, so any improvement is still good though. > > For daemon mode, the rate for that file is 100 requests/sec. For > default web2py streamer and 1MB chunk size it is 80 requests/sec. For > web2py streamer and 64KB chunk size, comes in better as about 90 > requests/sec. > > Now, my tests results above don't involve web2py. You have to remember > that when using that code from inside of web2py, you take the > additional hit from the overhead of web2py, its routing and dispatch > etc. Thus, although performance for returning of file content may be > improved, that is only one part of the overall time taken up and may > be a minimal amount given that database access etc will likely still > predominate. > > Anyway, I present all this so simply can be evaluated whether for WSGI > hosting mechanism you want to allow wsgi.file_wrapper to be used if > available. The feature would only get used if WSGI hosting mechanism > provided it. I perhaps would suggest though that if supported, and > made default, that you allow user to disable use of it. This is > because some WSGI hosting mechanisms may not implement > wsgi.file_wrapper properly. The particular area of concern would be > web2py byte range support. In Apache/mod_wsgi it will pay attention to > Content-Length in response and only send that amount of data from > file, but other WSGI hosting mechanisms may not, and will send all > file content from the seek point to end of file. This could result in > more data than supposed to being returned for a byte range request. > > If overly concerned, you might make ability to use wsgi.file_wrapper > off by default, but allow user to turn it on. This way if a specific > user finds it actually helps for their specific application because of > heavy streaming of files, then they can turn it on. > > Even if not interested in wsgi.file_wrapper, I would suggest you have > another look at the 1MB chunk size you are currently using and see > whether that is appropriate for all platforms. You might want to make > the default globally configurable. Certainly on MacOS X Snow Leopard > under Apache/mod_wsgi, a block size of 64KB performs better than the > current default of 1MB. > > Graham --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "web2py-users" group. To post to this group, send email to web2py@googlegroups.com To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/web2py?hl=en -~----------~----~----~----~------~----~------~--~---