On Sat, Mar 28, 2009 at 2:53 AM, Graham Dumpleton <graham.dumple...@gmail.com> wrote: > 2009/3/28 Mark Ramm <mark.mchristen...@gmail.com>: >> My thought is that we should do a couple things to the wsgi standard, >> and then anything like the lifecycle methods gets addresse,d it should >> be pushed into a "container" standard or something. >> >> I think Robert Brewer's WSGI Service Bus proposal that he made a >> couple years ago at PyCon needs a new name, but it does provide a good >> start on the lifecycle stuff. > > From memory, my concern over that specification was that it sort of > assumed that applications were all preloaded. I am not sure how well > it would work where lazy loading is performed and where there are > multiple WSGI applications running in a interpreter but where they > weren't themselves mounted within a WSGI application, but through > external mechanisms dictated by the WSGI hosting mechanism. > >> As for WSGI itself, we should make a couple of smaller changes which I >> think will likely be a bit easier to quantify and agree on. I'm sure >> lots more folks from yesterday's discussion will chip in here, but >> this is my take on the things we discussed. >> >> 1) We should drop the start_response callable, and return a three >> member tupple from the wsgi callable: >> >> def wsgi2app(environ): >> .... >> return (status_code, headers, response_iterator) >> >> 2) We should turn wsgi.input into an iterator rather than a somewhat >> file-like object. WSGI middleware that reads part of the wsgi.input >> iterator should make sure to restore it using itertools.chain or >> replace it with whatever. If there's a content length specified from >> the server the middleware should be responsible for maintaining or >> deleting that information as nessisary. Content length of 0 is >> allowed and means there's no data, whereas an unspecified or content >> length, indicates that the value is unknown. This will create a good >> symmetry between the input and output methods, and seems like a good >> comprimise between flexibility for middleware creators, and ease of >> use for consumers. > > The problem with an iterator/generator is how do you control the size > of the chunks of data returned. An iterator also probably isn't going > to make chunked request content any easier to handle. > > It may be easier to change how people use the wsgi.input that exists > now. First off allow one to say: > > wsgi.input.read() > > to get all input, rather than passing CONTENT_LENGTH as argument. > > For consume all data in chunks until exhausted, require a proper eof > indicator in the form of an empty string read, then can say: > > s = wsgi.input.read(BLOCKSIZE) > while s: > # do something with 's' > s = wsgi.input.read(BLOCKSIZE) > > That way you don't have to make around with checking how much you have read. > > This does require that an exception be raised if client closes > connection before all data expected was read. > > The question thus is, what would be the actual benefits of changing to > an iterator/generator. > >> 3) The server should encode the headers and include explicit >> information about the encoding in the wsgi environ variable. So that >> any assumptions about what they bytes in the headers represent is made >> explicit. > > That could be fun. For Apache/mod_wsgi at least you are in control of > the conversion. In Python 3.0 and CGI/WSGI the os.environ variables > are already unicode strings because they were converted by Python. How > this is done varies between UNIX and Windows platforms. > >> I think we're all very sold on item 1, and items 2 and 3 require more >> thinking, but seemed reasonable to those present at the discussion >> this afternoon. Hopefully we'll be meeting again on Saturday and >> will be able to continue to think through this stuff and push this all >> forward some more. >> >> I'm sure there also be several other minor tweeks to the spec like: > > Yeah, like defining how wsgi.file_wrapper should behave where response > Content-Length is defined but wrapped file actually provides more > content than that. > >> * Not de-encoding encoded slashes in path strings, so that >> applications can tell the difference between path separators and >> encoded slashes. > > When sitting on top of Apache, whether it be mod_wsgi, fastcgi, scgi, > ajp or CGI, you don't really have much choice, you get what Apache > gives you.
Which is fine, I guess, but it does make it impossible to tell the difference between real slashes and encoded ones in WSGI application code. I would love it if there were some way around that. >> * adding a "ClientWentAway" exception that indicates that wsgi.imput >> has not been officially exhausted, but that the client went away before >> wsgi.input was fully populated. > > The problem with an exception is what namespace do you put it in. You > almost need to have the type as part of the WSGI environment. You may > just be better standardising it by saying that an IOError must be > raised and leave it at that. At the moment most stuff doesn't even pay > attention to the fact that an exception could occur for some WSGI > adapters. That would be fine with me. The issue is definitely not the exception name, but the fact that one can be raised/caught in a standard way. _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com