I haven't read what you have done yet, but if you have done so already, ensure you read:
http://bitbucket.org/ianb/wsgi-peps/src/ This is Ian's and Armin's previous go at new specification. It though tried to go further than what you are doing. Also read: http://blog.dscpl.com.au/2009/09/roadmap-for-python-wsgi-specification.html I explain what I mean by native strings in that. Graham On 15 April 2010 22:54, Dirkjan Ochtman <dirk...@ochtman.nl> wrote: > Mostly taking Graham's list of issues and incorporating it into PEP 333. > > Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt > > Let's have comments here (comments in the form of diffs are > particularly welcome, of course). Remember, the idea is not to change > or improve WSGI right now, but only to improve the spec, improving > interoperability and enabling Python 3 support. > > Graham, I hope I did a good job with your suggestions. (Since so much > of this is yours, I've just listed you as the second author.) I tried > to clarify exactly what you meant by "native strings", can you check > that out? > > Cheers, > > Dirkjan > > --- pep-0333.txt 2010-04-15 14:46:02.000000000 +0200 > +++ wsgi-1.1.txt 2010-04-15 14:51:39.000000000 +0200 > @@ -1,114 +1,124 @@ > -PEP: 333 > -Title: Python Web Server Gateway Interface v1.0 > +PEP: 0000 > +Title: Python Web Server Gateway Interface 1.1 > Version: $Revision$ > Last-Modified: $Date$ > -Author: Phillip J. Eby <p...@telecommunity.com> > +Author: Dirkjan Ochtman <dirk...@ochtman.nl>, > + Graham Dumpleton <graham.dumple...@gmail.com> > Discussions-To: Python Web-SIG <web-sig@python.org> > Status: Draft > Type: Informational > Content-Type: text/x-rst > -Created: 07-Dec-2003 > -Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004 > +Created: 15-04-2010 > +Post-History: Not yet > > > Abstract > ======== > > -This document specifies a proposed standard interface between web > -servers and Python web applications or frameworks, to promote web > -application portability across a variety of web servers. > +This document specifies a revision of the proposed standard interface > +between web servers and Python web applications or frameworks, to > +promote web application portability across a variety of web servers. > > > Rationale and Goals > =================== > > -Python currently boasts a wide variety of web application frameworks, > -such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to > -name just a few [1]_. This wide variety of choices can be a problem > -for new Python users, because generally speaking, their choice of web > -framework will limit their choice of usable web servers, and vice > -versa. > - > -By contrast, although Java has just as many web application frameworks > -available, Java's "servlet" API makes it possible for applications > -written with any Java web application framework to run in any web > -server that supports the servlet API. > - > -The availability and widespread use of such an API in web servers for > -Python -- whether those servers are written in Python (e.g. Medusa), > -embed Python (e.g. mod_python), or invoke Python via a gateway > -protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of > -framework from choice of web server, freeing users to choose a pairing > -that suits them, while freeing framework and server developers to > -focus on their preferred area of specialization. > - > -This PEP, therefore, proposes a simple and universal interface between > -web servers and web applications or frameworks: the Python Web Server > -Gateway Interface (WSGI). > - > -But the mere existence of a WSGI spec does nothing to address the > -existing state of servers and frameworks for Python web applications. > -Server and framework authors and maintainers must actually implement > -WSGI for there to be any effect. > - > -However, since no existing servers or frameworks support WSGI, there > -is little immediate reward for an author who implements WSGI support. > -Thus, WSGI **must** be easy to implement, so that an author's initial > -investment in the interface can be reasonably low. > - > -Thus, simplicity of implementation on *both* the server and framework > -sides of the interface is absolutely critical to the utility of the > -WSGI interface, and is therefore the principal criterion for any > -design decisions. > - > -Note, however, that simplicity of implementation for a framework > -author is not the same thing as ease of use for a web application > -author. WSGI presents an absolutely "no frills" interface to the > -framework author, because bells and whistles like response objects and > -cookie handling would just get in the way of existing frameworks' > -handling of these issues. Again, the goal of WSGI is to facilitate > -easy interconnection of existing servers and applications or > -frameworks, not to create a new web framework. > - > -Note also that this goal precludes WSGI from requiring anything that > -is not already available in deployed versions of Python. Therefore, > -new standard library modules are not proposed or required by this > -specification, and nothing in WSGI requires a Python version greater > -than 2.2.2. (It would be a good idea, however, for future versions > -of Python to include support for this interface in web servers > -provided by the standard library.) > - > -In addition to ease of implementation for existing and future > -frameworks and servers, it should also be easy to create request > -preprocessors, response postprocessors, and other WSGI-based > -"middleware" components that look like an application to their > -containing server, while acting as a server for their contained > -applications. > - > -If middleware can be both simple and robust, and WSGI is widely > -available in servers and frameworks, it allows for the possibility > -of an entirely new kind of Python web application framework: one > -consisting of loosely-coupled WSGI middleware components. Indeed, > -existing framework authors may even choose to refactor their > -frameworks' existing services to be provided in this way, becoming > -more like libraries used with WSGI, and less like monolithic > -frameworks. This would then allow application developers to choose > -"best-of-breed" components for specific functionality, rather than > -having to commit to all the pros and cons of a single framework. > - > -Of course, as of this writing, that day is doubtless quite far off. > -In the meantime, it is a sufficient short-term goal for WSGI to > -enable the use of any framework with any server. > - > -Finally, it should be mentioned that the current version of WSGI > -does not prescribe any particular mechanism for "deploying" an > -application for use with a web server or server gateway. At the > -present time, this is necessarily implementation-defined by the > -server or gateway. After a sufficient number of servers and > -frameworks have implemented WSGI to provide field experience with > -varying deployment requirements, it may make sense to create > -another PEP, describing a deployment standard for WSGI servers and > -application frameworks. > +WSGI 1.0, specified in PEP 333, did a great job in making it easier > +for web applications and web servers to interface with each other. > +It has become very much the standard it was meant to be and an > +important part of the Python web development infrastructure. > + > +After several implementations were built by different developers, > +it inevitably turned out that the specification wasn't perfect. It > +left out some details that were implemented by all the web server > +interfaces because they were critical for many applications (or > +application frameworks). Additionally, the specification was written > +before Python 3.x was specified, resulting in a lack of clear > +specification on what to do with unicode strings. > + > +While there are some ideas around to improve WSGI further in less > +compatible ways, we feel that there is value to be had in first > +specifying a minor revision of the specification, which is largely > +compatible with existing implementations. Further simplification > +and experimentation are therefore deferred to a 2.0 version. > + > + > +Differences with WSGI 1.0 > +========================= > + > +Descriptive changes > +------------------- > + > +The following changes were made to realign the spec with > +implementations 'in the wild'. > + > +1. The 'readline()' function of 'wsgi.input' must optionally take > + a size hint. This is required because many applications use > + cgi.FieldStorage, which uses this functionality. > + > +2. The 'wsgi.input' functions for reading input must return an empty > + string as end of input stream marker. This is required for support > + of HTTP 1.1 request pipelining. A correctly implemented WSGI > + middleware already has to cope with an empty string as end > + sentinel anyway to detect premature end of input. > + > +3. Any WSGI application or middleware should not itself return, or > + consume from a wrapped WSGI component, more data than specified by > + the Content-Length response header if defined. Middleware that > + does this is arguably broken and can generate incorrect data. > + This is just a clarification of obligations. > + > +4. The WSGI adapter must not pass on to the server any data above > + what the Content-Length response header defines, if supplied. > + Doing this is technically a violation of HTTP. This is another > + clarification of obligations. > + > + > +String handling changes > +----------------------- > + > +The following changes were made to make WSGI work on Python 3.x. > + > +1. The application is passed an instance of a Python dictionary > + containing what is referred to as the WSGI environment. All keys > + in this dictionary are native strings. For CGI variables, all names > + are going to be ISO-8859-1 and so where native strings are > + unicode strings, that encoding is used for the names of CGI > + variables. > + > +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI > + environment, the value of the variable should be a native string. > + > +3. For the CGI variables contained in the WSGI environment, the values > + of the variables are native strings. Where native strings are > + unicode strings, ISO-8859-1 encoding would be used such that the > + original character data is preserved and as necessary the unicode > + string can be converted back to bytes and thence decoded to unicode > + again using a different encoding. > + > +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment > + and from which request content is read, should yield byte strings. > + > +5. The status line specified by the WSGI application should be a byte > + string. Where native strings are unicode strings, the native string > + type can also be returned in which case it would be encoded as > + ISO-8859-1. > + > +6. The list of response headers specified by the WSGI application should > + contain tuples consisting of two values, where each value is a byte > + string. Where native strings are unicode strings, the native string > + type can also be returned in which case it would be encoded as > + ISO-8859-1. > + > +7. The iterable returned by the application and from which response > + content is derived, should yield byte strings. Where native strings > + are unicode strings, the native string type can also be returned in > + which case it would be encoded as ISO-8859-1. > + > +8. The value passed to the 'write()' callback returned by > + 'start_response()' should be a byte string. Where native strings > + are unicode strings, a native string type can also be supplied, in > + which case it would be encoded as ISO-8859-1. > > > Specification Overview > @@ -447,6 +457,13 @@ > Streaming`_ section below for more on how application output must be > handled.) > > +Further on, several places specify constraints upon string types used > +in the WSGI API. The term native string is used to mean the 'str' class > +in both Python 2.x and 3.x. The spec tries to ensure optimal > +compatibility and ease of use by allowing implementations running on > +Python 3.x to encode strings (which are Unicode strings with no > +specified encoding) as ISO-8859-1 where a 3.x string is passed in. > + > The server or gateway should treat the yielded strings as binary byte > sequences: in particular, it should ensure that line endings are > not altered. The application is responsible for ensuring that the > @@ -489,12 +506,22 @@ > ``environ`` Variables > --------------------- > > +All keys in this dictionary are native strings. For CGI variables, > +all names are going to be ISO-8859-1 and so where native strings are > +unicode strings, that encoding is used for the names of CGI variables. > + > The ``environ`` dictionary is required to contain these CGI > environment variables, as defined by the Common Gateway Interface > specification [2]_. The following variables **must** be present, > unless their value would be an empty string, in which case they > **may** be omitted, except as otherwise noted below. > > +The values for CGI variables are native strings. Where native strings > +are unicode strings, ISO-8859-1 encoding would be used such that the > +original character data is preserved and as necessary the unicode > +string can be converted back to bytes and thence decoded to unicode > +again using a different encoding. > + > ``REQUEST_METHOD`` > The HTTP request method, such as ``"GET"`` or ``"POST"``. This > cannot ever be an empty string, and so is always required. > @@ -575,13 +602,14 @@ > ===================== =============================================== > Variable Value > ===================== =============================================== > -``wsgi.version`` The tuple ``(1,0)``, representing WSGI > +``wsgi.version`` The tuple ``(1, 0)``, representing WSGI > version 1.0. > > ``wsgi.url_scheme`` A string representing the "scheme" portion of > the URL at which the application is being > invoked. Normally, this will have the value > - ``"http"`` or ``"https"``, as appropriate. > + ``"http"`` or ``"https"``, as appropriate. The > + value is a native string. > > ``wsgi.input`` An input stream (file-like object) from which > the HTTP request body can be read. (The server > @@ -646,7 +674,7 @@ > Method Stream Notes > =================== ========== ======== > ``read(size)`` ``input`` 1 > -``readline()`` ``input`` 1,2 > +``readline(hint)`` ``input`` 1,2 > ``readlines(hint)`` ``input`` 1,3 > ``__iter__()`` ``input`` > ``flush()`` ``errors`` 4 > @@ -661,11 +689,12 @@ > ``Content-Length``, and is allowed to simulate an end-of-file > condition if the application attempts to read past that point. > The application **should not** attempt to read more data than is > - specified by the ``CONTENT_LENGTH`` variable. > + specified by the ``CONTENT_LENGTH`` variable. All read functions > + are required to return an empty string as the end of input stream > + marker. They must yield byte strings. > > -2. The optional "size" argument to ``readline()`` is not supported, > - as it may be complex for server authors to implement, and is not > - often used in practice. > +2. The optional "size" argument to ``readline()`` is required for > + the implementer, but optional for callers. > > 3. Note that the ``hint`` argument to ``readlines()`` is optional for > both caller and implementer. The application is free not to > @@ -692,12 +721,15 @@ > --------------------------------- > > The second parameter passed to the application object is a callable > -of the form ``start_response(status,response_headers,exc_info=None)``. > +of the form ``start_response(status, response_headers, exc_info=None)``. > (As with all WSGI callables, the arguments must be supplied > positionally, not by keyword.) The ``start_response`` callable is > used to begin the HTTP response, and it must return a > ``write(body_data)`` callable (see the `Buffering and Streaming`_ > -section, below). > +section, below). Values passed to the ``write(body_data)`` callable > +should be byte strings. Where native strings are unicode strings, a > +native strings type can also be supplied, in which case it would be > +encoded as ISO-8859-1. > > The ``status`` argument is an HTTP "status" string like ``"200 OK"`` > or ``"404 Not Found"``. That is, it is a string consisting of a > @@ -705,14 +737,20 @@ > single space, with no surrounding whitespace or other characters. > (See RFC 2616, Section 6.1.1 for more information.) The string > **must not** contain control characters, and must not be terminated > -with a carriage return, linefeed, or combination thereof. > +with a carriage return, linefeed, or combination thereof. This > +value should be a byte string. Where native strings are unicode > +strings, the native string type can also be returned, in which > +case it would be encoded as ISO-8859-1. > > The ``response_headers`` argument is a list of ``(header_name, > header_value)`` tuples. It must be a Python list; i.e. > -``type(response_headers) is ListType``, and the server **may** change > +``type(response_headers) is list``, and the server **may** change > its contents in any way it desires. Each ``header_name`` must be a > valid HTTP header field-name (as defined by RFC 2616, Section 4.2), > -without a trailing colon or other punctuation. > +without a trailing colon or other punctuation. Both the header_name > +and the header_value should be byte strings. Where native strings > +are unicode strings, the native string type can also be returned, > +in which case it would be encoded as ISO-8859-1. > > Each ``header_value`` **must not** include *any* control characters, > including carriage returns or linefeeds, either embedded or at the end. > @@ -809,6 +847,14 @@ > Handling the ``Content-Length`` Header > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > +If an application or middleware layer chooses to return a > +Content-Length header, it should not return more data than specified > +by the header value. Any wrapping middleware layer should not > +consume more data than specified in the header value from the > +wrapped component (either middleware or application). Any WSGI > +adapter must similarly not pass on data above what the > +Content-Length response header value defines. > + > If the application does not supply a ``Content-Length`` header, a > server or gateway may choose one of several approaches to handling > it. The simplest of these is to close the client connection when > @@ -1569,55 +1615,13 @@ > developers. > > > -Proposed/Under Discussion > -========================= > - > -These items are currently being discussed on the Web-SIG and elsewhere, > -or are on the PEP author's "to-do" list: > - > -* Should ``wsgi.input`` be an iterator instead of a file? This would > - help for asynchronous applications and chunked-encoding input > - streams. > - > -* Optional extensions are being discussed for pausing iteration of an > - application's ouptut until input is available or until a callback > - occurs. > - > -* Add a section about synchronous vs. asynchronous apps and servers, > - the relevant threading models, and issues/design goals in these > - areas. > - > - > Acknowledgements > ================ > > -Thanks go to the many folks on the Web-SIG mailing list whose > -thoughtful feedback made this revised draft possible. Especially: > +Thanks go to many folks on the Web-SIG mailing list for helping the work > +on clarifying and improving this specification. In particular: > > -* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up > - on the first draft as not offering any advantages over "plain old > - CGI", thus encouraging me to look for a better approach. > - > -* Ian Bicking, who helped nag me into properly specifying the > - multithreading and multiprocess options, as well as badgering me to > - provide a mechanism for servers to supply custom extension data to > - an application. > - > -* Tony Lownds, who came up with the concept of a ``start_response`` > - function that took the status and headers, returning a ``write`` > - function. His input also guided the design of the exception handling > - facilities, especially in the area of allowing for middleware that > - overrides application error messages. > - > -* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython > - (well before the spec was finalized) helped to shape the "supporting > - older versions of Python" section, as well as the optional > - ``wsgi.file_wrapper`` facility. > - > -* Mark Nottingham, who reviewed the spec extensively for issues with > - HTTP RFC compliance, especially with regard to HTTP/1.1 features that > - I didn't even know existed until he pointed them out. > - > +* Phillip J. Eby, for writing/editing the 1.0 specification. > > References > ========== > @@ -1643,8 +1647,6 @@ > > This document has been placed in the public domain. > > - > - > .. > Local Variables: > mode: indented-text > _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com