PEP 444 has no champion currently. Both Armin and I have basically left it behind. It would be great if you wanted to be its champion.
- C On Sun, 2010-11-21 at 03:12 -0800, Alice Bevan-McGregor wrote: > (A version of this is is available at http://web-core.org/2.0/pep-0444/ — > links are links, code may be easier to read.) > > PEP 444 is quite exciting to me. So much so that I’ve been spending a few > days writing a high-performance (C10K, 10Krsec) Py2.6+/3.1+ HTTP/1.1 server > which implements much of the proposed standard. The server is functional > (less web3.input at the time of this writing), but differs from PEP 444 in > several ways. It also adds several features I feel should be part of the > spec. > > Source for the server is available on GitHub: > > https://github.com/pulp/marrow.server.http > > I have made several notes about the PEP 444 specification during > implementation of the above, and concern over some implementation details: > > First, async is poorly defined: > > > If the origin server advertises that it has the web3.async capability, a > > Web3 application callable used by the server is permitted to return a > > callable that accepts no arguments. When it does so, this callable is to be > > called periodically by the origin server until it returns a non-None > > response, which must be a normal Web3 response tuple. > > Polling is not true async. I believe that it should be up to the server to > define how async is utilized, and that the specification should be clarified > on this point. (“Called periodically” is too vague.) “Callable” should > likely be redefined as “generator” (a callable that yields) as most > applications require holding on to state and wrapping everything in > functools.partial() is somewhat ugly. Utilizing generators would improve > support for existing Python async frameworks, and allow four modes of > operation: yield None (no response, keep waiting), yield response_tuple > (standard response), return / raise StopIteration (close the async > connection) and allow for data to be passed back to the async callable by the > higher-level async framework. > > Second, WSGI middleware, while impressive in capability, are somewhat… > heavy-weight. Heavily nesting function calls is wasteful of CPU and RAM, > especially if the middleware decides it can’t operate, for example, GZip > compression disabling itself for non-text/ mimetypes. The majority of WSGI > middleware can, and probably should be, implemented as linear ingress or > egress filters. For example, on-disk static file serving could be an ingress > filter, and GZip compression an egress filter. m.s.http supports this > filtering and demonstrates one API for such. Also, I am in the process of > writing an example egress CompressionFilter. > > An example API and filter use implementation: (paraphrased from > marrow.server.http) > > > # No filters, near 0 overhead. > > for filter_ in ingress_filters: > > # Can mutate the environment. > > result = filter_(env) > > > > # Allow the filter to return a response rather than continuing. > > if result: > > # result is a status, headers, body_iter tuple > > return result[0], result[1], result[2] > > > > status, headers, body = application(env) > > > > for filter_ in egress_filters: > > # Can mutate the environment, status, headers, body, or > > # return completely new status, headers, and body. > > status, headers, body = filter_(env, status, headers, body) > > > > return status, headers, body > > The environment has some minor issues. I’ll write up my changes in RFC-style: > > SERVER_NAME is REQUIRED and MUST contain the DNS name of the server OR > virtual server name for the web server if available OR an empty bytestring if > DNS resolution is unavailable. SERVER_ADDR is REQUIRED and MUST contain the > web server’s bound IP address. URL reconstruction SHOULD use HTTP_HOST if > available, SERVER_NAME if there is no HTTP_HOST, and fall back on SERVER_ADDR > if SERVER_NAME is an empty bytestring. > > CONTENTL_LENGTH is REQUIRED and MUST be None if not defined by the client. > Testing explicitly for None is more efficient than armoring against missing > values; also, explicit is better than implicit. (Paste’s WSGI1 server > defines CONTENT_LENGTH as 0, but this implies the client explicitly declared > it as zero, which is not the case.) > > FRAGMENT and PARAMETERS are REQUIRED and are parsed out of the URL in the > same way as the QUERY_STRING. FRAGMENT is the text after a hash mark (a.k.a. > “anchor” to browsers, e.g. /foo#bar). PARAMETERS come before QUERY_STRING, > and after PATH_INFO separated by a semicolon, e.g. /foo;bar?baz. Both values > MUST be empty bytestrings if not present in the URL. (Rarely used — I’ve only > seen it in Java and ColdFusion applications — but still useful.) > > Points of contention: > > Changing the namespace seems needless. Using the wsgi.* namespace with a > wsgi.version of (2, 0) will allow applications to easily armor themselves > against incompatible use. That’s what wsgi.version is for! I’d add this as > a strong “point of contention”. m.s.http keeps the wsgi namespace and uses a > version of (2, 0). > > That’s it so far. I may occasionally write in with additional ideas as I > continue with my HTTP server implementation. > > — Alice. > > _______________________________________________ > Web-SIG mailing list > Web-SIG@python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com