Re: [Web-SIG] PEP 444

Chris McDonough Sun, 21 Nov 2010 08:56:18 -0800

PEP 444 has no champion currently.  Both Armin and I have basically left
it behind.  It would be great if you wanted to be its champion.


- C

On Sun, 2010-11-21 at 03:12 -0800, Alice Bevan-McGregor wrote:
> (A version of this is is available at http://web-core.org/2.0/pep-0444/ — 
> links are links, code may be easier to read.)
> 
> PEP 444 is quite exciting to me.  So much so that I’ve been spending a few 
> days writing a high-performance (C10K, 10Krsec) Py2.6+/3.1+ HTTP/1.1 server 
> which implements much of the proposed standard.  The server is functional 
> (less web3.input at the time of this writing), but differs from PEP 444 in 
> several ways.  It also adds several features I feel should be part of the 
> spec.
> 
> Source for the server is available on GitHub:
> 
>       https://github.com/pulp/marrow.server.http
> 
> I have made several notes about the PEP 444 specification during 
> implementation of the above, and concern over some implementation details:
> 
> First, async is poorly defined:
> 
> > If the origin server advertises that it has the web3.async capability, a 
> > Web3 application callable used by the server is permitted to return a 
> > callable that accepts no arguments. When it does so, this callable is to be 
> > called periodically by the origin server until it returns a non-None 
> > response, which must be a normal Web3 response tuple.
> 
> Polling is not true async.  I believe that it should be up to the server to 
> define how async is utilized, and that the specification should be clarified 
> on this point.  (“Called periodically” is too vague.)  “Callable” should 
> likely be redefined as “generator” (a callable that yields) as most 
> applications require holding on to state and wrapping everything in 
> functools.partial() is somewhat ugly.  Utilizing generators would improve 
> support for existing Python async frameworks, and allow four modes of 
> operation: yield None (no response, keep waiting), yield response_tuple 
> (standard response), return / raise StopIteration (close the async 
> connection) and allow for data to be passed back to the async callable by the 
> higher-level async framework.
> 
> Second, WSGI middleware, while impressive in capability, are somewhat… 
> heavy-weight.  Heavily nesting function calls is wasteful of CPU and RAM, 
> especially if the middleware decides it can’t operate, for example, GZip 
> compression disabling itself for non-text/ mimetypes.  The majority of WSGI 
> middleware can, and probably should be, implemented as linear ingress or 
> egress filters.  For example, on-disk static file serving could be an ingress 
> filter, and GZip compression an egress filter.  m.s.http supports this 
> filtering and demonstrates one API for such.  Also, I am in the process of 
> writing an example egress CompressionFilter.
> 
> An example API and filter use implementation: (paraphrased from 
> marrow.server.http)
> 
> > # No filters, near 0 overhead.
> > for filter_ in ingress_filters:
> >     # Can mutate the environment.
> >     result = filter_(env)
> >     
> >     # Allow the filter to return a response rather than continuing.
> >     if result:
> >         # result is a status, headers, body_iter tuple
> >         return result[0], result[1], result[2]
> > 
> > status, headers, body = application(env)
> > 
> > for filter_ in egress_filters:
> >     # Can mutate the environment, status, headers, body, or
> >     # return completely new status, headers, and body.
> >     status, headers, body = filter_(env, status, headers, body)
> > 
> > return status, headers, body
> 
> The environment has some minor issues.  I’ll write up my changes in RFC-style:
> 
> SERVER_NAME is REQUIRED and MUST contain the DNS name of the server OR 
> virtual server name for the web server if available OR an empty bytestring if 
> DNS resolution is unavailable.  SERVER_ADDR is REQUIRED and MUST contain the 
> web server’s bound IP address.  URL reconstruction SHOULD use HTTP_HOST if 
> available, SERVER_NAME if there is no HTTP_HOST, and fall back on SERVER_ADDR 
> if SERVER_NAME is an empty bytestring.
> 
> CONTENTL_LENGTH is REQUIRED and MUST be None if not defined by the client.  
> Testing explicitly for None is more efficient than armoring against missing 
> values; also, explicit is better than implicit.  (Paste’s WSGI1 server 
> defines CONTENT_LENGTH as 0, but this implies the client explicitly declared 
> it as zero, which is not the case.)
> 
> FRAGMENT and PARAMETERS are REQUIRED and are parsed out of the URL in the 
> same way as the QUERY_STRING. FRAGMENT is the text after a hash mark (a.k.a. 
> “anchor” to browsers, e.g. /foo#bar). PARAMETERS come before QUERY_STRING, 
> and after PATH_INFO separated by a semicolon, e.g. /foo;bar?baz.  Both values 
> MUST be empty bytestrings if not present in the URL. (Rarely used — I’ve only 
> seen it in Java and ColdFusion applications — but still useful.)
> 
> Points of contention:
> 
> Changing the namespace seems needless.  Using the wsgi.* namespace with a 
> wsgi.version of (2, 0) will allow applications to easily armor themselves 
> against incompatible use.  That’s what wsgi.version is for!  I’d add this as 
> a strong “point of contention”.  m.s.http keeps the wsgi namespace and uses a 
> version of (2, 0).
> 
> That’s it so far.  I may occasionally write in with additional ideas as I 
> continue with my HTTP server implementation.
> 
>       — Alice.
> 
> _______________________________________________
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com


_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444

Reply via email to