P.J. Eby [mailto:p...@telecommunity.com] > At 07:40 PM 9/21/2009 -0700, Robert Brewer wrote: > > Yes; you have to transcode to the "correct" encoding. Once. > > Then every other WSGI application interface "below" that one > > doesn't have to care. > > You can only do that if you *break encapsulation*, which as I said > earlier is voiding the entire point of having a modular interface.
Requiring one component to run before another to achieve a correct result does not void modularity. Unix pipes employ a modular interface, but "cat /etc/fstab | wc | head" produces a very different result than "cat /etc/fstab | head | wc". In such a system, encapsulation requires that the components not share state, but rather trust that they are composed correctly (yes, by some "invisible hand") and that the given input is the intended one, even if that means a previous component transformed it. If, on the other hand, only utf-8-decoded strings can be passed as input to each WSGI component, then each WSGI component must be prepared to re-decode its inputs; in that case, each must be configured identically with the same logic to determine the correct decoding, since the correct decoding does not differ from one component to the next. That repeated configuration of the correct decoding is shared state, and breaks encapsulation; one-time transformation of inputs is not and does not. > Having a configurable encoding just means that *every* WSGI > application *must* verify the encoding in order to be safe. No, each can trust its inputs and do its intended job instead, if your idempotency requirement is relaxed. > I'm all > in favor of making everyone suffer equally, but all else being equal, > I'd prefer them to suffer idempotently rather than conditionally. ;-) I know you do, but I don't see the community following your lead in that preference. Any middleware that alters the environ breaks idempotency. Any middleware that alters the output breaks idempotency. Most routing middleware breaks idempotency. There's a lot of all of those already in the wild. CherryPy doesn't care, because we marginalized WSGI middleware into near obscurity. We did that in large part because of the idempotency requirements of WSGI 1.0. We may have the only routing middleware that you could mistakenly put in your stack twice and get the same result! So I'm not fighting for myself/my framework on this; surrogateescape would work just fine for us since we ship very little middleware. But I don't think it would work fine for Paste, Pylons, Turbogears, Repoze, etcetera etcetera who have lots of WSGI middleware to port and more they want to build, and have been chafing for years now against this requirement. I believe they want full unicode SCRIPT_NAME and PATH_INFO, and would prefer a single, new, modular WSGI component be inserted in their component graphs than to build that logic into every WSGI component. They already have to deal with correct ordering in their WSGI component graphs, because they've already abandoned strict idempotency. Ben, Ian, Mark, Chris, et al, please confirm or deny that; I could be way off base. Robert Brewer fuman...@aminus.org _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com