At 11:17 AM 9/23/2010 -0500, Ian Bicking wrote:
I don't see any reason why Location shouldn't be ASCII. Any header could have any character put in it, of course, there's just no valid case where Location shouldn't be a URL, and URLs are ASCII. Cookie can contain weirdness, yes. I would expect any library that abstracts cookies to handle this (it's certainly doable)... otherwise, this seems like one among many ways a person can do the wrong thing.

This can also be detected with the validator, which doesn't avoid runtime errors, but bytes allow runtime errors too -- they will just happen somewhere else (e.g., when a value is converted to bytes in an application or library).

Right: somewhere much closer to the *actual* error, where the developer can know the problem is, "I have garbage data or have not selected an appropriate codec", rather than "this WSGI stuff is giving me errors some place".


If servers print the invalid value on error (instead of just some generic error) I don't think it would be that hard to track down problems. This requires some explicit effort on the part of the server (most servers handle app_iter==None ungracefully, which is a similar problem).

The difference is that if a server rejects non-bytes, you'll know *right away* that your app isn't compliant, instead of having to wait until some non-latin1 data shows up.

AFAICT, there are only two advantages to using text for output headers:

1. Text is easier to work with, and
2. It's symmetric with using text for input headers.

Both of which can still be had, by using the @encode_headers decorator.

I'm a little bit on the fence on this one, because 1) it does seem a little pointless (if harmless) to shuffle headers around in bytes form, and 2) Location and Set-Cookie are very likely the only headers where any kind of damage could ever happen.

But, since it *can* happen, and because it is also really easy to fix the API issue with a decorator, I'm still leaning in favor of "output is bytes" over "headers are text, bodies are bytes", unless somebody can come up with either some actually-bad consequence of using bytes, or some extra-good consequence of using text (that isn't addressed by just using the decorator).

(Note, by the way, that WSGI design has always leaned in the direction of "any convenience that can be handled by a library should be", if it keeps the spec simpler and more verifiable. So, this seems like a good use of that principle.)

_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to