Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

Chris McDonough Mon, 21 Sep 2009 00:10:43 -0700

OK, after some consideration, I think I'm sold.

Answering my own original question about why unicode seems to make sense asvalues in the WSGI environment even without consideration for Python 3compatibility: *something* needs to do this translation. Currently Ipersonally rely on WebOb to do a lot of this translation. I can't think of agood reason that implementations at the level of WebOb would each need to dothis translation work; pushing the job into WSGI itself seems to make sensehere. This is particularly true for PATH_INFO and QUERY_STRING; these daysit's foolish to assume these values will be entirely composed of "low order"characters, and thus being able to access them as bytes natively isn't very useful.

OTOH, I suspect the Python 3 stdlib is still broken if it requires nativestrings in various places (and prohibits the use of bytes).


James Bennett wrote:

On Sun, Sep 20, 2009 at 11:25 PM, Chris McDonough <chr...@plope.com> wrote:

WSGI is a fairly low-level protocol aimed at folks who need to interface a
server to the outside world.  The outside world (by its nature) talks bytes.
 I fear that any implied conversion of environment values and iterable
return values to Unicode will actually eventually make things harder than
they are now.  I realize that it would make middleware implementors lives
harder to need to deal in bytes.  However, at this point, I also believe
that middleware kinda should be hard.  We have way too much middleware that
shouldn't be middleware these days (some written by myself).


Well, ordinarily I'd be inclined to agree: HTTP deals in bytes, so an
interface to HTTP should deal in bytes as well.

The problem, really is that despite being a very low-level interface,
WSGI has a tendency to leak up into much higher-level code, and (IMO)
authors of that high-level code really shouldn't have to waste their
time dealing with details of the underlying low-level gateway.

You've said you don't want to hear "Python 3" as the reason, but it
provides some useful examples: in high-level code you'll commonly want
to be doing things like, say, comparing parts of the requested URL
path to known strings or patterns. And that high-level code will
almost certainly use strings, while WSGI, in theory, will be using
bytes. That's just a recipe for disaster; if WSGI mandates bytes, then
bytes will have to start "infecting" much higher-level code (since
Python 3 -- rightly -- doesn't let you be nearly as promiscuous about
mixing bytes and strings).

Once I'm at a point where I can use Python 3, I know I'll personally
be looking for some library which will normalize everything for me
before I interact with it, precisely to avoid this sort of leakage; if
WSGI itself would at least *allow* that normalization to happen at the
low level (mandating it is another discussion entirely) I'd feel much
happier about it going forward.


_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

Reply via email to