-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jim Fulton wrote: > On Tue, Aug 4, 2009 at 12:05 PM, P.J. Eby<p...@telecommunity.com> wrote: >> At 10:44 PM 8/4/2009 +1000, Graham Dumpleton wrote: >>> In summary, what are the practical uses cases that would make passing >>> bytes over UTF-8 or even latin-1 worthwhile? >> My concern at this point is a nagging feeling that we are abandoning >> WSGI<->HTTP equivalence for convenience in the face of changes in Python's >> defaults. Had Python 3 been the standard version in existence when WSGI 1 >> was created, I would've argued for making *everything* bytes, in order to: >> >> 1. Force all encodings to be explicit, and >> 2. Ensure WSGI<->HTTP equivalence (i.e., WSGI==HTTP encoded in Python >> objects) >> >> And this is why the original spec said that Unicode strings should be >> treated as bytes -- because byte strings were always the original target of >> the spec. >> >> Please remember that WSGI is not primarily intended to provide application >> developers with a convenient API; its first and most important job is to >> ship the data around without mangling it in the process. >> >> HTTP moves bytes, therefore WSGI should move bytes. For practical reasons, >> it would be good to *also* support strings on the application side, >> especially for application migration. However, I see no reason to make >> *servers* provide decoded strings instead of bytes. > > +1 > > I haven't had enough time to follow this and earlier encoding > discussions and so haven't commented up to now, but I've always been > uncomfortable with WSGI using anything but bytes or assuming any > encoding. I agree that application frameworks should deal with > conversion between bytes and unicode.
+1 from me as well. The fact that Python3 now calls 'string' what used to be 'unicode' doesn't change the fact that "transport-level" operations have to be done in bytes. It should be the framework / application's job to handle conversion of byte inputs from the request onto strings, and string response fields onto bytes: ideally, the framework will do this in a way which keeps the application writer blissfully ignorant of the distinction. Note that I think Python3 gets the os.evniron bit wrong for exactly the same reasons: I think anybody wanting to use the environment-as-provided-by-the-OS should deal in bytes (or whatever the OS provides), with a convenience wrapper for those who don't care about the difference. I lost that argument, but that doesn't mean I was wrong. :) Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFKeHLg+gerLs4ltQ4RAiFjAJ9uZIkfxwh5w1aYiEdIpr+2yQ+iBwCeJiFM eUfWBoPwyzwHThkMwd24SZE= =lod9 -----END PGP SIGNATURE----- _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com