James Y Knight wrote: > In addition, I know of nobody who actually implements RFC 2047 > decoding of http header values...nothing really uses it. (of > course I don't know of all implementations out there.)
Certainly no browser supports it, which makes the point moot for WSGI. Most browsers, when quoting a header parameter, simply encode using the previous page's charset and put quotes around it... even if the parameter has a quote or control codes in it. Ian wrote: > Is this all compatible with os.environ in py3k? In 3.0a2 os.environ has Unicode strings for both keys and values. This is correct for Windows where environment variables are explicitly Unicode, but questionable (IMO) for Unix where they're really bytes that may or may not represent decodeable Unicode strings. >> SCRIPT_NAME/PATH_INFO This already causes problems in Windows CGI applications! Because these are passed in environment variables, IIS* has to decode the submitted bytes to Unicode first. It seems always to choose UTF-8 for this job, which I suppose is the least bad guess, but hardly infallible. (* - haven't tested this with Apache for Windows yet.) In Python 2.x, os.environ being byte strings, Python/the C library then has to encode them back to bytes, which I believe ends up using the system codepage. Since the system codepage is never UTF-8 on Windows this means not only that the bytes read back from eg. PATH_INFO are not the same as the original bytes submitted to the web server, but that if there are characters outside the system codepage submitted, they'll be unrecoverable. If os.environ remains Unicode in Unix and WSGI follows it (as it must if CGI-invoked WSGI is to continue working smoothly), webapps that try to allow for non-ASCII characters in URLs are likely to get some nasty deployment problems that depend on the system encoding setting, something that will be particularly troublesome for end-users to debug and fix. OTOH making the dictionaries reflect the underlying OS's conception of environment variables means users of os.environ and WSGI will have to be able to cope with both bytes and unicode, which would also be a big annoyance. In summary: urgh, this is all messy and 'orrible. -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com