ctypes.windll.kernel32.GetEnvironmentVariableW(u'PATH_INFO', ...)
Hmm... it turns out: no. IIS appears to be mangling characters that are
not in mbcs even *before* it puts the decoded value into the envvars.
The same is true with isapi_wsgi, which is the only other WSGI adapter I
know of
Mark Hammond wrote:
I don't think Python explicitly converts it - the CRT's ANSI version
of environ is used
Yes, it would be the CRT on Python 2.x. (Python 3.0 on non-NT does a
conversion always using UTF-8, if I'm reading convertenviron right.)
so the resulting strings should be encoded
Python decodes the environ to its own copy (wrapped in os.environ) at
interpreter startup time;
I don't think Python explicitly converts it - the CRT's ANSI version of environ
is used, so the resulting strings should be encoded using the 'mbcs' encoding.
What mangling do you see?
there's
Ian Bicking wrote:
As it is (in Python 2), you should do something like
environ['PATH_INFO'].decode('utf8') and it should work.
See the test cases in my original post: this doesn't work universally.
On WinNT platforms PATH_INFO has already gone through a decode/encode
cycle which almost
Andrew Clover wrote:
Ian Bicking wrote:
As it is (in Python 2), you should do something like
environ['PATH_INFO'].decode('utf8') and it should work.
See the test cases in my original post: this doesn't work universally.
On WinNT platforms PATH_INFO has already gone through a decode/encode
Ian Bicking wrote:
This is something messed up with CGI on NT, and whatever server you are
using, and perhaps the CGI adapter (maybe there's a way to get the raw
environment without any encoding, for example?)
Python decodes the environ to its own copy (wrapped in os.environ) at
interpreter
It would be lovely if we could allow WSGI applications to reliably
accept Unicode paths.
That is to say, allow WSGI apps to have beautiful URLs like Wikipedia's,
without requiring URL-rewriting magic. (Which is so highly
server-specific, potentially unavailable to non-admin webmasters, and
Andrew Clover wrote:
If we could reliably read the bytes the browser sends to us in the GET
request that would be great, we could just decode those and be done with
it. Unfortunately, that's not reliable, because:
1. thanks to an old wart in the CGI specification, %XX hex escapes are
decoded
FWIW, there was a past discussion on these issues on mod_wsgi list. I
can't really remember what the outcome of the discussion was. The
discussion is at:
http://groups.google.com/group/modwsgi/browse_frm/thread/2471a1a71620629f
Graham
2008/11/13 Andrew Clover [EMAIL PROTECTED]:
It would be