Re: [Web-SIG] WSGI Amendments thoughts: the horror of charsets

2008-11-18 Thread Andrew Clover

ctypes.windll.kernel32.GetEnvironmentVariableW(u'PATH_INFO', ...)


Hmm... it turns out: no. IIS appears to be mangling characters that are 
not in mbcs even *before* it puts the decoded value into the envvars.


The same is true with isapi_wsgi, which is the only other WSGI adapter I 
know of for IIS. This gets the same mangled byte string from 
GetServerVariable as Python gets from the envvars, so it looks like this 
is a mistake IIS is making further up before it even hits the CGI 
handler. Maybe someone more familiar with ISAPI knows a better way to 
read PATH_INFO than GetServerVariable, but I can't see anything 
promising in MSDN.


So it would seem to be impossible at the moment to have Unicode paths 
work under IIS at all.


The ctypes approach could rescue bytes for the Apache/nt/Py2 combination 
(perhaps also from libc.getenv for Apache/posix/Py3), but then Apache 
already gives us REQUEST_URI which is a much easier workaround. There 
might be CGI servers for Windows where ctypes could serve some purpose, 
but I can't think of any currently in use other than the Big Two.


In summary, to get the original submitted byte strings for PATH_INFO:

Apache/nt/Py2
process REQUEST_URI
Apache/posix/Py2
use PATH_INFO directly
(or process REQUEST_URI)
Apache/nt/Py3
encode PATH_INFO to ISO-8859-1
(or process REQUEST_URI)
Apache/posix/Py3
process REQUEST_URI
IIS/nt/Py2
decode PATH_INFO from mbcs, then encode to UTF-8
FAIL for characters not in current mbcs
FAIL for non-UTF-8 input
IIS/nt/Py3
encode PATH_INFO to UTF-8
FAIL for characters not in current mbcs
FAIL for non-UTF-8 input
wsgiref.simple_server/Py2
use PATH_INFO directly
wsgiref.simple_server/Py3
remains to be seen, but at the moment encode PATH_INFO to UTF-8
FAIL for non-UTF-8 input
cherrypy.wsgiserver/Py2
use PATH_INFO directly
cherrypy.wsgiserver/Py3
remains to be seen, but at the moment encode PATH_INFO to UTF-8
FAIL for non-UTF-8 input

--
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification

2008-11-18 Thread Phillip J. Eby

At 09:30 AM 11/18/2008 +1100, Graham Dumpleton wrote:

I would be for (1) errata or amendment as reality is that there is
probably no WSGI implementation that disallows an argument to
readline() given that certain Python code such as cgi.FieldStorage
wouldn't work otherwise.


Please note that that was a change in Python 2.5; older Pythons 
(including Jython until very recently) would not have needed a 
readline() argument, and so are less likely to have been tested that way.


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification

2008-11-18 Thread Alan Kennedy
[Graham]
> I would be for (1) errata or amendment as reality is that there is
> probably no WSGI implementation that disallows an argument to
> readline() given that certain Python code such as cgi.FieldStorage
> wouldn't work otherwise.
>
> For such a clarification on existing practice, I see no point in
> having to change wsgi.version in environ as it would just cause
> confusion.

+1

[Graham]
> I would also like to see other changes to WSGI specification but now
> is not the time, let us at least though get this obvious issue with
> API dealt with. After that we can then perhaps have a discussion of
> future of WSGI specification and whether there really is any interest
> in future versions with more significant changes.

+1

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com