[issue10155] Add fixups for encoding problems to wsgiref

2010-10-20 Thread And Clover

New submission from And Clover :

Currently wsgiref's CGIHandler makes a WSGI environ from the CGI environ 
without changes.

Unfortunately the CGI environ is wrong in a number of common circumstances:

- on Windows, the native environ is Unicode, and different servers choose 
different decodings for HTTP bytes to store in the environ (most notably for 
PATH_INFO);

- on Windows with Python 2.x, os.environ is read from the Unicode native 
environ using the ANSI encoding, which will lose/mangle non-ASCII characters;

- on Posix with Python 3.x, os.environ is read from a native bytes environ 
using the filesystemencoding which is probably not ISO-8859-1.

- on IIS, PATH_INFO inappropriately includes SCRIPT_NAME unless a hidden, 
rarely-used, and problematic config option is applied.

Previously, it was not clear in PEP 333 what was supposed to happen with 
headers and encodings, especially under Python 3. PEP  clears this up. 
These patches add fixups to wsgiref to try to generate the nearest to a 
'correct' environ as per PEP  as possible for the current platform and 
server software.

They also fix simple_server to use the correct encoding for PATH_INFO, and 
include the fix for issue 9022, correspondingly updating the simple_server demo 
app and tests to conform to PEP 's expectation that headers will be 
ISO-8859-1-decoded Unicode strings. The test_bytes_validation test is removed: 
as I understand it, it's no long allowed to use byte string headers/status.

--
components: Library (Lib)
files: wsgiref-patches-3.2a3.patch
keywords: patch
messages: 119220
nosy: aclover
priority: normal
severity: normal
status: open
title: Add fixups for encoding problems to wsgiref
type: behavior
versions: Python 2.7, Python 3.2
Added file: http://bugs.python.org/file19303/wsgiref-patches-3.2a3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-10-20 Thread And Clover

And Clover  added the comment:

(patch for Python 2.x, for what it's worth)

--
Added file: http://bugs.python.org/file19304/wsgiref-patches-2.7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-10-20 Thread And Clover

And Clover  added the comment:

(same again for branch PJ Eby's wsgiref svn: same as previous 2.7 patch aside 
from the line numbers)

--
Added file: http://bugs.python.org/file19309/wsgiref-patches-eby2692.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-10-20 Thread Ned Deily

Changes by Ned Deily :


--
nosy: +pje

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-10-20 Thread Senthil Kumaran

Changes by Senthil Kumaran :


--
nosy: +orsenthil

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-10-22 Thread Éric Araujo

Éric Araujo  added the comment:

Your patch adds a new handler, which is arguably a new feature that has to be 
rejected in a bugfix branch.

--
nosy: +eric.araujo
versions: +Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-10-23 Thread And Clover

Changes by And Clover :


Removed file: http://bugs.python.org/file19303/wsgiref-patches-3.2a3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-10-23 Thread And Clover

And Clover  added the comment:

Ah, sorry, submitted wrong patch against 3.2, disregard. Here's the 'proper' 
version (the functionality isn't changed, just the former patch had an unused 
and-Falsed out clause for reading environb, which in the end I decided not to 
use as the surrogateescape approach already covers it just as well for values).

@Éric: yes. Actually the whole patch is pretty much new functionality, which 
should not be considered for a 2.7.x bugfix release. I've submitted a patch 
against 2.7 for completeness and for the use of a separately-maintained 
post-2.7 wsgiref, but unless there is ever a Python 2.8 it should never hit 
stdlib.

The status quo wrt Unicode in environ is broken and inconsistent, which an 
accepted PEP  would finally clear up. But there may be webapps deployed 
that rely on their particular server's current inconsistent environ, and those 
shouldn't be broken by a bugfix 2.7 or 3.1 release.

--
versions:  -Python 3.1
Added file: http://bugs.python.org/file19348/wsgiref-patches-3.2a3.proper.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-11-03 Thread Phillip J. Eby

Phillip J. Eby  added the comment:

Committed to Py3K in r86146, with added docs and a larger list of transcodable 
CGI variables.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-11-03 Thread And Clover

And Clover  added the comment:

Thanks.

Some of those additions in _needs_transcode are potentially controversial, 
though. I'm not wholly sure it's the right thing to transcode these.

Some of them may not actually come from the request, eg `REMOTE_USER` may be 
filled in by IIS's Windows authentication using a native-Unicode string from 
the Windows user database. Is it the right thing to turn it into 
UTF-8-bytes-in-Unicode for consistency with Apache? Maybe. (At least for most 
of the other new envvars there will never see a non-ASCII character. Or in 
`REMOTE_IDENT`'s case never be used for anything.)

The case with the REDIRECT_HTTP_ and SSL_ envvars is an interesting one. Whilst 
transcoding them at some point will very probably be what applications need to 
do if they want to actually use them, is it within CGIHandler's remit to change 
Apache mod-specific variables that are not specified by CGI or WSGI?

(There might, after all, be lots of these to catch for other mods and servers, 
and it's *conceivable* that somebody might be re-using one of these names to 
set in the environment for some other purpose, in which case transcoding would 
be adding an unexpected mangling. We can't in the general case expect users to 
know to avoid envvar names are used as non-standard extensions in all servers.)

REDIRECT_HTTP_ at least comes from the HTTP request, so I guess the consistency 
is good there. (But then I think the only header that actually may contain 
non-ASCII is REDIRECT_URL, which replaces the unescaped SCRIPT_NAME and 
PATH_INFO; that one isn't caught at the moment.)

--
versions:  -Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-12-17 Thread Phillip J. Eby

Phillip J. Eby  added the comment:

So, do you have any suggestions for a specific change to the patch?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2010-12-17 Thread And Clover

And Clover  added the comment:

No, not specifically. My patch is conservative about what variables it recodes, 
yours more liberal, but it's difficult to say which is the better approach, or 
what PEP  requires.

If you're happy with the current patch, go ahead, let's have it for 3.2; I 
don't foresee significant problems with it. It's unlikely anyone is going to be 
re-using the SSL_ or REDIRECT_ variable names for something other than what 
Apache uses them for. There might be some confusion from IIS users over what 
encoding REMOTE_USER should be in, but I can't see any consistent resolution 
for that issue, and we'll certainly be in a better position than we are now.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10155] Add fixups for encoding problems to wsgiref

2012-12-17 Thread And Clover

And Clover added the comment:

(belated close-fixed)

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com