[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-20 Thread Anthony Sottile
Anthony Sottile added the comment: PEP states that environ variables are str variables decoded using latin1: https://www.python.org/dev/peps/pep-/#id19 Therefore, to get the original bytes, one must encode using latin1 On Apr 20, 2016 3:46 AM, "Александр Эри"

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-20 Thread Александр Эри
Александр Эри added the comment: Why wsgiref uses latin1? It must use utf-8. -- keywords: +patch nosy: +Александр Эри Added file: http://bugs.python.org/file42531/simple_server.py.diff ___ Python tracker

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-17 Thread Martin Panter
Changes by Martin Panter : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-16 Thread Roundup Robot
Roundup Robot added the comment: New changeset 1f2cfcd5a83f by Martin Panter in branch '3.5': Issue #26717: Stop encoding Latin-1-ized WSGI paths with UTF-8 https://hg.python.org/cpython/rev/1f2cfcd5a83f New changeset 815a4ac67e68 by Martin Panter in branch 'default': Issue #26717: Merge

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-08 Thread Anthony Sottile
Anthony Sottile added the comment: Forgot to remove the pyver code (leaning a bit too much on pre-commit) -- Added file: http://bugs.python.org/file42405/patch ___ Python tracker

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-08 Thread Martin Panter
Martin Panter added the comment: Thanks, this version looks pretty good to me. -- ___ Python tracker ___ ___

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-08 Thread Anthony Sottile
Anthony Sottile added the comment: Updates after review. -- Added file: http://bugs.python.org/file42404/patch ___ Python tracker ___

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-08 Thread Martin Panter
Martin Panter added the comment: I was going to say your original fix was the reverse of a change in r86146. But you seem to be fixing the problems before I express them :) For the fix I would suggest something like unquote(path, "latin-1") would be simpler. I left some other review comments

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-08 Thread Anthony Sottile
Anthony Sottile added the comment: Oops, broke b'/%80'. Here's a better fix that now takes: (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode utf-8)-> b'\xc2\x80' -(decode latin1)-> u'\xc2\x80' to: (on the wire) b'\x80' -(decode latin1)-> u'\x80' -(encode latin1) -> b'\x80'

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-08 Thread Anthony Sottile
Anthony Sottile added the comment: A few typos in my previous comment, pressed enter too quickly, here's an updated comment: Patch attached with test. In summary: A request to the url b'/\x80' appears to the application as a request to b'/\xc2\x80' -- The issue being the latin1 decoded

[issue26717] wsgiref.simple_server: mojibake with cp1252 bytes in PATH_INFO

2016-04-08 Thread Anthony Sottile
New submission from Anthony Sottile: Patch attached with test. In summary: A request to the url b'/\x80' appears to the application as a request to b'\xc2\x80' -- The issue being the latin1 decoded PATH_INFO is re-encoded as UTF-8 and then decoded as latin1 (on the wire) b'\x80' -(decode