[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread STINNER Victor
STINNER Victor added the comment: Oh, I forgot to mention that I'm not convinced that we should add such function to the Python stdlib. -- ___ Python tracker ___ ___

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread STINNER Victor
STINNER Victor added the comment: I don't think that applications are prepared to handle surrogate characters, so I'm not sure that the default encoding should be "surrogateescape". In my experience, text is later encoded to UTF-8 (or latin1 or ascii) and you then you an error from the encoder

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread Graham Dumpleton
Graham Dumpleton added the comment: >From memory, the term sometimes used on the WEB-SIG when discussed was >transcode. I find the idea that it needs 'fixing' or is 'incorrect', as in 'fix the original incorrect decoding to latin-1' is a bit misleading as well. It was the only practical way o

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread STINNER Victor
STINNER Victor added the comment: I don't like "fix" in the name "fix_encoding". It is negative. Why not "decode" or "decode_wsgi"? -- ___ Python tracker ___ ___

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread Graham Dumpleton
Graham Dumpleton added the comment: Is actually WSGI 1.0.1 and not 1.1. :-) -- nosy: +grahamd ___ Python tracker ___ ___ Python-bugs-l

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread Nick Coghlan
Nick Coghlan added the comment: Current cryptic incantation that requires deep knowledge of the encoding system to follow: data = data.encode("latin-1").decode("utf-8", "surrogateescape") Replacement that is not only more self-documenting, but also gives you something specific to look up

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- nosy: +pje ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you please provide an example how this helper will improve stdlib or user code? -- ___ Python tracker ___ _

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- components: +Library (Lib), Unicode nosy: +benjamin.peterson, ezio.melotti, haypo, lemburg, pitrou, serhiy.storchaka ___ Python tracker ___ _

[issue22264] Add wsgiref.util.fix_decoding

2014-08-24 Thread Nick Coghlan
Nick Coghlan added the comment: Last tweak, since the purpose is to fix the original incorrect decoding to latin-1, this should be defined as a decoding operation: def fix_decoding(data, encoding, errors="surrogateescape"): return data.encode("latin-1").decode(encoding, errors) ---