2017-01-12 1:23 GMT+01:00 INADA Naoki <songofaca...@gmail.com>: > I'm ±0 to surrogateescape by default. I feel +1 for stdout and -1 for stdin.
The use case is to be able to write a Python 3 program which works work UNIX pipes without failing with encoding errors: https://www.python.org/dev/peps/pep-0540/#producer-consumer-model-using-pipes If you want something stricter, there is the UTF-8 Strict mode which prevent mojibake everywhere. I'm not sure that the UTF-8 Strict mode is really useful. When I implemented it, I quickly understood that using strict *everywhere* is just a deadend: it would fail in too many places. https://www.python.org/dev/peps/pep-0540/#use-the-strict-error-handler-for-operating-system-data I'm not even sure yet that a Python 3 with stdin using strict is "usable". > In output case, surrogateescape is weaker than strict, but it only allows > surrgateescaped binary. If program carefully use surrogateescaped decode, > surrogateescape on stdout is safe enough. What do you mean that "carefully use surrogateescaped decode"? The rationale for using surrogateescape on stdout is to support this use case: https://www.python.org/dev/peps/pep-0540/#list-a-directory-into-stdout > On the other hand, surrogateescape is very weak for input. It accepts > arbitrary bytes. > It should be used carefully. In my experience with the Python bug tracker, almost nobody understands Unicode and locales. For the "Producer-consumer model using pipes" use case, encoding issues of Python 3.6 can be a blocker issue. Some developers may prefer a different programming language which doesn't bother them with Unicode: basicall, *all* other programming languages, no? > But I agree different encoding handler between stdin/stdout is not beautiful. > That's why I'm ±0. That's why there are two modes: UTF-8 and UTF-8 Strict. But I'm not 100% sure yet, on which encodings and error handlers should be used ;-) I started to play with my PEP 540 implementation. I already had to update the PEP 540 and its implementation for Windows. On Windows, os.fsdecode/fsencode now uses surrogatepass, not surrogateescape (Python 3.5 uses strict on Windows). Victor _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/