Inada Naoki <songofaca...@gmail.com> added the comment:

> I think that it is more correct to use the locale encoding. If error messages 
> are translated for readability, we should not ruin this by outputting \xXX.

* PyUnicode_DecodeLocale() doesn't support "backslashescape" error handler.
* Error message is usually encoded in locale encoding, but it is not guaranteed.
* Error message may contain path, it may be not locale encoding too.
* \xXX is far better than UnicodeDecodeError, anyway. We need to fix the 
UnicodeDecodeError first.
* non-UTF-8 locale is rare. We used this code for long time but we haven't 
reported this issue until now.

I don't against adding "backslashescape" to PyUnicode_DecodeLocale(). But to 
backport the bugfix for UnicodeDecodeError, change should be minimum.

So the main problem is: should we allow surrogateescape in error message?

For the record, PyUnicode_DecodeLocale() is using mbstowcs(). I don't know how 
reliable the function is in various platforms. That is why I had suggested 
PyUnicode_DecodeFSDefault() at first.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41894>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to