STINNER Victor added the comment: Attached patch works around the CODESET issue on OpenIndiana and FreeBSD. If the LC_CTYPE locale is "C" and nl_langinfo(CODESET) returns ASCII (or an alias of this encoding), b"\xE9" is decoded from the locale encoding: if the result is U+00E9, the patch Python uses ISO-8859-1. (If decoding fails, the locale encoding is really ASCII, the workaround is not used.)
If the result is different (b'\xe9' is not decoded from the locale encoding to U+00E9), a ValueError is raised. I wrote this test to detect bugs. I hope that our buildbots will validate the code. We may choose a different behaviour (ex: keep ASCII). Example on FreeBSD 8.2, original Python 3.4: $ ./python >>> import sys, locale >>> sys.getfilesystemencoding() 'ascii' >>> locale.getpreferredencoding() 'US-ASCII' Example on FreeBSD 8.2, patched Python 3.4: $ ./python >>> import sys, locale >>> sys.getfilesystemencoding() 'iso8859-1' >>> locale.getpreferredencoding() 'iso8859-1' ---------- keywords: +patch Added file: http://bugs.python.org/file27965/workaround_codeset.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16455> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com