Jeroen Ruigrok van der Werven <asmo...@in-nomine.org> added the comment:
I will first point out where our current implementation is broken, in my opinion of course, after which I propose a small patch. Both C90 (7.4.1.1) and C99 (7.11.1.1) state: "A value of "C" for locale specifies the minimal environment for C translation; a value of "" for locale specifies the locale-specific native environment. Other implementation-defined strings may be passed as the second argument to setlocale. [...] If a pointer to a string is given for locale and the selection can be honored, the setlocale function returns a pointer to the string associated with the specified category for the new locale. If the selection cannot be honored, the setlocale function returns a null pointer and the program’s locale is not changed." Note that neither C or POSIX defines any errors or sets errno or the likes. It simply returns a null pointer. In C you would typically start your program with a call like: #include <locale.h> int main(int argc, char *argv[]) { setlocale(LC_CTYPE, ""); ... } This will try to set the locale to what the native environment specifies, but will not error out if the value, if any, it receives does not map to a valid locale. It will return a null pointer if it cannot set the locale. Execution continues and the locale is set to the default "C". Our current behaviour in Python does not adhere to these semantics. To illustrate: # Obvious non-existing locale >>> from locale import setlocale, LC_CTYPE >>> setlocale(LC_CTYPE, 'B') Error: unsupported locale setting # Valid locale, but not available on my system >>> from os import getenv >>> from locale import setlocale, LC_CTYPE >>> getenv('LANG') >>> 'cy_GB.UTF-8' >>> setlocale(LC_CTYPE, '') Error: unsupported locale setting Neither Perl or PHP throw any error when setlocale() is passed an invalid locale. Python is being unnecessarily disruptive by throwing an error. As such I think PyLocale_setlocale() in Modules/_localemodule.c needs to be adjusted. Patch against trunk enclosed. This changes the semantics of our current implementation to the following: >>> from locale import setlocale, LC_CTYPE >>> rv = setlocale(LC_CTYPE, 'B') >>> type(rv) <class 'NoneType'> >>> rv = setlocale(LC_CTYPE, 'C') >>> type(rv) <class 'str'> >>> rv 'C' ---------- Added file: http://bugs.python.org/file13843/locale.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue1443504> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com