New submission from STINNER Victor <vstin...@redhat.com>:

The decimal module support formatting a number in the "n" formatting type if 
the LC_NUMERIC locale uses a different encoding than the LC_CTYPE locale. 

Example with attached decimal_locale.py on Fedora 29 with Python 3.7.2:

$ python3 decimal_locale.py 
LC_NUMERIC locale: uk_UA.koi8u
decimal_point: ',' = ',' = U+002c
thousands_sep: '\xa0' = '\xa0' = U+00a0
Traceback (most recent call last):
  File "/home/vstinner/decimal_locale.py", line 16, in <module>
    text = format(num, "n")
ValueError: invalid decimal point or unsupported combination of LC_CTYPE and 
LC_NUMERIC

Attached PR modify the _decimal module to support this corner case.

Note: I already wrote PR 5191 last year, but I abandoned the PR in the 
meanwhile.

--

Supporting non-ASCII decimal point and thousands separator has a long history 
and a list of now fixed issues:

* bpo-7442
* bpo-13706
* bpo-25812
* bpo-28604 (LC_MONETARY)
* bpo-31900
* bpo-33954

I even wrote an article about these bugs :-)
https://github.com/python/cpython/pull/5191

Python 3.7.2 now supports different encodings for LC_NUMERIC, LC_MONETARY and 
LC_CTYPE locales. format(int, "n") sets temporarily LC_CTYPE to LC_NUMERIC to 
decode decimal_point and thousands_sep from the correct encoding. The LC_CTYPE 
locale is only changed if it's different than LC_NUMERIC locale and if the 
decimal point and/or thousands separator is non-ASCII. It's implemented in this 
function:

int
_Py_GetLocaleconvNumeric(struct lconv *lc,
                         PyObject **decimal_point, PyObject **thousands_sep)

Function used by locale.localeconv() and format() (for "n" type).

I decided to fix the bug when I was fixing other locale bugs because we now got 
enough bug reports.

Copy of my msg309980:

"""
> I would not consider this a bug in Python, but rather in the locale settings 
> passed to setlocale().

Past 10 years, I repeated to every single user I met that "Python 3 is right, 
your system setup is wrong". But that's a waste of time. People continue to 
associate Python3 and Unicode to annoying bugs, because they don't understand 
how locales work.

Instead of having to repeat to each user that "hum, maybe your config is 
wrong", I prefer to support this non convential setup and work as expected ("it 
just works"). With my latest implementation, setlocale() is only done when 
LC_CTYPE and LC_NUMERIC are different, which is the corner case which 
"shouldn't occur in practice".
"""

----------
components: Library (Lib)
files: decimal_locale.py
messages: 333302
nosy: vstinner
priority: normal
severity: normal
status: open
title: decimal: formatter error if LC_NUMERIC uses a different encoding than 
LC_CTYPE
versions: Python 3.8
Added file: https://bugs.python.org/file48038/decimal_locale.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35697>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to