New submission from Bruno Chanal <bcha...@teksavvy.com>:

The short story: Small numbers are not displayed properly when using a French 
(language) locale or similar, and formatting output with str.format or 
string.Formatter(). The problem probably extends to other locales.

Long story:

---
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
Codename:       bionic
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'fr_CA.UTF-8'
>>> print('{:n}'.format(10))    # Garbled output

>>> print('{:n}'.format(10000)) # OK
10 000
>>> # Note: narrow non-break space used as thousands separator
... pass
>>> locale.format_string('%d', 10, grouping=True)      # OK
'10'
>>> locale.format_string('%d', 10123)                  # OK
'10123'
>>> locale.format_string('%d', 10123, grouping=True)   # OK thousands separator 
>>> \u202f
'10\u202f123'
>>> import string
>>> print(string.Formatter().format('{:n}', 10))  # Same problem with Formatter
AB
>>> print(string.Formatter().format('{:n}', 10000))
10 000

locale aware functions implementing the {:n} formatting code, such as 
str.format and string.Formatter, generate garbled output with small numbers 
under a French locale.

However, locale.format_string('%d', numeric_value) produces valid strings. In 
other words, it's a workaround for the time being...

The problem seems to originate from a new version of Ubuntu: I ran the same 
program about 18 months ago and didn't notice any problem.

My 0.02 $ worth of analysis: the output from the str.locale function is some 
random and changing value with small numbers. The behavior is reminiscent of 
invalid memory reads in C functions, e.g., mismatch of  parameter in function 
calls, or similar. The value is not consistent. It feels like format does not 
expect and deal properly with long Unicode characters as part of numbers. The 
space character is a NARROW NON-BREAK SPACE, in most Ubuntu French locales (and 
quite a few others) however.

The problem shows up in Python 3.6 and 3.7.

This might also be a security issue...

----------
components: Interpreter Core
messages: 331254
nosy: canuck7
priority: normal
severity: normal
status: open
title: str.format and string.Formatter bug with French (and other) locale
type: behavior
versions: Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35432>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to