STINNER Victor <victor.stin...@haypocalc.com> added the comment:

> My remark is that utf-8 tend to be applied to all kind of files;
> if someone once decide that non-ascii chars are allowed in (some) 
> string constants, they will be stored in utf-8.

In this case, it will be better to raise an error on non-ascii byte (character) 
in the format string. It's better to raise an error than to interpret utf-8 as 
iso-8859-1 (mojibake!). Since nobody noticed this bug 
(PyFormat_FromString/PyErr_Format expects ISO-8859-1), I suppose that nobody 
uses non-ASCII format string is always ascii.

Python builtin errors are not localised. If an application uses gettext, I 
suppose that the error will be raised in the Python code, not in the C API.

Attached patch changes PyFormat_FromStringV (and so PyFormat_FromString and 
PyErr_Format) to reject non-ascii byte (character) in the format string. I 
added a test and documented the format string encoding (which is now ASCII). 
See also #9738 for the documentation about function argument encoding.

----------
keywords: +patch
Added file: http://bugs.python.org/file18800/pyunicode_fromformat_ascii.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9769>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to