[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

STINNER Victor Fri, 19 Nov 2010 16:16:04 -0800

STINNER Victor <[email protected]> added the comment:

On Friday 19 November 2010 21:58:25 you wrote:
> > I choosed to use ASCII instead of UTF-8, because an UTF-8 decoder is long
> > (210 lines) and complex (see PyUnicode_DecodeUTF8Stateful()), whereas
> > ASCII decode is just: "unicode_char = (Py_UNICODE)byte;" + an if before
> > to check that 0 <= byte <= 127).
> 
> I don't think we need 210 lines to replace "*s++ = *f" with proper
> UTF-8 logic.  Even if we do, the code can be shared with
> PyUnicode_DecodeUTF8 and a UTF-8 iterator may be a welcome addition to
> Python C API.


Why should we do that? ASCII format is just fine. Remember that 
PyUnicode_FromFormatV() is part of the C API. I don't think that anyone would 
use non-ASCII format in C. If someone does that, (s)he should open a new issue 
for that :-) But I don't think that we should make the code more complex if 
it's just useless.

Victor

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue9769>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9769] PyUnicode_FromFormatV() doesn't handle non-ascii text correctly

Reply via email to