STINNER Victor <victor.stin...@haypocalc.com> added the comment: On Friday 19 November 2010 21:58:25 you wrote: > > I choosed to use ASCII instead of UTF-8, because an UTF-8 decoder is long > > (210 lines) and complex (see PyUnicode_DecodeUTF8Stateful()), whereas > > ASCII decode is just: "unicode_char = (Py_UNICODE)byte;" + an if before > > to check that 0 <= byte <= 127). > > I don't think we need 210 lines to replace "*s++ = *f" with proper > UTF-8 logic. Even if we do, the code can be shared with > PyUnicode_DecodeUTF8 and a UTF-8 iterator may be a welcome addition to > Python C API.
Why should we do that? ASCII format is just fine. Remember that PyUnicode_FromFormatV() is part of the C API. I don't think that anyone would use non-ASCII format in C. If someone does that, (s)he should open a new issue for that :-) But I don't think that we should make the code more complex if it's just useless. Victor ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9769> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com