Marc-Andre Lemburg <m...@egenix.com> added the comment: Antoine Pitrou wrote: > > Antoine Pitrou <pit...@free.fr> added the comment: > >> I find that the null termination for 8-bit strings makes low-level >> parsing operations (e.g., parsing a numeric string) safer and easier: > > Not to mention faster. The new IO library makes use of it (for newline > detection), on both bytestrings and unicode strings.
I'd consider that a bug. Esp. the IO lib should be 8-bit clean in the sense that it doesn't add any special meaning to NUL characters or code points. Besides, using a for-loop with a counter is both safer and faster than checking each an every character for NUL. Just think of what can happen if you have buggy code that overwrites the NUL byte in some corner case situation and then use the assumption of having the NUL byte as terminator - a classical buffer overrun. If you're lucky, you get a segfault. If not, you end up with data corruption or manipulation of data which could lead to unwanted code execution. The Python Unicode API deliberately tries to always use the combination of a Py_UNICODE* pointer and a length integer to avoid such issues. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue1943> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com