Serhiy Storchaka added the comment: I think the changeset which made decoders to use _PyUnicodeWriter (issue16311) is responsible of the regression.
For example consider b'\x80abc'.decode('utf-8', 'backslashreplace'). The writer reserves string buffer with size 4 (every byte produces at most 1 character). First byte is incorrect and replaced by 4-character string '\\x80'. The writer increases min_length but doesn't resize the buffer because its size is enough to write replacement string. But following writes of ASCII characters cause buffer overflow. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23321> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com