Serhiy Storchaka added the comment:

I think the changeset which made decoders to use _PyUnicodeWriter (issue16311) 
is responsible of the regression.

For example consider b'\x80abc'.decode('utf-8', 'backslashreplace').

The writer reserves string buffer with size 4 (every byte produces at most 1 
character). First byte is incorrect and replaced by 4-character string 
'\\x80'. The writer increases min_length but doesn't resize the buffer because 
its size is enough to write replacement string. But following writes of ASCII 
characters cause buffer overflow.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23321>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to