New submission from Serhiy Storchaka: UTF-7 incremental decoder can crash in debug build when decodes unfinished base-64 section. In non-debug build it just produces inconsistent unicode string. Minimal examples:
$ ./python -c "import codecs; codecs.utf_7_decode(b'a+AIA', 'strict')" python: Objects/unicodeobject.c:403: _PyUnicode_CheckConsistency: Assertion `maxchar >= 128' failed. Aborted (core dumped) $ ./python -c "import codecs; codecs.utf_7_decode(b'+AIA-+AQA', 'strict')" python: Objects/unicodeobject.c:410: _PyUnicode_CheckConsistency: Assertion `maxchar >= 0x100' failed. Aborted (core dumped) $ ./python -c "import codecs; codecs.utf_7_decode(b'+AQA-+2ADcAA', 'strict')" python: Objects/unicodeobject.c:414: _PyUnicode_CheckConsistency: Assertion `maxchar >= 0x10000' failed. Aborted (core dumped) This happens because _PyUnicodeWriter reverts position back before unfinished base-64 section, but its buffer was already widened for characters in unfinished base-64 section. if (inShift) { writer.pos = shiftOutStart; /* back off output */ *consumed = startinpos; } And now _PyUnicodeWriter generates a string with a kind larger then needed for decoded characters. This bug causes a lot of crashes on buildbots. E.g: http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/1197 http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.3/builds/1446 ---------- components: Interpreter Core, Unicode messages: 210444 nosy: ezio.melotti, haypo, serhiy.storchaka priority: high severity: normal stage: needs patch status: open title: Segfault in UTF-7 incremental decoder type: crash versions: Python 3.3, Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue20538> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com