New submission from Alexander Sturm <ast...@hbk.com>:

This issue can be reproduced using Python 3.5.2 - 3.5.5 compiled with VS2015 
(tested both the official Python builds as well as local builds using VS2015 
Version 14.0.25431.01 Update 3) on 64-bit Windows.

To reproduce, run the attached file (or python -c "import operator; 
operator.attrgetter('x'*1000000)").

The segfault itself occurs in attrgetter_new [1], while scanning for dots 
inside a unicode string. From my reading of the code, the code itself is 
actually correct - instead, it triggers a compiler bug in VS2015, which causes 
it to miscompile the code by applying an incorrect optimization.

The optimization VS2015 applies is twofold: It uses SIMD instructions to avoid 
looking at every byte of the input string individually, and it tries to 
eliminate the condition in PyUnicode_READ [2] by executing all three branches 
(UCS1, UCS2 and UCS4) at once and then in a separate step discarding the 
results for the two branches that would not be taken.

When combining these optimizations, the generated code incorrectly only checks 
the precondition of the UCS1 branch to decide whether it can use the optimized 
SIMD version. Since the UCS1 branch looks at 8 bytes of the buffer in each 
iteration, it checks whether the buffer contains at least 8 bytes. However, 
each iteration of the UCS2/UCS4 branches look at 16/32 bytes, respectively.

As a result, when passing a UCS1 string to attrgetter(), we end up reading 24 
bytes past the end of the buffer for every 8 bytes the string contains. In the 
reproduction example, we thus read about 24 MB past the buffer, almost 
guaranteeing a crash.

The issue does not occur when compiling Python using VS2017 (i.e. 
PlatformToolset v141, MSVC version 14.13.26128). Presumably this means the 
compiler bug has been fixed upstream.

[1]: https://github.com/python/cpython/blob/v3.5.5/Modules/_operator.c#L587
[2]: https://github.com/python/cpython/blob/v3.5.5/Include/unicodeobject.h#L521

----------
components: Interpreter Core
files: repro.py
messages: 314987
nosy: asturm
priority: normal
severity: normal
status: open
title: Segmentation fault in operator.attrgetter
versions: Python 3.5
Added file: https://bugs.python.org/file47521/repro.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33232>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to