[issue26000] Crash in Tokenizer - Heap-use-after-free

2018-09-24 Thread Karthikeyan Singaravelan

Karthikeyan Singaravelan  added the comment:

As part of triaging I am closing this issue as duplicate adding issue31852 as 
superseder which has the relevant PR and discussion about the fix. I have also 
verified the fix as in https://bugs.python.org/issue26000#msg326204. I think 
backporting the fix to Python 3.5 can be opened as a separate issue adding 
Larry since 3.5 is in security fixes mode if needed.

Thanks again everyone for the details.

resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> Crashes with lines of the form "async \"

Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2018-09-24 Thread Karthikeyan Singaravelan

Karthikeyan Singaravelan  added the comment:

Thanks William for the information. I can reproduce this on 3.5.6. I was able 
to bisect this down to
#31852 that deals with similar cases and fixed with commit 

$ cpython git:(master) git checkout 690c36f2f1085145d364a89bfed5944dd2470308
HEAD is now at 690c36f2f1 [3.6] bpo-31852: Fix segfault caused by using the 
async soft keyword (GH-4122)
$ cpython git:(690c36f2f1) git clean -xdf && ./configure --with-pydebug && make 
-s -j4
$ cpython git:(690c36f2f1) ./python.exe ../backups/vuln.py
  File "../backups/vuln.py", line 2
SyntaxError: Non-UTF-8 code starting with '\xef' in file ../backups/vuln.py on 
line 2, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for 
$ cpython git:(690c36f2f1) ./python.exe ../backups/vuln2.py
  File "../backups/vuln2.py", line 3
SyntaxError: Non-UTF-8 code starting with '\xdd' in file ../backups/vuln2.py on 
line 3, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for 

# Reproduce the crash

➜  cpython git:(690c36f2f1) git checkout 
Previous HEAD position was 690c36f2f1 [3.6] bpo-31852: Fix segfault caused by 
using the async soft keyword (GH-4122)
HEAD is now at 2702380870 bpo-31304: Update starmap_async documentation. 
(GH-4168) (GH-4177)
➜  cpython git:(2702380870) make
➜  cpython git:(2702380870) ./python.exe ../backups/vuln2.py
Assertion failed: (!PyErr_Occurred()), function PyObject_Call, file 
Objects/abstract.c, line 2247.
^[[A[2]71701 abort  ./python.exe ../backups/vuln2.py
➜  cpython git:(2702380870) ./python.exe ../backups/vuln.py
Assertion failed: (!PyErr_Occurred()), function PyObject_Call, file 
Objects/abstract.c, line 2247.
[2]71712 abort  ./python.exe ../backups/vuln.py

It doesn't affect master, 3.7.0 and v3.6.4+ . Since 3.5 is in security mode and 
was not backported to 3.5 in the linked ticket. I propose to close this ticket 
and reopen a separate one with Larry added to it if the fix needs an explicit 
backport to 3.5.6 on priority.



Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2018-09-23 Thread William Bowling

William Bowling  added the comment:

> Is this still reproducible? On master (Python 3.8) with a debug build it 
> throws a SyntaxError. I don't have Python 3.5 installed to check this though

Looks like it's fixed in master and 3.6.6 but still happening in 3.5.6


Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2018-09-23 Thread Karthikeyan Singaravelan

Karthikeyan Singaravelan  added the comment:

Is this still reproducible? On master (Python 3.8) with a debug build it throws 
a SyntaxError. I don't have Python 3.5 installed to check this though

$ ./python.exe
Python 3.8.0a0 (heads/master:c87d9f406b, Sep 23 2018, 19:48:30)
[Clang 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
➜  cpython git:(master) ./python.exe -c 'with open("vuln.py", "wb") as f: 
➜  cpython git:(master) ✗ ./python.exe vuln.py
  File "vuln.py", line 2
SyntaxError: Non-UTF-8 code starting with '\xef' in file vuln.py on line 2, but 
no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
➜  cpython git:(master) ✗ ./python.exe -c 'with open("vuln2.py", "wb") as f: 
➜  cpython git:(master) ✗ ./python.exe vuln2.py
  File "vuln2.py", line 3
SyntaxError: Non-UTF-8 code starting with '\xdd' in file vuln2.py on line 3, 
but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details


nosy: +xtreak

Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2016-02-21 Thread Sean Gillespie

Sean Gillespie added the comment:

Went ahead and did it since I had the time - the issue is that when doing a 
token of lookahead to see whether an 'async' at a top-level begins an 'async 
def' function or if it is an identifier. A shallow copy of the current token is 
made and given to another call to tok_get, which frees the token's buffer if a 
decoding error occurs. Since the shallow copy cloned the token's buffer 
pointer, the still-live token contains a freed pointer to its buffer that gets 
freed again later on.

By explicitly nulling-out the token's buffer pointer like tok_get does if the 
copied token's buffer pointer was nulled out, we avoid the double-free issue 
and present the correct syntax error:

$ ./python vuln.py 
  File "vuln.py", line 1
SyntaxError: Non-UTF-8 code starting with '\xef' in file vuln.py on line 2, but 
no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

William Bowling's second program is also fixed with this change, with one 
additional wrinkle: if a token contains a null byte as the
first character, an invalid write occurs when we attempt to replace the null 
character with a newline. This fix checks to make sure
that this is not the case before performing the newline insertion.

With this change, both of William Bowling's programs pass valgrind and
present the appropriate syntax error. I tried to add this to the couroutine 
syntax tests, but any way to load the file outside of giving it to ./python 
itself fails (correctly) because the program contains a null byte.

keywords: +patch
Added file: http://bugs.python.org/file41995/tokenizer_double_free.patch

Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2016-02-20 Thread Sean Gillespie

Sean Gillespie added the comment:

Is anyone currently working on this? If not, I'd like to try and fix this. I've 
debugged this a little and think I have an idea of what's going on.

nosy: +swgillespie

Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2016-01-03 Thread William Bowling

William Bowling added the comment:

Also a very similar source causes a slightly different crash 
(heap-buffer-overflow instead of heap-use-after-free):

./python -c 'with open("vuln2.py", "wb") as f: 
./python vuln2.py

Python 3.5.1+ (default, Jan  4 2016, 00:05:40)

Attached the asan report

Added file: http://bugs.python.org/file41487/asan2.txt

Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2016-01-03 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :

assignee:  -> serhiy.storchaka
nosy: +serhiy.storchaka
priority: normal -> high

Python tracker 

Python-bugs-list mailing list

[issue26000] Crash in Tokenizer - Heap-use-after-free

2016-01-03 Thread William Bowling

New submission from William Bowling:

Similar to https://bugs.python.org/issue25388 the following causes a crash on 
3.5.1 and the latest 3.5 branch:

./python -c 'with open("vuln.py", "wb") as f: 
./python vuln.py

Python 3.5.1+ (default, Jan  4 2016, 00:05:40) 
==24400==ERROR: AddressSanitizer: heap-use-after-free on address 0xf270f100 at 
pc 0x080ad09e bp 0xffef5ee8 sp 0xffef5ac0
READ of size 2 at 0xf270f100 thread T0
#0 0x80ad09d in strncpy (/home/will/python/cpython/python+0x80ad09d)
#1 0x8589b56 in parsetok /home/will/python/cpython/Parser/parsetok.c:235:13
#2 0x858b301 in PyParser_ParseFileObject 
#3 0x8439e0b in PyParser_ASTFromFileObject 
#4 0x843aa37 in PyRun_FileExFlags 
#5 0x8438a98 in PyRun_SimpleFileExFlags 
#6 0x84382a6 in PyRun_AnyFileExFlags 
#7 0x813f194 in run_file /home/will/python/cpython/Modules/main.c:318:11
#8 0x813f194 in Py_Main /home/will/python/cpython/Modules/main.c:768
#9 0x8138070 in main /home/will/python/cpython/./Programs/python.c:69:11
#10 0xf7558496 in __libc_start_main (/usr/lib32/libc.so.6+0x18496)
#11 0x80715b7 in _start (/home/will/python/cpython/python+0x80715b7)

0xf270f100 is located 0 bytes inside of 8194-byte region [0xf270f100,0xf2711102)
freed by thread T0 here:
#0 0x810c2a4 in __interceptor_cfree.localalias.1 
#1 0x8139560 in _PyMem_RawFree 
#2 0x813852b in PyMem_Free 
#3 0x8596b05 in error_ret /home/will/python/cpython/Parser/tokenizer.c:198:9
#4 0x8596b05 in decoding_fgets 
#5 0x8594df0 in tok_nextc 
#6 0x858ebba in tok_get /home/will/python/cpython/Parser/tokenizer.c:1457:13
#7 0x858fc79 in tok_get /home/will/python/cpython/Parser/tokenizer.c:1524:34
#8 0x858e1da in PyTokenizer_Get 
#9 0x85899a7 in parsetok /home/will/python/cpython/Parser/parsetok.c:208:16
#10 0x858b301 in PyParser_ParseFileObject 
#11 0x8439e0b in PyParser_ASTFromFileObject 
#12 0x843aa37 in PyRun_FileExFlags 
#13 0x8438a98 in PyRun_SimpleFileExFlags 
#14 0x84382a6 in PyRun_AnyFileExFlags 
#15 0x813f194 in run_file /home/will/python/cpython/Modules/main.c:318:11
#16 0x813f194 in Py_Main /home/will/python/cpython/Modules/main.c:768
#17 0x8138070 in main /home/will/python/cpython/./Programs/python.c:69:11
#18 0xf7558496 in __libc_start_main (/usr/lib32/libc.so.6+0x18496)

previously allocated by thread T0 here:
#0 0x810c784 in realloc (/home/will/python/cpython/python+0x810c784)
#1 0x8139541 in _PyMem_RawRealloc 
#2 0x8138506 in PyMem_Realloc 
#3 0x8594f1c in tok_nextc 
#4 0x858e4c9 in tok_get /home/will/python/cpython/Parser/tokenizer.c:1354:17
#5 0x858e1da in PyTokenizer_Get 
#6 0x85899a7 in parsetok /home/will/python/cpython/Parser/parsetok.c:208:16
#7 0x858b301 in PyParser_ParseFileObject 
#8 0x8439e0b in PyParser_ASTFromFileObject 
#9 0x843aa37 in PyRun_FileExFlags 
#10 0x8438a98 in PyRun_SimpleFileExFlags 
#11 0x84382a6 in PyRun_AnyFileExFlags 
#12 0x813f194 in run_file /home/will/python/cpython/Modules/main.c:318:11
#13 0x813f194 in Py_Main /home/will/python/cpython/Modules/main.c:768
#14 0x8138070 in main /home/will/python/cpython/./Programs/python.c:69:11
#15 0xf7558496 in __libc_start_main (/usr/lib32/libc.so.6+0x18496)

SUMMARY: AddressSanitizer: heap-use-after-free 
(/home/will/python/cpython/python+0x80ad09d) in strncpy
Shadow bytes around the buggy address:
  0x3e4e1dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e4e1de0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3e4e1df0: fa fa fa fa fa