New submission from Erlend Egeberg Aasland <[email protected]>:
Incomplete unicode literals abort iso. generating SyntaxError:
(lldb) target create "./python.exe"
Current executable set to '/Users/erlendaasland/src/cpython.git/python.exe'
(x86_64).
(lldb) r
Process 98955 launched: '/Users/erlendaasland/src/cpython.git/python.exe'
(x86_64)
Python 3.10.0a6+ (heads/main:9a50ef43e4, Mar 22 2021, 11:18:33) [Clang 12.0.0
(clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> "\u1f"
Assertion failed: (col_offset >= 0 && (unsigned long)col_offset <=
strlen(str)), function byte_offset_to_character_offset, file Parser/pegen.c,
line 150.
Process 98955 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = hit program assert
frame #4: 0x0000000100009bd6
python.exe`byte_offset_to_character_offset(line=0x00000001013f1220,
col_offset=7) at pegen.c:150:5
147 if (!str) {
148 return 0;
149 }
-> 150 assert(col_offset >= 0 && (unsigned long)col_offset <= strlen(str));
151 PyObject *text = PyUnicode_DecodeUTF8(str, col_offset, "replace");
152 if (!text) {
153 return 0;
Target 0: (python.exe) stopped.
(lldb) p col_offset
(Py_ssize_t) $0 = 7
(lldb) p str
(const char *) $1 = 0x00000001013f1250 "\"\\u1f\""
(lldb) p (size_t) strlen(str)
(size_t) $2 = 6
Python 3.9 behaviour:
Python 3.9.2 (v3.9.2:1a79785e3e, Feb 19 2021, 09:06:10)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> "\u1f"
File "<stdin>", line 1
"\u1f"
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in
position 0-3: truncated \uXXXX escape
Git bisect says the regression was introduced by this commit:
commit 08fb8ac99ab03d767aa0f1cfab3573eddf9df018
Author: Pablo Galindo <[email protected]>
Date: Thu Mar 18 01:03:11 2021 +0000
bpo-42128: Add 'missing :' syntax error message to match statements
(GH-24733)
I made a workaround (see attached patch), but I guess that's far from the
correct solution :)
----------
components: Unicode
files: patch.diff
keywords: patch
messages: 389297
nosy: erlendaasland, ezio.melotti, lys.nikolaou, pablogsal, vstinner
priority: normal
severity: normal
status: open
title: Parser aborts on incomplete/incorrect unicode literals in interactive
mode
type: crash
versions: Python 3.10
Added file: https://bugs.python.org/file49900/patch.diff
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue43591>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com