[issue46410] TypeError when parsing regexp with unicode named character sequence escape

2022-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: >>> import unicodedata >>> unicodedata.lookup('KEYCAP NUMBER SIGN') '#️' >>> print(ascii(unicodedata.lookup('KEYCAP NUMBER SIGN'))) '#\ufe0f\u20e3' Support of Unicode Named Character Sequences in the unicodeescape codec and in the RE parser would be a new

[issue46410] TypeError when parsing regexp with unicode named character sequence escape

2022-01-18 Thread Matthew Barnett
Matthew Barnett added the comment: They're not supported in string literals either: Python 3.10.1 (tags/v3.10.1:2cd268a, Dec 6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> "\N{KEYCAP NUMBER SIGN}" File "",

[issue46410] TypeError when parsing regexp with unicode named character sequence escape

2022-01-17 Thread Jirka Marsik
New submission from Jirka Marsik : re.compile(r"\N{name of Unicode Named Character Sequence}"), e.g. re.compile(r"\N{KEYCAP NUMBER SIGN}"), throws a TypeError. The regular expression parser relies on 'unicodedata' to lookup character names. The 'unicodedata' module recently added support for