New submission from Bob Kline <bkl...@rksystems.com>:

-    UNWANTED = re.compile("""['".,?!:;()[\]{}<>\u201C\u201D\u00A1\u00BF]+""")
+    UNWANTED = 
re.compile("""['".,?!:;()[\]{}<>\\u201C\\u201D\\u00A1\\u00BF]+""")

The non-ASCII characters in the original string are perfectly legitimate str 
characters, using valid standard escapes recognized and handled by the Python 
parser. It is unnecessary to lengthen the string argument passed to 
re.compile() and defer the conversion of the doubled escapes for the regular 
expression engine to handle.

----------
components: 2to3 (2.x to 3.x conversion tool)
messages: 350922
nosy: bkline
priority: normal
severity: normal
status: open
title: 2to3 introduces unwanted extra backslashes for unicode characters in 
regular expressions
type: behavior
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37996>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to