Bob Kline <[email protected]> added the comment:
Ah, this is worse than I first thought. It's not just converting code by adding
extra backslashes to regular expression strings, where at least the regular
expression engine will do what the original code was asking the Python parser
to do (unless user code checks for and enforces limits on regular expression
string lengths, so even that case is broken), but 2to3 is also mangling strings
in places where the behavior is changed (that is, broken). 2to3 wants to change
if c not in ".-_:\u00B7\u0e87":
to
if c not in ".-_:\\u00B7\\u0e87":
Not the same thing at all, as illustrated here:
$ python
Python 3.7.3 (default, Jun 19 2019, 07:38:49)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> len("\u00B7")
1
>>> len("\\u00B7")
6
>>>
That breaks the original code. This is a serious bug.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue37996>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com