[issue27364] Deprecate invalid unicode escape sequences

2016-09-03 Thread Martin Panter
Martin Panter added the comment: Left some comments for invalid_stdlib_escapes_2.patch -- ___ Python tracker ___

[issue27364] Deprecate invalid unicode escape sequences

2016-09-01 Thread Emanuel Barry
Emanuel Barry added the comment: Thanks Serhiy; it does look better to me too! -- Added file: http://bugs.python.org/file44322/deprecate_invalid_escapes_both_3.patch ___ Python tracker

[issue27364] Deprecate invalid unicode escape sequences

2016-09-01 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I think "invalid escape sequence '\?'" would look cleaner than "invalid escape sequence '?'". -- ___ Python tracker

[issue27364] Deprecate invalid unicode escape sequences

2016-09-01 Thread Emanuel Barry
Emanuel Barry added the comment: Ping. I'd like to get this merged in time for 3.6. Is there anything I can do to speed up the review? Since the change itself is very straightforward, I think this would make sense to merge it now and then fix the invalid escapes that are found during the beta

[issue27364] Deprecate invalid unicode escape sequences

2016-08-23 Thread John Mark Vandenberg
Changes by John Mark Vandenberg : -- nosy: +jayvdb ___ Python tracker ___ ___

[issue27364] Deprecate invalid unicode escape sequences

2016-08-14 Thread Emanuel Barry
Changes by Emanuel Barry : Added file: http://bugs.python.org/file44108/deprecate_invalid_escapes_both_2.patch ___ Python tracker ___

[issue27364] Deprecate invalid unicode escape sequences

2016-08-14 Thread Emanuel Barry
Emanuel Barry added the comment: Here's a new pair of patches for this. There are some small tweaks to the tests, and I properly fixed all instances of invalid escapes (I also made some strings into raw-strings at some places where it's not needed, solely for consistency with surrounding

[issue27364] Deprecate invalid unicode escape sequences

2016-08-11 Thread Emanuel Barry
Emanuel Barry added the comment: Hmm, that's odd, I recall some of the failures from testing, and thought I fixed them. Some of these are brand new, though, so thanks! I'll run and fix the tests (and modules as well); should likely have a patch by the weekend :) --

[issue27364] Deprecate invalid unicode escape sequences

2016-08-11 Thread Martin Panter
Martin Panter added the comment: I am trying out your patch at the moment. There are plenty of test suite failures; I ran the test suite with approximately the following: ./python -bWerror -m test -Wr -j0 -u network -x

[issue27364] Deprecate invalid unicode escape sequences

2016-07-18 Thread Emanuel Barry
Emanuel Barry added the comment: Here's a new patch which also deprecates invalid escape sequences in bytes. Tests included with test_codecs. Patch includes and supersedes deprecate_invalid_escapes_only_3.patch, and I have not found a single instance of an invalid escape sequence other than

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Martin Panter
Martin Panter added the comment: Forgot to say I reviewed invalid_stdlib_escapes_1.patch the other day and can’t see any problems. -- ___ Python tracker

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Emanuel Barry
Emanuel Barry added the comment: Just brought this to the attention of the code-quality mailing list, so linter maintainers should (hopefully!) catch up soon. Also new patch, I forgot to add '\c' in the tests. -- Added file:

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Emanuel Barry
Emanuel Barry added the comment: Easing transition is always a good idea. I'll contact the PyCQA people later today when I'm back home. On afterthought, it makes sense to wait more than two release cycles before making this an error. I don't really have a strong opinion when exactly that

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Guido van Rossum
Guido van Rossum added the comment: I think ultimately it has to become an error (otherwise I wouldn't have agreed to the warning, silent or not). But because there's so much 3rd party code that depends on it we indeed need to take "several" releases before we go there. Contacting the PyCQA

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread R. David Murray
R. David Murray added the comment: Yes, this change is likely to break a lot of code, so an extended deprecation period (certainly longer than 3.7, which Guido has already mandated) is the minimum). Guido hasn't agreed to making it an error yet, as far as I can see ;) --

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread STINNER Victor
STINNER Victor added the comment: @ebarry: To move faster, you should also worker with linters (pylint, pychecker, pyflakes, pycodestyle, flake8, ...) to log a warning to help projects to be prepared this change. linters are used on Python 2-only projects, so it will help them to be prepared

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Emanuel Barry
Emanuel Barry added the comment: I think ultimately a SyntaxError should be fine. I don't know *when* it becomes appropriate to change a warning into an error; I was thinking 3.7 but, as Serhiy said, there's no rush. I think waiting five release cycles is overkill though, that means the error

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: DeprecationWarning is used when we want to remove a feature. It becomes an error in the future. FutureWarning is used when we want change the meaning of a feature instead of removing it. For example re.split(':*', 'a:bc') emits a FutureWarning and returns

[issue27364] Deprecate invalid unicode escape sequences

2016-06-27 Thread STINNER Victor
STINNER Victor added the comment: Guido: "I am okay with making it a silent warning." The current patch raises a DeprecationWarning which is silent by default, but seen using python3 -Wd. What is the "long term" plan: always raise an *exception* in Python 3.7? Which exception? Another option

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Martin Panter
Martin Panter added the comment: Code samples in the documentation should also be fixed, like at . I think you can run “make -C Doc doctest” or something similar, which may help find some of these. Also, playing with your current patch, it

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry
Emanuel Barry added the comment: Indeed, we did, thanks for letting me know my mistake :) I didn't get very far into making bytes literal disallow invalid sequences, as I ran into issues with _codecs.escape_decode throwing the warning even when the literal was fine, and I think I stopped

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Martin Panter
Martin Panter added the comment: Hah, we posted the same fix almost at the same time :) -- ___ Python tracker ___

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Martin Panter
Martin Panter added the comment: Hello Emanual, I think I have fixed your problem with -Werror, by handling the exception returned by PyErr_WarnFormat() (see my patch). Thanks for separating the actual change from the escape violation fixes; it made it easier to spot the real problem :)

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry
Emanuel Barry added the comment: Aaand I feel pretty stupid; I didn't check the return value of PyErr_WarnFormat, so it was my mistake. Attached new patch, actually done right this time. -- Added file: http://bugs.python.org/file43552/deprecate_invalid_escapes_only_2.patch

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry
Emanuel Barry added the comment: Ah right, assert() is only enabled in debug mode, I forgot that. My (very uneducated) guess is that compile() got the error (which was a warning) but then decided to return a value anyway, and the next thing that tries to call anything crashes Python. I opened

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Guido van Rossum
Guido van Rossum added the comment: Hm, if you manage to trigger an assert() in the C code by writing some evil Python code, the C code is considered broken (unless it was using ctypes or one or two other explicit "void-the-warranty" exceptions). Maybe someone who has worked more with the C

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry
Emanuel Barry added the comment: I originally considered making two different patches, so there you go. deprecate_invalid_escapes_only_1.patch has the deprecation plus a test, and invalid_stdlib_escapes_1.patch fixes all invalid escapes in the stdlib. My code was the cause, although no

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Emanuel Barry
Changes by Emanuel Barry : Added file: http://bugs.python.org/file43550/invalid_stdlib_escapes_1.patch ___ Python tracker ___

[issue27364] Deprecate invalid unicode escape sequences

2016-06-26 Thread Guido van Rossum
Guido van Rossum added the comment: I am okay with making it a silent warning. Can we do it in two stages though? It doesn't have to be two releases, I just mean two separate commits: (1) fix all places in the stdlib that violate this principle; (2) separately commit the code that causes the

[issue27364] Deprecate invalid unicode escape sequences

2016-06-24 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- nosy: +gvanrossum ___ Python tracker ___ ___

[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry
Emanuel Barry added the comment: I found the cause of the failed assertion, an invalid escape sequence slipped through in a file. Patch attached (also with Serhiy's comments). It worries me a little though that pure Python code can cause a hard crash. Ok, it worries me a lot. Please don't

[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry
Emanuel Barry added the comment: Thanks, didn't find that one. Apparently Guido's stance is "Make this a silent warning, then we can discuss about preventing it later", which happens to be what I'm doing here. -- ___ Python tracker

[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: There was a long discussion on Python-Dev. [1] Guido taken part in it. [1] http://comments.gmane.org/gmane.comp.python.devel/151612 -- nosy: +serhiy.storchaka ___ Python tracker

[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry
Emanuel Barry added the comment: Yes, it's in use in an awful lot of places (see my patch). The proper fix is to use raw strings, or, if you need actual escapes in the same string, manually escape them. However, as you'll see by looking at the patch, the vast majority of cases are fixed by

[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Antti Haapala
Antti Haapala added the comment: it is handy to be able to use `\w` and `\d` in non-raw-string *regular expressions*, without too much backslashitis. Seems to be in use in Python standard library as well, for example in csv.py -- nosy: +ztane ___

[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread Emanuel Barry
Emanuel Barry added the comment: Now I have! I found nothing on Python-Dev, but apparently it's been discussed on Python-ideas before: https://mail.python.org/pipermail/python-ideas/2015-August/035031.html Guido hasn't participated in that discussion, and most of it was "This will break

[issue27364] Deprecate invalid unicode escape sequences

2016-06-23 Thread R. David Murray
R. David Murray added the comment: Have you searched the python-dev and python-ideas archives for the previous discussions of this issue? I don't remember for sure, but I think Guido might have made a ruling (not that the discussion couldn't be reopened if he has, but, well...) --

[issue27364] Deprecate invalid unicode escape sequences

2016-06-21 Thread Emanuel Barry
New submission from Emanuel Barry: Attached patch deprecates invalid escape sequences in unicode strings. The point of this is to prevent issues such as #27356 (and possibly other similar ones) in the future. Without the patch: >>> "hello \world" 'hello \\world' With the patch: >>> "hello