[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
Amandine Lee added the comment: I confirmed that the behavior acts as described. I added a patch documenting the behavior, built the docs with the patch, and visually confirmed that the docs looks appropriate. Ready for review! -- keywords: +patch nosy: +amandine Added file: http://bugs.python.org/file35532/issue12063.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
Roundup Robot added the comment: New changeset 188e5f42d4aa by Benjamin Peterson in branch '2.7': document TokenError and unclosed expression behavior (closes #12063) http://hg.python.org/cpython/rev/188e5f42d4aa New changeset ddc174c4c7e5 by Benjamin Peterson in branch '3.4': document TokenError and unclosed expression behavior (closes #12063) http://hg.python.org/cpython/rev/ddc174c4c7e5 New changeset 3f2f1ffc3ce2 by Benjamin Peterson in branch 'default': merge 3.4 (#12063) http://hg.python.org/cpython/rev/3f2f1ffc3ce2 -- nosy: +python-dev resolution: - fixed stage: needs patch - resolved status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
Changes by Dustin Haffner nit...@gmail.com: -- nosy: +dhaffner ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
Changes by Petri Lehtinen pe...@digip.org: -- keywords: +easy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
R. David Murray rdmur...@bitdance.com added the comment: I agree with Petri, so I'm setting this to a doc issue. -- assignee: - docs@python components: +Documentation nosy: +docs@python, r.david.murray stage: - needs patch type: - behavior ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
Petri Lehtinen pe...@digip.org added the comment: tokenize processes a line at a time, and noticing that an ending triple quote is missing would mean reading the whole file in the worst case. As tokenize seems to work in a generator-like fashion, it's probably not desired to cache all the input to be able to restart from some previous line. So, I'd go with documenting the behavior. -- nosy: +petri.lehtinen ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
Changes by Petri Lehtinen pe...@digip.org: -- versions: +Python 2.7, Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12063] tokenize module appears to treat unterminated single and double-quoted strings inconsistently
New submission from Devin Jeanpierre jeanpierr...@gmail.com: Tokenizing `' 1 2 3` versus `''' 1 2 3` yields different results. Tokenizing `' 1 2 3` gives: 1,0-1,1:ERRORTOKEN ' 1,2-1,3:NUMBER '1' 1,4-1,5:NUMBER '2' 1,6-1,7:NUMBER '3' 2,0-2,0:ENDMARKER '' while tokenizing `''' 1 2 3` yields: Traceback (most recent call last): File prog.py, line 4, in module tokenize.tokenize(iter([''' 1 2 3]).next) File /usr/lib/python2.6/tokenize.py, line 169, in tokenize tokenize_loop(readline, tokeneater) File /usr/lib/python2.6/tokenize.py, line 175, in tokenize_loop for token_info in generate_tokens(readline): File /usr/lib/python2.6/tokenize.py, line 296, in generate_tokens raise TokenError, (EOF in multi-line string, strstart) tokenize.TokenError: ('EOF in multi-line string', (1, 0)) Apparently tokenize decides to re-tokenize after the erroneous quote in the case of a single-quote, but not a triple-quote. I guess that this is because retokenizing the rest of the file after an unclosed triple-quote would be expensive; however, I've also been told it's very strange and possibly wrong for tokenize to be inconsistent this way. If this is the right behavior, I guess I'd like it if it were documented. This sort of thing is confusing / potentially misleading for users of the tokenize module. Or at least, when I saw how single quotes were handled, I assumed incorrectly that all quotes were handled that way. -- messages: 135836 nosy: Devin Jeanpierre priority: normal severity: normal status: open title: tokenize module appears to treat unterminated single and double-quoted strings inconsistently ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12063 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com