[issue12675] tokenize module happily tokenizes code with syntax errors

Gareth Rees Thu, 04 Aug 2011 05:21:15 -0700

Gareth Rees <g...@garethrees.org> added the comment:

Having looked at some of the consumers of the tokenize module, I don't think my 
proposed solutions will work.


It seems to be the case that the resynchronization behaviour of tokenize.py is 
important for consumers that are using it to transform arbitrary Python source 
code (like 2to3.py). These consumers are relying on the "roundtrip" property 
that X == untokenize(tokenize(X)). So solution (1) is necessary for the 
handling of tokenization errors.

Also, that fact that TokenInfo is a 5-tuple is relied on in some places (e.g. 
lib2to3/patcomp.py line 38), so it can't be extended. And there are consumers 
(though none in the standard library) that are relying on type=ERRORTOKEN being 
the way to detect errors in a tokenization stream. So I can't overload that 
field of the structure.

Any good ideas for how to record the cause of error without breaking backwards 
compatibility?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12675>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue12675] tokenize module happily tokenizes code with syntax errors

Reply via email to