[issue35107] untokenize() fails on tokenize output when a newline is missing

Terry J. Reedy Tue, 30 Oct 2018 08:00:09 -0700


Terry J. Reedy <[email protected]> added the comment:


It seems to me a bug that if '\n' is not present, tokenize adds both NL and 
NEWLINE tokens, instead of just one of them.  Moreover, both tuples of the 
double correction look wrong.

If '\n' is present,
  TokenInfo(type=56 (NL), string='\n', start=(1, 1), end=(1, 2), line='#\n')
looks correct.

If NL represents a real character, the length 0 string='' in the generated
  TokenInfo(type=56 (NL), string='', start=(1, 1), end=(1, 1), line='#'),
seems wrong.  I suspect that the idea was to mis-represent NL to avoid '\n' 
being added by untokenize.  In
  TokenInfo(type=4 (NEWLINE), string='', start=(1, 1), end=(1, 2), line='')
string='' is mismatched by length = 2-1 = 1.  I am inclined to think that the 
following would be the correct added token, which should untokenize correctly
  TokenInfo(type=4 (NEWLINE), string='', start=(1, 1), end=(1, 1), line='')

ast.dump(ast.parse(s)) returns 'Module(body=[])' for both versions of 's', so 
no help there.

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue35107>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue35107] untokenize() fails on tokenize output when a newline is missing

Reply via email to