subject:"\[issue17125\] tokenizer.tokenize passes a bytes object to str.startswith"

[issue17125] tokenizer.tokenize passes a bytes object to str.startswith

2013-02-04 Thread Tyler Crompton

New submission from Tyler Crompton: Line 402 in lib/python3.3/tokenize.py, contains the following line: if first.startswith(BOM_UTF8): BOM_UTF8 is a bytes object. str.startswith does not accept bytes objects. I was able to use tokenize.tokenize only after making the following changes:

[issue17125] tokenizer.tokenize passes a bytes object to str.startswith

2013-02-04 Thread R. David Murray

R. David Murray added the comment: The docs could certainly be more explicit...currently they state that tokenize is *detecting* the encoding of the file, which *implies* but does not make explicit that the input must be binary, not text. The doc problem will get fixed as part of the fix to