[issue3353] make built-in tokenizer available via Python C API

Anthony Sottile Wed, 27 Jan 2021 09:18:22 -0800


Anthony Sottile <asott...@umich.edu> added the comment:


you already have that right now because the `tokenize` module is exposed. 
(except that every change to the tokenization requires it to be implemented 
once in C and once in python)

it's much more frustrating when the two differ as well

I don't think all the internals of the C tokenization need to be exposed, my 
main goals would be:

- expose enough information to reimplement Lib/tokenize.py
- replace Lib/tokenize.py with the C tokenizer

and the reasons would be:

- eliminate the (potential) drift and complexity between the two
- get a fast tokenizer


Unlike the AST, the tokenization changes much less frequently (last major 
addition I can remember is the `@` operator


We can hide almost all of the details of the tokenization behind an opaque 
struct and getter functions

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue3353>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue3353] make built-in tokenizer available via Python C API

Reply via email to