[issue33337] Provide a supported Concrete Syntax Tree implementation in the standard library

Łukasz Langa Mon, 23 Apr 2018 01:02:27 -0700

Łukasz Langa <luk...@langa.pl> added the comment:

> These modification are applied only before bytecodecode generation. The AST 
> presented to user is not modified.


This bit me when implementing PEP 563 but I was then on the compile path, 
right.  Still, the latest docstring folding would qualify as an example here, 
too, no?


> Is this a problem? 2.7 is a dead end, its support will be ended in less than 
> 2 years. Even 3.6 will be moved to a security only fixes stage short time 
> after releasing 3.8.

Yes, it is a problem.  We will support Python 2 until 2020 but people will be 
running Python 2 code for a decade *at least*.  We need to provide those people 
a way to move their code forward.  Static analysis tools like formatters, 
linters, type checkers, or 2to3-style translators, are all soon going to run on 
Python 3.  It would be a shame if those programs were barred from helping users 
that are still struggling on Python 2.

A closer example is async/await.  It would be a shame if running on Python 3.7 
meant you can't write a tool that renames (or even just *detects*) invalid uses 
of async/await.  I firmly believe that the version of the runtime should be 
indepedent of the version it's able to analyze.


> I'm in favor of updating Lib/lib2to3/pgen2/tokenize.py, but I don't 
> understand why Lib/tokenize.py should parse 2.7.

Hopefully I sufficiently explained that above.


> I'm in favor of reimplementing pgen in Python if this will simplify the code 
> and the building process. Python code is simpler than C code, this code is 
> not performance critical, and in any case we need an external Python when 
> modify grammar of bytecode.

Well, I didn't think about abandoning pgen.  I admit that's mostly because my 
knee-jerk reaction was that it would be too slow.  But you're right that this 
is not performance critical because every `pip install` runs `compileall`.

I guess we could parse in "strict" mode for Python itself but allow for 
multiple grammars for standard library use (as I explained in the reply to 
Guido).  And this would most likely give us opportunity to iterate on grammar 
improvements in the future.

And yet, I'm cautious here.  Even ignoring performance, that sounds like a more 
ambitious task from what I'm attempting.  Unless I find partners in crime for 
this, I wouldn't attempt that.  And I would need thumbs up from the BDFL and 
performance-wary contributors.


> For what purposes the CST is needed besides 2to3?

Anywhere where you need the full view of the code which includes non-semantic 
pieces.  Those include:
- whitespace;
- comments;
- parentheses;
- commas;
- strings prefixes.

The main use case is linters and refactoring tools.  For example mypy is using 
a modified AST to support type comments.  YAPF and Black are based on lib2to3 
because as formatters they can't lose comments, string prefixes, and 
organizational parentheses either.  JEDI is using Parso, a lib2to3 fork, for 
similar reasons.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33337>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33337] Provide a supported Concrete Syntax Tree implementation in the standard library

Reply via email to