[Python-Dev] Re: PEP 617: New PEG parser for CPython

Gregory P. Smith Tue, 21 Apr 2020 22:02:25 -0700

On Tue, Apr 21, 2020 at 9:35 PM Gregory P. Smith <g...@krypto.org> wrote:


> Could we go ahead and mark lib2to3 as Pending Deprecation in 3.9 so we can
> get it out of the stdlib by 3.11 or 3.12?
>

I'm going ahead and tracking the idea in https://bugs.python.org/issue40360.


>
> lib2to3 is the basis of all sorts of general source code manipulation
> tooling.  Its name and original reason d'etre have moved on.  It is
> actively used to parse and rewrite Python 3 code all the time.  yapf uses
> it, black uses a fork of it.  Other Python code manipulation tooling uses
> it.  Modernize like fixers are useful for all sorts of cleanups.
>
> IMNSHO it would be better if lib2to3 were *not* in the stdlib anymore -
> Black already chose to fork lib2to3
> <https://github.com/psf/black/tree/master/blib2to3>.  So given that it is
> eventually not going to be able to parse future syntax, the better answer
> seems like deprecation, putting the final version up on PyPI and letting
> any descendants of it live on PyPI where they can get more active care than
> a stdlib module ever does.
>
> -gps
>
>
> On Tue, Apr 21, 2020 at 6:58 PM Guido van Rossum <gu...@python.org> wrote:
>
>> Great! Please submit a PR to update the [lib]2to3 docs and CC me
>> (@gvanrossum).
>>
>> While perhaps it wouldn't hurt if the PEP mentioned lib2to3, it was just
>> accepted by the Steering Council without such language, and I wouldn't want
>> to imply that the SC agrees with everything I said. So I still think we
>> ought to deal with lib2to3 independently (and no, it won't need its own PEP
>> :-). A reasonable option would be to just deprecate it and recommend people
>> use parso, LibCST or something else (I wouldn't recommend pegen in its
>> current form yet).
>>
>> On Tue, Apr 21, 2020 at 6:21 PM Carl Meyer <c...@oddbird.net> wrote:
>>
>>> On Sat, Apr 18, 2020 at 10:38 PM Guido van Rossum <gu...@python.org>
>>> wrote:
>>> >
>>> > Note that, while there is indeed a docs page about 2to3, the only docs
>>> for lib2to3 in the standard library reference are a link to the source code
>>> and a single "Note: The lib2to3 API should be considered unstable and may
>>> change drastically in the future."
>>> >
>>> > Fortunately,  in order to support the 2to3 application, lib2to3
>>> doesn't need to change, because the syntax of Python 2 is no longer
>>> changing. :-) Choosing to remove 2to3 is an independent decision. And
>>> lib2to3 does not depend in any way on the old parser module. (It doesn't
>>> even use the standard tokenize module, but incorporates its own version
>>> that is slightly tweaked to support Python 2.)
>>>
>>> Indeed! Thanks for clarifying, I now recall that I already knew it
>>> doesn't, but forgot.
>>>
>>> The docs page for 2to3 does currently say "lib2to3 could also be
>>> adapted to custom applications in which Python code needs to be edited
>>> automatically." Perhaps at least this sentence should be removed, and
>>> maybe also replaced with a clearer note that lib2to3 not only has an
>>> unstable API, but also should not necessarily be expected to continue
>>> to parse future Python versions, and thus building tools on top of it
>>> should be discouraged rather than recommended. (Maybe even use the
>>> word "deprecated.") Happy to submit a PR for this if you agree it's
>>> warranted.
>>>
>>> It still seems to me that it wouldn't hurt for PEP 617 itself to also
>>> mention this shift in lib2to3's effective status (from "available but
>>> no API stability guarantee" to "probably will not parse future Python
>>> versions") as one of its indirect effects.
>>>
>>> > You've mentioned a few different tools that already use different
>>> technologies: LibCST depends on parso which has a fork of pgen2, lib2to3
>>> which has the original pgen2. I wonder if this would be an opportunity to
>>> move such parsing support out of the standard library completely. There are
>>> already two versions of pegen, but neither is in the standard library:
>>> there is the original pegen repo which is where things started, and there
>>> is a fork of that code in the CPython Tools directory (not yet in the
>>> upstream repo, but see PR 19503).
>>> >
>>> > The pegen tool has two generators, one generating C code and one
>>> generating Python code. I think that the C generator is really only
>>> relevant for CPython itself: it relies on the builtin tokenizer (the one
>>> written in C, not the stdlib tokenize.py) and the generated C code depends
>>> on many internal APIs. In fact the C generator in the original pegen repo
>>> doesn't work with Python 3.9 because those internal APIs are no longer
>>> exported. (It also doesn't work with Python 3.7 or older because it makes
>>> critical use of the walrus operator. :-) Also, once we started getting
>>> serious about replacing the old parser, we worked exclusively on the C
>>> generator in the CPython Tools directory, so the version in the original
>>> pegen repo is lagging quite a bit behind (is is the Python grammar in that
>>> repo). But as I said you're not gonna need it.
>>> >
>>> > On the other hand, the Python generator is designed to be flexible,
>>> and while it defaults to using the stdlib tokenize.py tokenizer, you can
>>> easily hook up your own. Putting this version in the stdlib would be a
>>> mistake, because the code is pretty immature; it is really waiting for a
>>> good home, and if parso or LibCST were to decide to incorporate a fork of
>>> it and develop it into a high quality parser generator for Python-like
>>> languages that would be great. I wouldn't worry much about the duplication
>>> of code -- the Python generator in the CPython Tools directory is only used
>>> for one purpose, and that is to produce the meta-parser (the parser for
>>> grammars) from the meta-grammar. And I would happily stop developing the
>>> original pegen once a fork is being developed.
>>>
>>> Thanks, this is all very clarifying! I hadn't even found the original
>>> gvanrossum/pegen repo, and was just looking at the CPython PR for PEP
>>> 617. Clearly I haven't been following this work closely.
>>>
>>> > Another option would be to just improve the python generator in the
>>> original pegen repo to satisfy the needs of parso and LibCST. Reading the
>>> blurb for parso it looks like it really just parses *Python*, which is less
>>> ambitious than pegen. But it also seems to support error recovery, which
>>> currently isn't part of pegen. (However, we've thought about it.) Anyway,
>>> regardless of how exactly this is structured someone will probably have to
>>> take over development and support. Pegen started out as a hobby project to
>>> educate myself about PEG parsers. Then I wrote a bunch of blog posts about
>>> my approach, and finally I started working on using it to generate a
>>> replacement for the old pgen-based parser. But I never found the time to
>>> make it an appealing parser generator tool for other languages, even though
>>> that was on my mind as a future possibility. It will take some time to
>>> disentangle all this, and I'd be happy to help someone who wants to work on
>>> this.
>>>
>>> This seems like the place to start. When we start work on Python 3.10
>>> support for LibCST, we can start with trying to use and adapt pegen in
>>> place of the vendored fork of parso we currently use, and if that's
>>> promising enough, consider taking over maintenance of it.
>>>
>>> Carl
>>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> *Pronouns: he/him **(why is my pronoun here?)*
>> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
>> _______________________________________________
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/RDV3GAL2SPBGDAOX27ZMXMX7SETZ3D7M/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q4SBIFUGFKYIJ437FG5IS7YNESS2C5LI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 617: New PEG parser for CPython

Reply via email to