On Aug 15, 2019, at 13:02, Caleb Donovick <donov...@cs.stanford.edu> wrote:
> 
> When walking an ast it impossible to know the type of an empty list without 
> writing down some giant lookup from node types and field names to field types.

This part could be solved without making the lists AST nodes at all. Just use 
new bare subclasses of list for each of the kinds of lists, so a field in a 
node never has an empty list, it has an empty ExprList or StmtList or whatever. 
If that’s sufficient for your needs, that seems pretty easy, and I don’t think 
it would have any compatibility problems.

If that’s not sufficient, and you really do need StmtList, etc. to be nodes, 
it’s a bit trickier, but I think still doable.

Before I get to that, it seems like either the simple version above or the more 
involved version below is something you could write pretty easily (replacing 
ast.py with a fork but still using the same _ast.so under the covers), and 
share as a third-party module. I’m not sure how useful it would be on PyPI, but 
it would at least be useful to demonstrate to people here. Of course the real
implementation will probably require substantial changes to the C code, but if 
you can build a playable prototype without any such changes, that’s always 
handy.

Also, while a I can see a use for StmtList (because every StmtList is a block 
or a module body, so they all have quite a bit in common), where would you use 
ExprList? Is there anything you want to do with both decorator chains and del 
targets but not with non-ExprList nodes? I think to actually be useful, you’d 
need a bunch of separate subclasses of ExprList. And meanwhile, some of the 
lists that the grammar requires, like WithItemList, don’t seem like they’d ever 
be useful either. So, you are exposing a lot of new public types that nobody 
will ever care about, making it harder to find the ones they do care about in 
help, etc. I could be wrong; maybe you do need to distinguish among all these 
different kinds of lists by type, but don’t need to distinguish the different 
kinds of ExprList by type. But this implies that you probably need to give us a 
more concrete use case before the idea can be judged fairly.

Anyway, on to the fun bit:

> More concretely it would nice be to able to programatically visit all blocks 
> (stmt*)  without having to something like:
> 
> ```
> class BlockVisitor(NodeVisitor):
>     def visit_If(self, node: If):
>         self.visit(node.test)
>         self.visit_block(node.body)
>         self.visit_block(node.orelse)
>        
>     def visit_FunctionDef(self, node: FunctionDef):
>         for field, value in iter_fields(node):
>             if field == 'body':
>                 self.visit_block(value)
>             else:
>                 # the implementation of generic_visit
> ```
> Now it turns out that all fields that are lists and are named "body", 
> "orelse", or "finalbody" are stmt* and only such fields are stmt*.  A rule 
> could also be synthesized to identify expr* and so forth but this seems 
> incredibly hacky to me.
> 
> It would be much cleaner if <type>* were actual nodes in the ast. E.g. 
> something like: 
> ```
> class ast_list(AST, MutableSequence[T_co]): ... 
> class StmtList(ast_list[stmt]): ... 
> class ExprList(ast_list[expr]): ...
> ...
> class FunctionDef(stmt):
>     name: identifier
>     args: arguments
>     body: StmtList
>     decorator_list: ExprList
>     returns: Optional[expr]
> ```

The first problem with treating lists as nodes is that the grammar doesn’t 
treat lists as nodes, so you’re introducing a disconnect. But it’s obviously a 
useful disconnect in some cases. So maybe the right answer is to add a new flag 
param lists_as_nodes=False to iter_child_nodes (and any functions that pass 
through to it like walk), NodeVisitor, and NodeTransformer?

Making them nodes also means visiting an AST in 3.9 would see a very different 
tree than visiting the same AST in 3.8. And, while ast explicitly allows for 
breaking changes like that between versions, it still seems like a good idea to 
avoid gratuitous massive changes when possible. But the flag solves that too.

The bigger problem is that if you have any code that walks ASTs manually— 
including generic code similar to walk or NodeVisitor—that code isn’t going to 
know how to walk the elements of an ast_list, because they aren’t fields. 
That’s a huge backward compatibility problem. But actually, I think this can be 
solved very easily: just add a field to ast_list named elts that contains a 
reference to self (or even a copy of the same list as a plain old list).

So that just leaves the type issue you started with. You want ast_list to be an 
AST node, ideally without breaking the fact that it’s a list. But you can’t 
subclass both list and AST because AST is a structseq, and you can only have 
one base class with struct members. But I don’t think there’s any actual 
behavior in AST that you need (except maybe for looking like a structseq, with 
_fields, but you can fake that with a plain old class a la namedtuple or 
dataclass, as long as there’s no C code that relies on being able to iterate 
the structseq fields). It I’m right, there are two possible solutions you could 
explore.

Option 1 is to rename AST to _AST, then make a new empty base class AST that 
both _AST and ast_list subclass. (The lists_as_nodes flag then just selects 
between testing AST and _AST.)

Option 2 is to only virtually subclass AST. The new ast_list inherits only from 
list, but AST has a subclass hook that accepts ast_list as a subclass. (The 
lists_as_nodes flag becomes slightly more complicated, but still pretty 
simple—and besides, that’s all under-the-covers code.)


_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MHWYZSKMMYDR24XYV4BYNY7WFUZU6EQD/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to