On Aug 15, 2019, at 13:02, Caleb Donovick <donov...@cs.stanford.edu> wrote: > > When walking an ast it impossible to know the type of an empty list without > writing down some giant lookup from node types and field names to field types.
This part could be solved without making the lists AST nodes at all. Just use new bare subclasses of list for each of the kinds of lists, so a field in a node never has an empty list, it has an empty ExprList or StmtList or whatever. If that’s sufficient for your needs, that seems pretty easy, and I don’t think it would have any compatibility problems. If that’s not sufficient, and you really do need StmtList, etc. to be nodes, it’s a bit trickier, but I think still doable. Before I get to that, it seems like either the simple version above or the more involved version below is something you could write pretty easily (replacing ast.py with a fork but still using the same _ast.so under the covers), and share as a third-party module. I’m not sure how useful it would be on PyPI, but it would at least be useful to demonstrate to people here. Of course the real implementation will probably require substantial changes to the C code, but if you can build a playable prototype without any such changes, that’s always handy. Also, while a I can see a use for StmtList (because every StmtList is a block or a module body, so they all have quite a bit in common), where would you use ExprList? Is there anything you want to do with both decorator chains and del targets but not with non-ExprList nodes? I think to actually be useful, you’d need a bunch of separate subclasses of ExprList. And meanwhile, some of the lists that the grammar requires, like WithItemList, don’t seem like they’d ever be useful either. So, you are exposing a lot of new public types that nobody will ever care about, making it harder to find the ones they do care about in help, etc. I could be wrong; maybe you do need to distinguish among all these different kinds of lists by type, but don’t need to distinguish the different kinds of ExprList by type. But this implies that you probably need to give us a more concrete use case before the idea can be judged fairly. Anyway, on to the fun bit: > More concretely it would nice be to able to programatically visit all blocks > (stmt*) without having to something like: > > ``` > class BlockVisitor(NodeVisitor): > def visit_If(self, node: If): > self.visit(node.test) > self.visit_block(node.body) > self.visit_block(node.orelse) > > def visit_FunctionDef(self, node: FunctionDef): > for field, value in iter_fields(node): > if field == 'body': > self.visit_block(value) > else: > # the implementation of generic_visit > ``` > Now it turns out that all fields that are lists and are named "body", > "orelse", or "finalbody" are stmt* and only such fields are stmt*. A rule > could also be synthesized to identify expr* and so forth but this seems > incredibly hacky to me. > > It would be much cleaner if <type>* were actual nodes in the ast. E.g. > something like: > ``` > class ast_list(AST, MutableSequence[T_co]): ... > class StmtList(ast_list[stmt]): ... > class ExprList(ast_list[expr]): ... > ... > class FunctionDef(stmt): > name: identifier > args: arguments > body: StmtList > decorator_list: ExprList > returns: Optional[expr] > ``` The first problem with treating lists as nodes is that the grammar doesn’t treat lists as nodes, so you’re introducing a disconnect. But it’s obviously a useful disconnect in some cases. So maybe the right answer is to add a new flag param lists_as_nodes=False to iter_child_nodes (and any functions that pass through to it like walk), NodeVisitor, and NodeTransformer? Making them nodes also means visiting an AST in 3.9 would see a very different tree than visiting the same AST in 3.8. And, while ast explicitly allows for breaking changes like that between versions, it still seems like a good idea to avoid gratuitous massive changes when possible. But the flag solves that too. The bigger problem is that if you have any code that walks ASTs manually— including generic code similar to walk or NodeVisitor—that code isn’t going to know how to walk the elements of an ast_list, because they aren’t fields. That’s a huge backward compatibility problem. But actually, I think this can be solved very easily: just add a field to ast_list named elts that contains a reference to self (or even a copy of the same list as a plain old list). So that just leaves the type issue you started with. You want ast_list to be an AST node, ideally without breaking the fact that it’s a list. But you can’t subclass both list and AST because AST is a structseq, and you can only have one base class with struct members. But I don’t think there’s any actual behavior in AST that you need (except maybe for looking like a structseq, with _fields, but you can fake that with a plain old class a la namedtuple or dataclass, as long as there’s no C code that relies on being able to iterate the structseq fields). It I’m right, there are two possible solutions you could explore. Option 1 is to rename AST to _AST, then make a new empty base class AST that both _AST and ast_list subclass. (The lists_as_nodes flag then just selects between testing AST and _AST.) Option 2 is to only virtually subclass AST. The new ast_list inherits only from list, but AST has a subclass hook that accepts ast_list as a subclass. (The lists_as_nodes flag becomes slightly more complicated, but still pretty simple—and besides, that’s all under-the-covers code.) _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MHWYZSKMMYDR24XYV4BYNY7WFUZU6EQD/ Code of Conduct: http://python.org/psf/codeofconduct/