I think I'll reply to both of you in one post since the thoughts are related.
Ary Borenszweig wrote: > Ellery Newcomer wrote: >> >> Not that I know anything about DMD outside parse.c, but in >> expression.h, there be decls along the lines of >> >> struct UnaExp{ >> Expression* e1; >> } >> struct BinExp{ >> Expression* e1; >> Expression* e2; >> } >> >> Iterating them would just be a tree walk. That takes care of 90% and ... >> >> Oh. Yeah. There are a bunch of special cases. >> >> But from what I've seen, iterating over an expression or any part of >> the AST involves implementing a function (like semantic) at each >> subclass. I'm betting there is no other way to do this. >> D'oh. Yep that's what I was noticing too. Thanks for confirming that. >> I love ANTLR. It generates an arbitrary child/sibling AST, and walking >> it is a piece of cake. > > It would be better if dmd's source code implemented the visitor pattern. > Descent's port implements it and it uses it a a lot of places. I wonder if Walter would go for it. Is he very receptive of purely architectural patches for dmd? ... If you care for a bit of reading, I'll tell you my story and ping an idea. This is about the property expression rewrite of course. I'd love to just use the current convention in dmd and write the rewrite as a non-recursive function that gets called at every point in the tree whenever someExpr->semantic(sc) is called. However, there's a snag. When doing the property rewrite, the inner-most subexpression will need to generate the outermost temporaries. An example: prop1.prop2 += foo; AFAIK, the compiler sees this as ((prop1).prop2) += foo; or ((prop1()).prop2()) += foo; after resolveProperties is called. So prop1 is the innermost expression. It would be rewritten as this comma expression: (auto t1 = prop1()), (auto t2 = t1.prop2()), (t2 += foo), (t1.prop2(t2)), (prop1(t1)) So t1 is the outermost temporary, and it corresponds to prop1, the innermost expression. This is problematic in dmd's traditional approach, because the semantic pass on prop1 does not have access to it's enclosing expression (as far as I can tell). If it did, I'd just go to the root of the tree, uproot the tree, stick it in a comma expression, and just start nesting: CommaExp // Step 1a for the rewrite. | +--> auto t1 = prop1() | +--> CommaExp // Step 1b . | . +--> CommaExp // Step 2a . | | . | +--> auto t2 = t1.prop2() . | | . | +--> CommaExp // Step 2b . | | . | +--> t2 += foo; // Originally the root. . | | . | +--> t1.prop2(t2); . | . +--> prop1(t1) Then as long as the innermost expression does its rewrite before its parent, then it will work. If it doesn't go first, well I dunno. I don't seem to have this option anyways. I can't find a way to get the root. Perhaps adding a root expression pointer to the Scope (sc) object that gets tossed around semantic would work, but I haven't considered the bookkeeping involved with that. I already started taking a different approach before thinking of that. My plan then is to do the rewrite after the root's expr->semantic(sc) call. It will probably look like expr->doPropertyRewrite(), and be called from statement.c. But in there will be the guts of the function that has a signature like this: Expression *doPropertyRewriteGreedily( Expression *expr, // The current expression being rewritten. Expressions *prolog, // Temporary decls and getter calls go here. Expressions *epilog, // Setter calls go here. int treatAsLvalue // Is this expression mutated by its parent? ) This works out pretty nice since adding the temporaries and such into the prolog and epilog looks very natural. The function just calls itself to walk the AST. The problem is that walking the AST thing. The function has to know about expr's children in order to recurse on them. This is easy in some cases like postfix ++ and --, because that pattern is represented by one subclass of Expression: PostExp. Same for the properties themselves (CallExp). But then there's the more general patterns like all other impure expressions (+=,-=,*=,/=, etc) and all pure expressions ever (+,-,*,/,%,&,|, pure functions, literals, etc). In a lot of those cases the rewrite doesn't care about analyzing the expression beyond a certain point, or at all, and it just wants to forward the children to the next recursion level. That's where I hit this iteration problem and all of the special cases. Now for the idea. Currently I'm planning an approach that's a bit more minimalistic than the visitor pattern; just to solve my immediate problem. I will define these methods in Expression: virtual Expression *getChild(unsigned index); virtual void setChild(unsigned index, Expression* val); virtual unsigned numChildren(); Then I will implement them as needed in the subclasses of Expression. I figure I'd have to write this logic into my function anyways, so I might as well make this usable to others. I've already written a draft of the property rewrite that uses these, so I know they get the job done in principle. I just hope this will all be OK in the patch (or even holds up under debugging). So yeah, if I haven't put you to sleep already, please let me know if I messed up somewhere. - Chad