This sounds like a very reasonable approach. Pruning clinits based on the whole program CFG is definitely interesting, but is there an estimate on the saving (speed/size) that would be gained from that optimization? I ask, just as a cross-check.
On Wed, Jun 17, 2009 at 1:16 PM, Ray Cromwell <cromwell...@gmail.com> wrote: > > Miguel, for what it's worth, I agree wholeheartedly. Object oriented > programming tends to encourage small block sizes except where inlining helps > to enlarge them, so inter-block definitely helps alot. Moreover, you can't > really build a good CFA/DFA without inter-block information propagation. > Here's an off the cuff proposal for doing this later: > > Step 1, as I outlined, is to take JExpressions which are not literals, > fieldrefs, etc and introduce temporaries so as to introduce a three-address > style into the AST. > e.g. > a + b where a is unary/binary/etc op becomes t = a, t + b > return expr becomes t=expr; return t > while(expr) becomes t=expr, while(t) { t=expr; } > etc > > This gives us an AST structure amenable to traditional analysis without > introducing a totally new IR, we can reuse most of the existing JNode > classes, only unlike what you'd find in the Dragon Book, we keep high level > ops for loops, conditionals, method calls, and everything else. > > Step 2, introduce a new JPhi node (extends JNode). Build CFG. Add steps to > compute dominator trees/dominance frontiers and use these to insert JPhi > nodes into the ASTs. Modify all of the existing visitors to deal with JPhi. > Add stage to unSSA. > > Step 3: Introduce optimizations to this structure. Edge-split form. Faster > 'optimal' dominance algorithms (I say, use something simpler like the Cooper > technique at first: > http://www.hipersoft.rice.edu/grads/publications/dom14.pdf) > > The crucial question is whether to do procedure level CFG, or whole program > CFG. One of the things that interested me about the potential of building a > limited whole-program CFG was the ability to aggressively (and correctly) > prune clinits. Now, if the eager clinit hoisting is acceptable, than this > would be overkill, but if not, then the way to do it is to compute the > immediate dominator hierarchy for each block. > > If block B_j contains clinitA() and it is dominated by some block B_i in > its immediate dominators which contains a call to clinitA(), then it is safe > to remove clinitA() from B_j since all possible paths of execution for > reaching B_j must first go through B_i, guaranteeing the clinitA() would > have been invoked. > > -Ray > > 2009/6/17 Miguel Méndez <mmen...@google.com> > >> If you are going to tackle the intra-block optimization, you might as well >> for inter-block opt with block summaries that merge across the control flow. >> The later is not that much harder since you summarize the intra-block data >> and propagate. But yes, doing SSA for real is a non-trivial amount of work. >> >> >> On Wed, Jun 17, 2009 at 11:44 AM, Ray Cromwell <cromwell...@gmail.com>wrote: >> >>> >>> I agree. My current proposal really isn't SSA, but poor man's SSA, which >>> is to simply flatten blocks by introducing temporaries, renumber the >>> existing variables, perform classic intra-block optimizations, and then >>> undo. This would atleast result in some modest block level improvements, >>> including making builder pattern more optimal. The in-block renumbering >>> procedure I've come up with essentially constructs the liveness range >>> information as your doing it, without having to deal with the more exotic >>> SSA algorithms. It might be small enough to perform a useful experiment >>> without too much work. >>> Doing real SSA would require quite a bit more work, especially if you >>> want to deal fully with exceptions and arrays. That to me is something >>> that's too big of a change to the current compiler infrastructure at the >>> moment. >>> >>> -Ray >>> >>> >>> 2009/6/17 Miguel Méndez <mmen...@google.com> >>> >>>> Longer term moving towards an SSA variant has a lot of >>>> advantages. In the near term, simply experimenting with performing the >>>> classic intra-procedural (block global) optimizations wouldn't necessarily >>>> require SSA although SSA does make the bookkeeping easier and faster. If >>>> you introduced a pass to perform the intra-procedural optimizations that >>>> would not violate the invariants expected by subsequent optimizations >>>> (procedure is still in tree form) you'd go a long way. >>>> >>>> If you do go with SSA, it's my experience that you should go all the way >>>> and dead create deal with the phi functions otherwise you end up getting a >>>> lot more temps than you would otherwise and therefore more "live" refs etc. >>>> >>>> On Wed, Jun 17, 2009 at 1:12 AM, Ray Cromwell <cromwell...@gmail.com>wrote: >>>> >>>>> >>>>> I agree, although I find having a wiki with inline response rather >>>>> convenient. Here is what I have so far from the Wave, at the end, there >>>>> is >>>>> a possible to solution to the problem of optimizing away the builder >>>>> pattern >>>>> using general purpose optimizations. >>>>> -Ray >>>>> >>>>> Advanced GWT Compiler Optimizations >>>>> >>>>> >>>>> The purpose of this wave is to talk about a list of future compiler >>>>> improvements / optimizations for GWT that go beyond the scope of those >>>>> listed in the current project wiki. >>>>> >>>>> >>>>> Hybrid Intermediate Representations (Tree SSA) >>>>> >>>>> Control Flow / Data Flow Analysis >>>>> >>>>> Common Subexpression Elimination / Partial Redundancy Elimination >>>>> >>>>> Block Level Dead Code Elimination >>>>> >>>>> Copy Propagation >>>>> >>>>> Register Allocation (temporary/local reduction) >>>>> >>>>> Escape Analysis >>>>> >>>>> Object inlining / Destructuring >>>>> >>>>> Speculative Inlining/Function Cloning >>>>> >>>>> Efficient Anonymous Inner Classes >>>>> >>>>> Loop Hoisting / Loop Induction >>>>> >>>>> >>>>> >>>>> Intermediate Representations >>>>> >>>>> >>>>> In order to perform many optimizations, it is easier to deal with a >>>>> flatter data structure than the traditional AST such as the traditional >>>>> three address code, but using pure three address code for everything has >>>>> its >>>>> own downsides, as well as requiring substantial changes to the compiler. >>>>> >>>>> >>>>> Matt M suggested a hybrid approach of only converting JExpressions to >>>>> SSA form, This would have the benefit of localizing the changes. It >>>>> reminded >>>>> me of GCC's Tree SSA, which upon re-review, looks like it can offer us >>>>> some >>>>> useful lessons. >>>>> >>>>> >>>>> As an example, consider the expression x = a + b + c + d in the current >>>>> AST, this will look something like (assign x (+ d (+ c (+ a b)))). We can >>>>> flatten this by introducing temporaries, arriving at: >>>>> >>>>> >>>>> t1 = a + b >>>>> >>>>> t2 = t1 + c >>>>> >>>>> t3 = t2 + d >>>>> >>>>> x = t3 >>>>> >>>>> >>>>> In addition, if we are only dealing with simple blocks (no branches), >>>>> we can use a more simpler kind of SSA, which is simply to rename variables >>>>> on assignment. >>>>> >>>>> >>>>> Thus, if you have >>>>> >>>>> >>>>> x = 1 >>>>> >>>>> y = 2 >>>>> >>>>> z = x + y >>>>> >>>>> x++ >>>>> >>>>> y++ >>>>> >>>>> z = x + y >>>>> >>>>> >>>>> you can rename each variable on assignment, yielding the following >>>>> >>>>> >>>>> x1 = 1 >>>>> >>>>> y1 = 2 >>>>> >>>>> z1 = x1 + y1 >>>>> >>>>> x2 = x1 + 1 >>>>> >>>>> y2 = y1 + 1 >>>>> >>>>> z2 = x2 + y2 >>>>> >>>>> >>>>> This produces an SSA-like form, with each variable defined only once. >>>>> At first glance, it looks like a de-optimization, but a later pass that >>>>> does >>>>> constant folding, copy propagation, and dead elimination will clean this >>>>> up. >>>>> As an example, after copy propagation >>>>> >>>>> >>>>> x1 = 1 >>>>> >>>>> y1 = 1 >>>>> >>>>> z1 = 1 + 2 >>>>> >>>>> x2 = 1 + 1 >>>>> >>>>> y2 = 2 + 1 >>>>> >>>>> z2 = x2 + y2 >>>>> >>>>> >>>>> after constant folding >>>>> >>>>> x1 = 1 >>>>> >>>>> y1 = 1 >>>>> >>>>> z1 = 3 >>>>> >>>>> x2 = 2 >>>>> >>>>> y2 = 3 >>>>> >>>>> z2 = x2 + y2 >>>>> >>>>> >>>>> after DCE (x1/y1/z1 no longer referenced) >>>>> >>>>> x2 = 2 >>>>> >>>>> y2 = 3 >>>>> >>>>> z2 = x2 + y2 >>>>> >>>>> >>>>> after copy prop >>>>> >>>>> x2 = 2 >>>>> >>>>> y2 = 3 >>>>> >>>>> z2 = 2 + 3 >>>>> >>>>> >>>>> after constant fold >>>>> >>>>> x2 = 2 >>>>> >>>>> y2 = 3 >>>>> >>>>> z2 = 5 >>>>> >>>>> >>>>> after DCE >>>>> >>>>> z2 = 5 >>>>> >>>>> >>>>> Finally, after renaming registers back to their original names >>>>> >>>>> z = 5 >>>>> >>>>> >>>>> A register allocation style pass can further eliminate temporaries >>>>> later. >>>>> >>>>> >>>>> However, do to proper CFA/DFA, we may want to build a real SSA form, >>>>> complete with phi functions, so we may do cross-block optimizations. >>>>> >>>>> >>>>> Effects on the Builder Pattern >>>>> >>>>> Another recent list discussion concerned optimizing the Builder >>>>> Pattern. Flattening the AST and using SSA techniques can have a positive >>>>> effect on this as well. Consider: >>>>> >>>>> >>>>> Builder b = new Builder(); >>>>> >>>>> b.setA(10).setB(20).setC(30); >>>>> >>>>> >>>>> After conversion to static methods. >>>>> >>>>> setC(setB(setA(b, 10), 20), 30); >>>>> >>>>> >>>>> Flattening/SSA: >>>>> >>>>> t1 = setA(b, 10) >>>>> >>>>> t2 = setB(t1, 20) >>>>> >>>>> t3 = setC(t2, 30) >>>>> >>>>> >>>>> After inlining: >>>>> >>>>> b.A = 10; >>>>> >>>>> t1 = b; >>>>> >>>>> t1.B = 20; >>>>> >>>>> t2 = t1 >>>>> >>>>> t2.C = 30 >>>>> >>>>> t3 = t2 >>>>> >>>>> >>>>> After copy prop: >>>>> >>>>> b.A = 10 >>>>> >>>>> t1 = b >>>>> >>>>> b.B = 20 >>>>> >>>>> t2 = b >>>>> >>>>> b.C = 30 >>>>> >>>>> >>>>> After dead code elimination: >>>>> >>>>> b.A = 10 >>>>> >>>>> b.B = 20 >>>>> >>>>> b.C = 30 >>>>> >>>>> >>>>> On Tue, Jun 16, 2009 at 6:30 PM, Bruce Johnson<br...@google.com> >>>>> wrote: >>>>> > Re: Wave access, I was really mostly just being playful, but I'll >>>>> most >>>>> > certainly beg and plead to get this group early in the line if at all >>>>> > possible. >>>>> > >>>>> > Meanwhile, I agree with Cameron that we should make sure to discuss >>>>> the >>>>> > major ideas here on the mailing list, even if some of the initial >>>>> specifics >>>>> > are fleshed out in a wave. I definitely don't want people who don't >>>>> yet have >>>>> > Wave accounts to feel penalized just because they weren't able to go >>>>> to I/O. >>>>> > (But if you didn't go to I/O this year, I really hope you can next >>>>> year -- >>>>> > it's fantastic to meet together in person.) >>>>> > >>>>> > On Tue, Jun 16, 2009 at 9:22 PM, Cameron Braid <came...@braid.com.au> >>>>> wrote: >>>>> >> >>>>> >> Unfortunately I wasn't able to attend Google I/O, however I did >>>>> watch a >>>>> >> few of the sessions online. >>>>> >> >>>>> >> I signed up for the sandbox, but it appears that google aren't going >>>>> to >>>>> >> grant access to non IO attendees for a little while yet. >>>>> >> >>>>> >> I'd appreciate an invite, but I understand if that's not feasible. >>>>> >> >>>>> >> Cheers >>>>> >> >>>>> >> Cameron >>>>> >> >>>>> >> 2009/6/17 Ray Cromwell <cromwell...@gmail.com> >>>>> >>> >>>>> >>> Cameron, were you at Google I/O or did you sign up for the sandbox? >>>>> I can >>>>> >>> ask a Googler to invite you if not. >>>>> >>> -Ray >>>>> >>> >>>>> >>> On Mon, Jun 15, 2009 at 6:21 PM, Cameron Braid < >>>>> came...@braid.com.au> >>>>> >>> wrote: >>>>> >>>> >>>>> >>>> 2009/6/16 Bruce Johnson <br...@google.com> >>>>> >>>>> >>>>> >>>>> I'm starting to make a bit o' progress on this. I'll send out a >>>>> design >>>>> >>>>> doc "real soon now". >>>>> >>>>> >>>>> >>>>> BTW, anyone on the Contributors list here have Wave sandbox >>>>> accounts? >>>>> >>>>> Sure would be easier to discuss this in a wave... >>>>> >>>> >>>>> >>>> No :( >>>>> >>>> >>>>> >>>> But if you are offering invites, send one this way :) Otherwise, >>>>> its >>>>> >>>> probably wise to keep the discussion in a place where all >>>>> interested parties >>>>> >>>> can participate. >>>>> >>>> >>>>> >>>> >>>>> >>>> Cam >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >> >>>>> >> >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> >>>>> >>>> >>>> >>>> -- >>>> Miguel >>>> >>>> >>>> >>> >>> >>> >> >> >> -- >> Miguel >> >> >> > > > > -- Miguel --~--~---------~--~----~------------~-------~--~----~ http://groups.google.com/group/Google-Web-Toolkit-Contributors -~----------~----~----~----~------~----~------~--~---