On Sat, Mar 26, 2005 at 12:00:43PM -0500, Kazu Hirata wrote: > Have you considered merging CCP and VRP (as suggested by Kenny last > year at the summit)? > By merging, do you mean *replacing* CCP with VRP? Yes, it's doable. No, it's not a good idea.
Because of its lattice evaluation, VRP is about 3x slower than CCP. Consider that we currently schedule a single VRP pass and 3 CCP passes. On cc1-i-files, that single VRP accounts for ~4 seconds of compile time, the three CCP passes account for ~5 seconds of compile time. Also consider that currently our VRP pass is pretty quick because it does not propagate branch probabilities. It only evaluates probabilities 0 and 1. I have plans to change it so that it can feed information to branch prediction. That will make it even more heavyweight. Furthermore, VRP only deals with GIMPLE registers. Our CCP pass can propagate load/store constants. If we burdened VRP with loads and stores, it would be even slower. So, while VRP can subsume some of the actions of CCP, it's much slower and can't really be run all that often. It's fine to allow it to do some constant propagation. But morphing the two passes into one will not gain us much. > <bb 0>: > if (a_1 == 0) goto <L0>; else goto <L2>; > > <L0>:; > a_2 = 0; > bar (a_2); <-- a_2 isn't replaced with 0 yet! > > Note that we don't have bar (0) at the end. This is because currently > VRP does not have the equivalent of substitute_and_fold. We just use > the range information to fold COND_EXPR; we don't fold each statement > using constants and ranges gathered by VRP. > Right. And that is something that is easily doable in vrp_finalize. It's in my todo pile, but if you want to do it, go right ahead. > So I am thinking about inserting ASSERT_EXPRs up front *before* going > into SSA, without much of pruning, and then run an enhanced version of > I was doing this originally. It turns out to be easier and faster to insert the assertions once we are in SSA form. If you go back in time in the VRP code, you'll see that it evolved from there. What we could do is insert the assertions, do the various passes that take advantage of them and then remove the assertions once they are no longer necessary. I still haven't read in detail your plan for using ASSERT_EXPRs in the jump threader, but at first sight it sounded decent. > statements just before hitting loop optimizers. :-) I have not > figured out how to deal with ASSERT_EXPRs in FRE, but Daniel Berlin > says I just have to teach tree-vn.c how to deal with it. > At one point, all the passes had to deal with ASSERT_EXPRs. Mostly by ignoring them. Which is additional, unwanted work because some of them had to actively know about them being nothing but fancy copy operations. That gets in the way of their work. I think that ASSERT_EXPRs should only survive as long as they're useful. > Last but not least, I'm willing to do the work, but I'd like to be > more or less on the same page with other people hacking these scalar > optimizations before I do any significant work. > Sure. Go ahead. My short term plan is to start merging the various components into mainline. I'll start with the incremental SSA patches, followed by VRP, the CCP and copy-prop improvements. Perhaps we could leave the changes to the threader in TCB for a little while longer, but first I need to read your proposal in detail. > By the way, looking at Kenny's slides from last year, one thing we > have not implemented in our propagation engine is to process the CFG > worklist in the DFS order of a dominator tree. I haven't looked > closely at this, but if the speed of propagation is a concern, we > should come back to this. > ISTR either stevenb or dberlin implementing a dom-order propagation. I think they got a minor speedup that could be worth having. Diego.