------- Comment #2 from rakdver at kam dot mff dot cuni dot cz 2010-03-03 09:21 ------- Subject: Re: Teaching SCEV about ADDR_EXPR causes regression
> This together with the patch mentioned in the previous comment allows to > generate: > ivtmp.0 = &a[0][0] > <bb1> > L1: > s.0 = PHI(0, s.2) > i.0 = PHI(122, i.1) > ivtmp.1 = PHI(ivtmp.0, ivtmp.4) > <bb2> > L2: > s.1 = PHI(s.0, s.2) > j.0 = PHI(122, j.1) > ivtmp.2 = PHI(ivtmp.1, ivtmp.3) > s.2 = s.1 + MEM(ivtmp.2) > ivtmp.3 = ivtmp.2 + 4 > j.1 = j.0 - 1 > if (j.1 >= 0) goto L2 > <bb3> > ivtmp.4 = ivtmp.3 // would be ivtmp.4 = ivtmp.1 + stride > i.1 = i.0 - 1 > if (i.1 >= 0) goto L1 > > The improvement is that ivtmp.1 is not live across the inner loop. > > The approach is to store final values of IVs in a hashtable, mapping SSA_NAME > of initial value in the preheader to aff_tree with final value, and then try > to > replace increments of new IVs with uses of IVs from inner loops (currently I > just implemented a brute force loop over all IV uses to find a useful entry in > that hashtable). > Does this make sense and sound acceptable? the approach seems ok. However, it is not immediately clear that performing the replacement is a good idea -- it trades of register pressure for creating new dependences, i.e., it makes register allocation easier, but scheduling harder. So, some performance testing is necessary to check this, Zdenek -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43174