> -----Original Message-----
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of
> Jiangning Liu
> Sent: Tuesday, February 28, 2012 11:19 AM
> To: 'William J. Schmidt'
> Cc: gcc@gcc.gnu.org; wschm...@gcc.gnu.org
> Subject: RE: A question about redundant PHI expression stmt
> 
> 
> 
> > -----Original Message-----
> > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
> Of
> > William J. Schmidt
> > Sent: Monday, February 27, 2012 11:32 PM
> > To: Jiangning Liu
> > Cc: gcc@gcc.gnu.org; wschm...@gcc.gnu.org
> > Subject: Re: A question about redundant PHI expression stmt
> >
> > On Fri, 2012-02-24 at 16:07 +0800, Jiangning Liu wrote:
> > > Hi,
> > >
> > > For the small case below, there are some redundant PHI expression
> > stmt
> > > generated, and finally cause back-end generates redundant copy
> > instructions
> > > due to some reasons around IRA.
> > >
> > > int *l, *r, *g;
> > > void test_func(int n)
> > > {
> > >   int i;
> > >   static int j;
> > >   static int pos, direction, direction_pre;
> > >
> > >   pos = 0;
> > >   direction = 1;
> > >
> > >   for ( i = 0; i < n; i++ )
> > >   {
> > >           direction_pre = direction;
> > >
> > >           for ( j = 0; j <= 400; j++ )
> > >           {
> > >
> > >                   if ( g[pos] == 200 )
> > >                           break;
> > >
> > >                   if ( direction == 0 )
> > >                           pos = l[pos];
> > >                   else
> > >                           pos = r[pos];
> > >
> > >                   if ( pos == -1 )
> > >                   {
> > >                           if ( direction_pre != direction )
> > >                                   break;
> > >
> > >                           pos = 0;
> > >                           direction = !direction;
> > >                   }
> > >
> > >           }
> > >
> > >           f(g[pos]);
> > >   }
> > > }
> > >
> > > I know PR39976 has something to do with this case, and check-in
> > r182140
> > > caused big degradation on the real benchmark, but I'm still
> confusing
> > around
> > > this issue.
> > >
> > > The procedure is like this,
> > >
> > > Loop store motion generates code below,
> > >
> > > <bb 6>:
> > >   D.4084_17 = l.4_13 + D.4077_70;
> > >   pos.5_18 = *D.4084_17;
> > >   pos_lsm.34_103 = pos.5_18;
> > >   goto <bb 8>;
> > >
> > > <bb 7>:
> > >   D.4088_23 = r.6_19 + D.4077_70;
> > >   pos.7_24 = *D.4088_23;
> > >   pos_lsm.34_104 = pos.7_24;
> > >
> > > <bb 8>:
> > >   # prephitmp.29_89 = PHI <pos.5_18(6), pos.7_24(7)>
> > >   # pos_lsm.34_53 = PHI <pos_lsm.34_103(6), pos_lsm.34_104(7)>
> > >   pos.2_25 = prephitmp.29_89;
> > >   if (pos.2_25 == -1)
> > >     goto <bb 9>;
> > >   else
> > >     goto <bb 20>;
> > >
> > > And then, copy propagation transforms block 8 it into
> > >
> > > <bb 8>:
> > >   # prephitmp.29_89 = PHI <pos.5_18(11), pos.7_24(12)>
> > >   # pos_lsm.34_53 = PHI <pos.5_18(11), pos.7_24(12)>
> > >   ...
> > >
> > > And then, these two duplicated PHI stmts have been kept until
> expand
> > pass.
> > >
> > > In dom2, a stmt like below
> > >
> > >   # pos_lsm.34_54 = PHI <pos_lsm.34_53(13), 0(16)>
> > >
> > > is transformed into
> > >
> > >   # pos_lsm.34_54 = PHI <prephitmp.29_89(13), 0(16)>
> > >
> > > just because they are equal.
> > >
> > > Unfortunately, this transformation changed back-end behavior to
> > generate
> > > redundant copy instructions and hurt performance. This case is from
> a
> > real
> > > benchmark and hurt performance a lot.
> > >
> > > Does the fix in r182140 intend to totally clean up this kind of
> > redundancy?
> > > Where should we get it fixed?
> > >
> >
> > Hi, sorry not to have responded sooner -- I just now got some time to
> > look at this.
> >
> > I compiled your code with -O3 for revisions 182139 and 182140 of GCC
> > trunk, and found no difference between the code produced by the
> middle
> > end for the two versions.  So something else has apparently come
> along
> > since then that helped produce the problematic code generation for
> you.
> > Either that or I need to use different compile flags; you didn't
> > specify
> > what you used.
> >
> > The fix in r182140 does just what you saw in dom2:  identifies
> > duplicate
> > PHIs in the same block and ensures only one of them is used.  This
> > actually avoids inserting extra blocks during expand in certain loop
> > cases.  I am not sure why you are getting redundant copies as a
> result,
> > but it sounds from your comments like IRA didn't coalesce a register
> > copy or something like that.  You may want to bisect revisions on the
> > trunk to see where your bad code generation started to occur to get a
> > better handle on what happened.
> >
> > As Richard said, the dom pass is likely to be removed someday,
> whenever
> > someone can get around to it.  My redundant-phi band-aid in dom would
> > go
> > away then as well, but presumably an extra pass of PRE would replace
> it,
> > and redundant PHIs would still be removed.
> >
> 
> Bill,
> 
> Thanks a lot for your explanation!
> 
> The bug could be exposed with -O2 if you apply the check-in r184435
> made by Richard.
> 
> BTW, at present is there anybody working on the NEW pass replacing dom?
> If no, in short term, it would be good to still get it fixed just
> because it is hurting benchmark. If Yes, may I know what that NEW pass
> will be focusing on?
> 

With dump enabled, I do see a PHI copy is identified by 
tree_ssa_dominator_optimize as below,

Optimizing block #12

LKUP STMT prephitmp.29_89 = PHI <pos.5_18, pos.7_24>
          prephitmp.29_89 = PHI <pos.5_18(10), pos.7_24(11)>
2>>> STMT prephitmp.29_89 = PHI <pos.5_18, pos.7_24>
          prephitmp.29_89 = PHI <pos.5_18(10), pos.7_24(11)>
LKUP STMT pos_lsm.34_53 = PHI <pos.5_18, pos.7_24>
          pos_lsm.34_53 = PHI <pos.5_18(10), pos.7_24(11)>
FIND: prephitmp.29_89
0>>> COPY pos_lsm.34_53 = prephitmp.29_89
<<<< STMT prephitmp.29_89 = PHI <pos.5_18, pos.7_24>
          prephitmp.29_89 = PHI <pos.5_18(10), pos.7_24(11)>
Optimizing statement if (prephitmp.29_89 == -1)
LKUP STMT prephitmp.29_89 eq_expr -1
          if (prephitmp.29_89 == -1)

So does it mean phicprop (eliminate_degenerate_phis) needs to be improved as 
well?

> Thanks,
> -Jiangning
> 
> > Thanks,
> > Bill
> >
> > > Thanks,
> > > -Jiangning
> > >
> > >
> > >
> >
> 
> 
> 
> 




Reply via email to