https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802

--- Comment #6 from luoxhu at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
> So the point is that P is invariant but we do not hoist it because it's
> computed in a (estimated) cold block?  I notice that the condition is
> invariant, too, so
> in principle we could hoist as
> 
>   if (d > 0.01)
>     P = ( W < E ) ? (W - E)/d : (E - W)/d;
>   for (i=0; i < 2; i++ )
>     if( d > 0.01 )
>       F[i] += P;


Yes. But this loop only iterates twice, so bbs in loop is colder than
preheader.
-funswitch-loops should move the condition out of loop, but also need increase
the loop iteration count:

"/home/luoxhu/workspace/gcc-master/gcc/testsuite/gcc.dg/tree-ssa/recip-3.c:16:14:
note: Not unswitching, loop is not expected to iterate"

> 
> alternatively one might argue that invariant expressions (unconditionally
> computed or in a special way under invariant conditions) should be costed
> differently.
> 
> I think best would be to restore the original intent of the testcase which
> was added with the fix for PRs 23109, 23948 and 24123.  I suppose there
> we saw the invariant hoisted(?) and the loop unrolled so I would suggest
> to either apply the hoisting or the unrolling manually to the testcase.
> (just look at the PRs whether you get a better idea of the origin of the
> testcase).

To restore the original intent of the testcase, increase the loop count is
better than "either apply the hoisting or unrolling".  Change it from "2" to at
least "5" will turn the cold bb to hot bb, then the two divides could be
hoisted out in LIM pass again(Verified below change could both pass on
power-m32 and x86-i686):

(It is much reasonable than the other two directions as loop iteration count is
not key for the test code.)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/recip-3.c
b/gcc/testsuite/gcc.dg/tree-ssa/recip-3.c
index 641c91e..a1d2d87 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/recip-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/recip-3.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O1 -fno-trapping-math -funsafe-math-optimizations
-fdump-tree-recip" } */

-double F[2] = { 0.0, 0.0 }, e;
+double F[5] = { 0.0, 0.0 }, e;

 /* In this case the optimization is interesting.  */
 float h ()
@@ -13,7 +13,7 @@ float h ()
        d = 2.*e;
        E = 1. - d;

-       for( i=0; i < 2; i++ )
+       for( i=0; i < 5; i++ )
                if( d > 0.01 )
                {
                        P = ( W < E ) ? (W - E)/d : (E - W)/d;
@@ -23,4 +23,4 @@ float h ()
        F[0] += E / d;
 }

-/* { dg-final { scan-tree-dump-times " / " 5 "recip" } } */
+/* { dg-final { scan-tree-dump-times " / " 1 "recip" } } */

Reply via email to