https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89584

            Bug ID: 89584
           Summary: CPU2000 degradations with r268448 (172.mgrid -22%,
                    252.eon -8%)
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: ipa
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pthaugen at linux dot ibm.com
                CC: dje at gcc dot gnu.org, hubicka at gcc dot gnu.org,
                    marxin at gcc dot gnu.org, rguenth at gcc dot gnu.org,
                    segher at gcc dot gnu.org, wschmidt at gcc dot gnu.org
  Target Milestone: ---
              Host: powerpc64-unknown-linux-gnu
            Target: powerpc64-unknown-linux-gnu
             Build: powerpc64-unknown-linux-gnu

Revision 268448 introduced the noted degradations. Compile flags are -m64 -O3
-mcpu=power7 -fpeel-loops -funroll-loops -ffast-math -mpopcntd -mrecip=all.

I dug into the mgrid degradation further to have some more detail. The main
difference appears to be that the last call to RESID() in the main function is
now inlined. RESID() is actually cloned, and this call is to the clone,
resid_.constprop.0. I'm not sure if this is another instance of losing RESTRICT
on the parameters as seen in prior PRs (54497/55334 and 84737) or just a fact
of inlining that specific call into an inner loop now creates too much register
pressure and we spill too much (I suspect the latter). Following is a simple
static instruction count comparison of the vectorized loop from
resid_.constprop.0() and the same loop after inlining, note the obvious
increase in load/store insns.

Old = constprop.s
New = constprop_inline.s
INSTR              Old      New     Change
-----------      -----    -----     ------
addi        -        1       29       28
bc          -        1        1        0
cmpl        -        1        1        0
ld          -        0       17       17
lxvd2x      -       19       33       14
ori         -        0        5        5
stxvd2x     -        1       15       14
xvadddp     -       17       17        0
xvnmsubadp  -        1        1        0
xvnmsubmdp  -        3        3        0
xxlor       -        3        2       -1
-----------      -----    -----    -----
load        -       19       50       31
store       -        1       15       14
total       -       47      124       77

Reply via email to