https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89154

            Bug ID: 89154
           Summary: 5% degradation of CPU2006 473.astar starting with
                    r266305
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pthaugen at gcc dot gnu.org
                CC: dje at gcc dot gnu.org, rguenth at gcc dot gnu.org,
                    segher at gcc dot gnu.org, wschmidt at gcc dot gnu.org
  Target Milestone: ---
              Host: powerpc64le-unknown-linux-gnu
            Target: powerpc64le-unknown-linux-gnu
             Build: powerpc64le-unknown-linux-gnu

Not sure if this is really tree-optimization issue, just picked as initial
component since fix dealt with that. Could possibly be
rtl-optimization/shrink-wrap issue brought about by additional register
pressure due to CSE'ing/hoisting some additional code.

Funtion way2obj::releasepoint() degrades 20% starting with r266305. Looking at
perf output, the main difference seems to be that we're no longer
shrink-wrapping the early exit test at the start of the function.

Following is the annotated assembly of the start of the function.

r266304:
--------
0000000010006a40 <_ZN7way2obj12releasepointEii>: /* way2obj::releasepoint(int,
int) total: 2032811 22.9279 */
               :    10006a40:   lis     r2,4098
               :    10006a44:   addi    r2,r2,32512
 95384  1.0758 :    10006a48:   lwz     r9,4424(r3)
               :    10006a4c:   ld      r8,8(r3)
119001  1.3422 :    10006a50:   lhz     r7,16(r3)
     1 1.1e-05 :    10006a54:   mullw   r9,r9,r5
               :    10006a58:   add     r9,r9,r4
               :    10006a5c:   extsw   r9,r9
169526  1.9121 :    10006a60:   rldicr  r9,r9,2,61
               :    10006a64:   lhzx    r10,r8,r9
 21865  0.2466 :    10006a68:   cmpw    r10,r7
               :    10006a6c:   beqlr



r266305:
--------
0000000010006a40 <_ZN7way2obj12releasepointEii>: /* way2obj::releasepoint(int,
int) total: 2440798 26.2354 */
               :    10006a40:   lis     r2,4098
               :    10006a44:   addi    r2,r2,32512
 35498  0.3816 :    10006a48:   lwa     r6,4424(r3)
               :    10006a4c:   ld      r7,8(r3)
 26361  0.2833 :    10006a50:   std     r30,-16(r1)
               :    10006a54:   mr      r30,r3
157660  1.6946 :    10006a58:   mfcr    r12
162000  1.7413 :    10006a5c:   lhz     r3,16(r3)
    17 1.8e-04 :    10006a60:   std     r23,-72(r1)
   139  0.0015 :    10006a64:   mr      r23,r4
     2 2.1e-05 :    10006a68:   mullw   r9,r6,r5
    59 6.3e-04 :    10006a6c:   stw     r12,8(r1)
244832  2.6316 :    10006a70:   stdu    r1,-112(r1)
     4 4.3e-05 :    10006a74:   add     r9,r9,r4
     5 5.4e-05 :    10006a78:   extsw   r9,r9
   201  0.0022 :    10006a7c:   rldicr  r8,r9,2,61
   343  0.0037 :    10006a80:   add     r4,r7,r8
     9 9.7e-05 :    10006a84:   lhzx    r10,r7,r8
151595  1.6294 :    10006a88:   cmpw    r10,r3
               :    10006a8c:   beq     10006c64
<_ZN7way2obj12releasepointEii+0x224>

The target of the conditional branch in the slow version is just the epilogue
code to restore R1, R23, R30 and CR3/CR4 and return.

Reply via email to