https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113827

            Bug ID: 113827
           Summary: MrBayes benchmark redundant load
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rdapp at gcc dot gnu.org
                CC: juzhe.zhong at rivai dot ai, law at gcc dot gnu.org,
                    pan2.li at intel dot com
            Blocks: 79704
  Target Milestone: ---
            Target: riscv

A hot block in the MrBayes benchmark (as used in the Phoronix testsuite) has a
redundant scalar load when vectorized.

Minimal example, compiled with -march=rv64gcv -O3

int foo (float **a, float f, int n)
{
  for (int i = 0; i < n; i++)
    {
      a[i][0] /= f;
      a[i][1] /= f;
      a[i][2] /= f;
      a[i][3] /= f;
      a[i] += 4;
    }
}

GCC:
.L3:
        ld      a5,0(a0)
        vle32.v v1,0(a5)
        vfmul.vv        v1,v1,v2
        vse32.v v1,0(a5)
        addi    a5,a5,16
        sd      a5,0(a0)
        addi    a0,a0,8
        bne     a0,a4,.L3

The value of a5 doesn't change after the store to 0(a0).

LLVM:
.L3
        vle32.v   v8,(a1)
        addi      a3,a1,16
        sd        a3,0(a2)
        vfdiv.vf  v8,v8,fa5
        addi      a2,a2,8
        vse32.v   v8,(a1)
        bne       a2,a0,.L3


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79704
[Bug 79704] [meta-bug] Phoronix Test Suite compiler performance issues

Reply via email to