https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95760

            Bug ID: 95760
           Summary: ivopts with loop variables
           Product: gcc
           Version: tree-ssa
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hailey.chiu at sifive dot com
  Target Milestone: ---

C Source:

int **matrix;
int n;
void foo()
{
    static int row;
    static int col;
    static int sum = 0;

    for( row = 0 ; row < n ; row++ )
    {
        for( col = 0 ; col < n ; col++ )
        {
            sum += matrix[row][col];
        }
    }
}

Compiling:
$./RISCV-GCC-10.1/bin/riscv64-unknown-elf-gcc foo.c -Os -S -o foo.s
-march=rv32imac -mabi=ilp32

Asm:

foo:
...skip load/store variables...
.L5:
    slli    a7,a1,2 #row*4 
    add a7,t3,a7    #matrix + (row*4) 
    li  a0,0
.L3:
    bgt a5,a0,.L4
    addi    a1,a1,1 #row++
    mv  a0,a5
    li  a7,1
    j   .L2
.L4:
    lw  t1,0(a7)
    slli    t4,a0,2 #col*4
    addi    a0,a0,1 #col++
    add t1,t1,t4    #*matrix + (col*4)
    lw  t1,0(t1)
    add a6,a6,t1
    li  t1,1
    j   .L3

The calculation of matrix offset is not increasing by 4 after each iteration. I
also check that with RISCV-GCC-8.3, it can be emitted code like "add a7, a7, 4"
after each iteration. GCC-10.1 takes two instructions to do this, while GCC-8.3
takes one. I think it might affect code size / performance slightly. 

I am also wondering if "col" can be optimized to the add by 4 operation,
because gcc-8.3 doesn't optimize it too. 

Thanks.

Reply via email to