https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95760
Bug ID: 95760 Summary: ivopts with loop variables Product: gcc Version: tree-ssa Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hailey.chiu at sifive dot com Target Milestone: --- C Source: int **matrix; int n; void foo() { static int row; static int col; static int sum = 0; for( row = 0 ; row < n ; row++ ) { for( col = 0 ; col < n ; col++ ) { sum += matrix[row][col]; } } } Compiling: $./RISCV-GCC-10.1/bin/riscv64-unknown-elf-gcc foo.c -Os -S -o foo.s -march=rv32imac -mabi=ilp32 Asm: foo: ...skip load/store variables... .L5: slli a7,a1,2 #row*4 add a7,t3,a7 #matrix + (row*4) li a0,0 .L3: bgt a5,a0,.L4 addi a1,a1,1 #row++ mv a0,a5 li a7,1 j .L2 .L4: lw t1,0(a7) slli t4,a0,2 #col*4 addi a0,a0,1 #col++ add t1,t1,t4 #*matrix + (col*4) lw t1,0(t1) add a6,a6,t1 li t1,1 j .L3 The calculation of matrix offset is not increasing by 4 after each iteration. I also check that with RISCV-GCC-8.3, it can be emitted code like "add a7, a7, 4" after each iteration. GCC-10.1 takes two instructions to do this, while GCC-8.3 takes one. I think it might affect code size / performance slightly. I am also wondering if "col" can be optimized to the add by 4 operation, because gcc-8.3 doesn't optimize it too. Thanks.