https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98196
Bug ID: 98196 Summary: [11 Regression] aarch64: Wrong code at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- AArch64 GCC miscompiles the following testcase: static unsigned long long a; static long b; static signed char c[17][11]; static signed char (*d)[11] = c; static void e(unsigned long long *f, int h) { *f ^= h; } int main() { for (long g = 0; g < 17; ++g) for (long i = 0; i < 1; ++i) c[g][i] = 2; for (int g = 0; g < 16; g += 4) for (short i = 0; i < 10; i += 10) b = d[g][i]; e(&a, b); if (a != 2) __builtin_abort(); } at -O3 -march=armv8.2-a+sve -msve-vector-bits=256 -fvect-cost-model=unlimited since r11-3705-gdae673abd37d400408959497e50fe1f3fbef5533. The broken code is below. It seems that we load a without first storing to it, so we just load 0 (instead of 2) and make the call to abort. main: adrp x2, .LANCHOR0 add x0, x2, :lo12:.LANCHOR0 mov w1, 2 strb w1, [x2, #:lo12:.LANCHOR0] // c[0][0] = 2 ldr x2, [x0, 200] // x2 <- a (= 0) strb w1, [x0, 11] strb w1, [x0, 22] strb w1, [x0, 33] strb w1, [x0, 44] strb w1, [x0, 55] strb w1, [x0, 66] strb w1, [x0, 77] strb w1, [x0, 88] strb w1, [x0, 99] strb w1, [x0, 110] strb w1, [x0, 121] strb w1, [x0, 132] strb w1, [x0, 143] strb w1, [x0, 154] strb w1, [x0, 165] strb w1, [x0, 176] str xzr, [x0, 192] // b = 0 cmp x2, 2 bne .L7 mov w0, 0 ret .L7: stp x29, x30, [sp, -16]! mov x29, sp bl abort .size main, .-main .bss .align 4 .set .LANCHOR0,. + 0 .type c, %object .size c, 187 c: .zero 187 .zero 5 .type b, %object .size b, 8 b: .zero 8 .type a, %object .size a, 8 a: .zero 8 .ident "GCC: (unknown) 11.0.0 20201207 (experimental)"