https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106
Bug ID: 68106 Summary: c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64 Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Created attachment 36594 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36594&action=edit reduced testcase The testcase fails at aarch64 at both trunk and 5-branch with -O -flra-remat. I haven't managed to generate wrong code with -O2, which enables -flra-remat. I am using qemu userspace emulation to run the testcase. $ gcc -O -flra-remat testcase.c $ ./a.out qemu: uncaught target signal 6 (Aborted) - core dumped Aborted $ gcc -O -flra-remat testcase.c -S $ gcc -O -fno-lra-remat testcase.c -S -o testcase-no-lra-remat.s $ diff -u testcase.s testcase-no-lra-remat.s ... @@ -104,20 +104,21 @@ .L25: lsl x20, x23, 56 add x19, x20, x26 + str x19, [x29, 128] asr x22, x19, 63 adds x27, x21, x19 adc x0, x24, x22 - str x0, [x29, 120] - add x2, x29, 140 + str x0, [x29, 136] + add x2, x29, 156 mov x1, x19 mov x0, x25 bl upseu cmp x0, x27 bne .L14 - adc x0, x24, x22 + ldr x0, [x29, 136] cmp x0, xzr cset w1, ne - ldr w0, [x29, 140] + ldr w0, [x29, 156] cmp w1, w0 bne .L14 subs x1, x21, x20 ... If I am reading the assembly correctly, the important difference is: ... adds x27, x21, x19 adc x0, x24, x22 - str x0, [x29, 120] - add x2, x29, 140 + str x0, [x29, 136] + add x2, x29, 156 ... - adc x0, x24, x22 + ldr x0, [x29, 136] ... Normally, without lra-remat, the result of "adc x0, x24, x22" is stored to the stack and then reloaded. With -flra-remat, the value is stored as well, but later, "adc" is used to recompute the value again - that saves one access to the stack, but cpsr has changed in the meantime, so it is using wrong value of the C bit. Tested revisions: r229293 - FAIL 5-branch r229305 - FAIL 4_9-branch - doesn't know -flra-remat