https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #1 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Zdenek Sojka from comment #0)
> Created attachment 36594 [details]
> reduced testcase
> 
> The testcase fails at aarch64 at both trunk and 5-branch with -O
> -flra-remat. I haven't managed to generate wrong code with -O2, which
> enables -flra-remat. I am using qemu userspace emulation to run the testcase.
> 
> $ gcc -O -flra-remat testcase.c
> $ ./a.out 
> qemu: uncaught target signal 6 (Aborted) - core dumped
> Aborted
> 
> $ gcc -O -flra-remat testcase.c -S
> $ gcc -O -fno-lra-remat testcase.c -S -o testcase-no-lra-remat.s
> $ diff -u testcase.s testcase-no-lra-remat.s
> ...
> @@ -104,20 +104,21 @@
>  .L25:
>         lsl     x20, x23, 56
>         add     x19, x20, x26
> +       str     x19, [x29, 128]
>         asr     x22, x19, 63
>         adds    x27, x21, x19
>         adc     x0, x24, x22
> -       str     x0, [x29, 120]
> -       add     x2, x29, 140
> +       str     x0, [x29, 136]
> +       add     x2, x29, 156
>         mov     x1, x19
>         mov     x0, x25
>         bl      upseu
>         cmp     x0, x27
>         bne     .L14
> -       adc     x0, x24, x22
> +       ldr     x0, [x29, 136]
>         cmp     x0, xzr
>         cset    w1, ne
> -       ldr     w0, [x29, 140]
> +       ldr     w0, [x29, 156]
>         cmp     w1, w0
>         bne     .L14
>         subs    x1, x21, x20
> ...
> 
> If I am reading the assembly correctly, the important difference is:
> ...
>         adds    x27, x21, x19
>         adc     x0, x24, x22
> -       str     x0, [x29, 120]
> -       add     x2, x29, 140
> +       str     x0, [x29, 136]
> +       add     x2, x29, 156
> ...
> -       adc     x0, x24, x22
> +       ldr     x0, [x29, 136]
> ...
> 
> Normally, without lra-remat, the result of "adc     x0, x24, x22" is stored
> to the stack and then reloaded. 
> With -flra-remat, the value is stored as well, but later, "adc" is used to
> recompute the value again - that saves one access to the stack, but cpsr has
> changed in the meantime, so it is using wrong value of the C bit.
> 
> Tested revisions:
> r229293 - FAIL
> 5-branch r229305 - FAIL
> 4_9-branch - doesn't know -flra-remat

I was not able to reproduce it on the current trunk.  But I've reproduced it on
r229293.  I've been working on it and I am planning to submit a patch for the
trunk today.

Reply via email to