[Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64

2016-01-23 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |5.4

--- Comment #6 from Andrew Pinski  ---
Fixed.

[Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64

2016-01-13 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

Bernd Schmidt  changed:

   What|Removed |Added

 CC||bernds at gcc dot gnu.org

--- Comment #5 from Bernd Schmidt  ---
Can this be closed?

[Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64

2015-11-06 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #4 from Vladimir Makarov  ---
Author: vmakarov
Date: Fri Nov  6 17:33:01 2015
New Revision: 229868

URL: https://gcc.gnu.org/viewcvs?rev=229868&root=gcc&view=rev
Log:
2015-11-06  Vladimir Makarov  

PR rtl-optimization/68106
* lra-remat.c (input_regno_present_p): Process hard regs
explicitly present in machine description insns.
(call_used_input_regno_present_p): Ditto.
(calculate_gen_cands): Ditto.
(do_remat): Ditto.

2015-11-06  Vladimir Makarov  

PR rtl-optimization/68106
* testsuite/gcc.target/aarch64/pr68106.c: New.


Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/aarch64/pr68106.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/lra-remat.c
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64

2015-10-30 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #3 from Vladimir Makarov  ---
The problem was in ignoring hard registers explicitly present in machine
description insns by LRA rematerialization subpass.

I'll wait for a few days before backporting this in gcc-5-branch.


[Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64

2015-10-30 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #2 from Vladimir Makarov  ---
Author: vmakarov
Date: Fri Oct 30 17:45:16 2015
New Revision: 229593

URL: https://gcc.gnu.org/viewcvs?rev=229593&root=gcc&view=rev
Log:
2015-10-30  Vladimir Makarov  

PR rtl-optimization/68106
* lra-remat.c (input_regno_present_p): Process hard regs
explicitly present in machine description insns.
(call_used_input_regno_present_p): Ditto.
(calculate_gen_cands): Ditto.
(do_remat): Ditto.

2015-10-30  Vladimir Makarov  

PR rtl-optimization/68106
* gcc.target/aarch64/pr68106.c: New.


Added:
trunk/gcc/testsuite/gcc.target/aarch64/pr68106.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/lra-remat.c
trunk/gcc/testsuite/ChangeLog


[Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64

2015-10-30 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #1 from Vladimir Makarov  ---
(In reply to Zdenek Sojka from comment #0)
> Created attachment 36594 [details]
> reduced testcase
> 
> The testcase fails at aarch64 at both trunk and 5-branch with -O
> -flra-remat. I haven't managed to generate wrong code with -O2, which
> enables -flra-remat. I am using qemu userspace emulation to run the testcase.
> 
> $ gcc -O -flra-remat testcase.c
> $ ./a.out 
> qemu: uncaught target signal 6 (Aborted) - core dumped
> Aborted
> 
> $ gcc -O -flra-remat testcase.c -S
> $ gcc -O -fno-lra-remat testcase.c -S -o testcase-no-lra-remat.s
> $ diff -u testcase.s testcase-no-lra-remat.s
> ...
> @@ -104,20 +104,21 @@
>  .L25:
> lsl x20, x23, 56
> add x19, x20, x26
> +   str x19, [x29, 128]
> asr x22, x19, 63
> addsx27, x21, x19
> adc x0, x24, x22
> -   str x0, [x29, 120]
> -   add x2, x29, 140
> +   str x0, [x29, 136]
> +   add x2, x29, 156
> mov x1, x19
> mov x0, x25
> bl  upseu
> cmp x0, x27
> bne .L14
> -   adc x0, x24, x22
> +   ldr x0, [x29, 136]
> cmp x0, xzr
> csetw1, ne
> -   ldr w0, [x29, 140]
> +   ldr w0, [x29, 156]
> cmp w1, w0
> bne .L14
> subsx1, x21, x20
> ...
> 
> If I am reading the assembly correctly, the important difference is:
> ...
> addsx27, x21, x19
> adc x0, x24, x22
> -   str x0, [x29, 120]
> -   add x2, x29, 140
> +   str x0, [x29, 136]
> +   add x2, x29, 156
> ...
> -   adc x0, x24, x22
> +   ldr x0, [x29, 136]
> ...
> 
> Normally, without lra-remat, the result of "adc x0, x24, x22" is stored
> to the stack and then reloaded. 
> With -flra-remat, the value is stored as well, but later, "adc" is used to
> recompute the value again - that saves one access to the stack, but cpsr has
> changed in the meantime, so it is using wrong value of the C bit.
> 
> Tested revisions:
> r229293 - FAIL
> 5-branch r229305 - FAIL
> 4_9-branch - doesn't know -flra-remat

I was not able to reproduce it on the current trunk.  But I've reproduced it on
r229293.  I've been working on it and I am planning to submit a patch for the
trunk today.