https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683
Bug ID: 82683
Summary: GCC generates bad code with -tune=thunderx2t99
Product: gcc
Version: unknown
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: sje at gcc dot gnu.org
Target Milestone: ---
Created attachment 42448
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42448&action=edit
Test case
I am compiling the GCC spec 2017 benchmark on aarch64. If I compile it with
-tune=thunderxt88 it works and if I compile with -tune=thunderx2t99 it fails.
The tune option should affect the speed of a program on different architectures
but it should never result in bad code.
I have attached a cutdown testcase (compilable but not runnable) to show the
problem. In the good case you should see two sxtw sign extend instructions:
sxtw x20, w0
cbz x1, .L2
ldr w0, [x1, x20, lsl 2]
sxtw x20, w0 // 21
.L2:
In the bad case we only get one:
sxtw x20, w0
cbz x1, .L2
ldr w0, [x1, x20, lsl 2]
.L2
If I insert the missing sxtw by hand everything works fine for me. The sxtw
seems to go missing during combine but I do not know why. Notice that in
addition to not doing the sxtw, we leave the loaded value in w0 and do not
put it in x20 like the good code does.
In addition to the -tune argument I am compiling with:
-std=c11 -O2 -fno-inline -fno-schedule-insns -fno-schedule-insns2
-fno-strict-aliasing