https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115568
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |ra
--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
It looks the problem is in LRA. sched1 pass just reorders the instruction in
just the right way to expose the problem.
When compiling with -O2 -fno-tree-sink -fno-tree-ter -fschedule-insns, we have
the following dump in _.317r.asmcons:
120: NOTE_INSN_BASIC_BLOCK 9
121: r100:SI=~r165:SI
122: r109:SI=r165:SI
REG_DEAD r165:SI
123: r112:SI=~r167:SI
124: r104:SI=r167:SI
16: r170:SI=r164:SI
REG_DEAD r164:SI
17: r164:SI=r109:SI
18: r145:SI=r168:SI
REG_DEAD r168:SI
19: r168:SI=r116:SI
REG_DEAD r116:SI
312: r163:QI=0x1
335: pc=L160
336: barrier
FTR, the sched1 pass just reorders these instructions to:
120: NOTE_INSN_BASIC_BLOCK 9
122: r109:SI=r165:SI
REG_DEAD r165:SI
16: r170:SI=r164:SI
REG_DEAD r164:SI
18: r145:SI=r168:SI
REG_DEAD r168:SI
121: r100:SI=~r165:SI
123: r112:SI=~r167:SI
312: r163:QI=0x1
124: r104:SI=r167:SI
17: r164:SI=r109:SI
19: r168:SI=r116:SI
REG_DEAD r116:SI
335: pc=L160
336: barrier
IRA pass renames a couple of registers to (_.323r.ira):
316: r223:SI=0xffffffffffffffff
REG_EQUAL 0xffffffffffffffff
...
43: r164:SI=[`e']
361: r218:SI=r164:SI
REG_DEAD r164:SI
...
120: NOTE_INSN_BASIC_BLOCK 9
122: r109:SI=r223:SI
16: r170:SI=r218:SI
REG_DEAD r218:SI
18: r145:SI=r220:SI
REG_DEAD r220:SI
121: r100:SI=~r223:SI
REG_DEAD r223:SI
123: r112:SI=~r219:SI
312: r163:QI=0x1
124: r104:SI=r219:SI
17: r218:SI=r109:SI
19: r220:SI=r222:SI
REG_DEAD r222:SI
335: pc=L160
336: barrier
Now LRA reloads this sequence to (_.324r.reload):
316: di:SI=0xffffffffffffffff
REG_EQUAL 0xffffffffffffffff
...
43: r14:SI=[`e']
361: [sp:DI-0x34]=r14:SI
...
120: NOTE_INSN_BASIC_BLOCK 9
122: r14:SI=di:SI
476: [sp:DI-0x2c]=r14:SI
483: ax:SI=di:SI <--- here
482: NOTE_INSN_DELETED
16: [sp:DI-0x24]=ax:SI
18: [sp:DI-0x1c]=r11:SI
121: di:SI=~di:SI
413: r9:SI=di:SI
414: r8:SI=r13:SI
123: r8:SI=~r8:SI
312: [sp:DI-0x1]=0x1
124: di:SI=r13:SI
17: [sp:DI-0x34]=r14:SI
19: r11:SI=cx:SI
335: pc=L160
336: barrier
Please note (insn 483). While LRA substitutes register r218:SI in (insn 361)
and (insn 17) with [sp:DI-0x34], it fails to do so in (insn 16) and creates
wrong reload (insn 483), effectively assigning -1 to [sp:DI-0x24].
(insn 483) should read
483: ax:SI=[sp:DI-0x34]
instead.
The above invalid RTX sequence results in (please add -fno-schedule-insns2
-fno-cprop-registers to compile flags to avoid further reordering of the RTX
sequence):
movl %edi, %r14d # 122 [c=4 l=3] *movsi_internal/0
movl %r14d, -44(%rsp) # 476 [c=4 l=5] *movsi_internal/1
--> movl %edi, -36(%rsp) # 16 [c=4 l=4] *movsi_internal/1
movl %r11d, -28(%rsp) # 18 [c=4 l=5] *movsi_internal/1
notl %edi # 121 [c=4 l=2] *one_cmplsi2_1/0
movl %edi, %r9d # 413 [c=4 l=3] *movsi_internal/0
movl %r13d, %r8d # 414 [c=4 l=3] *movsi_internal/0
notl %r8d # 123 [c=4 l=3] *one_cmplsi2_1/0
movb $1, -1(%rsp) # 312 [c=4 l=5] *movqi_internal/8
movl %r13d, %edi # 124 [c=4 l=3] *movsi_internal/0
movl %r14d, -52(%rsp) # 17 [c=4 l=5] *movsi_internal/1
movl %ecx, %r11d # 19 [c=4 l=3] *movsi_internal/0
The insn #16 above is wrong due to the invalid reload.
When the offending insn #16 is substituted with expected sequence in the
assembly dump:
...
movl -52(%rsp), %eax
movl %eax, -36(%rsp) # 16
...
the testcase executes without problems.