On 6/8/24 2:36 PM, Jeff Law wrote:
On 6/5/24 8:42 PM, Fei Gao wrote:
But let's back up and get a good explanation of what the problem is.
Based on patch 2/2 it looks like we have lost an assignment to the
return register.
To someone not familiar with this code, it sounds to me like we've made
a mistake earlier and we're now defining a hook that lets us go back and
fix that earlier mistake. I'm probably wrong, but so far that's what
it sounds like.
Hi Jeff
You're right. Let me rephrase patch 2/2 with more details. Search /*
feigao to location the point I'm
tring to explain.
code snippets from gcc/function.cc
void
thread_prologue_and_epilogue_insns (void)
{
...
/*feigao:
targetm.gen_epilogue () is called here to generate epilogue
sequence.
https://gcc.gnu.org/git/?
p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
Commit above tries in targetm.gen_epilogue () to detect if
there's li a0,0 insn at the end of insn chain, if so, cm.popret
is replaced by cm.popretz and li a0,0 insn is deleted.
So that seems like the critical issue. Generation of the prologue/
epilogue really shouldn't be changing other instructions in the
instruction stream. I'm not immediately aware of another target that
does that, an it seems like a rather risky thing to do.
It looks like the cm.popretz's RTL exposes the assignment to a0 and
there's a DCE pass that runs after insertion of the prologue/epilogue.
So I would suggest leaving the assignment to a0 in the RTL chain and see
if the later DCE pass after prologue generation eliminates the redundant
assignment. That seems a lot cleaner.
So I looked at this in a bit more detail. I'm going to explicitly
reject this patch now.
The removal of the set to a0 in riscv_gen_multi_pop_insn looks wrong on
multiple levels. I don't think you have enough context in that routine
or its callers to know if it's safe ie given this fragment of RTL:
(call_insn 12 11 13 3 (parallel [
(call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl
0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
(const_int 0 [0]))
(use (unspec:SI [
(const_int 0 [0])
] UNSPEC_CALLEE_CC))
(clobber (reg:SI 1 ra))
]) "j.c":14:9 441 {call_internal}
(expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41]
<function_decl 0x7ffff7796d00 test_1>)
(nil))
(expr_list:SI (use (reg:SI 10 a0))
(nil)))
(code_label 13 12 14 4 2 (nil) [1 uses])
(note 14 13 19 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 19 14 20 4 (set (reg/i:SI 10 a0)
(const_int 0 [0])) "j.c":18:1 276 {*movsi_internal}
(nil))
(insn 20 19 24 4 (use (reg/i:SI 10 a0)) "j.c":18:1 -1
(nil))
You delete insns 19 and 20 resulting in this:
(call_insn 12 11 13 3 (parallel [
(call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl
0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
(const_int 0 [0]))
(use (unspec:SI [
(const_int 0 [0])
] UNSPEC_CALLEE_CC))
(clobber (reg:SI 1 ra))
]) "j.c":14:9 441 {call_internal}
(expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41]
<function_decl 0x7ffff7796d00 test_1>)
(nil))
(expr_list:SI (use (reg:SI 10 a0))
(nil)))
(code_label 13 12 14 4 2 (nil) [1 uses])
(note 14 13 24 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(note 24 14 0 NOTE_INSN_DELETED)
Which is incorrect/inconsistent RTL. And as I've noted before, it's
conceptually wrong for the backend code to be removing insns from the
insn chain during prologue/epilogue generation.
I realize you're trying to use a hook to limit how this impacts other
targets, but if you're making a bad decision in the RISC-V backend code,
working around it with a target hook rather than fixing the core problem
in the RISC-V backend just makes the whole situation worse.
My suggest is this. Leave the assignment to a0 and use alone. That's
likely going to result in some kind of code size regression, but not a
correctness regression. Then address the code size regressions with the
invariant that prologue/epilogue generation must not change existing
insns on the insn chain.
jeff