On 6/8/24 2:36 PM, Jeff Law wrote:


On 6/5/24 8:42 PM, Fei Gao wrote:

But let's back up and get a good explanation of what the problem is.
Based on patch 2/2 it looks like we have lost an assignment to the
return register.

To someone not familiar with this code, it sounds to me like we've made
a mistake earlier and we're now defining a hook that lets us go back and
fix that earlier mistake.   I'm probably wrong, but so far that's what
it sounds like.
Hi Jeff

You're right. Let me rephrase  patch 2/2 with more details. Search /* feigao to location the point I'm
tring to explain.

code snippets from gcc/function.cc
void
thread_prologue_and_epilogue_insns (void)
{
...
   /*feigao:
         targetm.gen_epilogue () is called here to generate epilogue sequence.     https://gcc.gnu.org/git/? p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
    Commit above tries in targetm.gen_epilogue () to detect if
    there's li    a0,0 insn at the end of insn chain, if so, cm.popret
    is replaced by cm.popretz and li    a0,0 insn is deleted.
So that seems like the critical issue.  Generation of the prologue/ epilogue really shouldn't be changing other instructions in the instruction stream.  I'm not immediately aware of another target that does that, an it seems like a rather risky thing to do.


It looks like the cm.popretz's RTL exposes the assignment to a0 and there's a DCE pass that runs after insertion of the prologue/epilogue. So I would suggest leaving the assignment to a0 in the RTL chain and see if the later DCE pass after prologue generation eliminates the redundant assignment.  That seems a lot cleaner.
So I looked at this in a bit more detail. I'm going to explicitly reject this patch now.

The removal of the set to a0 in riscv_gen_multi_pop_insn looks wrong on multiple levels. I don't think you have enough context in that routine or its callers to know if it's safe ie given this fragment of RTL:

(call_insn 12 11 13 3 (parallel [
            (call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 
0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
                (const_int 0 [0]))
            (use (unspec:SI [
                        (const_int 0 [0])
                    ] UNSPEC_CALLEE_CC))
            (clobber (reg:SI 1 ra))
        ]) "j.c":14:9 441 {call_internal}
     (expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41] 
<function_decl 0x7ffff7796d00 test_1>)
        (nil))
    (expr_list:SI (use (reg:SI 10 a0))
        (nil)))

(code_label 13 12 14 4 2 (nil) [1 uses])

(note 14 13 19 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(insn 19 14 20 4 (set (reg/i:SI 10 a0)
        (const_int 0 [0])) "j.c":18:1 276 {*movsi_internal}
     (nil))

(insn 20 19 24 4 (use (reg/i:SI 10 a0)) "j.c":18:1 -1
     (nil))


You delete insns 19 and 20 resulting in this:

(call_insn 12 11 13 3 (parallel [
            (call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 
0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
                (const_int 0 [0]))
            (use (unspec:SI [
                        (const_int 0 [0])
                    ] UNSPEC_CALLEE_CC))
            (clobber (reg:SI 1 ra))
        ]) "j.c":14:9 441 {call_internal}
     (expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41] 
<function_decl 0x7ffff7796d00 test_1>)
        (nil))
    (expr_list:SI (use (reg:SI 10 a0))
        (nil)))

(code_label 13 12 14 4 2 (nil) [1 uses])

(note 14 13 24 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(note 24 14 0 NOTE_INSN_DELETED)

Which is incorrect/inconsistent RTL. And as I've noted before, it's conceptually wrong for the backend code to be removing insns from the insn chain during prologue/epilogue generation.

I realize you're trying to use a hook to limit how this impacts other targets, but if you're making a bad decision in the RISC-V backend code, working around it with a target hook rather than fixing the core problem in the RISC-V backend just makes the whole situation worse.

My suggest is this. Leave the assignment to a0 and use alone. That's likely going to result in some kind of code size regression, but not a correctness regression. Then address the code size regressions with the invariant that prologue/epilogue generation must not change existing insns on the insn chain.

jeff











Reply via email to