On 3/20/19 5:25 AM, Paulo Matos wrote:
I am working on trying to get RISC-V 32 emitting sibcalls even in the
present of `-msave-restore`, for a client concerned with generated code
size.

This won't work unless you define a new set of restore functions. The current ones restore the return address from the stack and return, which is wrong if you want to do a sibcall. This is why we tail call (jump to) the restore functions, because the actual function return is in the restore functions. You will need a new set of restore functions that restore regs without restoring the ra. You then probably also then need other cascading changes to make this work.

The new set of restore functions will then increase code size a bit offsetting the gain you get from using them. You would have to have enough sibling calls that can use -msave-restore to make this worthwhile. It isn't clear if this would be a win or not.

I thought I was on the right path until I noticed that the CFG is messed
up because of assumptions related to emission of sibcall instead of a
libcall until the epilogue is expanded. During the pro_and_epilogue pass
I get an emergency dump and a segfault:
gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c:11:1: error: in
basic block 2:
gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c:11:1: error: flow
control insn inside a basic block
(jump_insn 24 23 6 2 (parallel [
             (return)
             (use (reg:SI 1 ra))
             (const_int 0 [0])
         ]) "gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c":11:1 -1
      (nil))

If you look at the epilogue code, you will see that it emits a regular instruction which hides the call to the restore routine, and then it emits a special fake return insn that doesn't do anything. You can just stop emitting the special fake return insn in this case. This of course assumes that you have a new set of restore functions that actually return the caller, instead of the caller's parent.

One of the issues with -msave-restore is that the limited offset ranges of calls and branches means that if you don't have a tiny program then each save/restore call/jump is probably an auipc/lui plus the call/tail, which limits the code size reduction you get from using it. If you can control where the -msave-restore routines are placed in memory, then putting them near address 0, or near the global pointer address, will allow linker relaxation to optimize these calls/jumps to a single instruction. This will probably help more than trying to get it to work with sibling calls.

If you can modify the hardware, you might try adding load/store multiple instructions and using that instead of the -msave-restore option. I don't know if anyone has tried this yet, but it would be an interesting experiment that might result in smaller code size.

Jim

Reply via email to