https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121498
Bug ID: 121498
Summary: RISC-V:The ra register is not pushed onto the stack
due to the use of the -fshrink-wrap parameter
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: lw297073896 at gmail dot com
Target Milestone: ---
Bug Description
Target Architecture: RISC-V
Trigger Condition:
When compiling large functions (exceeding typical stack frame size)
With -fshrink-wrap optimization enabled
Observed Behavior:
The function prologue fails to push the return address (ra) register onto the
stack. This causes the function to return with a non-existent address in the ra
register.
src code snippet:
typedef unsigned short trek_uint16_t;
#define trek_mem_ddr1 0x180000000ULL
#define trek_hart0_T0_state (trek_mem_ddr1+0x093497f4)
#define trek_read32(addr) (*((volatile trek_uint16_t *)(addr)))
#define trek_read32_shared(addr) trek_read32(addr)
void trek_hart0_T0(void) {
switch (trek_read32_shared(trek_hart0_T0_state)) {
case (0x1): {
trek_c2t_event(0, 0x2);
trek_c2t_event(0, 0x3);
}
case (0x2): {
}
//many same cases ...
case (0x1c5d): {
trek_c2t_event(0, 0x1283);
break;
}
default:
if (trek_read32_shared(trek_hart0_T0_state) != 0x1c5e){
trek_c2t_event(0, 0x1284);
}
break;
}
}
The compilation command is:gcc -mcmodel=medany -static -std=gnu99 -O2
-march=rv64gcvh -c trek_pss.c -o trek_pss.o
The assembly format is as follows:
trek_hart0_T0:
.LFB15:
li a5,1649221632
addi a5,a5,1533
slli a5,a5,2
li a4,8192
addi a3,a4,-931
bleu a5,a3,1f; jump .L33362, ra:1:
sext.w a4,a5
lla a5,.L461
sh2add a4,a4,a5
lw a4,0(a4)
addi sp,sp,-48
sd ra,40(sp)
add a5,a4,a5
jr a5
....
.L33362:
li a5,1649221632
addi a5,a5,1533
slli a5,a5,2
lw a4,0(a5)
li a5,8192
addi a5,a5,-930
beq a4,a5,.L33364
...
.L33364:
ret
>From the above assembly, it can be seen that the push of ra occurs after the
jump instruction "jump .L33362". During the linking phase, since the label
".L33362" and the address range of this jump instruction exceed 4KB, the jump
instruction is transformed into "auipc ra,0x16f + jalr x0,-338(ra)", which
overwrites ra. This ultimately causes the address in the ra register to be
invalid when ret is executed.