https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #13 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #10)
> (In reply to Hongtao Liu from comment #9)
> > (In reply to Hongtao Liu from comment #8)
> > > (In reply to H.J. Lu from comment #7)
> > > > Created attachment 60350 [details]
> > > > ira: Don't increase callee-saved register cost by 1000x
> > >
> > > NOTE, r15-1619-g3b9b8d6cfdf593 improved 500.perlbench_r on many different
> > > platforms, let me help verify the patch with SPEC2017.
> >
> > There're 5% regression on alderlake for 511.povray_r.
> > With the patch, there're more PUSH/POPs for callee saved registers.(Those
> > PUSH/POPs have been eliminated by r15-1619-g3b9b8d6cfdf593)
>
> We need testcases to show that. Without them, we can't be sure that the
> improvement won't go away.
I think the testcase in PR111673 demonstrates it
int f(int);
int advance(int dz)
{
if (dz > 0)
return (dz + dz) * dz;
else
return dz * f(dz);
}
Before r15-1619-g3b9b8d6cfdf593
advance(int):
push rbx
mov ebx, edi
test edi, edi
jle .L2
imul ebx, edi
lea eax, [rbx+rbx]
pop rbx
ret
.L2:
call f(int)
imul eax, ebx
pop rbx
ret
After
advance(int):
test edi, edi
jle .L2
imul edi, edi
lea eax, [rdi+rdi]
ret
.L2:
sub rsp, 24
mov DWORD PTR [rsp+12], edi
call f(int)
imul eax, DWORD PTR [rsp+12]
add rsp, 24
ret
Unlike testcase in #c6(call in both if and else branch), there's no call in if
branch, it's not optimal to push rbx at the entry of the function, it can be
sinked to else branch(as sub + mov). When jle .L2 is not taken, it can save
one push instruction. And that's why 511.povray_r is improved.