https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108315

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |linkw at gcc dot gnu.org

--- Comment #2 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Alexander Monakov from comment #0)
> Created attachment 54202 [details]
> testcase
> 
> At least the documentation should mention that if intentional.
> 
> In the attached example, the function bar is compiled to
> 
> bar:
>         .localentry     bar,1
>         mtctr 3
>         mr 12,3
>         bctr
>         .long 0
>         .byte 0,0,0,0,0,0,0,0
> 
> i.e. it does not preserve r2 (it's compiled with -mcpu=power10). If the
> caller is not compiled with -mcpu=power10, it needs r2 preserved (bar has a
> localentry, so the nop in the caller stays a nop after linking).

My local 64bit-elfv2-abi spec v1.5 has the following description:

3.4.1. Symbol Values

"The values of these three most significant bits of the st_other field have the
following meanings:

...

1 The local and global entry points are the same, and r2 should be treated as
caller-saved for local and global callers. "

...

"The value of st_other is determined from the .localentry directive as follows:
If the .localentry value is 0, the value of st_other is 0. If the .localentry
value is 1, the value of st_other is 1. Otherwise, the value of st_other is the
logarithm (base 2) of the .localentry value."

The function bar is with st_other value 1, r2 should be treated as
caller-saved, so it doesn't take action to preserve r2.

> 
> I verified the testcase misbehaves on Compile Farm's gcc135: as it does not
> use any power10-specific instructions, it's runnable there.

I tried the attachment on one local machine (also ppc64le p9) and noticed the
linker already did some fix-ups with long_branch.bar stub,

Dump of assembler code for function main:
   0x0000000010000540 <+0>:     lis     r2,4098
   0x0000000010000544 <+4>:     addi    r2,r2,32512
   0x0000000010000548 <+8>:     mflr    r0
   0x000000001000054c <+12>:    nop
   0x0000000010000550 <+16>:    ld      r3,-32728(r2)
   0x0000000010000554 <+20>:    std     r0,16(r1)
   0x0000000010000558 <+24>:    stdu    r1,-32(r1)
   0x000000001000055c <+28>:    bl      0x10000510 <00000038.long_branch.bar>
=> 0x0000000010000560 <+32>:    ld      r2,24(r1)
   0x0000000010000564 <+36>:    addis   r3,r2,-2
   0x0000000010000568 <+40>:    addi    r3,r3,-30328

Dump of assembler code for function 00000038.long_branch.bar:
=> 0x0000000010000510 <+0>:     std     r2,24(r1)
   0x0000000010000514 <+4>:     b       0x10000710 <bar>

which would save r2 onto the corresponding stack slot ahead, it runs well as
expected. Not sure why it doesn't work on your side, maybe this inter-operation
requires some support in newer binutils? My local one is GNU ld 2.34 which is
for final linking (and 2.35 for power10 support, ie. bar.o generation).

Reply via email to