[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #15 from CVS Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:745d04e796c1a7ebcea0185d0742d58b0c0030ab

commit r11-6557-g745d04e796c1a7ebcea0185d0742d58b0c0030ab
Author: H.J. Lu 
Date:   Fri Jan 8 08:41:38 2021 -0800

x86-64: Require lp64 for PR target/98482 tests

Require lp64 for PR target/98482 tests since -mcmodel=large is isn't
supported for x32.

PR target/98482
* gcc.target/i386/pr98482-1.c: Require lp64.
* gcc.target/i386/pr98482-2.c: Likewise.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #14 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #10)
> If we are emitting for nested functions
>   pushq   %r10
> 1:call__fentry__
>   popq%r10
> (is it ok to misalign the stack for __fentry__? but then even plain call
> __fentry__ actually misaligns it), then perhaps we can do similarly for the
> PIC case.  But I wonder how does __fentry__ then find the caller if it can't
> rely on the return address being right above the return address to the
> function that called __fentry__ (appart from unwind info of course, but we
> don't really emit .cfi_* directives here either, do we?).

Generic part of the compiler pushes static chain register for nested functions,
so there is little we can do in the target part. If there is a problem with
misaligned stack, then I think __mcount_internal will have to be realigned,
because calls to both, mcount and __fentry__ can be misaligned.

I don't know what to do with __fentry__ argument. Luckily, mcount finds its
argument via frame pointer, so it works there.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #13 from H.J. Lu  ---
Fixed for GCC 11 so far.  Please open a new GCC bug for mcount stack
alignment.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #12 from CVS Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:76be18f442948d1a4bc49a7d670b07097f9e5983

commit r11-6552-g76be18f442948d1a4bc49a7d670b07097f9e5983
Author: H.J. Lu 
Date:   Fri Jan 8 05:20:19 2021 -0800

x86-64: Use R10 and R11 for profiling large model with PIC

For NO_PROFILE_COUNTERS targets, R11 is a scratch register.  We can use
R10 and R11 to call mcount in large model with PIC.

gcc/

PR target/98482
* config/i386/i386.c (x86_function_profiler): Use R10 and R11
to call mcount in large model with PIC for NO_PROFILE_COUNTERS
targets.

gcc/testsuite/

PR target/98482
* gcc.target/i386/pr98482-2.c: Updated.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #11 from CVS Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:1b885264a48dcd71b7aeb26c0abeb91246724897

commit r11-6548-g1b885264a48dcd71b7aeb26c0abeb91246724897
Author: H.J. Lu 
Date:   Thu Jan 7 14:27:49 2021 -0800

x86-64: Use R10 for profiling large model

R10 is caller-saved.  Although it can be used as a static chain register,
it is preserved when calling mcount for nested functions.  Use R10 as a
scratch register to call mcount in large model.

gcc/

PR target/98482
* config/i386/i386.c (x86_function_profiler): Use R10 to call
mcount in large model.  Sorry for large model with PIC.

gcc/testsuite/

PR target/98482
* gcc.target/i386/pr98482-1.c: New test.
* gcc.target/i386/pr98482-1.c: Likewise.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #10 from Jakub Jelinek  ---
If we are emitting for nested functions
pushq   %r10
1:  call__fentry__
popq%r10
(is it ok to misalign the stack for __fentry__? but then even plain call
__fentry__ actually misaligns it), then perhaps we can do similarly for the PIC
case.  But I wonder how does __fentry__ then find the caller if it can't rely
on the return address being right above the return address to the function that
called __fentry__ (appart from unwind info of course, but we don't really emit
.cfi_* directives here either, do we?).

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #9 from Uroš Bizjak  ---
(In reply to Topi Miettinen from comment #8)
> I'm unfortunately ignorant to GCC internals and usage of %r10, but otherwise
> the patch looks good to me.
> 
> For -mcmodel=large -fPIC, the call sequence probably needs to be similar to
> how other extern functions are called under those flags:
> 
> .L2:
> movabsq $_GLOBAL_OFFSET_TABLE_-.L2, %reg0
> leaq.L2(%rip), %reg1
> movabsq $__fentry__@PLTOFF, %reg2
> addq%reg0, %reg1
> addq%reg1, %reg2
> call*%reg2

We are only lucky to get one temporary register (%r10), so perhaps the above
could be implemented for NO_PROFILE_COUNTERS targets, where %r11 is also
available.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-08 Thread toiwoton at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #8 from Topi Miettinen  ---
I'm unfortunately ignorant to GCC internals and usage of %r10, but otherwise
the patch looks good to me.

For -mcmodel=large -fPIC, the call sequence probably needs to be similar to how
other extern functions are called under those flags:

.L2:
movabsq $_GLOBAL_OFFSET_TABLE_-.L2, %reg0
leaq.L2(%rip), %reg1
movabsq $__fentry__@PLTOFF, %reg2
addq%reg0, %reg1
addq%reg1, %reg2
call*%reg2

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #7 from Hongtao.liu  ---
(In reply to H.J. Lu from comment #6)
> A patch is posted at
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563033.html

Yes, %r10 is pushed before __fentry__

cat test.c

void func(int (*param)(int));

void outer(int x)
{
int nested(int y)
{
// If x is not used somewhere in here,
// then the function will be "lifted" into
// a normal, non-nested function.
return x + y;
}
func(nested);
}

with -O2 -pg -mfentry got

nested.0:
.LFB1:
.cfi_startproc
pushq   %r10
1:  call__fentry__
popq%r10
movl(%r10), %eax
addl%edi, %eax
ret
.cfi_endproc
.LFE1:
.size   nested.0, .-nested.0
.p2align 4
.globl  outer
.type   outer, @function
outer:
.LFB0:
.cfi_startproc
1:  call__fentry__
subq$56, %rsp
.cfi_def_cfa_offset 64
leaq64(%rsp), %rax
movq%rax, 32(%rsp)
movl$-17599, %eax
movl%edi, (%rsp)
movw%ax, 4(%rsp)
movl$-17847, %edx
movl$nested.0, %eax
leaq4(%rsp), %rdi
movl%eax, 6(%rsp)
movw%dx, 10(%rsp)
movq%rsp, 12(%rsp)
movl$-1864106167, 20(%rsp)
callfunc
addq$56, %rsp
.cfi_def_cfa_offset 8

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-07 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed|2021-01-04 00:00:00 |2021-01-07
 Status|UNCONFIRMED |NEW

--- Comment #6 from H.J. Lu  ---
A patch is posted at

https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563033.html

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #5 from Uroš Bizjak  ---
(In reply to Topi Miettinen from comment #4)
> Sorry, I didn't check the ABI. It seems that %r11 and maybe %r10 should be
> usable:

%r11 is already used as PROFILE_COUNT_REGISTER for !NO_PROFILE_COUNTERS
targets.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-07 Thread toiwoton at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #4 from Topi Miettinen  ---
Sorry, I didn't check the ABI. It seems that %r11 and maybe %r10 should be
usable:

Figure 3.4: Register Usage

Register
 Usage
 Preserved across function calls

%r10
 temporary register, used for passing a function’s static chain pointer
 No

%r11
 temporary register
 No

Otherwise, I suppose any register could be used if it's saved:
pushq   %reg
movabsq $__fentry__, %reg
call*%reg
popq %reg

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #3 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #2)
> (In reply to Hongtao.liu from comment #1)
> > and by the time of output __fentry__ in gcc, register is already accocated,
> > is there any regs supposed to be safe in the entry of function? or we need
> > to spill reg to stack and load it back after call, it looks inefficient.
> 
> You can use any calee-saved register here.

Eh, no - __fentry__ is called before pushes.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #2 from Uroš Bizjak  ---
(In reply to Hongtao.liu from comment #1)
> and by the time of output __fentry__ in gcc, register is already accocated,
> is there any regs supposed to be safe in the entry of function? or we need
> to spill reg to stack and load it back after call, it looks inefficient.

You can use any calee-saved register here.

[Bug target/98482] -mfentry creates invalid call for -mcmodel=large

2021-01-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482

--- Comment #1 from Hongtao.liu  ---
(In reply to Topi Miettinen from comment #0)
> GCC on x86_64 with `-mfentry` generates invalid code for `-mcmodel=large`.
> The call to `__fentry__` uses plain `call` instruction, but this can only
> address locations within 32 bit range while the target may be anywhere in
> the 64 bit range due to `-mcmodel=large`.
> 
> The expected code would be something which loads a 64 bit value to a
> register and then uses register indirect call, so instead of
> call__fentry__
> there needs to be
> movabsq $__fentry__, %rax
> call*%rax

according to PSabi, %rax is not safe.
%rax
temporary register; with variable arguments No
passes information about the number of vector
registers used; 1 st return register

and by the time of output __fentry__ in gcc, register is already accocated, is
there any regs supposed to be safe in the entry of function? or we need to
spill reg to stack and load it back after call, it looks inefficient.