Re: Ping: [PATCH] Darwini,X86: Adjust call clobbers to allow for lazy-binding [PR100152].
> On 9 Jul 2021, at 09:57, Uros Bizjak wrote: > > On Fri, Jul 9, 2021 at 10:25 AM Iain Sandoe wrote: >> >> (early) ping; >> if possible I’d like to get this onto master in time to back-port for 11.2. >> >>> On 4 Jul 2021, at 21:08, Iain Sandoe wrote: >>> >>> Hi, >>> >>> (I’m not going to defend the status quo here, it seems a bit prone >>> to confusing a user [different interposition behaviour between the >>> inlined and non-inlined cases] however, this is what the platform >>> compilers implement). >>> >>> >>> >>> We allow public functions defined in a TU to bind locally for PIC >>> code (the default) on 64bit Mach-O. >>> >>> If such functions are not inlined, we cannot tell at compile-time if >>> they might be called via the lazy symbol resolver (this can depend on >>> options given at link-time). Therefore, we must assume that the lazy >>> resolver could be used which clobbers R11 and R10. >>> >>> The solution here is similar in form to the one used for veneer regs >>> on Arm (but I’m open to alternate suggestions). >>> >>> tested on X86_64-darwin, linux >>> OK for master? >>> Iain >>> >>> Signed-off-by: Iain Sandoe >>> >>> PR target/100152 - [10/11/12 Regression] used caller-saved register not >>> preserved across a call. >>> >>> PR target/100152 >>> >>> gcc/ChangeLog: >>> >>> * config/i386/i386-expand.c (ix86_expand_call): If a call is >>> to a non-local-binding, or local but to a public symbol, then >>> assume that it might be indirected via the lazy symbol binder. >>> Mark R10 and R10 as clobbered in that case. > > LGTM, but this is Darwin specific patch, so you could approve it yourself. thanks, maybe I should have made it “RFC” (I was wondering if there was any better way to do this). Anyway, I’ll get it in and baking on master. thanks again, Iain
Re: Ping: [PATCH] Darwini,X86: Adjust call clobbers to allow for lazy-binding [PR100152].
On Fri, Jul 9, 2021 at 10:25 AM Iain Sandoe wrote: > > (early) ping; > if possible I’d like to get this onto master in time to back-port for 11.2. > > > On 4 Jul 2021, at 21:08, Iain Sandoe wrote: > > > > Hi, > > > > (I’m not going to defend the status quo here, it seems a bit prone > > to confusing a user [different interposition behaviour between the > > inlined and non-inlined cases] however, this is what the platform > > compilers implement). > > > > > > > > We allow public functions defined in a TU to bind locally for PIC > > code (the default) on 64bit Mach-O. > > > > If such functions are not inlined, we cannot tell at compile-time if > > they might be called via the lazy symbol resolver (this can depend on > > options given at link-time). Therefore, we must assume that the lazy > > resolver could be used which clobbers R11 and R10. > > > > The solution here is similar in form to the one used for veneer regs > > on Arm (but I’m open to alternate suggestions). > > > > tested on X86_64-darwin, linux > > OK for master? > > Iain > > > > Signed-off-by: Iain Sandoe > > > > PR target/100152 - [10/11/12 Regression] used caller-saved register not > > preserved across a call. > > > >PR target/100152 > > > > gcc/ChangeLog: > > > > * config/i386/i386-expand.c (ix86_expand_call): If a call is > > to a non-local-binding, or local but to a public symbol, then > > assume that it might be indirected via the lazy symbol binder. > > Mark R10 and R10 as clobbered in that case. LGTM, but this is Darwin specific patch, so you could approve it yourself. Uros. > > --- > > gcc/config/i386/i386-expand.c | 16 +++- > > 1 file changed, 15 insertions(+), 1 deletion(-) > > > > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c > > index b37642e35ee..1b860e027b0 100644 > > --- a/gcc/config/i386/i386-expand.c > > +++ b/gcc/config/i386/i386-expand.c > > @@ -8380,6 +8380,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx > > callarg1, > > pop = NULL; > > gcc_assert (!TARGET_64BIT || !pop); > > > > + rtx addr = XEXP (fnaddr, 0); > > if (TARGET_MACHO && !TARGET_64BIT) > > { > > #if TARGET_MACHO > > @@ -8392,7 +8393,6 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx > > callarg1, > > /* Static functions and indirect calls don't need the pic register. > > Also, > >check if PLT was explicitly avoided via no-plt or "noplt" attribute, > > making > >it an indirect call. */ > > - rtx addr = XEXP (fnaddr, 0); > > if (flag_pic > > && GET_CODE (addr) == SYMBOL_REF > > && !SYMBOL_REF_LOCAL_P (addr)) > > @@ -8555,6 +8555,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx > > callarg1, > > } > > } > > > > + if (TARGET_MACHO && TARGET_64BIT && !sibcall > > + && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr)) > > + || !fndecl || TREE_PUBLIC (fndecl))) > > +{ > > + /* We allow public functions defined in a TU to bind locally for PIC > > + code (the default) on 64bit Mach-O. > > + If such functions are not inlined, we cannot tell at compile-time if > > + they will be called via the lazy symbol resolver (this can depend on > > + options given at link-time). Therefore, we must assume that the lazy > > + resolver could be used which clobbers R11 and R10. */ > > + clobber_reg (, gen_rtx_REG (DImode, R11_REG)); > > + clobber_reg (, gen_rtx_REG (DImode, R10_REG)); > > +} > > + > > if (vec_len > 1) > > call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec)); > > rtx_insn *call_insn = emit_call_insn (call); > > -- > > 2.24.1 > > >
Ping: [PATCH] Darwini,X86: Adjust call clobbers to allow for lazy-binding [PR100152].
(early) ping; if possible I’d like to get this onto master in time to back-port for 11.2. > On 4 Jul 2021, at 21:08, Iain Sandoe wrote: > > Hi, > > (I’m not going to defend the status quo here, it seems a bit prone > to confusing a user [different interposition behaviour between the > inlined and non-inlined cases] however, this is what the platform > compilers implement). > > > > We allow public functions defined in a TU to bind locally for PIC > code (the default) on 64bit Mach-O. > > If such functions are not inlined, we cannot tell at compile-time if > they might be called via the lazy symbol resolver (this can depend on > options given at link-time). Therefore, we must assume that the lazy > resolver could be used which clobbers R11 and R10. > > The solution here is similar in form to the one used for veneer regs > on Arm (but I’m open to alternate suggestions). > > tested on X86_64-darwin, linux > OK for master? > Iain > > Signed-off-by: Iain Sandoe > > PR target/100152 - [10/11/12 Regression] used caller-saved register not > preserved across a call. > >PR target/100152 > > gcc/ChangeLog: > > * config/i386/i386-expand.c (ix86_expand_call): If a call is > to a non-local-binding, or local but to a public symbol, then > assume that it might be indirected via the lazy symbol binder. > Mark R10 and R10 as clobbered in that case. > --- > gcc/config/i386/i386-expand.c | 16 +++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c > index b37642e35ee..1b860e027b0 100644 > --- a/gcc/config/i386/i386-expand.c > +++ b/gcc/config/i386/i386-expand.c > @@ -8380,6 +8380,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, > pop = NULL; > gcc_assert (!TARGET_64BIT || !pop); > > + rtx addr = XEXP (fnaddr, 0); > if (TARGET_MACHO && !TARGET_64BIT) > { > #if TARGET_MACHO > @@ -8392,7 +8393,6 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, > /* Static functions and indirect calls don't need the pic register. > Also, >check if PLT was explicitly avoided via no-plt or "noplt" attribute, > making >it an indirect call. */ > - rtx addr = XEXP (fnaddr, 0); > if (flag_pic > && GET_CODE (addr) == SYMBOL_REF > && !SYMBOL_REF_LOCAL_P (addr)) > @@ -8555,6 +8555,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, > } > } > > + if (TARGET_MACHO && TARGET_64BIT && !sibcall > + && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr)) > + || !fndecl || TREE_PUBLIC (fndecl))) > +{ > + /* We allow public functions defined in a TU to bind locally for PIC > + code (the default) on 64bit Mach-O. > + If such functions are not inlined, we cannot tell at compile-time if > + they will be called via the lazy symbol resolver (this can depend on > + options given at link-time). Therefore, we must assume that the lazy > + resolver could be used which clobbers R11 and R10. */ > + clobber_reg (, gen_rtx_REG (DImode, R11_REG)); > + clobber_reg (, gen_rtx_REG (DImode, R10_REG)); > +} > + > if (vec_len > 1) > call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec)); > rtx_insn *call_insn = emit_call_insn (call); > -- > 2.24.1 >
[PATCH] Darwini, X86: Adjust call clobbers to allow for lazy-binding [PR100152].
Hi, (I’m not going to defend the status quo here, it seems a bit prone to confusing a user [different interposition behaviour between the inlined and non-inlined cases] however, this is what the platform compilers implement). We allow public functions defined in a TU to bind locally for PIC code (the default) on 64bit Mach-O. If such functions are not inlined, we cannot tell at compile-time if they might be called via the lazy symbol resolver (this can depend on options given at link-time). Therefore, we must assume that the lazy resolver could be used which clobbers R11 and R10. The solution here is similar in form to the one used for veneer regs on Arm (but I’m open to alternate suggestions). tested on X86_64-darwin, linux OK for master? Iain Signed-off-by: Iain Sandoe PR target/100152 - [10/11/12 Regression] used caller-saved register not preserved across a call. PR target/100152 gcc/ChangeLog: * config/i386/i386-expand.c (ix86_expand_call): If a call is to a non-local-binding, or local but to a public symbol, then assume that it might be indirected via the lazy symbol binder. Mark R10 and R10 as clobbered in that case. --- gcc/config/i386/i386-expand.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index b37642e35ee..1b860e027b0 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -8380,6 +8380,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, pop = NULL; gcc_assert (!TARGET_64BIT || !pop); + rtx addr = XEXP (fnaddr, 0); if (TARGET_MACHO && !TARGET_64BIT) { #if TARGET_MACHO @@ -8392,7 +8393,6 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, /* Static functions and indirect calls don't need the pic register. Also, check if PLT was explicitly avoided via no-plt or "noplt" attribute, making it an indirect call. */ - rtx addr = XEXP (fnaddr, 0); if (flag_pic && GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr)) @@ -8555,6 +8555,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, } } + if (TARGET_MACHO && TARGET_64BIT && !sibcall + && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr)) + || !fndecl || TREE_PUBLIC (fndecl))) +{ + /* We allow public functions defined in a TU to bind locally for PIC +code (the default) on 64bit Mach-O. +If such functions are not inlined, we cannot tell at compile-time if +they will be called via the lazy symbol resolver (this can depend on +options given at link-time). Therefore, we must assume that the lazy +resolver could be used which clobbers R11 and R10. */ + clobber_reg (, gen_rtx_REG (DImode, R11_REG)); + clobber_reg (, gen_rtx_REG (DImode, R10_REG)); +} + if (vec_len > 1) call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec)); rtx_insn *call_insn = emit_call_insn (call); -- 2.24.1