Re: PING [PATCH] x86: Add ix86_ifunc_ref_local_ok

2022-07-31 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 27, 2022 at 4:47 PM H.J. Lu  wrote:
>
> On Thu, Jul 21, 2022 at 11:53 AM H.J. Lu  wrote:
> >
> > We can't always use the PLT entry as the function address for local IFUNC
> > functions.  When the PIC register is needed for PLT call, indirect call
> > via the PLT entry will fail since the PIC register may not be set up
> > properly for indirect call.  Add ix86_ifunc_ref_local_ok to return false
> > when the PLT entry can't be used as local IFUNC function pointers.
> >
> > gcc/
> >
> > PR target/83782
> > * config/i386/i386.cc (ix86_ifunc_ref_local_ok): New.
> > (TARGET_IFUNC_REF_LOCAL_OK): Use it.
> >
> > gcc/testsuite/
> >
> > PR target/83782
> > * gcc.target/i386/pr83782-1.c: Require non-ia32.
> > * gcc.target/i386/pr83782-2.c: Likewise.
> > * gcc.target/i386/pr83782-3.c: New test.

You are the expert in this area, I'll blindly rubber-stamp OK.

Thanks,
Uros.

> > ---
> >  gcc/config/i386/i386.cc   | 15 ++-
> >  gcc/testsuite/gcc.target/i386/pr83782-1.c |  8 +++---
> >  gcc/testsuite/gcc.target/i386/pr83782-2.c |  4 +--
> >  gcc/testsuite/gcc.target/i386/pr83782-3.c | 32 +++
> >  4 files changed, 50 insertions(+), 9 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr83782-3.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index e03f86d4a23..5e30dc884bf 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -16070,6 +16070,19 @@ ix86_call_use_plt_p (rtx call_op)
> >return true;
> >  }
> >
> > +/* Implement TARGET_IFUNC_REF_LOCAL_OK.  If this hook returns true,
> > +   the PLT entry will be used as the function address for local IFUNC
> > +   functions.  When the PIC register is needed for PLT call, indirect
> > +   call via the PLT entry will fail since the PIC register may not be
> > +   set up properly for indirect call.  In this case, we should return
> > +   false.  */
> > +
> > +static bool
> > +ix86_ifunc_ref_local_ok (void)
> > +{
> > +  return !flag_pic || (TARGET_64BIT && ix86_cmodel != CM_LARGE_PIC);
> > +}
> > +
> >  /* Return true if the function being called was marked with attribute
> > "noplt" or using -fno-plt and we are compiling for non-PIC.  We need
> > to handle the non-PIC case in the backend because there is no easy
> > @@ -24953,7 +24966,7 @@ ix86_libgcc_floating_mode_supported_p
> >ix86_get_multilib_abi_name
> >
> >  #undef TARGET_IFUNC_REF_LOCAL_OK
> > -#define TARGET_IFUNC_REF_LOCAL_OK hook_bool_void_true
> > +#define TARGET_IFUNC_REF_LOCAL_OK ix86_ifunc_ref_local_ok
> >
> >  #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES
> >  # undef TARGET_ASM_RELOC_RW_MASK
> > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-1.c 
> > b/gcc/testsuite/gcc.target/i386/pr83782-1.c
> > index ce97b12e65d..85674346aec 100644
> > --- a/gcc/testsuite/gcc.target/i386/pr83782-1.c
> > +++ b/gcc/testsuite/gcc.target/i386/pr83782-1.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-do compile } */
> > +/* { dg-do compile { target { ! ia32 } } } */
> >  /* { dg-require-ifunc "" } */
> >  /* { dg-options "-O2 -fpic" } */
> >
> > @@ -20,7 +20,5 @@ bar(void)
> >return foo;
> >  }
> >
> > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
> > target ia32 } } } */
> > -/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ 
> > \t]%(?:e|r)ax} { target { ! ia32 } } } } */
> > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
> > -/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } 
> > } } } */
> > +/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ 
> > \t]%(?:e|r)ax} } } */
> > +/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-2.c 
> > b/gcc/testsuite/gcc.target/i386/pr83782-2.c
> > index e25d258bbda..a654ded771f 100644
> > --- a/gcc/testsuite/gcc.target/i386/pr83782-2.c
> > +++ b/gcc/testsuite/gcc.target/i386/pr83782-2.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-do compile } */
> > +/* { dg-do compile { target { ! ia32 } } } */
> >  /* { dg-require-ifunc "" } */
> >  /* { dg-options "-O2 -fpic" } */
> >
> > @@ -20,7 +20,5 @@ bar(void)
> >return foo;
> >  }
> >
> > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
> > target ia32 } } } */
> >  /* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ 
> > \t]%(?:e|r)ax} { target { ! ia32 } } } } */
> > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
> >  /* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } 
> > } } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-3.c 
> > b/gcc/testsuite/gcc.target/i386/pr83782-3.c
> > new file mode 100644
> > index 000..1536481cb79
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr83782-3.c
> > @@ -0,0 +1,32 @@
> > +/* { dg-do run }  */
> > +/* { dg-require-ifunc "" } */
> > +/* { 

PING [PATCH] x86: Add ix86_ifunc_ref_local_ok

2022-07-27 Thread H.J. Lu via Gcc-patches
On Thu, Jul 21, 2022 at 11:53 AM H.J. Lu  wrote:
>
> We can't always use the PLT entry as the function address for local IFUNC
> functions.  When the PIC register is needed for PLT call, indirect call
> via the PLT entry will fail since the PIC register may not be set up
> properly for indirect call.  Add ix86_ifunc_ref_local_ok to return false
> when the PLT entry can't be used as local IFUNC function pointers.
>
> gcc/
>
> PR target/83782
> * config/i386/i386.cc (ix86_ifunc_ref_local_ok): New.
> (TARGET_IFUNC_REF_LOCAL_OK): Use it.
>
> gcc/testsuite/
>
> PR target/83782
> * gcc.target/i386/pr83782-1.c: Require non-ia32.
> * gcc.target/i386/pr83782-2.c: Likewise.
> * gcc.target/i386/pr83782-3.c: New test.
> ---
>  gcc/config/i386/i386.cc   | 15 ++-
>  gcc/testsuite/gcc.target/i386/pr83782-1.c |  8 +++---
>  gcc/testsuite/gcc.target/i386/pr83782-2.c |  4 +--
>  gcc/testsuite/gcc.target/i386/pr83782-3.c | 32 +++
>  4 files changed, 50 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr83782-3.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index e03f86d4a23..5e30dc884bf 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -16070,6 +16070,19 @@ ix86_call_use_plt_p (rtx call_op)
>return true;
>  }
>
> +/* Implement TARGET_IFUNC_REF_LOCAL_OK.  If this hook returns true,
> +   the PLT entry will be used as the function address for local IFUNC
> +   functions.  When the PIC register is needed for PLT call, indirect
> +   call via the PLT entry will fail since the PIC register may not be
> +   set up properly for indirect call.  In this case, we should return
> +   false.  */
> +
> +static bool
> +ix86_ifunc_ref_local_ok (void)
> +{
> +  return !flag_pic || (TARGET_64BIT && ix86_cmodel != CM_LARGE_PIC);
> +}
> +
>  /* Return true if the function being called was marked with attribute
> "noplt" or using -fno-plt and we are compiling for non-PIC.  We need
> to handle the non-PIC case in the backend because there is no easy
> @@ -24953,7 +24966,7 @@ ix86_libgcc_floating_mode_supported_p
>ix86_get_multilib_abi_name
>
>  #undef TARGET_IFUNC_REF_LOCAL_OK
> -#define TARGET_IFUNC_REF_LOCAL_OK hook_bool_void_true
> +#define TARGET_IFUNC_REF_LOCAL_OK ix86_ifunc_ref_local_ok
>
>  #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES
>  # undef TARGET_ASM_RELOC_RW_MASK
> diff --git a/gcc/testsuite/gcc.target/i386/pr83782-1.c 
> b/gcc/testsuite/gcc.target/i386/pr83782-1.c
> index ce97b12e65d..85674346aec 100644
> --- a/gcc/testsuite/gcc.target/i386/pr83782-1.c
> +++ b/gcc/testsuite/gcc.target/i386/pr83782-1.c
> @@ -1,4 +1,4 @@
> -/* { dg-do compile } */
> +/* { dg-do compile { target { ! ia32 } } } */
>  /* { dg-require-ifunc "" } */
>  /* { dg-options "-O2 -fpic" } */
>
> @@ -20,7 +20,5 @@ bar(void)
>return foo;
>  }
>
> -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
> target ia32 } } } */
> -/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} 
> { target { ! ia32 } } } } */
> -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
> -/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } } 
> } } */
> +/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} 
> } } */
> +/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr83782-2.c 
> b/gcc/testsuite/gcc.target/i386/pr83782-2.c
> index e25d258bbda..a654ded771f 100644
> --- a/gcc/testsuite/gcc.target/i386/pr83782-2.c
> +++ b/gcc/testsuite/gcc.target/i386/pr83782-2.c
> @@ -1,4 +1,4 @@
> -/* { dg-do compile } */
> +/* { dg-do compile { target { ! ia32 } } } */
>  /* { dg-require-ifunc "" } */
>  /* { dg-options "-O2 -fpic" } */
>
> @@ -20,7 +20,5 @@ bar(void)
>return foo;
>  }
>
> -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
> target ia32 } } } */
>  /* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} 
> { target { ! ia32 } } } } */
> -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
>  /* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } } 
> } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr83782-3.c 
> b/gcc/testsuite/gcc.target/i386/pr83782-3.c
> new file mode 100644
> index 000..1536481cb79
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr83782-3.c
> @@ -0,0 +1,32 @@
> +/* { dg-do run }  */
> +/* { dg-require-ifunc "" } */
> +/* { dg-require-effective-target pie } */
> +/* { dg-options "-fpie -pie" } */
> +
> +#include 
> +
> +static int __attribute__((noinline))
> +implementation (void)
> +{
> +  printf ("'ere I am JH\n");
> +  return 0;
> +}
> +
> +static __typeof__ (implementation) *resolver (void)
> +{
> +  return (void *)implementation;
> +}
> +
> +extern int magic (void) __attribute__ 

PING: [PATCH] x86: Add

2020-10-02 Thread H.J. Lu via Gcc-patches
On Wed, Sep 23, 2020 at 10:58 AM H.J. Lu  wrote:
>
> For sources which can't use any vector instructions,  and
>  cannot be included for compiler intrinsics:
>
> $ echo "#include " | gcc -S -O2 -mno-sse -mno-mmx -x c -
> In file included from /usr/include/stdlib.h:1013,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/mm_malloc.h:27,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/xmmintrin.h:34,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/immintrin.h:29,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/x86intrin.h:32,
>  from :1:
> /usr/include/bits/stdlib-float.h: In function ‘atof’:
> /usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE 
> disabled
>26 | {
>   | ^
> $
>
> libgcc/config/i386/shadow-stack-unwind.h has a workaround:
>
> /* NB: We need _get_ssp and _inc_ssp from .  But we can't
>include  which ends up including , which
>includes  and  unconditionally.  But we can't
>include any libc system headers unconditionally from libgcc.  Avoid
>including  here by defining _IMMINTRIN_H_INCLUDED.  */
>  #define _IMMINTRIN_H_INCLUDED
>  #include 
>  #undef _IMMINTRIN_H_INCLUDED
>
> Add a standalone intrinsic header file, , to provide
> integer only intrinsics.  All integer only intrinsics are placed in
> .   and  simply include
> .
>
> Add the FSF copyright to ,  and
> .
>
> gcc/
>
> PR target/97148
> * config.gcc (extra_headers): Add x86gprintrin.h.
> * config/i386/adxintrin.h: Check _X86GPRINTRIN_H_INCLUDED for
> .
> * config/i386/bmi2intrin.h: Likewise.
> * config/i386/bmiintrin.h: Likewise.
> * config/i386/cetintrin.h: Likewise.
> * config/i386/cldemoteintrin.h: Likewise.
> * config/i386/clflushoptintrin.h: Likewise.
> * config/i386/clwbintrin.h: Likewise.
> * config/i386/fxsrintrin.h: Likewise.
> * config/i386/ia32intrin.h: Likewise.
> * config/i386/lwpintrin.h: Likewise.
> * config/i386/lzcntintrin.h: Likewise.
> * config/i386/movdirintrin.h: Likewise.
> * config/i386/pkuintrin.h: Likewise.
> * config/i386/rdseedintrin.h: Likewise.
> * config/i386/rtmintrin.h: Likewise.
> * config/i386/serializeintrin.h: Likewise.
> * config/i386/tbmintrin.h: Likewise.
> * config/i386/waitpkgintrin.h: Likewise.
> * config/i386/xsavecintrin.h: Likewise.
> * config/i386/xsaveintrin.h: Likewise.
> * config/i386/xsaveoptintrin.h: Likewise.
> * config/i386/xsavesintrin.h: Likewise.
> * config/i386/xtestintrin.h: Likewise.
> * config/i386/enqcmdintrin.h: Check _X86GPRINTRIN_H_INCLUDED for
> .  Replace  with 
> in the error message.
> * config/i386/immintrin.h: Include  instead of
> , , ,
> , , ,
> , , , ,
> , ,
> , , ,
> , , ,
> , ,  and
> .
> (_wbinvd): Moved to config/i386/x86gprintrin.h.
> (_rdrand16_step): Likewise.
> (_rdrand32_step): Likewise.
> (_rdpid_u32): Likewise.
> (_readfsbase_u32): Likewise.
> (_readfsbase_u64): Likewise.
> (_readgsbase_u32): Likewise.
> (_readgsbase_u64): Likewise.
> (_writefsbase_u32): Likewise.
> (_writefsbase_u64): Likewise.
> (_writegsbase_u32): Likewise.
> (_writegsbase_u64): Likewise.
> (_rdrand64_step): Likewise.
> (_ptwrite64): Likewise.
> (_ptwrite32): Likewise.
> * config/i386/x86gprintrin.h: New file.
> * config/i386/pconfigintrin.h: Add the FSF copyright.  Check
> _X86GPRINTRIN_H_INCLUDED for .
> * config/i386/tsxldtrkintrin.h: Likewise.
> * config/i386/wbnoinvdintrin.h: Likewise.
> * config/i386/x86intrin.h: Include .  Don't
> include , , ,
> ,  and .
>
> gcc/testsuite/
>
> * gcc.target/i386/avx-1.c (__builtin_ia32_lwpval32): New to
> support  included in .
> (__builtin_ia32_lwpval64): Likewise.
> (__builtin_ia32_lwpins32): Likewise.
> (__builtin_ia32_lwpins64): Likewise.
> (__builtin_ia32_bextri_u32): New to support 
> included in .
> (__builtin_ia32_bextri_u64): Likewise.
> * gcc.target/i386/x86gprintrin-1.c: New test.
> * gcc.target/i386/x86gprintrin-2.c: Likewise.
> * gcc.target/i386/x86gprintrin-3.c: Likewise.
> * gcc.target/i386/x86gprintrin-4.c: Likewise.
> * gcc.target/i386/x86gprintrin-4a.c: Likewise.
> * gcc.target/i386/x86gprintrin-5.c: Likewise.
> * gcc.target/i386/x86gprintrin-5a.c: Likewise.
> * gcc.target/i386/x86gprintrin-5b.c: Likewise.
> * gcc.target/i386/x86gprintrin-6.c: Likewise.
>
> libgcc/
>
> PR target/97148
> * config/i386/shadow-stack-unwind.h: Include 
> 

PING [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-08-19 Thread H.J. Lu via Gcc-patches
On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
>
> On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
> >
> > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> > >
> > > Duplicate the cmpstrn pattern for cmpmem.  The only difference is that
> > > the length argument of cmpmem is guaranteed to be less than or equal to
> > > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower than
> > > memcmp function implemented with vector instruction, see
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > >
> > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> >
> > If there is no benefit compared to the library implementation, then
> > enable these patterns only when -minline-all-stringops is used.
>
> Fixed.
>
> > Eventually these should be reimplemented with SSE4 string instructions.
> >
> > Honza is the author of the block handling x86 system, I'll leave the
> > review to him.
>
> We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
> by
>
> commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> Author: Nick Clifton 
> Date:   Fri Aug 12 16:26:11 2011 +
>
> builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
>
> * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> pattern.
> * doc/md.texi (cmpstrn): Note that the comparison stops if both
> fetched bytes are zero.
> (cmpstr): Likewise.
> (cmpmem): Note that the comparison does not stop if both of the
> fetched bytes are zero.
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
>
> is a regression.
>
> Honza, can you take a look at this?
>

PING:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html

-- 
H.J.


PING: [PATCH] x86: Add UNSPECV_PATCHABLE_AREA

2020-05-22 Thread H.J. Lu via Gcc-patches
On Sat, May 2, 2020 at 4:55 AM H.J. Lu  wrote:
>
> Currently patchable area is at the wrong place.  It is placed immediately
> after function label, before both .cfi_startproc and ENDBR.  This patch
> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
> changes ENDBR insertion pass to also insert patchable area instruction.
> TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is defined to avoid placing
> patchable area before .cfi_startproc and ENDBR.
>
> OK for master?
>
> Thanks.
>
> H.J.
> ---
> gcc/
>
> PR target/93492
> * config/i386/i386-features.c (rest_of_insert_endbranch):
> Renamed to ...
> (rest_of_insert_endbr_and_patchable_area): Change return type
> to void. Add need_endbr and patchable_area_size arguments.
> Don't call timevar_push nor timevar_pop.  Replace
> endbr_queued_at_entrance with insn_queued_at_entrance.  Insert
> UNSPECV_PATCHABLE_AREA for patchable area.
> (pass_data_insert_endbranch): Renamed to ...
> (pass_data_insert_endbr_and_patchable_area): This.  Change
> pass name to endbr_and_patchable_area.
> (pass_insert_endbranch): Renamed to ...
> (pass_insert_endbr_and_patchable_area): This.  Add need_endbr
> and patchable_area_size;.
> (pass_insert_endbr_and_patchable_area::gate): Set and check
> need_endbr and patchable_area_size.
> (pass_insert_endbr_and_patchable_area::execute): Call
> timevar_push and timevar_pop.  Pass need_endbr and
> patchable_area_size to rest_of_insert_endbr_and_patchable_area.
> (make_pass_insert_endbranch): Renamed to ...
> (make_pass_insert_endbr_and_patchable_area): This.
> * config/i386/i386-passes.def: Replace pass_insert_endbranch
> with pass_insert_endbr_and_patchable_area.
> * config/i386/i386-protos.h (ix86_output_patchable_area): New.
> (make_pass_insert_endbranch): Renamed to ...
> (make_pass_insert_endbr_and_patchable_area): This.
> * config/i386/i386.c (ix86_asm_output_function_label): Set
> function_label_emitted to true.
> (ix86_print_patchable_function_entry): New function.
> (ix86_output_patchable_area): Likewise.
> (x86_function_profiler): Replace endbr_queued_at_entrance with
> insn_queued_at_entrance.  Generate ENDBR only for TYPE_ENDBR.
> Call ix86_output_patchable_area to generate patchable area if
> needed.
> (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): New.
> * i386.h (queued_insn_type): New.
> (machine_function): Add function_label_emitted.  Replace
> endbr_queued_at_entrance with insn_queued_at_entrance.
> * config/i386/i386.md (UNSPECV_PATCHABLE_AREA): New.
> (patchable_area): New.
>
> gcc/testsuite/
>
> PR target/93492
> * gcc.target/i386/pr93492-1.c: New test.
> * gcc.target/i386/pr93492-2.c: Likewise.
> * gcc.target/i386/pr93492-3.c: Likewise.
> * gcc.target/i386/pr93492-4.c: Likewise.
> * gcc.target/i386/pr93492-5.c: Likewise.

PING:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545021.html

-- 
H.J.


PING [PATCH] x86: Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]

2018-11-29 Thread H.J. Lu
On Wed, Oct 31, 2018 at 12:42 PM H.J. Lu  wrote:
>
> On Thu, Sep 27, 2018 at 7:58 AM Richard Biener
>  wrote:
> >
> > On Thu, Sep 27, 2018 at 3:16 PM H.J. Lu  wrote:
> > >
> > > On Thu, Sep 27, 2018 at 6:08 AM, Szabolcs Nagy  
> > > wrote:
> > > > On 26/09/18 19:10, H.J. Lu wrote:
> > > >>
> > > >> Add -mzero-caller-saved-regs=[skip|used|all] command-line option and
> > > >> zero_caller_saved_regs("skip|used|all") function attribue:
> > > >>
> > > >> 1. -mzero-caller-saved-regs=skip and zero_caller_saved_regs("skip")
> > > >>
> > > >> Don't zero caller-saved integer registers upon function return.
> > > >>
> > > >> 2. -mzero-caller-saved-regs=used and zero_caller_saved_regs("used")
> > > >>
> > > >> Zero used caller-saved integer registers upon function return.
> > > >>
> > > >> 3. -mzero-caller-saved-regs=all and zero_caller_saved_regs("all")
> > > >>
> > > >> Zero all caller-saved integer registers upon function return.
> > > >>
> > > >> Tested on i686 and x86-64 with bootstrapping GCC trunk and
> > > >> -mzero-caller-saved-regs=used as well as -mzero-caller-saved-regs=all
> > > >> enabled by default.
> > > >>
> > > >
> > > > from this description and the documentation it's
> > > > not clear to me what this tries to achieve.
> > > >
> > > > is it trying to prevent information leak?
> > > > or some pcs hack the caller may rely on?
> > > >
> > > > if it's for information leak then i'd expect such
> > > > attribute to be used on crypto code.. however i'd
> > > > expect crypto code to use simd registers as well,
> > > > so integer only cleaning needs explanation.
> > >
> > > The target usage is in Linux kernel.
> >
> > Maybe still somehow encode that in the option since it otherwise raises
> > expectations that are not met?
> > -mzero-call-clobbered-regs=used-int|all-int|skip|used-simd|used-fp,etc.?
> > and sorry() on unimplemented ones?  Or simply zero also non-integer
> > regs the same way?  I suppose
> > there isn't sth like vzeroupper that zeros all SIMD regs and completely?
> >
>
> Here is the updated patch to zero caller-saved vector registers.   I don't
> mind a different option name if it is preferred.  I may be able to create
> some generic utility functions which can be used by other backends.  But
> actual implementation must be target specific.
>
> Any comments?

PING.

https://gcc.gnu.org/ml/gcc-patches/2018-10/msg02079.html

-- 
H.J.