Re: PING [PATCH] x86: Add ix86_ifunc_ref_local_ok
On Wed, Jul 27, 2022 at 4:47 PM H.J. Lu wrote: > > On Thu, Jul 21, 2022 at 11:53 AM H.J. Lu wrote: > > > > We can't always use the PLT entry as the function address for local IFUNC > > functions. When the PIC register is needed for PLT call, indirect call > > via the PLT entry will fail since the PIC register may not be set up > > properly for indirect call. Add ix86_ifunc_ref_local_ok to return false > > when the PLT entry can't be used as local IFUNC function pointers. > > > > gcc/ > > > > PR target/83782 > > * config/i386/i386.cc (ix86_ifunc_ref_local_ok): New. > > (TARGET_IFUNC_REF_LOCAL_OK): Use it. > > > > gcc/testsuite/ > > > > PR target/83782 > > * gcc.target/i386/pr83782-1.c: Require non-ia32. > > * gcc.target/i386/pr83782-2.c: Likewise. > > * gcc.target/i386/pr83782-3.c: New test. You are the expert in this area, I'll blindly rubber-stamp OK. Thanks, Uros. > > --- > > gcc/config/i386/i386.cc | 15 ++- > > gcc/testsuite/gcc.target/i386/pr83782-1.c | 8 +++--- > > gcc/testsuite/gcc.target/i386/pr83782-2.c | 4 +-- > > gcc/testsuite/gcc.target/i386/pr83782-3.c | 32 +++ > > 4 files changed, 50 insertions(+), 9 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr83782-3.c > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > index e03f86d4a23..5e30dc884bf 100644 > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -16070,6 +16070,19 @@ ix86_call_use_plt_p (rtx call_op) > >return true; > > } > > > > +/* Implement TARGET_IFUNC_REF_LOCAL_OK. If this hook returns true, > > + the PLT entry will be used as the function address for local IFUNC > > + functions. When the PIC register is needed for PLT call, indirect > > + call via the PLT entry will fail since the PIC register may not be > > + set up properly for indirect call. In this case, we should return > > + false. */ > > + > > +static bool > > +ix86_ifunc_ref_local_ok (void) > > +{ > > + return !flag_pic || (TARGET_64BIT && ix86_cmodel != CM_LARGE_PIC); > > +} > > + > > /* Return true if the function being called was marked with attribute > > "noplt" or using -fno-plt and we are compiling for non-PIC. We need > > to handle the non-PIC case in the backend because there is no easy > > @@ -24953,7 +24966,7 @@ ix86_libgcc_floating_mode_supported_p > >ix86_get_multilib_abi_name > > > > #undef TARGET_IFUNC_REF_LOCAL_OK > > -#define TARGET_IFUNC_REF_LOCAL_OK hook_bool_void_true > > +#define TARGET_IFUNC_REF_LOCAL_OK ix86_ifunc_ref_local_ok > > > > #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES > > # undef TARGET_ASM_RELOC_RW_MASK > > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-1.c > > b/gcc/testsuite/gcc.target/i386/pr83782-1.c > > index ce97b12e65d..85674346aec 100644 > > --- a/gcc/testsuite/gcc.target/i386/pr83782-1.c > > +++ b/gcc/testsuite/gcc.target/i386/pr83782-1.c > > @@ -1,4 +1,4 @@ > > -/* { dg-do compile } */ > > +/* { dg-do compile { target { ! ia32 } } } */ > > /* { dg-require-ifunc "" } */ > > /* { dg-options "-O2 -fpic" } */ > > > > @@ -20,7 +20,5 @@ bar(void) > >return foo; > > } > > > > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { > > target ia32 } } } */ > > -/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ > > \t]%(?:e|r)ax} { target { ! ia32 } } } } */ > > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */ > > -/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } > > } } } */ > > +/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ > > \t]%(?:e|r)ax} } } */ > > +/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" } } */ > > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-2.c > > b/gcc/testsuite/gcc.target/i386/pr83782-2.c > > index e25d258bbda..a654ded771f 100644 > > --- a/gcc/testsuite/gcc.target/i386/pr83782-2.c > > +++ b/gcc/testsuite/gcc.target/i386/pr83782-2.c > > @@ -1,4 +1,4 @@ > > -/* { dg-do compile } */ > > +/* { dg-do compile { target { ! ia32 } } } */ > > /* { dg-require-ifunc "" } */ > > /* { dg-options "-O2 -fpic" } */ > > > > @@ -20,7 +20,5 @@ bar(void) > >return foo; > > } > > > > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { > > target ia32 } } } */ > > /* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ > > \t]%(?:e|r)ax} { target { ! ia32 } } } } */ > > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */ > > /* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } > > } } } */ > > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-3.c > > b/gcc/testsuite/gcc.target/i386/pr83782-3.c > > new file mode 100644 > > index 000..1536481cb79 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr83782-3.c > > @@ -0,0 +1,32 @@ > > +/* { dg-do run } */ > > +/* { dg-require-ifunc "" } */ > > +/* {
PING [PATCH] x86: Add ix86_ifunc_ref_local_ok
On Thu, Jul 21, 2022 at 11:53 AM H.J. Lu wrote: > > We can't always use the PLT entry as the function address for local IFUNC > functions. When the PIC register is needed for PLT call, indirect call > via the PLT entry will fail since the PIC register may not be set up > properly for indirect call. Add ix86_ifunc_ref_local_ok to return false > when the PLT entry can't be used as local IFUNC function pointers. > > gcc/ > > PR target/83782 > * config/i386/i386.cc (ix86_ifunc_ref_local_ok): New. > (TARGET_IFUNC_REF_LOCAL_OK): Use it. > > gcc/testsuite/ > > PR target/83782 > * gcc.target/i386/pr83782-1.c: Require non-ia32. > * gcc.target/i386/pr83782-2.c: Likewise. > * gcc.target/i386/pr83782-3.c: New test. > --- > gcc/config/i386/i386.cc | 15 ++- > gcc/testsuite/gcc.target/i386/pr83782-1.c | 8 +++--- > gcc/testsuite/gcc.target/i386/pr83782-2.c | 4 +-- > gcc/testsuite/gcc.target/i386/pr83782-3.c | 32 +++ > 4 files changed, 50 insertions(+), 9 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr83782-3.c > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index e03f86d4a23..5e30dc884bf 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -16070,6 +16070,19 @@ ix86_call_use_plt_p (rtx call_op) >return true; > } > > +/* Implement TARGET_IFUNC_REF_LOCAL_OK. If this hook returns true, > + the PLT entry will be used as the function address for local IFUNC > + functions. When the PIC register is needed for PLT call, indirect > + call via the PLT entry will fail since the PIC register may not be > + set up properly for indirect call. In this case, we should return > + false. */ > + > +static bool > +ix86_ifunc_ref_local_ok (void) > +{ > + return !flag_pic || (TARGET_64BIT && ix86_cmodel != CM_LARGE_PIC); > +} > + > /* Return true if the function being called was marked with attribute > "noplt" or using -fno-plt and we are compiling for non-PIC. We need > to handle the non-PIC case in the backend because there is no easy > @@ -24953,7 +24966,7 @@ ix86_libgcc_floating_mode_supported_p >ix86_get_multilib_abi_name > > #undef TARGET_IFUNC_REF_LOCAL_OK > -#define TARGET_IFUNC_REF_LOCAL_OK hook_bool_void_true > +#define TARGET_IFUNC_REF_LOCAL_OK ix86_ifunc_ref_local_ok > > #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES > # undef TARGET_ASM_RELOC_RW_MASK > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-1.c > b/gcc/testsuite/gcc.target/i386/pr83782-1.c > index ce97b12e65d..85674346aec 100644 > --- a/gcc/testsuite/gcc.target/i386/pr83782-1.c > +++ b/gcc/testsuite/gcc.target/i386/pr83782-1.c > @@ -1,4 +1,4 @@ > -/* { dg-do compile } */ > +/* { dg-do compile { target { ! ia32 } } } */ > /* { dg-require-ifunc "" } */ > /* { dg-options "-O2 -fpic" } */ > > @@ -20,7 +20,5 @@ bar(void) >return foo; > } > > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { > target ia32 } } } */ > -/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} > { target { ! ia32 } } } } */ > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */ > -/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } } > } } */ > +/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} > } } */ > +/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" } } */ > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-2.c > b/gcc/testsuite/gcc.target/i386/pr83782-2.c > index e25d258bbda..a654ded771f 100644 > --- a/gcc/testsuite/gcc.target/i386/pr83782-2.c > +++ b/gcc/testsuite/gcc.target/i386/pr83782-2.c > @@ -1,4 +1,4 @@ > -/* { dg-do compile } */ > +/* { dg-do compile { target { ! ia32 } } } */ > /* { dg-require-ifunc "" } */ > /* { dg-options "-O2 -fpic" } */ > > @@ -20,7 +20,5 @@ bar(void) >return foo; > } > > -/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { > target ia32 } } } */ > /* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} > { target { ! ia32 } } } } */ > -/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */ > /* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } } > } } */ > diff --git a/gcc/testsuite/gcc.target/i386/pr83782-3.c > b/gcc/testsuite/gcc.target/i386/pr83782-3.c > new file mode 100644 > index 000..1536481cb79 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr83782-3.c > @@ -0,0 +1,32 @@ > +/* { dg-do run } */ > +/* { dg-require-ifunc "" } */ > +/* { dg-require-effective-target pie } */ > +/* { dg-options "-fpie -pie" } */ > + > +#include > + > +static int __attribute__((noinline)) > +implementation (void) > +{ > + printf ("'ere I am JH\n"); > + return 0; > +} > + > +static __typeof__ (implementation) *resolver (void) > +{ > + return (void *)implementation; > +} > + > +extern int magic (void) __attribute__
PING: [PATCH] x86: Add
On Wed, Sep 23, 2020 at 10:58 AM H.J. Lu wrote: > > For sources which can't use any vector instructions, and > cannot be included for compiler intrinsics: > > $ echo "#include " | gcc -S -O2 -mno-sse -mno-mmx -x c - > In file included from /usr/include/stdlib.h:1013, > from > /usr/lib/gcc/x86_64-redhat-linux/10/include/mm_malloc.h:27, > from > /usr/lib/gcc/x86_64-redhat-linux/10/include/xmmintrin.h:34, > from > /usr/lib/gcc/x86_64-redhat-linux/10/include/immintrin.h:29, > from > /usr/lib/gcc/x86_64-redhat-linux/10/include/x86intrin.h:32, > from :1: > /usr/include/bits/stdlib-float.h: In function ‘atof’: > /usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE > disabled >26 | { > | ^ > $ > > libgcc/config/i386/shadow-stack-unwind.h has a workaround: > > /* NB: We need _get_ssp and _inc_ssp from . But we can't >include which ends up including , which >includes and unconditionally. But we can't >include any libc system headers unconditionally from libgcc. Avoid >including here by defining _IMMINTRIN_H_INCLUDED. */ > #define _IMMINTRIN_H_INCLUDED > #include > #undef _IMMINTRIN_H_INCLUDED > > Add a standalone intrinsic header file, , to provide > integer only intrinsics. All integer only intrinsics are placed in > . and simply include > . > > Add the FSF copyright to , and > . > > gcc/ > > PR target/97148 > * config.gcc (extra_headers): Add x86gprintrin.h. > * config/i386/adxintrin.h: Check _X86GPRINTRIN_H_INCLUDED for > . > * config/i386/bmi2intrin.h: Likewise. > * config/i386/bmiintrin.h: Likewise. > * config/i386/cetintrin.h: Likewise. > * config/i386/cldemoteintrin.h: Likewise. > * config/i386/clflushoptintrin.h: Likewise. > * config/i386/clwbintrin.h: Likewise. > * config/i386/fxsrintrin.h: Likewise. > * config/i386/ia32intrin.h: Likewise. > * config/i386/lwpintrin.h: Likewise. > * config/i386/lzcntintrin.h: Likewise. > * config/i386/movdirintrin.h: Likewise. > * config/i386/pkuintrin.h: Likewise. > * config/i386/rdseedintrin.h: Likewise. > * config/i386/rtmintrin.h: Likewise. > * config/i386/serializeintrin.h: Likewise. > * config/i386/tbmintrin.h: Likewise. > * config/i386/waitpkgintrin.h: Likewise. > * config/i386/xsavecintrin.h: Likewise. > * config/i386/xsaveintrin.h: Likewise. > * config/i386/xsaveoptintrin.h: Likewise. > * config/i386/xsavesintrin.h: Likewise. > * config/i386/xtestintrin.h: Likewise. > * config/i386/enqcmdintrin.h: Check _X86GPRINTRIN_H_INCLUDED for > . Replace with > in the error message. > * config/i386/immintrin.h: Include instead of > , , , > , , , > , , , , > , , > , , , > , , , > , , and > . > (_wbinvd): Moved to config/i386/x86gprintrin.h. > (_rdrand16_step): Likewise. > (_rdrand32_step): Likewise. > (_rdpid_u32): Likewise. > (_readfsbase_u32): Likewise. > (_readfsbase_u64): Likewise. > (_readgsbase_u32): Likewise. > (_readgsbase_u64): Likewise. > (_writefsbase_u32): Likewise. > (_writefsbase_u64): Likewise. > (_writegsbase_u32): Likewise. > (_writegsbase_u64): Likewise. > (_rdrand64_step): Likewise. > (_ptwrite64): Likewise. > (_ptwrite32): Likewise. > * config/i386/x86gprintrin.h: New file. > * config/i386/pconfigintrin.h: Add the FSF copyright. Check > _X86GPRINTRIN_H_INCLUDED for . > * config/i386/tsxldtrkintrin.h: Likewise. > * config/i386/wbnoinvdintrin.h: Likewise. > * config/i386/x86intrin.h: Include . Don't > include , , , > , and . > > gcc/testsuite/ > > * gcc.target/i386/avx-1.c (__builtin_ia32_lwpval32): New to > support included in . > (__builtin_ia32_lwpval64): Likewise. > (__builtin_ia32_lwpins32): Likewise. > (__builtin_ia32_lwpins64): Likewise. > (__builtin_ia32_bextri_u32): New to support > included in . > (__builtin_ia32_bextri_u64): Likewise. > * gcc.target/i386/x86gprintrin-1.c: New test. > * gcc.target/i386/x86gprintrin-2.c: Likewise. > * gcc.target/i386/x86gprintrin-3.c: Likewise. > * gcc.target/i386/x86gprintrin-4.c: Likewise. > * gcc.target/i386/x86gprintrin-4a.c: Likewise. > * gcc.target/i386/x86gprintrin-5.c: Likewise. > * gcc.target/i386/x86gprintrin-5a.c: Likewise. > * gcc.target/i386/x86gprintrin-5b.c: Likewise. > * gcc.target/i386/x86gprintrin-6.c: Likewise. > > libgcc/ > > PR target/97148 > * config/i386/shadow-stack-unwind.h: Include >
PING [PATCH] x86: Add cmpmemsi for -minline-all-stringops
On Tue, May 19, 2020 at 5:14 AM H.J. Lu wrote: > > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak wrote: > > > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu wrote: > > > > > > Duplicate the cmpstrn pattern for cmpmem. The only difference is that > > > the length argument of cmpmem is guaranteed to be less than or equal to > > > lengths of 2 memory areas. Since "repz cmpsb" can be much slower than > > > memcmp function implemented with vector instruction, see > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052 > > > > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only. > > > > If there is no benefit compared to the library implementation, then > > enable these patterns only when -minline-all-stringops is used. > > Fixed. > > > Eventually these should be reimplemented with SSE4 string instructions. > > > > Honza is the author of the block handling x86 system, I'll leave the > > review to him. > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi. It was changed > by > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f > Author: Nick Clifton > Date: Fri Aug 12 16:26:11 2011 + > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern. > > * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi > pattern. > * doc/md.texi (cmpstrn): Note that the comparison stops if both > fetched bytes are zero. > (cmpstr): Likewise. > (cmpmem): Note that the comparison does not stop if both of the > fetched bytes are zero. > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151 > > is a regression. > > Honza, can you take a look at this? > PING: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html -- H.J.
PING: [PATCH] x86: Add UNSPECV_PATCHABLE_AREA
On Sat, May 2, 2020 at 4:55 AM H.J. Lu wrote: > > Currently patchable area is at the wrong place. It is placed immediately > after function label, before both .cfi_startproc and ENDBR. This patch > adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and > changes ENDBR insertion pass to also insert patchable area instruction. > TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is defined to avoid placing > patchable area before .cfi_startproc and ENDBR. > > OK for master? > > Thanks. > > H.J. > --- > gcc/ > > PR target/93492 > * config/i386/i386-features.c (rest_of_insert_endbranch): > Renamed to ... > (rest_of_insert_endbr_and_patchable_area): Change return type > to void. Add need_endbr and patchable_area_size arguments. > Don't call timevar_push nor timevar_pop. Replace > endbr_queued_at_entrance with insn_queued_at_entrance. Insert > UNSPECV_PATCHABLE_AREA for patchable area. > (pass_data_insert_endbranch): Renamed to ... > (pass_data_insert_endbr_and_patchable_area): This. Change > pass name to endbr_and_patchable_area. > (pass_insert_endbranch): Renamed to ... > (pass_insert_endbr_and_patchable_area): This. Add need_endbr > and patchable_area_size;. > (pass_insert_endbr_and_patchable_area::gate): Set and check > need_endbr and patchable_area_size. > (pass_insert_endbr_and_patchable_area::execute): Call > timevar_push and timevar_pop. Pass need_endbr and > patchable_area_size to rest_of_insert_endbr_and_patchable_area. > (make_pass_insert_endbranch): Renamed to ... > (make_pass_insert_endbr_and_patchable_area): This. > * config/i386/i386-passes.def: Replace pass_insert_endbranch > with pass_insert_endbr_and_patchable_area. > * config/i386/i386-protos.h (ix86_output_patchable_area): New. > (make_pass_insert_endbranch): Renamed to ... > (make_pass_insert_endbr_and_patchable_area): This. > * config/i386/i386.c (ix86_asm_output_function_label): Set > function_label_emitted to true. > (ix86_print_patchable_function_entry): New function. > (ix86_output_patchable_area): Likewise. > (x86_function_profiler): Replace endbr_queued_at_entrance with > insn_queued_at_entrance. Generate ENDBR only for TYPE_ENDBR. > Call ix86_output_patchable_area to generate patchable area if > needed. > (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): New. > * i386.h (queued_insn_type): New. > (machine_function): Add function_label_emitted. Replace > endbr_queued_at_entrance with insn_queued_at_entrance. > * config/i386/i386.md (UNSPECV_PATCHABLE_AREA): New. > (patchable_area): New. > > gcc/testsuite/ > > PR target/93492 > * gcc.target/i386/pr93492-1.c: New test. > * gcc.target/i386/pr93492-2.c: Likewise. > * gcc.target/i386/pr93492-3.c: Likewise. > * gcc.target/i386/pr93492-4.c: Likewise. > * gcc.target/i386/pr93492-5.c: Likewise. PING: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545021.html -- H.J.
PING [PATCH] x86: Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]
On Wed, Oct 31, 2018 at 12:42 PM H.J. Lu wrote: > > On Thu, Sep 27, 2018 at 7:58 AM Richard Biener > wrote: > > > > On Thu, Sep 27, 2018 at 3:16 PM H.J. Lu wrote: > > > > > > On Thu, Sep 27, 2018 at 6:08 AM, Szabolcs Nagy > > > wrote: > > > > On 26/09/18 19:10, H.J. Lu wrote: > > > >> > > > >> Add -mzero-caller-saved-regs=[skip|used|all] command-line option and > > > >> zero_caller_saved_regs("skip|used|all") function attribue: > > > >> > > > >> 1. -mzero-caller-saved-regs=skip and zero_caller_saved_regs("skip") > > > >> > > > >> Don't zero caller-saved integer registers upon function return. > > > >> > > > >> 2. -mzero-caller-saved-regs=used and zero_caller_saved_regs("used") > > > >> > > > >> Zero used caller-saved integer registers upon function return. > > > >> > > > >> 3. -mzero-caller-saved-regs=all and zero_caller_saved_regs("all") > > > >> > > > >> Zero all caller-saved integer registers upon function return. > > > >> > > > >> Tested on i686 and x86-64 with bootstrapping GCC trunk and > > > >> -mzero-caller-saved-regs=used as well as -mzero-caller-saved-regs=all > > > >> enabled by default. > > > >> > > > > > > > > from this description and the documentation it's > > > > not clear to me what this tries to achieve. > > > > > > > > is it trying to prevent information leak? > > > > or some pcs hack the caller may rely on? > > > > > > > > if it's for information leak then i'd expect such > > > > attribute to be used on crypto code.. however i'd > > > > expect crypto code to use simd registers as well, > > > > so integer only cleaning needs explanation. > > > > > > The target usage is in Linux kernel. > > > > Maybe still somehow encode that in the option since it otherwise raises > > expectations that are not met? > > -mzero-call-clobbered-regs=used-int|all-int|skip|used-simd|used-fp,etc.? > > and sorry() on unimplemented ones? Or simply zero also non-integer > > regs the same way? I suppose > > there isn't sth like vzeroupper that zeros all SIMD regs and completely? > > > > Here is the updated patch to zero caller-saved vector registers. I don't > mind a different option name if it is preferred. I may be able to create > some generic utility functions which can be used by other backends. But > actual implementation must be target specific. > > Any comments? PING. https://gcc.gnu.org/ml/gcc-patches/2018-10/msg02079.html -- H.J.