Re: [PATCH v5] : Add pragma GCC target("general-regs-only")
On Sun, Jul 18, 2021 at 3:46 AM H.J. Lu wrote: > > On Thu, Apr 22, 2021 at 7:30 AM Richard Biener via Gcc-patches > wrote: > > > > On Thu, Apr 22, 2021 at 2:52 PM Richard Biener > > wrote: > > > > > > On Thu, Apr 22, 2021 at 2:22 PM Jakub Jelinek wrote: > > > > > > > > On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via > > > > Gcc-patches wrote: > > > > > > The question is if the pragma GCC target right now behaves > > > > > > incrementally > > > > > > or not, whether > > > > > > #pragma GCC target("avx2") > > > > > > adds -mavx2 to options if it was missing before and nothing > > > > > > otherwise, or if > > > > > > it switches other options off. If it is incremental, we could e.g. > > > > > > try to > > > > > > use the second least significant bit of global_options_set.x_* to > > > > > > mean > > > > > > this option has been set explicitly by some surrounding #pragma GCC > > > > > > target. > > > > > > The normal tests - global_options_set.x_flag_whatever could still > > > > > > work > > > > > > fine because they wouldn't care if the option was explicit from > > > > > > anywhere > > > > > > (command line or GCC target or target attribute) and just & 2 would > > > > > > mean > > > > > > it was explicit from pragma GCC target; though there is the case of > > > > > > bitfields... And then the inlining decision could check the & 2 > > > > > > flags to > > > > > > see what is required and what is just from command line. > > > > > > Or we can have some other pragma GCC that would be like target but > > > > > > would > > > > > > have flags that are explicit (and could e.g. be more restricted, to > > > > > > ISA > > > > > > options only, and let those use in addition to #pragma GCC target. > > > > > > > > > > I'm still curious as to what you think will break if always-inline > > > > > does what > > > > > it is documented to do. > > > > > > > > We will silently accept calling intrinsics that must be used only in > > > > certain > > > > ISA contexts, which will lead to people writing non-portable code. > > > > > > > > So -O2 -mno-avx > > > > #include > > > > > > > > void > > > > foo (__m256 *x) > > > > { > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > } > > > > etc. will now be accepted when it shouldn't be. > > > > clang rejects it like gcc with: > > > > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > > > > feature 'avx', but would be inlined into function 'foo' that is > > > > compiled without support for 'avx' > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > ^ > > > > > > > > Note, if I do: > > > > #include > > > > > > > > __attribute__((target ("no-sse3"))) void > > > > foo (__m256 *x) > > > > { > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > } > > > > and compile > > > > clang -S -O2 -mavx2 1.c > > > > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > > > > feature 'avx', but would be inlined into function 'foo' that is > > > > compiled without support for 'avx' > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > ^ > > > > then from the error message it seems that unlike GCC, clang remembers > > > > the exact target features that are needed for the intrinsics and checks > > > > just > > > > those. > > > > Though, looking at the preprocessed source, seems it uses > > > > static __inline __m256 __attribute__((__always_inline__, __nodebug__, > > > > __target__("avx"), __min_vector_width__(256))) > > > > _mm256_sub_ps(__m256 __a, __m256 __b) > > > > { > > > > return (__m256)((__v8sf)__a-(__v8sf)__b); > > > > } > > > > and not target pragmas. > > > > > > > > Anyway, if we tweak our intrinsic headers so that > > > > -#ifndef __AVX__ > > > > #pragma GCC push_options > > > > #pragma GCC target("avx") > > > > -#define __DISABLE_AVX__ > > > > -#endif /* __AVX__ */ > > > > > > > > ... > > > > -#ifdef __DISABLE_AVX__ > > > > -#undef __DISABLE_AVX__ > > > > #pragma GCC pop_options > > > > -#endif /* __DISABLE_AVX__ */ > > > > and do the opts_set->x_* & 2 stuff on explicit options coming out of > > > > target/optimize pragmas and attributes, perhaps we don't even need > > > > to introduce a new attribute and can handle everything magically: > > > > Oh, and any such changes will likely interact with Martins ideas to rework > > how optimize and target attributes work (aka adding ontop of the > > commandline options). That is, attribute target will then not be enough > > to remember the exact set of needed ISA features (as opposed to what > > likely clang implements?) > > > > > > 1) if it is gnu_inline extern inline, allow indirect calls, otherwise > > > > disallow them for always_inline functions > > > > > > There are a lot of intrinsics using extern inline __gnu_inline though... > > > > > > > 2) for the isa flags and option mismatches, only disallow opts_set->x_* > > > > & 2 > > > > stuff > > > > This will keep both intrinsics and glibc fortify macros working fine > > > > in all the needed use
PING^1 [PATCH v5] : Add pragma GCC target("general-regs-only")
On Sat, Jul 17, 2021 at 6:45 PM H.J. Lu wrote: > > On Thu, Apr 22, 2021 at 7:30 AM Richard Biener via Gcc-patches > wrote: > > > > On Thu, Apr 22, 2021 at 2:52 PM Richard Biener > > wrote: > > > > > > On Thu, Apr 22, 2021 at 2:22 PM Jakub Jelinek wrote: > > > > > > > > On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via > > > > Gcc-patches wrote: > > > > > > The question is if the pragma GCC target right now behaves > > > > > > incrementally > > > > > > or not, whether > > > > > > #pragma GCC target("avx2") > > > > > > adds -mavx2 to options if it was missing before and nothing > > > > > > otherwise, or if > > > > > > it switches other options off. If it is incremental, we could e.g. > > > > > > try to > > > > > > use the second least significant bit of global_options_set.x_* to > > > > > > mean > > > > > > this option has been set explicitly by some surrounding #pragma GCC > > > > > > target. > > > > > > The normal tests - global_options_set.x_flag_whatever could still > > > > > > work > > > > > > fine because they wouldn't care if the option was explicit from > > > > > > anywhere > > > > > > (command line or GCC target or target attribute) and just & 2 would > > > > > > mean > > > > > > it was explicit from pragma GCC target; though there is the case of > > > > > > bitfields... And then the inlining decision could check the & 2 > > > > > > flags to > > > > > > see what is required and what is just from command line. > > > > > > Or we can have some other pragma GCC that would be like target but > > > > > > would > > > > > > have flags that are explicit (and could e.g. be more restricted, to > > > > > > ISA > > > > > > options only, and let those use in addition to #pragma GCC target. > > > > > > > > > > I'm still curious as to what you think will break if always-inline > > > > > does what > > > > > it is documented to do. > > > > > > > > We will silently accept calling intrinsics that must be used only in > > > > certain > > > > ISA contexts, which will lead to people writing non-portable code. > > > > > > > > So -O2 -mno-avx > > > > #include > > > > > > > > void > > > > foo (__m256 *x) > > > > { > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > } > > > > etc. will now be accepted when it shouldn't be. > > > > clang rejects it like gcc with: > > > > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > > > > feature 'avx', but would be inlined into function 'foo' that is > > > > compiled without support for 'avx' > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > ^ > > > > > > > > Note, if I do: > > > > #include > > > > > > > > __attribute__((target ("no-sse3"))) void > > > > foo (__m256 *x) > > > > { > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > } > > > > and compile > > > > clang -S -O2 -mavx2 1.c > > > > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > > > > feature 'avx', but would be inlined into function 'foo' that is > > > > compiled without support for 'avx' > > > > x[0] = _mm256_sub_ps (x[1], x[2]); > > > > ^ > > > > then from the error message it seems that unlike GCC, clang remembers > > > > the exact target features that are needed for the intrinsics and checks > > > > just > > > > those. > > > > Though, looking at the preprocessed source, seems it uses > > > > static __inline __m256 __attribute__((__always_inline__, __nodebug__, > > > > __target__("avx"), __min_vector_width__(256))) > > > > _mm256_sub_ps(__m256 __a, __m256 __b) > > > > { > > > > return (__m256)((__v8sf)__a-(__v8sf)__b); > > > > } > > > > and not target pragmas. > > > > > > > > Anyway, if we tweak our intrinsic headers so that > > > > -#ifndef __AVX__ > > > > #pragma GCC push_options > > > > #pragma GCC target("avx") > > > > -#define __DISABLE_AVX__ > > > > -#endif /* __AVX__ */ > > > > > > > > ... > > > > -#ifdef __DISABLE_AVX__ > > > > -#undef __DISABLE_AVX__ > > > > #pragma GCC pop_options > > > > -#endif /* __DISABLE_AVX__ */ > > > > and do the opts_set->x_* & 2 stuff on explicit options coming out of > > > > target/optimize pragmas and attributes, perhaps we don't even need > > > > to introduce a new attribute and can handle everything magically: > > > > Oh, and any such changes will likely interact with Martins ideas to rework > > how optimize and target attributes work (aka adding ontop of the > > commandline options). That is, attribute target will then not be enough > > to remember the exact set of needed ISA features (as opposed to what > > likely clang implements?) > > > > > > 1) if it is gnu_inline extern inline, allow indirect calls, otherwise > > > > disallow them for always_inline functions > > > > > > There are a lot of intrinsics using extern inline __gnu_inline though... > > > > > > > 2) for the isa flags and option mismatches, only disallow opts_set->x_* > > > > & 2 > > > > stuff > > > > This will keep both intrinsics and glibc fortify macros working fine > > > > in all the needed use
[PATCH v5] : Add pragma GCC target("general-regs-only")
new attribute and can handle everything magically: > > Oh, and any such changes will likely interact with Martins ideas to rework > how optimize and target attributes work (aka adding ontop of the > commandline options). That is, attribute target will then not be enough > to remember the exact set of needed ISA features (as opposed to what > likely clang implements?) > > > > 1) if it is gnu_inline extern inline, allow indirect calls, otherwise > > > disallow them for always_inline functions > > > > There are a lot of intrinsics using extern inline __gnu_inline though... > > > > > 2) for the isa flags and option mismatches, only disallow opts_set->x_* & > > > 2 > > > stuff > > > This will keep both intrinsics and glibc fortify macros working fine > > > in all the needed use cases. > > > > Yes, see my example in the other mail. > > > > I think before we add any new attributes we should sort out the > > current mess, eventually adding some testcases for desired > > diagnostic. > > > > Richard. > > > > > Jakub Here is the v5 patch: 1. Intrinsics in only require GPR ISAs. Add #if defined __MMX__ || defined __SSE__ #pragma GCC push_options #pragma GCC target("general-regs-only") #define __DISABLE_GENERAL_REGS_ONLY__ #endif and #ifdef __DISABLE_GENERAL_REGS_ONLY__ #undef __DISABLE_GENERAL_REGS_ONLY__ #pragma GCC pop_options #endif /* __DISABLE_GENERAL_REGS_ONLY__ */ to to disable non-GPR ISAs so that they can be used in functions with __attribute__ ((target("general-regs-only"))). 2. When checking always_inline attribute, if callee only uses GPRs, ignore MASK_80387 since enable MASK_80387 in caller has no impact on callee inline. OK for master? Thanks. -- H.J. From ac2c100dbaf838a378bd8b1209b57afc8a5c72fc Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Sat, 17 Jul 2021 07:44:45 -0700 Subject: [PATCH v5] : Add pragma GCC target("general-regs-only") 1. Intrinsics in only require GPR ISAs. Add #if defined __MMX__ || defined __SSE__ #pragma GCC push_options #pragma GCC target("general-regs-only") #define __DISABLE_GENERAL_REGS_ONLY__ #endif and #ifdef __DISABLE_GENERAL_REGS_ONLY__ #undef __DISABLE_GENERAL_REGS_ONLY__ #pragma GCC pop_options #endif /* __DISABLE_GENERAL_REGS_ONLY__ */ to to disable non-GPR ISAs so that they can be used in functions with __attribute__ ((target("general-regs-only"))). 2. When checking always_inline attribute, if callee only uses GPRs, ignore MASK_80387 since enable MASK_80387 in caller has no impact on callee inline. gcc/ PR target/99744 * config/i386/i386.c (ix86_can_inline_p): Ignore MASK_80387 if callee only uses GPRs. * config/i386/ia32intrin.h: Revert commit 5463cee2770. * config/i386/serializeintrin.h: Revert commit 71958f740f1. * config/i386/x86gprintrin.h: Add #pragma GCC target("general-regs-only") and #pragma GCC pop_options to disable non-GPR ISAs. gcc/testsuite/ PR target/99744 * gcc.target/i386/pr99744-3.c: New test. * gcc.target/i386/pr99744-4.c: Likewise. * gcc.target/i386/pr99744-5.c: Likewise. * gcc.target/i386/pr99744-6.c: Likewise. * gcc.target/i386/pr99744-7.c: Likewise. * gcc.target/i386/pr99744-8.c: Likewise. --- gcc/config/i386/i386.c| 6 +- gcc/config/i386/ia32intrin.h | 14 +- gcc/config/i386/serializeintrin.h | 7 +- gcc/config/i386/x86gprintrin.h| 11 + gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 ++ gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 ++ gcc/testsuite/gcc.target/i386/pr99744-6.c | 23 ++ gcc/testsuite/gcc.target/i386/pr99744-7.c | 14 + gcc/testsuite/gcc.target/i386/pr99744-8.c | 15 + 10 files changed, 481 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-7.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-8.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 9d74b7a191b..da067b1586f 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -554,7 +554,7 @@ ix86_can_inline_p (tree caller, tree callee) /* Changes of those flags can be tolerated for always inlines. Lets hope user knows what he is doing. */ - const unsigned HOST_WIDE_INT always_inline_safe_mask + unsigned HOST_WIDE_INT always_inline_safe_mask = (MASK_USE_8BIT_IDIV | MASK_ACCUMULATE_OUTGOING_ARGS | MASK_NO_ALIGN_STRINGOPS | MASK_AVX256_SPLIT_UNALIGNED_LOAD | MASK_AVX256_SPLIT_UNALIGNED_STORE | MA