On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via Gcc-patches wrote: > > The question is if the pragma GCC target right now behaves incrementally > > or not, whether > > #pragma GCC target("avx2") > > adds -mavx2 to options if it was missing before and nothing otherwise, or if > > it switches other options off. If it is incremental, we could e.g. try to > > use the second least significant bit of global_options_set.x_* to mean > > this option has been set explicitly by some surrounding #pragma GCC target. > > The normal tests - global_options_set.x_flag_whatever could still work > > fine because they wouldn't care if the option was explicit from anywhere > > (command line or GCC target or target attribute) and just & 2 would mean > > it was explicit from pragma GCC target; though there is the case of > > bitfields... And then the inlining decision could check the & 2 flags to > > see what is required and what is just from command line. > > Or we can have some other pragma GCC that would be like target but would > > have flags that are explicit (and could e.g. be more restricted, to ISA > > options only, and let those use in addition to #pragma GCC target. > > I'm still curious as to what you think will break if always-inline does what > it is documented to do.
We will silently accept calling intrinsics that must be used only in certain ISA contexts, which will lead to people writing non-portable code. So -O2 -mno-avx #include <x86intrin.h> void foo (__m256 *x) { x[0] = _mm256_sub_ps (x[1], x[2]); } etc. will now be accepted when it shouldn't be. clang rejects it like gcc with: 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target feature 'avx', but would be inlined into function 'foo' that is compiled without support for 'avx' x[0] = _mm256_sub_ps (x[1], x[2]); ^ Note, if I do: #include <x86intrin.h> __attribute__((target ("no-sse3"))) void foo (__m256 *x) { x[0] = _mm256_sub_ps (x[1], x[2]); } and compile clang -S -O2 -mavx2 1.c 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target feature 'avx', but would be inlined into function 'foo' that is compiled without support for 'avx' x[0] = _mm256_sub_ps (x[1], x[2]); ^ then from the error message it seems that unlike GCC, clang remembers the exact target features that are needed for the intrinsics and checks just those. Though, looking at the preprocessed source, seems it uses static __inline __m256 __attribute__((__always_inline__, __nodebug__, __target__("avx"), __min_vector_width__(256))) _mm256_sub_ps(__m256 __a, __m256 __b) { return (__m256)((__v8sf)__a-(__v8sf)__b); } and not target pragmas. Anyway, if we tweak our intrinsic headers so that -#ifndef __AVX__ #pragma GCC push_options #pragma GCC target("avx") -#define __DISABLE_AVX__ -#endif /* __AVX__ */ ... -#ifdef __DISABLE_AVX__ -#undef __DISABLE_AVX__ #pragma GCC pop_options -#endif /* __DISABLE_AVX__ */ and do the opts_set->x_* & 2 stuff on explicit options coming out of target/optimize pragmas and attributes, perhaps we don't even need to introduce a new attribute and can handle everything magically: 1) if it is gnu_inline extern inline, allow indirect calls, otherwise disallow them for always_inline functions 2) for the isa flags and option mismatches, only disallow opts_set->x_* & 2 stuff This will keep both intrinsics and glibc fortify macros working fine in all the needed use cases. Jakub