On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via Gcc-patches wrote:
> > The question is if the pragma GCC target right now behaves incrementally
> > or not, whether
> > #pragma GCC target("avx2")
> > adds -mavx2 to options if it was missing before and nothing otherwise, or if
> > it switches other options off.  If it is incremental, we could e.g. try to
> > use the second least significant bit of global_options_set.x_* to mean
> > this option has been set explicitly by some surrounding #pragma GCC target.
> > The normal tests - global_options_set.x_flag_whatever could still work
> > fine because they wouldn't care if the option was explicit from anywhere
> > (command line or GCC target or target attribute) and just & 2 would mean
> > it was explicit from pragma GCC target; though there is the case of
> > bitfields... And then the inlining decision could check the & 2 flags to
> > see what is required and what is just from command line.
> > Or we can have some other pragma GCC that would be like target but would
> > have flags that are explicit (and could e.g. be more restricted, to ISA
> > options only, and let those use in addition to #pragma GCC target.
> 
> I'm still curious as to what you think will break if always-inline does what
> it is documented to do.

We will silently accept calling intrinsics that must be used only in certain
ISA contexts, which will lead to people writing non-portable code.

So -O2 -mno-avx
#include <x86intrin.h>

void
foo (__m256 *x)
{
  x[0] = _mm256_sub_ps (x[1], x[2]);
}
etc. will now be accepted when it shouldn't be.
clang rejects it like gcc with:
1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target feature 
'avx', but would be inlined into function 'foo' that is compiled without 
support for 'avx'
  x[0] = _mm256_sub_ps (x[1], x[2]);
         ^

Note, if I do:
#include <x86intrin.h>

__attribute__((target ("no-sse3"))) void
foo (__m256 *x)
{
  x[0] = _mm256_sub_ps (x[1], x[2]);
}
and compile
clang -S -O2 -mavx2 1.c
1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target feature 
'avx', but would be inlined into function 'foo' that is compiled without 
support for 'avx'
  x[0] = _mm256_sub_ps (x[1], x[2]);
         ^
then from the error message it seems that unlike GCC, clang remembers
the exact target features that are needed for the intrinsics and checks just
those.
Though, looking at the preprocessed source, seems it uses
static __inline __m256 __attribute__((__always_inline__, __nodebug__, 
__target__("avx"), __min_vector_width__(256)))
_mm256_sub_ps(__m256 __a, __m256 __b)
{
  return (__m256)((__v8sf)__a-(__v8sf)__b);
}
and not target pragmas.

Anyway, if we tweak our intrinsic headers so that
-#ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target("avx")
-#define __DISABLE_AVX__
-#endif /* __AVX__ */

...
-#ifdef __DISABLE_AVX__
-#undef __DISABLE_AVX__
 #pragma GCC pop_options
-#endif /* __DISABLE_AVX__ */
and do the opts_set->x_* & 2 stuff on explicit options coming out of
target/optimize pragmas and attributes, perhaps we don't even need
to introduce a new attribute and can handle everything magically:

1) if it is gnu_inline extern inline, allow indirect calls, otherwise
disallow them for always_inline functions
2) for the isa flags and option mismatches, only disallow opts_set->x_* & 2
stuff
This will keep both intrinsics and glibc fortify macros working fine
in all the needed use cases.

        Jakub

Reply via email to