Re: Intel AVX10.1 Compiler Design and Support

Richard Biener via Gcc-patches Mon, 21 Aug 2023 00:37:55 -0700

On Mon, Aug 21, 2023 at 3:20 AM Hongtao Liu via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Sun, Aug 20, 2023 at 6:44 AM ZiNgA BuRgA via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi,
> >
> > With the proposed design of these switches, how would I restrict AVX10.1
> > to particular AVX-512 subsets?
> We can't, avx10.1 is taken as an indivisible ISA which contains all
> AVX512 related instructions.
>
> > We’ve been taking these cases as bugs (but yes, intrinsics are still 
> > allowed, so in some cases it might prove difficult to guarantee this).
> intel sde support avx10.1-256 target which can be used to validate the
> binary(if there's invalid 512-bit vector register or 64-bit kmask
> register is used).
> > I don’t see any other way of doing what you want within the constraints of 
> > this design.
> It looks like the requirement is that we want a
> -mavx10-vector-width=256(or maybe reuse -mprefer-vector-width=256)
> option that acts on the original -mavx512XXX option to produce
> avx10.1-256 compatible binary. we can't use -mavx10.1-256 since it may
> include avx512fp16 directives and thus not be backward compatible
> SKX/CLX/ICX.


Yes.  Note we cannot really re-purpose -mprefer-vector-width=256 since that
would also make uses of 512bit intrinsics ill-formed.  So we'd need a new
flag that would restrict AVX512VL to 256bit, possibly using a common internal
flag for this and the -mavx10.1-256 vector size effect.

Maybe -mdisable-vector-width-512 or -mavx512vl-for-avx10.1-256 or
-mavx512vl-256?  Writing these the last looks most sensible to me?
Note it should combine with -mavx512vl to -mavx512vl-256 to make
-march=native -mavx512vl-256 work (I think we should also allow the
flag together with -mavx10.1*?)

mavx512vl-256
Target ...
Disable the 512bit vector ISA subset of AVX512 or AVX10, enable
the 256bit vector ISA subset of AVX512.

Richard.

> >
> > For example, usage of the |_mm256_rol_epi32| intrinsic should be
> > compatible on any AVX10/256 implementation, /as well as /any AVX-512VL
> > without AVX10 implementation (e.g. Skylake-X).  But how do I signal that
> > I want compatibility with both these targets?
> >
> >   * |-mavx512vl| lets the compiler use 512-bit registers -> incompatible
> >     with 256-bit AVX10.
> >   * |-mavx512vl -mprefer-vector-width=256| might steer the compiler away
> >     from 512-bit registers, but I don't think it guarantees it.
> >   * |-mavx10.1-256| lets the compiler use all Sapphire Rapids AVX-512
> >     features at 256-bit wide (so in theory, it could choose to compile
> >     it with |vpshldd|) -> incompatible with Skylake-X.
> >   * |-mavx10.1-256 -mno-avx512fp16 -mno-avx512...| will emit a warning
> >     and ignore the attempts at disabling AVX-512 subsets.
> >   * |-mavx10.1-256 -mavx512vl| takes the /union/ of the features, not
> >     the /intersection./
> >
> > Is there something like |-mavx512vl -mmax-vector-width=256|, or am I
> > misunderstanding the situation?
> >
> > Thanks!
>
>
>
> --
> BR,
> Hongtao

Re: Intel AVX10.1 Compiler Design and Support

Reply via email to