On Sun, Aug 23, 2020 at 9:03 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Sun, Aug 23, 2020 at 5:23 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > On Sun, Aug 23, 2020 at 10:18:28AM +0200, Uros Bizjak wrote:
> > > On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > > > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, 
> > > > > > AVX and
> > > > > > AVX512 during CPUID check to avoid vector and mask register 
> > > > > > operations.
> > > > >
> > > > > -mgeneral-regs-only ?
> > > > >
> > > >
> > > > Here is a patch to add target("general-regs-only") function
> > > > attribute and use it for CPUID check.   OK for master if there
> > > > are no regressions?
> > >
> > > Please test it first, then ask for an approval.
> > >
> > > Please submit the general-regs-only part as an independent patch. (I
> > > think this is the option linux should use for compilation).
> > >
> > > OTOH, wrapping CPUID check in a target attribute is a bad idea. We
> > > should disable spills to mask registers for generic targets by either
> > > raising costs of moves between general and mask registers and/or (as
> > > suggested earlier) introducing TARGET_SPILL_TO_MASK_REGS tuning and
> > > use it in secondary_memory_needed to prevent inter register unit
> > > spills.
> > >
> > > So, compiling with -mavx512bw would NOT enable spills by default,
> > > where compiling with -march=skylake-avx512 (or using equivalent
> > > -mtune) would. This is IMO the least surprising approach, and would
> > > avoid changing sources (as you now have to do for several testcases).
> >
> > We have 2 orthogonal issues here:
> >
> > 1. When mask register spill should be enabled.
> > 2. CPUID check should be done with general registers only.
> >
> > As shown in GCC testcases, CPUID check may be done with arbitrary ISAs
> > or -march/-mtune options enabled.  We should either
> >
> > 1. Enable only general registers for CPUID check.  Or
> > 2. Issue an error for CPUID check if non-general registers are used.
>
> We should follow the same approach as with SSE2, where DI/SImode
> spills to XMM registers were effectively disabled for a generic
> target. So, unless the tuning target is also specified, spills to mask
> registers should not be generated. It was my oversight to approve the
> patch that enables spills for a generic target, and without the tuning
> flag, the patch will be reverted.
>
> Now, we have -mgeneral-regs-only functionality in place, so if a
> package wants to enable spills, the correct -mtune (ro -march that
> implies -mtune) should be used, and it is expected that the detection
> code is amended with general-regs-only pragmas.
>
> <footnote
>
> Speaking of pragmas, these should be added outside cpuid.h, like:
>
> #pragma GCC push_options
> #pragma GCC target("general-regs-only")
>
> #include <cpuid.h>
>
> void cpuid_check ()
> ...
>
> #pragma GCC pop_options
>
> >footnote
>
> Nowadays, -march=native is mostly used outside generic target
> compilations, so for relevant avx512 targets, we still generate spills
> to mask regs. In future, we can review the setting of the tuning flag
> for a generic target in the same way as with SSE2 inter-reg moves.
>

Florian raised an issue that we need to limit <cpuid.h> to the basic ISAs.
<cpuid.h> should be handled similarly to other intrinsic header files.
That is <cpuid.h> should use

#pragma GCC push_options
#ifdef __x86_64__
#pragma GCC target("arch=x86-64")
#else
#pragma GCC target("arch=i386")
...
#pragma GCC pop_options

Here is a patch.  OK for master?

Thanks.

-- 
H.J.

Reply via email to