On Tue, Aug 8, 2023 at 9:58 AM liuhongt wrote:
>
> Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
> supported via EAX.
>
> Intel documentation says invalid subleaves return 0. We had been
> relying on that behavior instead of checking the max sublef number.
>
> It appears that some Sandy Bridge CPUs return at least the subleaf 0
> EDX value for subleaf 1. Best guess is that this is a bug in a
> microcode patch since all of the bits we're seeing set in EDX were
> introduced after Sandy Bridge was originally released.
>
> This is causing avxvnniint16 to be incorrectly enabled with
> -march=native on these CPUs.
>
> BTW: Thanks for reminder from llvm forks Phoebe and Craig.
>
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Please rather do it in a more self-descriptive way, as proposed in the
attached patch. You won't need a comment then.
(Please note that indentation is wrong in the patch in order to better
see the changes).
Uros.
diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 30ef0d334ca..49724d2cba1 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -762,7 +762,9 @@ get_available_features (struct __processor_model *cpu_model,
/* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */
if (max_cpuid_level >= 7)
{
- __cpuid_count (7, 0, eax, ebx, ecx, edx);
+ unsigned subleaf_level;
+
+ __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx);
if (ebx & bit_BMI)
set_feature (FEATURE_BMI);
if (ebx & bit_SGX)
@@ -873,8 +875,9 @@ get_available_features (struct __processor_model *cpu_model,
if (edx & bit_AVX512FP16)
set_feature (FEATURE_AVX512FP16);
}
-
- __cpuid_count (7, 1, eax, ebx, ecx, edx);
+ if (subleaf_level >= 1)
+ {
+ __cpuid_count (7, 1, eax, ebx, ecx, edx);
if (eax & bit_HRESET)
set_feature (FEATURE_HRESET);
if (eax & bit_CMPCCXADD)
@@ -914,6 +917,7 @@ get_available_features (struct __processor_model *cpu_model,
if (edx & bit_AMX_COMPLEX)
set_feature (FEATURE_AMX_COMPLEX);
}
+ }
}
/* Get Advanced Features at level 0xd (eax = 0xd, ecx = 1). */