On Tue, Jan 23, 2018 at 02:53:33PM +0000, David Woodhouse wrote: > We were doing a fresh CPUID for every single bit in every single output > register. Do it once and then harvest *all* the bits we want. > > We were also doing the max_level check with a new CPUID invocation for > every single bit. Stop that too. > > Signed-off-by: David Woodhouse <d...@amazon.co.uk> > --- > Spotted this in my travels; it offended me.
Ok, I see that it's itching so let's scratch it properly :-) If we're going to optimize scattered.c, let's do something like this: * do CPUID for each function once. * for each set bit in there, set feature flag which means we'd have to change the data structure. struct cpuid_leaf { u32 level; u32 sub_leaf; struct cpuid_bit bits[]; }; and that last thing is: struct cpuid_bit { u16 feature; u8 reg; u8 bit; }; So that you have something like (for example with leaf 0x10): struct cpuid_leaf leafs[] = { ... { .level = 0x00000010, .sub_leaf = 0, .bits = { { X86_FEATURE_CAT_L3, CPUID_EBX, 1 }, { X86_FEATURE_CAT_L2, CPUID_EBX, 2 }, { X86_FEATURE_MBA , CPUID_EBX, 3 }, { 0 } } } ... } This way you get the CPUID only once and then iterate over bits[] and you can do the cleaner max level computation cpuid_eax(level & 0xffff0000); without having to do the extended level checks. Anyway, something like that. It's probably not even worth doing anything though - I doubt the speedup is visible at all. But I certainly understand the intent to fix an annoying thing like that. :-)) Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.