On Mon, Mar 29 2021 at 09:14, Len Brown wrote: > On Sat, Mar 20, 2021 at 6:14 PM Thomas Gleixner <t...@linutronix.de> wrote: >> >> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote: >> > + >> > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */ >> > +static inline void xdisable_switch(struct fpu *prev, struct fpu *next) >> > +{ >> > + if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled()) >> > + return; >> > + >> > + if (unlikely(prev->state_mask != next->state_mask)) >> > + xdisable_setbits(xfirstuse_not_detected(next)); >> > +} >> >> So this is invoked on context switch. Toggling bit 18 of MSR_IA32_XFD >> when it does not match. The spec document says: >> >> "System software may disable use of Intel AMX by clearing XCR0[18:17], by >> clearing CR4.OSXSAVE, or by setting IA32_XFD[18]. It is recommended that >> system software initialize AMX state (e.g., by executing TILERELEASE) >> before doing so. This is because maintaining AMX state in a >> non-initialized state may have negative power and performance >> implications." >> >> I'm not seeing anything related to this. Is this a recommendation >> which can be ignored or is that going to be duct taped into the code >> base once the first user complains about slowdowns of their non AMX >> workloads on that machine? > > I found the author of this passage, and he agreed to revise it to say this > was targeted primarily at VMMs.
Why would this only a problem for VMMs? > "negative power and performance implications" refers to the fact that > the processor will not enter C6 when AMX INIT=0, instead it will demote > to the next shallower C-state, eg C1E. > > (this is because the C6 flow doesn't save the AMX registers) > > For customers that have C6 enabled, the inability of a core to enter C6 > may impact the maximum turbo frequency of other cores. That's the same on bare metal, right? Thanks, tglx