On 2018-09-12 08:47:19 [-0700], Andy Lutomirski wrote: > > --- a/arch/x86/kernel/fpu/core.c > > +++ b/arch/x86/kernel/fpu/core.c > > @@ -101,14 +101,14 @@ void __kernel_fpu_begin(void) > > > > kernel_fpu_disable(); > > > > - if (fpu->initialized) { > > + __cpu_invalidate_fpregs_state(); > > + > > + if (!test_and_set_thread_flag(TIF_LOAD_FPU)) { > > Since the already-TIF_LOAD_FPU path is supposed to be fast here, use > test_thread_flag() instead. test_and_set operations do unconditional RMW > operations and are always full barriers, so they’re slow. okay.
> Also, on top of this patch, there should be lots of cleanups available. In > particular, all the fpu state accessors could probably be reworked to take > TIF_LOAD_FPU into account, which would simplify the callers and maybe even > the mess of variables tracking whether the state is in regs. Do you refer to the fpu.initilized check or something else? Sebastian