This series is an attempt at making the x86 FPU context switching code much lazier. By only reloading the FPU context when a task switches to user mode, we can avoid switching FPU context for tasks that spin in kernel mode, and avoid reloading the FPU context for tasks that get interrupted by a kernel thread or briefly go idle.
It also allows us to skip restoring the userspace FPU context when exiting a KVM guest. This series is still BROKEN. The first 3 patches seem to work fine in my tests (but should not, due to missing signal path code), while the 4th test makes it easier to trigger bugs. I am posting this to ask about obvious issues people may see, ideas on what direction I should take this series in, and to avoid code conflicts with Andy's plans wrt. lazy fpu mode.