GCC generates lousy code in __switch_to_xtra. This patch series is an updated version of tglx's patches from last year (https://lkml.org/lkml/2016/12/15/432) that address review comments.
Since v1: Part 1 - x86/process: Optimize TIF checks in __switch_to_xtra() - READ_ONCE annotations added as requested by Andy Lutomirski Part 2 - x86/process: Correct and optimize TIF_BLOCKSTEP switch - DEBUGCTLMSR_BTF is now modified when either the previous or next or both tasks use it, because the MSR is "highly magical". Part 3 - x86/process: Optimize TIF_NOTSC switch - Unchanged I didn't introduce a cpufeature for blockstep because that would add additional overhead compared to the existing code, where it's generally known at compile time that blockstep is supported. Perhaps we should just BUG_ON(!arch_has_block_step()) here if we really care to check anything. arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/tlbflush.h | 10 ++++++++++ arch/x86/kernel/process.c | 76 +++++++++++++++++++++++++++++++++++----------------------------------------- 3 files changed, 46 insertions(+), 41 deletions(-)

