On Tue, Sep 1, 2015 at 3:41 PM, Andy Lutomirski <l...@kernel.org> wrote: > This reinstates 2c7577a75837 ("sched/x86_64: Don't save flags on > context switch"), which was reverted in 512255a2ad2c.
Hi Ingo and Thomas- I just realized that there's no good reason that this patch belongs with the rest of the entry series -- it's totally independent. Should I resend it by itself, or would you rather just apply it as is? I'll send an updated entry series soon. --Andy > > Historically, Linux has always saved and restored EFLAGS across > context switches. As far as I know, the only reason to do this is > because of the NT flag. In particular, if something calls switch_to > with the NT flag set, then we don't want to leak the NT flag into a > different task that might try to IRET and fail because NT is set. > > Before 8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from > userspace"), we could run system call bodies with NT set. This > would be a DoS or possibly privilege escalation hole if scheduling > in such a system call would leak NT into a different task. > > Importantly, we don't need to worry about NT being set while > preemptible or across page faults. The only way we can schedule due > to preemption or a page fault is in an interrupt entry that nests > inside the SYSENTER prologue. The CPU will clear NT when entering > through an interrupt gate, so we won't schedule with NT set. > > The only other interesting flags are IOPL and AC. Allowing > switch_to to change IOPL has no effect, as the value loaded during > kernel execution doesn't matter at all except between a SYSENTER > entry and the subsequent PUSHF, and anythign that interrupts in that > window will restore IOPL on return. > > If we call __switch_to with AC set, we have bigger problems. > > Signed-off-by: Andy Lutomirski <l...@kernel.org> > --- > arch/x86/include/asm/switch_to.h | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/switch_to.h > b/arch/x86/include/asm/switch_to.h > index d7f3b3b78ac3..751bf4b7bf11 100644 > --- a/arch/x86/include/asm/switch_to.h > +++ b/arch/x86/include/asm/switch_to.h > @@ -79,12 +79,12 @@ do { > \ > #else /* CONFIG_X86_32 */ > > /* frame pointer must be last for get_wchan */ > -#define SAVE_CONTEXT "pushf ; pushq %%rbp ; movq %%rsi,%%rbp\n\t" > -#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp ; popf\t" > +#define SAVE_CONTEXT "pushq %%rbp ; movq %%rsi,%%rbp\n\t" > +#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp\t" > > #define __EXTRA_CLOBBER \ > , "rcx", "rbx", "rdx", "r8", "r9", "r10", "r11", \ > - "r12", "r13", "r14", "r15" > + "r12", "r13", "r14", "r15", "flags" > > #ifdef CONFIG_CC_STACKPROTECTOR > #define __switch_canary > \ > @@ -100,7 +100,11 @@ do { > \ > #define __switch_canary_iparam > #endif /* CC_STACKPROTECTOR */ > > -/* Save restore flags to clear handle leaking NT */ > +/* > + * There is no need to save or restore flags, because flags are always > + * clean in kernel mode, with the possible exception of IOPL. Kernel IOPL > + * has no effect. > + */ > #define switch_to(prev, next, last) \ > asm volatile(SAVE_CONTEXT \ > "movq %%rsp,%P[threadrsp](%[prev])\n\t" /* save RSP */ \ > -- > 2.4.3 > -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/