On Tue, Sep 1, 2015 at 3:41 PM, Andy Lutomirski <l...@kernel.org> wrote:
> This reinstates 2c7577a75837 ("sched/x86_64: Don't save flags on
> context switch"), which was reverted in 512255a2ad2c.

Hi Ingo and Thomas-

I just realized that there's no good reason that this patch belongs
with the rest of the entry series -- it's totally independent.  Should
I resend it by itself, or would you rather just apply it as is?

I'll send an updated entry series soon.

--Andy

>
> Historically, Linux has always saved and restored EFLAGS across
> context switches.  As far as I know, the only reason to do this is
> because of the NT flag.  In particular, if something calls switch_to
> with the NT flag set, then we don't want to leak the NT flag into a
> different task that might try to IRET and fail because NT is set.
>
> Before 8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from
> userspace"), we could run system call bodies with NT set.  This
> would be a DoS or possibly privilege escalation hole if scheduling
> in such a system call would leak NT into a different task.
>
> Importantly, we don't need to worry about NT being set while
> preemptible or across page faults.  The only way we can schedule due
> to preemption or a page fault is in an interrupt entry that nests
> inside the SYSENTER prologue.  The CPU will clear NT when entering
> through an interrupt gate, so we won't schedule with NT set.
>
> The only other interesting flags are IOPL and AC.  Allowing
> switch_to to change IOPL has no effect, as the value loaded during
> kernel execution doesn't matter at all except between a SYSENTER
> entry and the subsequent PUSHF, and anythign that interrupts in that
> window will restore IOPL on return.
>
> If we call __switch_to with AC set, we have bigger problems.
>
> Signed-off-by: Andy Lutomirski <l...@kernel.org>
> ---
>  arch/x86/include/asm/switch_to.h | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/switch_to.h 
> b/arch/x86/include/asm/switch_to.h
> index d7f3b3b78ac3..751bf4b7bf11 100644
> --- a/arch/x86/include/asm/switch_to.h
> +++ b/arch/x86/include/asm/switch_to.h
> @@ -79,12 +79,12 @@ do {                                                      
>                   \
>  #else /* CONFIG_X86_32 */
>
>  /* frame pointer must be last for get_wchan */
> -#define SAVE_CONTEXT    "pushf ; pushq %%rbp ; movq %%rsi,%%rbp\n\t"
> -#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp ; popf\t"
> +#define SAVE_CONTEXT    "pushq %%rbp ; movq %%rsi,%%rbp\n\t"
> +#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp\t"
>
>  #define __EXTRA_CLOBBER  \
>         , "rcx", "rbx", "rdx", "r8", "r9", "r10", "r11", \
> -         "r12", "r13", "r14", "r15"
> +         "r12", "r13", "r14", "r15", "flags"
>
>  #ifdef CONFIG_CC_STACKPROTECTOR
>  #define __switch_canary                                                      
>     \
> @@ -100,7 +100,11 @@ do {                                                     
>                   \
>  #define __switch_canary_iparam
>  #endif /* CC_STACKPROTECTOR */
>
> -/* Save restore flags to clear handle leaking NT */
> +/*
> + * There is no need to save or restore flags, because flags are always
> + * clean in kernel mode, with the possible exception of IOPL.  Kernel IOPL
> + * has no effect.
> + */
>  #define switch_to(prev, next, last) \
>         asm volatile(SAVE_CONTEXT                                         \
>              "movq %%rsp,%P[threadrsp](%[prev])\n\t" /* save RSP */       \
> --
> 2.4.3
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to