On Wed, Feb 13, 2019 at 04:45:32PM +0100, Peter Zijlstra wrote: > On Wed, Feb 13, 2019 at 03:41:45PM +0100, Peter Zijlstra wrote: > > On Wed, Feb 13, 2019 at 02:39:22PM +0000, Julien Thierry wrote: > > > Hi Peter, > > > > > > On 13/02/2019 14:25, Peter Zijlstra wrote: > > > > On Wed, Feb 13, 2019 at 02:00:26PM +0000, Will Deacon wrote: > > > >> The difference is because getting preempted in the sequence above is > > > >> triggered off the back of an interrupt. On arm64, and I think also on > > > >> x86, > > > >> the user access state (SMAP or PAN) is saved and restored across > > > >> exceptions > > > >> but not across context switch. > > > > > > > > A quick reading of the SDM seems to suggest the SMAP state is part of > > > > EFLAGS, which is context switched just fine AFAIK. > > > > > > > I fail to see where this is happening when looking at the switch_to() > > > logic in x86_64. > > > > Yeah, me too.. we obviously preserve EFLAGS for user context, but for > > kernel-kernel switches we do not seem to preserve it :-( > > So I dug around the context switch code a little, and I think we lost it > here: > > 0100301bfdf5 ("sched/x86: Rewrite the switch_to() code") > > Before that, x86_64 switch_to() read like (much simplified): > > asm volatile ( /* do RSP twiddle */ > : /* output */ > : /* input */ > : "memory", "cc", .... "flags"); > > (see __EXTRA_CLOBBER) > > Which I suppose means that GCC generates the PUSHF/POPF to preserve the > EFLAGS, since we mark those explicitly clobbered. > > Before that: > > f05e798ad4c0 ("Disintegrate asm/system.h for X86") > > We had explicit PUSHF / POPF in SAVE_CONTEXT / RESTORE_CONTEXT resp. > > Now I cannot see how the current code preserves EFLAGS (if indeed it > does), and the changelog doesn't mention this change _AT_ALL_. > > > For a little bit of context; it turns out that user_access_begin() / > user_access_end() sets EFLAGS.AC and scheduling in between there wrecks > that because we're apparently not saving that anymore. > > Now, I'm tempted to add the PUSHF / POPF right back because of this, but > first I suppose we need to figure out if that change was on purpose and > why that went missing from the Changelog.
Just for giggles; the below patch seems to boot. --- arch/x86/entry/entry_32.S | 2 ++ arch/x86/entry/entry_64.S | 2 ++ arch/x86/include/asm/switch_to.h | 1 + 3 files changed, 5 insertions(+) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index d309f30cf7af..5fc76b755510 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -650,6 +650,7 @@ ENTRY(__switch_to_asm) pushl %ebx pushl %edi pushl %esi + pushfl /* switch stack */ movl %esp, TASK_threadsp(%eax) @@ -672,6 +673,7 @@ ENTRY(__switch_to_asm) #endif /* restore callee-saved registers */ + popfl popl %esi popl %edi popl %ebx diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 1f0efdb7b629..4fe27b67d7e2 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -291,6 +291,7 @@ ENTRY(__switch_to_asm) pushq %r13 pushq %r14 pushq %r15 + pushfq /* switch stack */ movq %rsp, TASK_threadsp(%rdi) @@ -313,6 +314,7 @@ ENTRY(__switch_to_asm) #endif /* restore callee-saved registers */ + popfq popq %r15 popq %r14 popq %r13 diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h index 7cf1a270d891..157149d4129c 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -40,6 +40,7 @@ asmlinkage void ret_from_fork(void); * order of the fields must match the code in __switch_to_asm(). */ struct inactive_task_frame { + unsigned long flags; #ifdef CONFIG_X86_64 unsigned long r15; unsigned long r14;