On Wed, Feb 13, 2019 at 03:41:45PM +0100, Peter Zijlstra wrote: > On Wed, Feb 13, 2019 at 02:39:22PM +0000, Julien Thierry wrote: > > Hi Peter, > > > > On 13/02/2019 14:25, Peter Zijlstra wrote: > > > On Wed, Feb 13, 2019 at 02:00:26PM +0000, Will Deacon wrote: > > >> The difference is because getting preempted in the sequence above is > > >> triggered off the back of an interrupt. On arm64, and I think also on > > >> x86, > > >> the user access state (SMAP or PAN) is saved and restored across > > >> exceptions > > >> but not across context switch. > > > > > > A quick reading of the SDM seems to suggest the SMAP state is part of > > > EFLAGS, which is context switched just fine AFAIK. > > > > > I fail to see where this is happening when looking at the switch_to() > > logic in x86_64. > > Yeah, me too.. we obviously preserve EFLAGS for user context, but for > kernel-kernel switches we do not seem to preserve it :-(
So I dug around the context switch code a little, and I think we lost it here: 0100301bfdf5 ("sched/x86: Rewrite the switch_to() code") Before that, x86_64 switch_to() read like (much simplified): asm volatile ( /* do RSP twiddle */ : /* output */ : /* input */ : "memory", "cc", .... "flags"); (see __EXTRA_CLOBBER) Which I suppose means that GCC generates the PUSHF/POPF to preserve the EFLAGS, since we mark those explicitly clobbered. Before that: f05e798ad4c0 ("Disintegrate asm/system.h for X86") We had explicit PUSHF / POPF in SAVE_CONTEXT / RESTORE_CONTEXT resp. Now I cannot see how the current code preserves EFLAGS (if indeed it does), and the changelog doesn't mention this change _AT_ALL_. For a little bit of context; it turns out that user_access_begin() / user_access_end() sets EFLAGS.AC and scheduling in between there wrecks that because we're apparently not saving that anymore. Now, I'm tempted to add the PUSHF / POPF right back because of this, but first I suppose we need to figure out if that change was on purpose and why that went missing from the Changelog.