On Thu, Apr 23, 2015 at 9:13 AM, Linus Torvalds <torva...@linux-foundation.org> wrote: > On Thu, Apr 23, 2015 at 9:06 AM, Brian Gerst <brge...@gmail.com> wrote: >> >> So you are saying we should save and conditionally restore the >> kernel's %ss during context switch? That shouldn't be too bad. Half >> of the time you would be loading the null selector which is fast (no >> GDT access, no validation). > > I'd almost prefer something along those lines, yes. Who knows *what* > leaks? If the present bit state leaks, then likely so does the limit > value etc etc.. >
I'll go out on a limb and guess the present bit doesn't leak. If I were implementing an x86 cpu, I wouldn't have a present bit at all in the descriptor cache, since you aren't supposed to be able to load a non-present descriptor in the first place. I bet it's the limit we're seeing. But I think I prefer something closer to Denys' approach with alternatives instead. I think the only case that matters (if my hare-brained explanantion of the actual crash is right) is when we sysret (q or l) while SS is 0. That only happens if we scheduled inside a syscall, and I'm guessing that testing if ss is zero and reloading it on syscall return will be a smaller performance hit than reloading on all context switches. The latter could happen more than once per syscall, and it could also affect tasks that aren't doing syscalls at all and are therefore unaffected. I'll try to send out a patch and a test case later today, but no promises -- the test case will be a bit tedious, and I'm already overcommitted for today :( A sketch of the a reproducer: Two threads. Thread 1 sets ss to some very-low-limit value, and it loops doing mov $-1, %eax; int $80. Thread 2 is ordinary 32-bit code doing while(true) usleep(1); --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/