Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread H. Peter Anvin
On 04/23/2015 03:55 PM, Andy Lutomirski wrote: > On Thu, Apr 23, 2015 at 3:52 PM, H. Peter Anvin wrote: >> On 04/23/2015 03:38 PM, Andy Lutomirski wrote: Because there are way more sysrets than context switches, and Linux is particularly sensitive to system call latency, by design.

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Denys Vlasenko
On Fri, Apr 24, 2015 at 1:04 AM, Linus Torvalds wrote: > And at that point,> the only cost is a a single no-op on most CPU's (we still > don't know > _which_ AMD CPU's are affected, but I guess we could start off with > all of them and see if we can get an exhaustive list some way). This seems t

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Linus Torvalds
On Thu, Apr 23, 2015 at 3:55 PM, Andy Lutomirski wrote: > > It's a matter of the ratio, right? One cycle of syscall overhead > saved is worth some number of context switch cycles added, and the > ratio probably varies by workload. Yeah. That said, I do think Peter is right that there are usually

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Andy Lutomirski
On Thu, Apr 23, 2015 at 3:52 PM, H. Peter Anvin wrote: > On 04/23/2015 03:38 PM, Andy Lutomirski wrote: >>> >>> Because there are way more sysrets than context switches, and Linux is >>> particularly sensitive to system call latency, by design. >> > > Just to clarify: why would Linux be more sensi

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread H. Peter Anvin
On 04/23/2015 03:38 PM, Andy Lutomirski wrote: >> >> Because there are way more sysrets than context switches, and Linux is >> particularly sensitive to system call latency, by design. > Just to clarify: why would Linux be more sensitive to system call by design? It enables much simpler APIs and

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Andy Lutomirski
On Thu, Apr 23, 2015 at 3:31 PM, H. Peter Anvin wrote: > On 04/23/2015 03:29 PM, Andy Lutomirski wrote: >>> >>> Yes, the NULL SS is a special thing in 64-bit mode. I agree that >>> context-switching it is probably the way to go; it should be cheap >>> enough. We might even be able to conditional

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread H. Peter Anvin
On 04/23/2015 03:29 PM, Andy Lutomirski wrote: >> >> Yes, the NULL SS is a special thing in 64-bit mode. I agree that >> context-switching it is probably the way to go; it should be cheap >> enough. We might even be able to conditionalize it on an X86_BUG_ flag. > > I still don't see why context

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Andy Lutomirski
On Thu, Apr 23, 2015 at 2:37 PM, H. Peter Anvin wrote: > On 04/23/2015 02:10 PM, Borislav Petkov wrote: >> On Thu, Apr 23, 2015 at 10:01:16PM +0200, Denys Vlasenko wrote: >>> Naturally, CS can't be NULL, and up until today >>> I thought SS also can't. But the bit is probably implemented >>> for al

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Borislav Petkov
On Thu, Apr 23, 2015 at 02:37:27PM -0700, H. Peter Anvin wrote: > Yes, the NULL SS is a special thing in 64-bit mode. I agree that > context-switching it is probably the way to go; it should be cheap > enough. We might even be able to conditionalize it on an X86_BUG_ > flag. Oh sure, I was thinkin

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread H. Peter Anvin
On 04/23/2015 02:10 PM, Borislav Petkov wrote: > On Thu, Apr 23, 2015 at 10:01:16PM +0200, Denys Vlasenko wrote: >> Naturally, CS can't be NULL, and up until today >> I thought SS also can't. But the bit is probably implemented >> for all eight cached descriptors. > > There's this section about NU

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Borislav Petkov
On Thu, Apr 23, 2015 at 10:01:16PM +0200, Denys Vlasenko wrote: > Naturally, CS can't be NULL, and up until today > I thought SS also can't. But the bit is probably implemented > for all eight cached descriptors. There's this section about NULL selector in APM v2. It says that NULL selectors are u

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Denys Vlasenko
On Thu, Apr 23, 2015 at 6:27 PM, Andy Lutomirski wrote: > I'll go out on a limb and guess the present bit doesn't leak. If I > were implementing an x86 cpu, I wouldn't have a present bit at all in > the descriptor cache, since you aren't supposed to be able to load a > non-present descriptor in t

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Andy Lutomirski
On Thu, Apr 23, 2015 at 9:13 AM, Linus Torvalds wrote: > On Thu, Apr 23, 2015 at 9:06 AM, Brian Gerst wrote: >> >> So you are saying we should save and conditionally restore the >> kernel's %ss during context switch? That shouldn't be too bad. Half >> of the time you would be loading the null s

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Linus Torvalds
On Thu, Apr 23, 2015 at 9:06 AM, Brian Gerst wrote: > > So you are saying we should save and conditionally restore the > kernel's %ss during context switch? That shouldn't be too bad. Half > of the time you would be loading the null selector which is fast (no > GDT access, no validation). I'd a

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Borislav Petkov
On Thu, Apr 23, 2015 at 09:05:07AM -0700, Linus Torvalds wrote: > So it sounds very much like an AMD bug/misfeature. Because sysret > *should* reset the descriptor cache. Do we know whether this affects > *all* AMD CPU's or just a subset? I am being told that this might appear to be the case with

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Brian Gerst
On Thu, Apr 23, 2015 at 11:22 AM, Linus Torvalds wrote: > On Thu, Apr 23, 2015 at 5:34 AM, Denys Vlasenko wrote: >> >> It was observed to cause Wine crashes. Conjectured sequence of events >> causing it is as follows: >> >> 1. Wine process enters kernel via syscall insn. >> 2. Context switch to a

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Linus Torvalds
On Thu, Apr 23, 2015 at 8:50 AM, Andy Lutomirski wrote: > On Thu, Apr 23, 2015 at 8:22 AM, Linus Torvalds >> >> It is a bit scary to me that apparently we leak %ss values between >> processes, so that while we run in the kernel we can randomly have the >> ss descriptor either be 0 or __KERNEL_DS.

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Andy Lutomirski
On Thu, Apr 23, 2015 at 8:22 AM, Linus Torvalds wrote: > On Thu, Apr 23, 2015 at 5:34 AM, Denys Vlasenko wrote: >> >> It was observed to cause Wine crashes. Conjectured sequence of events >> causing it is as follows: >> >> 1. Wine process enters kernel via syscall insn. >> 2. Context switch to an

Re: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Linus Torvalds
On Thu, Apr 23, 2015 at 5:34 AM, Denys Vlasenko wrote: > > It was observed to cause Wine crashes. Conjectured sequence of events > causing it is as follows: > > 1. Wine process enters kernel via syscall insn. > 2. Context switch to any other task. > 3. Interrupt or exception happens, CPU loads %ss

[PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary

2015-04-23 Thread Denys Vlasenko
AMD docs say that SYSRET32 loads %ss selector with a value from a MSR, but *cached descriptor* of %ss is not modified. (Intel CPUs reset the descriptor to a fixed, valid state). It was observed to cause Wine crashes. Conjectured sequence of events causing it is as follows: 1. Wine process enters