On Thu, Apr 30, 2015 at 02:39:07PM -0700, H. Peter Anvin wrote:
> This is the microbenchmark I used.
>
> For the record, Intel's intention going forward is that 0F 1F will
> always be as fast or faster than any other alternative.
It looks like this is the case on AMD too.
So I took your benchmar
On Thu, Apr 30, 2015 at 04:23:26PM -0700, H. Peter Anvin wrote:
> I probably should have added that the microbenchmark specifically tests
> for an atomic 5-byte NOP (as required by tracepoints etc.) If the
> requirement for 5-byte atomic is dropped there might be faster
> combinations, e.g. 66 66
On 04/30/2015 02:39 PM, H. Peter Anvin wrote:
> This is the microbenchmark I used.
>
> For the record, Intel's intention going forward is that 0F 1F will
> always be as fast or faster than any other alternative.
>
I probably should have added that the microbenchmark specifically tests
for an ato
This is the microbenchmark I used.
For the record, Intel's intention going forward is that 0F 1F will
always be as fast or faster than any other alternative.
-hpa
#define _GNU_SOURCE
#include
#include
#include
#include
#include
static void nop_p6(void)
{
asm volatile(".rept 1000\
On Tue, Apr 28, 2015 at 10:16:33AM -0700, Linus Torvalds wrote:
> I suspect it might be related to things like getting performance
> counters and instruction debug traps etc right. There are quite
> possibly also simply constraints where the front end has to generate
> *something* just to keep the
On Tue, Apr 28, 2015 at 9:58 AM, Borislav Petkov wrote:
>
> Well, AFAIK, NOPs do require resources for tracking in the machine. I
> was hoping that hw would be smarter and discard at decode time but there
> probably are reasons that it can't be done (...yet).
I suspect it might be related to thin
On Tue, Apr 28, 2015 at 09:28:52AM -0700, Linus Torvalds wrote:
> On Tue, Apr 28, 2015 at 8:55 AM, Borislav Petkov wrote:
> >
> > Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are
> > better than the 0F 1F 00 suggested by the manual (Haha!):
>
> That's which AMD CPU?
F16h.
On Tue, Apr 28, 2015 at 8:55 AM, Borislav Petkov wrote:
>
> Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are
> better than the 0F 1F 00 suggested by the manual (Haha!):
That's which AMD CPU?
On my intel i7-4770S, they are the same cost (I cut down your loop
numbers by an o
On Mon, Apr 27, 2015 at 01:14:51PM -0700, H. Peter Anvin wrote:
> I did a microbenchmark in user space... let's see if I can find it.
How about the simple one below?
Provided it is correct, it shows that the 0x66-prefixed 3-byte NOPs are
better than the 0F 1F 00 suggested by the manual (Haha!):
On Mon, Apr 27, 2015 at 09:45:12PM +0200, Borislav Petkov wrote:
> > Maybe you are measuring random noise.
>
> Yeah. Last exercise tomorrow. Let's see what those numbers would look
> like.
Right, so with Mel's help, I did a simple microbenchmark to measure how
many cycles a syscall (getpid()) nee
I did a microbenchmark in user space... let's see if I can find it.
On April 27, 2015 1:03:29 PM PDT, Borislav Petkov wrote:
>On Mon, Apr 27, 2015 at 12:59:11PM -0700, H. Peter Anvin wrote:
>> It really comes down to this: it seems older cores from both Intel
>> and AMD perform better with 66 66
On Mon, Apr 27, 2015 at 12:59:11PM -0700, H. Peter Anvin wrote:
> It really comes down to this: it seems older cores from both Intel
> and AMD perform better with 66 66 66 90, whereas the 0F 1F series is
> better on newer cores.
>
> When I measured it, the differences were sometimes dramatic.
How
It really comes down to this: it seems older cores from both Intel and AMD
perform better with 66 66 66 90, whereas the 0F 1F series is better on newer
cores.
When I measured it, the differences were sometimes dramatic.
On April 27, 2015 11:53:44 AM PDT, Borislav Petkov wrote:
>On Mon, Apr 27,
On Mon, Apr 27, 2015 at 09:21:34PM +0200, Denys Vlasenko wrote:
> On 04/27/2015 09:11 PM, Borislav Petkov wrote:
> > A: 709.528485252 seconds time elapsed
> >( +- 0.02% )
> > B: 708.976557288 seconds time elapsed
On 04/27/2015 09:11 PM, Borislav Petkov wrote:
> A: 709.528485252 seconds time elapsed
> ( +- 0.02% )
> B: 708.976557288 seconds time elapsed
> ( +- 0.04% )
> C: 709.312844791 seconds time elapsed
On Mon, Apr 27, 2015 at 08:38:54PM +0200, Borislav Petkov wrote:
> I'm running them now and will report numbers relative to the last run
> once it is done. And those numbers should in practice get even better if
> we revert to the simpler canonical-ness check but let's see...
Results are done. New
On Mon, Apr 27, 2015 at 11:47:30AM -0700, Linus Torvalds wrote:
> On Mon, Apr 27, 2015 at 11:38 AM, Borislav Petkov wrote:
> >
> > So our current NOP-infrastructure does ASM_NOP_MAX NOPs of 8 bytes so
> > without more invasive changes, our longest NOPs are 8 byte long and then
> > we have to repea
On Mon, Apr 27, 2015 at 11:12:05AM -0700, Linus Torvalds wrote:
> So if one or two cycles in this code doesn't matter, then why are we
> adding alternate instructions just to avoid a few ALU instructions and
> a conditional branch that predicts perfectly? And if it does matter,
> then the 6-byte op
On Mon, Apr 27, 2015 at 11:38 AM, Borislav Petkov wrote:
>
> So our current NOP-infrastructure does ASM_NOP_MAX NOPs of 8 bytes so
> without more invasive changes, our longest NOPs are 8 byte long and then
> we have to repeat.
Btw (and I'm too lazy to check) do we take alignment into account?
Be
On Mon, Apr 27, 2015 at 11:14:15AM -0700, Linus Torvalds wrote:
> Btw, please don't use the "more than three 66h overrides" version.
Oh yeah, a notorious "frontend choker".
> Sure, that's what the optimization manual suggests if you want
> single-instruction decode for all sizes up to 15 bytes, b
On Mon, Apr 27, 2015 at 9:40 AM, Borislav Petkov wrote:
>
> Either way, the NOPs-version is faster and I'm running the test with the
> F16h-specific NOPs to see how they perform.
Btw, please don't use the "more than three 66h overrides" version.
Sure, that's what the optimization manual suggests
On Mon, Apr 27, 2015 at 9:12 AM, Denys Vlasenko wrote:
>
> It is smaller, but not by much. It is two instructions smaller.
Ehh. That's _half_.
And on a decoding side, it's the difference between 6 bytes that
decode cleanly and can be decoded in parallel with other things
(assuming the 6-byte nop
On Mon, Apr 27, 2015 at 09:00:08AM -0700, Linus Torvalds wrote:
> On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote:
> >
> > Right, what about the false positives:
>
> Anybody who tries to return to kernel addresses with sysret is
> suspect. It's more likely to be an attack vector than anyth
On 04/27/2015 04:57 PM, Linus Torvalds wrote:
> On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote:
>>
>> /*
>> * Change top 16 bits to be the sign-extension of 47th bit, if this
>> * changed %rcx, it was not canonical.
>> */
>> ALTERNATIVE "", \
>>
On 04/27/2015 06:04 PM, Brian Gerst wrote:
> On Mon, Apr 27, 2015 at 11:56 AM, Andy Lutomirski wrote:
>> On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote:
>>> On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote:
On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote:
>
On Mon, Apr 27, 2015 at 11:56 AM, Andy Lutomirski wrote:
> On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote:
>> On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote:
>>> On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote:
>>> >
>>> > /*
>>> > * Change top 16
On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote:
>
> Right, what about the false positives:
Anybody who tries to return to kernel addresses with sysret is
suspect. It's more likely to be an attack vector than anything else
(ie somebody who is trying to take advantage of a CPU bug).
I don'
On Mon, Apr 27, 2015 at 8:46 AM, Borislav Petkov wrote:
> On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote:
>> On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote:
>> >
>> > /*
>> > * Change top 16 bits to be the sign-extension of 47th bit, if this
>> >
On Mon, Apr 27, 2015 at 07:57:36AM -0700, Linus Torvalds wrote:
> On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote:
> >
> > /*
> > * Change top 16 bits to be the sign-extension of 47th bit, if this
> > * changed %rcx, it was not canonical.
> > */
> >
On Mon, Apr 27, 2015 at 08:06:16AM -0700, Linus Torvalds wrote:
> So maybe our AMD nop tables should be updated?
Ho-humm, we're using k8_nops on all 64-bit AMD. I better do some
opt-guide staring.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe fr
On Mon, Apr 27, 2015 at 7:57 AM, Linus Torvalds
wrote:
>
> ..end result is just six bytes. That way you can use alternative to
> replace it with one single noop on AMD.
Actually, it looks like we have no good 6-byte no-ops on AMD. So you'd
get two three-byte ones. Oh well. It's still better than
On Mon, Apr 27, 2015 at 4:35 AM, Borislav Petkov wrote:
>
> /*
> * Change top 16 bits to be the sign-extension of 47th bit, if this
> * changed %rcx, it was not canonical.
> */
> ALTERNATIVE "", \
> "shl$(64 - (47+1)), %rcx; \
>
On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote:
> I know it would be ugly, but would it be worth saving two bytes by
> using ALTERNATIVE "jmp 1f", "shl ...", ...?
Damn, it is actually visible even that saving the unconditional forward
JMP makes the numbers marginally nicer (E: row
On Mon, Apr 27, 2015 at 02:08:40PM +0200, Denys Vlasenko wrote:
> > 819ef40c: 48 c1 e1 10 shl$0x10,%rcx
> > 819ef410: 48 c1 f9 10 sar$0x10,%rcx
> > 819ef414: 49 39 cbcmp%rcx,%r11
> > 819ef417:
On 04/27/2015 01:35 PM, Borislav Petkov wrote:
> On Mon, Apr 27, 2015 at 10:53:05AM +0200, Borislav Petkov wrote:
>> ALTERNATIVE "",
>> "shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
>> sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
>>
On Mon, Apr 27, 2015 at 10:53:05AM +0200, Borislav Petkov wrote:
> ALTERNATIVE "",
> "shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
>sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
>cmpq%rcx, %r11 \
>jne
On Mon, Apr 27, 2015 at 12:07:14PM +0200, Denys Vlasenko wrote:
> /* Only three 0x66 prefixes for NOP for fast decode on all CPUs */
> ALTERNATIVE ".byte 0x66,0x66,0x66,0x90 \
> .byte 0x66,0x66,0x66,0x90 \
> .byte 0x66,0x66,0x66,0x90",
>
On 04/27/2015 10:53 AM, Borislav Petkov wrote:
> On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote:
>>> +#define X86_BUG_CANONICAL_RCX X86_BUG(8) /* SYSRET #GPs when %RCX
>>> non-canonical */
>>
>> I think that "sysret" should appear in the name.
>
> Yeah, I thought about it too, w
On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote:
> > +#define X86_BUG_CANONICAL_RCX X86_BUG(8) /* SYSRET #GPs when %RCX
> > non-canonical */
>
> I think that "sysret" should appear in the name.
Yeah, I thought about it too, will fix.
> Oh no! My laptop is currently bug-free, a
>
> diff --git a/arch/x86/include/asm/cpufeature.h
> b/arch/x86/include/asm/cpufeature.h
> index 7ee9b94d9921..8d555b046fe9 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -265,6 +265,7 @@
> #define X86_BUG_11AP X86_BUG(5) /* Bad local API
On Fri, Apr 24, 2015 at 7:17 PM, Denys Vlasenko
wrote:
> On Fri, Apr 24, 2015 at 10:50 PM, Andy Lutomirski wrote:
>> On Fri, Apr 24, 2015 at 1:46 PM, Denys Vlasenko
This might be way more trouble than it's worth.
>>>
>>> Exactly my feeling. What are you trying to save? About four CPU
>>> cyc
On Fri, Apr 24, 2015 at 4:18 AM, Andy Lutomirski wrote:
> On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote:
> Even if the issue affects SYSRETQ, it could be that we don't care. If
> the extent of the info leak is whether we context switched during a
> 64-bit syscall to a non-syscall contex
On Sat, Apr 25, 2015 at 11:12:06PM +0200, Borislav Petkov wrote:
> I've prepended the perf stat output with markers A:, B: or C: for easier
> comparing. The markers mean:
>
> A: Linus' master from a couple of days ago + tip/master + tip/x86/asm
> B: With Andy's SYSRET patch ontop
> C: Without RCX
On Thu, Apr 23, 2015 at 07:15:01PM -0700, Andy Lutomirski wrote:
> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
> with SS == 0 results in an invalid usermode state in which SS is
> apparently equal to __USER_DS but causes #SS if used.
>
> Work around the issue by replacing NU
On Fri, Apr 24, 2015 at 10:50 PM, Andy Lutomirski wrote:
> On Fri, Apr 24, 2015 at 1:46 PM, Denys Vlasenko
>>> This might be way more trouble than it's worth.
>>
>> Exactly my feeling. What are you trying to save? About four CPU
>> cycles of checking %ss != __KERNEL_DS on each switch_to?
>> That's
On 04/24/2015 01:50 PM, Andy Lutomirski wrote:
>>
>> Exactly my feeling. What are you trying to save? About four CPU
>> cycles of checking %ss != __KERNEL_DS on each switch_to?
>> That's not worth bothering about. Your last patch seems to be perfect.
>
> We'll have to do the write to ss almost eve
On 04/24/2015 01:50 PM, Andy Lutomirski wrote:
>>
>> Exactly my feeling. What are you trying to save? About four CPU
>> cycles of checking %ss != __KERNEL_DS on each switch_to?
>> That's not worth bothering about. Your last patch seems to be perfect.
>
> We'll have to do the write to ss almost eve
On 04/24/2015 01:50 PM, Andy Lutomirski wrote:
>>
>> Exactly my feeling. What are you trying to save? About four CPU
>> cycles of checking %ss != __KERNEL_DS on each switch_to?
>> That's not worth bothering about. Your last patch seems to be perfect.
>
> We'll have to do the write to ss almost eve
On 04/24/2015 01:50 PM, Andy Lutomirski wrote:
>>
>> Exactly my feeling. What are you trying to save? About four CPU
>> cycles of checking %ss != __KERNEL_DS on each switch_to?
>> That's not worth bothering about. Your last patch seems to be perfect.
>
> We'll have to do the write to ss almost eve
On 04/24/2015 01:50 PM, Andy Lutomirski wrote:
>>
>> Exactly my feeling. What are you trying to save? About four CPU
>> cycles of checking %ss != __KERNEL_DS on each switch_to?
>> That's not worth bothering about. Your last patch seems to be perfect.
>
> We'll have to do the write to ss almost eve
On 04/24/2015 01:50 PM, Andy Lutomirski wrote:
>>
>> Exactly my feeling. What are you trying to save? About four CPU
>> cycles of checking %ss != __KERNEL_DS on each switch_to?
>> That's not worth bothering about. Your last patch seems to be perfect.
>
> We'll have to do the write to ss almost eve
On Fri, Apr 24, 2015 at 1:21 PM, Andy Lutomirski wrote:
>
> 2. SYSRETQ. The only way that I know of to see the problem is SYSRETQ
> followed by a far jump or return. This is presumably *extremely*
> rare.
>
> What if we fixed #2 up in do_stack_segment. We should double-check
> the docs, but I t
On Fri, Apr 24, 2015 at 1:46 PM, Denys Vlasenko
wrote:
> On Fri, Apr 24, 2015 at 10:21 PM, Andy Lutomirski wrote:
>> On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote:
>>> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
>>> with SS == 0 results in an invalid usermode stat
On Fri, Apr 24, 2015 at 10:21 PM, Andy Lutomirski wrote:
> On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote:
>> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
>> with SS == 0 results in an invalid usermode state in which SS is
>> apparently equal to __USER_DS but causes
On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote:
> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
> with SS == 0 results in an invalid usermode state in which SS is
> apparently equal to __USER_DS but causes #SS if used.
>
> Work around the issue by replacing NULL SS val
On Fri, Apr 24, 2015 at 12:59:06PM +0200, Borislav Petkov wrote:
> Yeah, that makes more sense. So I tested Andy's patch but changed it as
> above and I get
>
> $ taskset -c 0 ./sysret_ss_attrs_32
> [RUN] Syscalls followed by SS validation
> [OK]We survived
Andy, you wanted the 64-bit versi
On Fri, Apr 24, 2015 at 1:41 PM, Linus Torvalds
wrote:
> On Fri, Apr 24, 2015 at 10:33 AM, Brian Gerst wrote:
>>
>> To clarify, I was thinking of the CONFIG_PREEMPT case. A nested
>> interrupt wouldn't change SS, and IST interrupts can't schedule.
>
> It has absolutely nothing to do with nested
On Fri, Apr 24, 2015 at 10:33 AM, Brian Gerst wrote:
>
> To clarify, I was thinking of the CONFIG_PREEMPT case. A nested
> interrupt wouldn't change SS, and IST interrupts can't schedule.
It has absolutely nothing to do with nested interrupts or CONFIG_PREEMPT.
The problem happens simply becaus
On Fri, Apr 24, 2015 at 12:25 PM, Linus Torvalds
wrote:
> On Fri, Apr 24, 2015 at 5:00 AM, Brian Gerst wrote:
>>
>> So actually this isn't a preemption issue, as the NULL SS is coming
>> from an interrupt from userspace (timer tick, etc.).
>
> It *is* a preemption issue, in the sense that the int
On Fri, Apr 24, 2015 at 5:00 AM, Brian Gerst wrote:
>
> So actually this isn't a preemption issue, as the NULL SS is coming
> from an interrupt from userspace (timer tick, etc.).
It *is* a preemption issue, in the sense that the interrupt that
clears SS also then returns to user space using an "i
On Fri, Apr 24, 2015 at 7:27 AM, Denys Vlasenko wrote:
> On 04/24/2015 04:15 AM, Andy Lutomirski wrote:
>> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
>> with SS == 0 results in an invalid usermode state in which SS is
>> apparently equal to __USER_DS but causes #SS if used.
On 04/24/2015 04:15 AM, Andy Lutomirski wrote:
> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
> with SS == 0 results in an invalid usermode state in which SS is
> apparently equal to __USER_DS but causes #SS if used.
>
> Work around the issue by replacing NULL SS values with
On Fri, Apr 24, 2015 at 11:59:42AM +0200, Denys Vlasenko wrote:
> I propose a more conservative check:
>
> if (ss_sel != __KERNEL_DS)
> loadsegment(ss, __KERNEL_DS);
>
> I would propose this even if I would see no real case where it matters...
> but I even do s
On 04/24/2015 04:15 AM, Andy Lutomirski wrote:
> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
> with SS == 0 results in an invalid usermode state in which SS is
> apparently equal to __USER_DS but causes #SS if used.
>
> Work around the issue by replacing NULL SS values with
On Thu, Apr 23, 2015 at 10:15 PM, Andy Lutomirski wrote:
> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
> with SS == 0 results in an invalid usermode state in which SS is
> apparently equal to __USER_DS but causes #SS if used.
>
> Work around the issue by replacing NULL SS va
On Thu, Apr 23, 2015 at 7:15 PM, Andy Lutomirski wrote:
> AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
> with SS == 0 results in an invalid usermode state in which SS is
> apparently equal to __USER_DS but causes #SS if used.
>
> Work around the issue by replacing NULL SS val
AMD CPUs don't reinitialize the SS descriptor on SYSRET, so SYSRET
with SS == 0 results in an invalid usermode state in which SS is
apparently equal to __USER_DS but causes #SS if used.
Work around the issue by replacing NULL SS values with __KERNEL_DS
in __switch_to, thus ensuring that SYSRET nev
67 matches
Mail list logo