Re: STIBP by default.. Revert?

2018-11-20 Thread Arjan van de Ven

On 11/20/2018 11:27 PM, Jiri Kosina wrote:

On Mon, 19 Nov 2018, Arjan van de Ven wrote:


In the documentation, AMD officially recommends against this by default,
and I can speak for Intel that our position is that as well: this really
must not be on by default.


Thanks for pointing to the AMD doc, it's indeed clearly stated there.

Is there any chance this could perhaps be added to Intel documentation as
well, so that we avoid cases like this in the future?


absolutely that's now already in progress;
the doc publishing process is a bit on the long side unfortunately so it won't
be today ;)


Re: Re: STIBP by default.. Revert?

2018-11-20 Thread Jiri Kosina
On Mon, 19 Nov 2018, Arjan van de Ven wrote:

> In the documentation, AMD officially recommends against this by default, 
> and I can speak for Intel that our position is that as well: this really 
> must not be on by default.

Thanks for pointing to the AMD doc, it's indeed clearly stated there.

Is there any chance this could perhaps be added to Intel documentation as 
well, so that we avoid cases like this in the future?

The revision 3.0 of Intel doc from may 2018 [1] I am always looking into 
doesn't say anything discouraging about STIBP to me.

[1] 
https://software.intel.com/security-software-guidance/api-app/sites/default/files/336996-Speculative-Execution-Side-Channel-Mitigations.pdf?_gclid=5b78f4d130faf8.22277271-5b78f4d130fb70.17467890&_utm_source=xakep&_utm_campaign=mention17&_utm_medium=inline&_utm_content=lnk223716354570

Thanks,

-- 
Jiri Kosina
SUSE Labs



Re: STIBP by default.. Revert?

2018-11-20 Thread Jiri Kosina
On Sun, 18 Nov 2018, Linus Torvalds wrote:

> This was marked for stable, and honestly, nowhere in the discussion
> did I see any mention of just *how* bad the performance impact of this
> was.
> 
> When performance goes down by 50% on some loads, people need to start
> asking themselves whether it was worth it. It's apparently better to
> just disable SMT entirely, which is what security-conscious people do
> anyway.
> 
> So why do that STIBP slow-down by default when the people who *really*
> care already disabled SMT?
> 
> I think we should use the same logic as for L1TF: we default to
> something that doesn't kill performance. Warn once about it, and let
> the  crazy people say "I'd rather take a 50% performance hit than
> worry about a theoretical issue".

Just to update status quo here -- Thomas is working on polishing Tim's set 
into mergeable state, I've just sent him small addition on top that makes 
IBPB also be controlled via the same toggle.

That should make the whole 'spectre v2 userspace-to-userspace' mitigation 
control consistent and undestandable. And also give us even few more % 
back that are lost due to IBPB as well.

-- 
Jiri Kosina
SUSE Labs



Re: STIBP by default.. Revert?

2018-11-19 Thread Thomas Gleixner
On Sun, 18 Nov 2018, Tim Chen wrote:
> On 11/18/2018 02:17 PM, Jiri Kosina wrote:
> > On Sun, 18 Nov 2018, Linus Torvalds wrote:
> > 
> >>> So, I think it's as theoretical as any other spectrev2 (only with the
> >>> extra "HT" condition added on top).
> >>
> >> What? No.
> >>
> >> It's *way* more theoretical than something like meltdown, which could
> >> be trivially used to get data from another protection domain.
> > 
> > Oh yeah, I absolutely agree that spectrev2 and Meltdown and completely 
> > different beasts.
> > 
> >> Have you seen any actual realistic attacks for normal human users?
> >> Things where the *kernel* should actually care?
> >>
> >> The javascript thing is for the browser to fix up, 
> > 
> > It's probably not just browsers, but anything running JITed sandboxed 
> > code. So the most straightforward way might be the prctl() aproach, where 
> > userspace would claim "I do care about this, please fix it up for me". So 
> > prctl() + perhaps SECCOMP.
> > 
> > Which gets us back to Tim's fixup patch. Do you still prefer the revert, 
> > given the existence of that? I think that if Tim's fixup makes it through 
> > (it's currently missing SECCOMP handling, but that is trivial to add on 
> > top), it might be the best compromise. We'd also have have to make IBPB 
> > obey it to be consistent (and get even a few more % of performance back), 
> > but that's easy as well.
> > 
> I think if Thomas can merge my patchset along with Jiri's, the default
> option will become opt in for tasks that want the extra security and we
> won't lose performance.

If it would be in mergeable state 

Thanks,

tglx


Re: STIBP by default.. Revert?

2018-11-19 Thread Jiri Kosina
On Mon, 19 Nov 2018, Ingo Molnar wrote:

> > This was marked for stable, and honestly, nowhere in the discussion
> > did I see any mention of just *how* bad the performance impact of this
> > was.
> 
> Yeah. This was an oversight - we'll fix it!
> 
> > When performance goes down by 50% on some loads, people need to start
> > asking themselves whether it was worth it. It's apparently better to
> > just disable SMT entirely, which is what security-conscious people do
> > anyway.
> > 
> > So why do that STIBP slow-down by default when the people who *really*
> > care already disabled SMT?
> > 
> > I think we should use the same logic as for L1TF: we default to
> > something that doesn't kill performance. Warn once about it, and let
> > the  crazy people say "I'd rather take a 50% performance hit than
> > worry about a theoretical issue".
> 
> Yeah, absolutely.
> 
> We'll also require performance measurements in changelogs enabling any 
> sort of mitigation feature from now on - this requirement was implicit 
> but 53c613fe6349 flew in under the radar, so it's going to be explicit an 
> explicit requirement.

Do you already have an idea whether you'd proceed with Tim's patchset for 
current cycle? If so, great as far as I am concerned. If not, I'll send a 
patch that switches this to opt-in via kernel cmdline (+boot-time warning 
if not mitigated).

Thanks,

-- 
Jiri Kosina
SUSE Labs



Re: STIBP by default.. Revert?

2018-11-19 Thread Ingo Molnar


* Linus Torvalds  wrote:

> This was marked for stable, and honestly, nowhere in the discussion
> did I see any mention of just *how* bad the performance impact of this
> was.

Yeah. This was an oversight - we'll fix it!

> When performance goes down by 50% on some loads, people need to start
> asking themselves whether it was worth it. It's apparently better to
> just disable SMT entirely, which is what security-conscious people do
> anyway.
> 
> So why do that STIBP slow-down by default when the people who *really*
> care already disabled SMT?
> 
> I think we should use the same logic as for L1TF: we default to
> something that doesn't kill performance. Warn once about it, and let
> the  crazy people say "I'd rather take a 50% performance hit than
> worry about a theoretical issue".

Yeah, absolutely.

We'll also require performance measurements in changelogs enabling any 
sort of mitigation feature from now on - this requirement was implicit 
but 53c613fe6349 flew in under the radar, so it's going to be explicit an 
explicit requirement.

Thanks,

Ingo


Re: STIBP by default.. Revert?

2018-11-18 Thread Willy Tarreau
On Sun, Nov 18, 2018 at 02:40:28PM -0800, Tim Chen wrote:
> Tasks that want extra security will enable that via prctl interface or
> making themselves non-dumpable.

Well, you need to be careful regarding the last part of your option
above, because a number of network daemons become non-dumpable by
executing setuid() at boot, and certainly don't want to suffer a
performance loss as a side effect of wanting to become "normally"
secure. I'd suggest to use the prctl only so that it doesn't
randomly hit innocent applications that would only have as a last
resort to turn off reasonable security features to avoid this impact.

Regards,
Willy


Re: STIBP by default.. Revert?

2018-11-18 Thread Andi Kleen
> So my patchset and Jiri's patchset should probably be merged together, so the
> users have a choice of the behavior.

... or delay Jiri's patchkit until the scheduler can actually check
properly for the cases when it is really needed. 

-Andi


Re: STIBP by default.. Revert?

2018-11-18 Thread Andi Kleen
Linus Torvalds  writes:

> On Sun, Nov 18, 2018 at 2:17 PM Jiri Kosina  wrote:
>> Which gets us back to Tim's fixup patch. Do you still prefer the revert,
>> given the existence of that?
>
> I don't think the code needs to be reverted, but the *behavior* of
> just unconditionally enabling STIBP needs to be reverted.

Actually I think it should be reverted. Yes of course opt-in
is needed.

But also when you opt-in it doesn't make sense to set STIBP
when the sibling is running the same security context, which
is actually a common case. So to even use it properly you would
need some scheduler support to detect these cases and only
enable it then with opt-in. These patches didn't even try to tackle
this problem.

-Andi


Re: Re: STIBP by default.. Revert?

2018-11-18 Thread Arjan van de Ven

On 11/19/2018 6:00 AM, Linus Torvalds wrote:

On Sun, Nov 18, 2018 at 1:49 PM Jiri Kosina  wrote:



So why do that STIBP slow-down by default when the people who *really*
care already disabled SMT?


BTW for them, there is no impact at all.


Right. People who really care about security and are anal about it do
not see *any* advantage of the patch.


In the documentation, AMD officially recommends against this by default, and I 
can
speak for Intel that our position is that as well: this really must not be on 
by default.

STIBP and its friends are there as tools, and were created early on as big 
hammers because
that is all that one can add in a microcode update.. expensive big hammers.

In some ways it's analogous to the "disable caches" bit in CR0. sure it's there 
as a big hammer,
but you don't set that always just because caches could be used for a side 
channel

Using these tools much more surgically is fine, if a paranoid task wants it for 
example,
or when you know you are doing a hard core security transition. But always on? 
Yikes.



Re: STIBP by default.. Revert?

2018-11-18 Thread Jiri Kosina
On Sun, 18 Nov 2018, Jiri Kosina wrote:

> It's probably not just browsers, but anything running JITed sandboxed 
> code. So the most straightforward way might be the prctl() aproach, where 
> userspace would claim "I do care about this, please fix it up for me". So 
> prctl() + perhaps SECCOMP.

I've just sent SECCOMP handling as a followup to Tim's set.

I still feel like we should make STIBP and IBPB behavior consistent (in a 
sense they should always be used both, or none of them), but that might be 
additional 4.21 optimization.

Thanks,

-- 
Jiri Kosina
SUSE Labs



Re: STIBP by default.. Revert?

2018-11-18 Thread Tim Chen
On 11/18/2018 02:36 PM, Linus Torvalds wrote:
> On Sun, Nov 18, 2018 at 2:17 PM Jiri Kosina  wrote:
>> Which gets us back to Tim's fixup patch. Do you still prefer the revert,
>> given the existence of that?
> 
> I don't think the code needs to be reverted, but the *behavior* of
> just unconditionally enabling STIBP needs to be reverted.

Yes, in the patchset I proposed on top of Jiri's, the default will
be enabling STIBP only for tasks that ask for STIBP via prctl or
those that are non-dumpable, like sshd.

> 
> Because it was clearly way more expensive than people were told.

I did realize the performance implications of making STIBP on for
all tasks.  Here's to recap performance impact of STIBP when I posted
my patchset:

"...leaving STIBP on all the time is expensive for certain
applications that have frequent indirect branches. One such application
is perlbench in the SpecInt Rate 2006 test suite which shows a
21% reduction in throughput."


I think if we include Jiri's patchset for 4.20, we should have
my patchset also included to avoid the performance impact on
regular tasks but still allow admin and applications to turn on STIBP if needed.

Thanks.

Tim


Re: STIBP by default.. Revert?

2018-11-18 Thread Dave Hansen


> On Nov 18, 2018, at 2:17 PM, Jiri Kosina  wrote:
> 
> It's probably not just browsers, but anything running JITed sandboxed 
> code. So the most straightforward way might be the prctl() aproach, where 
> userspace would claim "I do care about this, please fix it up for me". So 
> prctl() + perhaps SECCOMP.

Yeah, the prctl() shifts the pain to the right place: folks explicitly opting 
in.  Always-on seemed way too draconian to me.


Re: STIBP by default.. Revert?

2018-11-18 Thread Tim Chen
On 11/18/2018 02:17 PM, Jiri Kosina wrote:
> On Sun, 18 Nov 2018, Linus Torvalds wrote:
> 
>>> So, I think it's as theoretical as any other spectrev2 (only with the
>>> extra "HT" condition added on top).
>>
>> What? No.
>>
>> It's *way* more theoretical than something like meltdown, which could
>> be trivially used to get data from another protection domain.
> 
> Oh yeah, I absolutely agree that spectrev2 and Meltdown and completely 
> different beasts.
> 
>> Have you seen any actual realistic attacks for normal human users?
>> Things where the *kernel* should actually care?
>>
>> The javascript thing is for the browser to fix up, 
> 
> It's probably not just browsers, but anything running JITed sandboxed 
> code. So the most straightforward way might be the prctl() aproach, where 
> userspace would claim "I do care about this, please fix it up for me". So 
> prctl() + perhaps SECCOMP.
> 
> Which gets us back to Tim's fixup patch. Do you still prefer the revert, 
> given the existence of that? I think that if Tim's fixup makes it through 
> (it's currently missing SECCOMP handling, but that is trivial to add on 
> top), it might be the best compromise. We'd also have have to make IBPB 
> obey it to be consistent (and get even a few more % of performance back), 
> but that's easy as well.
> 
> Thanks,
> 

I think if Thomas can merge my patchset along with Jiri's, the default option 
will become
opt in for tasks that want the extra security and we won't lose performance.

Tasks that want extra security will enable that via prctl interface or
making themselves non-dumpable.

Admin that want security for all tasks will enable the strict option on boot to
enable STIBP for all tasks.

So my patchset and Jiri's patchset should probably be merged together, so the
users have a choice of the behavior.

Tim


Re: STIBP by default.. Revert?

2018-11-18 Thread Linus Torvalds
On Sun, Nov 18, 2018 at 2:17 PM Jiri Kosina  wrote:
> Which gets us back to Tim's fixup patch. Do you still prefer the revert,
> given the existence of that?

I don't think the code needs to be reverted, but the *behavior* of
just unconditionally enabling STIBP needs to be reverted.

Because it was clearly way more expensive than people were told.

Linus


Re: STIBP by default.. Revert?

2018-11-18 Thread Tony Luck
On Sun, Nov 18, 2018 at 2:19 PM Jiri Kosina  wrote:
> Which gets us back to Tim's fixup patch. Do you still prefer the revert,
> given the existence of that? I think that if Tim's fixup makes it through
> (it's currently missing SECCOMP handling, but that is trivial to add on
> top), it might be the best compromise. We'd also have have to make IBPB
> obey it to be consistent (and get even a few more % of performance back),
> but that's easy as well.

+1 for Tim's patch.  That make us more consistent with how we handled
L1TF (giving the system owner a control knob to decide whether they
want this level of fixup, based on their own analysis of their vulnerability).

-Tony


Re: STIBP by default.. Revert?

2018-11-18 Thread Jiri Kosina
On Sun, 18 Nov 2018, Linus Torvalds wrote:

> > So, I think it's as theoretical as any other spectrev2 (only with the
> > extra "HT" condition added on top).
> 
> What? No.
> 
> It's *way* more theoretical than something like meltdown, which could
> be trivially used to get data from another protection domain.

Oh yeah, I absolutely agree that spectrev2 and Meltdown and completely 
different beasts.

> Have you seen any actual realistic attacks for normal human users?
> Things where the *kernel* should actually care?
> 
> The javascript thing is for the browser to fix up, 

It's probably not just browsers, but anything running JITed sandboxed 
code. So the most straightforward way might be the prctl() aproach, where 
userspace would claim "I do care about this, please fix it up for me". So 
prctl() + perhaps SECCOMP.

Which gets us back to Tim's fixup patch. Do you still prefer the revert, 
given the existence of that? I think that if Tim's fixup makes it through 
(it's currently missing SECCOMP handling, but that is trivial to add on 
top), it might be the best compromise. We'd also have have to make IBPB 
obey it to be consistent (and get even a few more % of performance back), 
but that's easy as well.

Thanks,

-- 
Jiri Kosina
SUSE Labs



Re: STIBP by default.. Revert?

2018-11-18 Thread Willy Tarreau
On Sun, Nov 18, 2018 at 10:49:44PM +0100, Jiri Kosina wrote:
> odds are that people 
> who don't care about spectrev2 already have 'nospectre_v2' on their 
> command-line, so they are fine as well.

FWIW in our appliances, we never modify the boot loader's cmdline
in field, we only provide new kernel+rootfs images. We've however
disabled the config options for all this class of vulnerabilities.
As long as it remains possible to disable the new ones using config
options only, that's not an issue for me.

Cheers,
Willy


Re: STIBP by default.. Revert?

2018-11-18 Thread Linus Torvalds
On Sun, Nov 18, 2018 at 1:49 PM Jiri Kosina  wrote:
>
> > So why do that STIBP slow-down by default when the people who *really*
> > care already disabled SMT?
>
> BTW for them, there is no impact at all.

Right. People who really care about security and are anal about it do
not see *any* advantage of the patch.

But people who aren't that worried suddenly see potentially huge slowdowns.

In other words, the behavior of the patch is basically essentially
exactly the reverse of what you'd want. You penalize the people who
don't even want it and don't care.

> STIBP is only activated on systems with HT on; plus odds are that people
> who don't care about spectrev2 already have 'nospectre_v2' on their
> command-line, so they are fine as well.

I'm talking about *normal* people. People who simply aren't all that
invested in this all. People who just want to get their work done.

> So, I think it's as theoretical as any other spectrev2 (only with the
> extra "HT" condition added on top).

What? No.

It's *way* more theoretical than something like meltdown, which could
be trivially used to get data from another protection domain.

Have you seen any actual realistic attacks for normal human users?
Things where the *kernel* should actually care?

The javascript thing is for the browser to fix up, not for the kernel
to say "now everything should run up to 50% slower".

   Linus


Re: STIBP by default.. Revert?

2018-11-18 Thread Jiri Kosina
On Sun, 18 Nov 2018, Linus Torvalds wrote:

> This was marked for stable, and honestly, nowhere in the discussion
> did I see any mention of just *how* bad the performance impact of this
> was.

Frankly, I ran some benchmarks myself, and am seeing very, very 
varying/noisy results, which were rather inconclusive across the 
individual workloads.

The article someone pointed me to at Phoronix yesterday also was talking 
about something between 3% and 20% IIRC.

> When performance goes down by 50% on some loads, people need to start
> asking themselves whether it was worth it. It's apparently better to
> just disable SMT entirely, which is what security-conscious people do
> anyway.
> 
> So why do that STIBP slow-down by default when the people who *really*
> care already disabled SMT?

BTW for them, there is no impact at all. And they probably did it because 
of crypto libraries being implemented in a way that their observable 
operation depends on value of the secrets (tlbleed etc), but this is being 
fixed in the said crypto code.

STIBP is only activated on systems with HT on; plus odds are that people 
who don't care about spectrev2 already have 'nospectre_v2' on their 
command-line, so they are fine as well.

> I think we should use the same logic as for L1TF: we default to
> something that doesn't kill performance. Warn once about it, and let
> the  crazy people say "I'd rather take a 50% performance hit than
> worry about a theoretical issue".

So, I think it's as theoretical as any other spectrev2 (only with the 
extra "HT" condition added on top).

Namely, I think that scenarios such as "one browser tab's javascript is 
attacking other browser's tab private data on a sibling" is rather a 
practical one, I've seen such demos (not with the sibling condition being 
in place, but I don't think that matters that much). The same likely holds 
for server threads serving individual requests in separate threads, etc 
(but yeah, you need to have proper gadgets there in place already in 
server's .text, which makes it much less practical).

For L1TF, the basic argument AFAICS was, that the default situation is 
"only special people care about strict isolation between VMs". But this is 
isolation between individual processess.

Tim is currently working [1] on a patchset on top of my original 
STIBP-enabling commit, that will make STIBP to be used in much smaller 
number of cases (taking prctl()-based aproach as one of the possible ones, 
similarly as we did for SSBD), and as I indicated in that thread, I think 
it should be really considered for this -rc cycle still if it lands in tip 
in a reasonable future.

To conclude -- I am quite happy that this finally started to move 
(although it's sad that some of it is due to clickbaity article titles, 
but whatever), as Intel didn't really provide any patch / guidance (*) in 
past ~1 year to those who care about spectrev2 isolation on HT, which is 
something I wasn't really feeling comfortable with, and that's why I 
submitted the patch.

If we make it opt-in (on kernel cmdline) and issue big fat warning if not 
mitigated, fine, but then we're bit incosistent how we deal with 
cross-core (IBPB) and cross-thread (STIBP) spectrev2 security in the 
kernel code.

If we take Tim's aproach, even better -- there would be means available to 
make system secure, although not sacrifying performance by default.

I would not prefer going the plain revert path though, as that leaves no 
other option to mitigate rather than turning SMT off, which might possibly 
have even *worse* performance numbers depending on workload.

[1] 
https://lore.kernel.org/lkml/?q=cover.1542418936.git.tim.c.chen%40linux.intel.com

(*) no code to implement STIBP sanely, no recommendation about turning SMT 
off at least for some workloads

Thanks,

-- 
Jiri Kosina
SUSE Labs