Re: meltdown
Hello, On Sat, 6 Jan 2018 07:33:50 + m...@netbsd.org wrote: > Loongson-2 had an issue where from branch prediction it would prefetch > instructions from the I/O area and deadlock. > > This happened in normal usage so we build the kernel with a binutils > flag to output different jumps and flush the BTB on kernel entry.* I remember when we added support for those flags to our gcc & binutils. > I wouldn't count on MIPS CPUs to hold under the same level of scrutiny > as x86 CPUs, luckily they're pretty obscure (and most probably aren't > speculative). Yeah, I doubt there are a lot of IRIX servers left, and embedded MIPS is probably safe ;) have fun Michael
Re: meltdown
> On Jan 5, 2018, at 8:55 PM, Thor Lancelot Simon wrote: > > On Thu, Jan 04, 2018 at 04:58:30PM -0500, Mouse wrote: >>> As I understand it, on intel cpus and possibly more, we'll need to >>> unmap the kernel on userret, or else userland can read arbitrary >>> kernel memory. >> >> "Possibly more"? Anything that does speculative execution needs a good >> hard look, and that's damn near everything these days. > > I wonder about just "these days". The potential for this kind of problem > goes all the way back to STRETCH or the 6600, doesn't it? If they had > memory permissions, which I frankly don't know. And even in microprocessors > it's got to go back to... the end of the 1980s (R6000?) certainly the 1990s. No, the issue here isn't permissions, the issue is speculative execution that leaves observable side effects (such as the existence of cache entries) after the speculative path is abandoned. And in the case of Meltdown (though not Spectre) it also requires having the speculative load issue omit the access permission check. CDC 6600 has memory relocation, but not permissions, and in any case it does not have speculative execution of any type. It does have multiple issue, of course, but that alone is not sufficient to create the vulnerability. > Though of course "fail early" is an obvious principle to security types, > given the cost of aborting work in progress I can easily see the > opposite being true for CPU designers (I'm not one, so I don't really > know). Which idiom (check permissions, then speculate / speculate, then > check permissions) is more common? Clearly it depends on the design, either on what's straightforward or efficient to do, or on what is considered essential by the particular designers involved. Presumably (one hopes) the result of the current work is that design techniques will change. And especially, that a better and wider understanding of side channel attacks will appear. Side channel attacks can be quite strange and esoteric. They are worth reading about. My favorite is one I read a year or so ago, a paper describing capturing the sound made by the electronics inside a cell phone as it was performing an RSA crypto operation. This allowed the attacker to reconstruct the RSA secret key without having to install any special software in the phone, and without having to tamper with the phone physically in any way. paul
Re: meltdown
>> Though of course "fail early" is an obvious principle to security >> types, given the cost of aborting work in progress I can easily see >> the opposite being true for CPU designers (I'm not one, so I don't >> really know). Which idiom (check permissions, then speculate / >> speculate, then check permissions) is more common? > No idea, one would think that failing early in order to avoid > unnecessary resource usage would be useful. Perhaps, but _not_ failing is a win if it turns out the spec ex is confirmed instead of annulled. And if the silicon would be sitting idle otherwise, the only resource used is power. (And die area, but that's used in a static sense, not a dynamic sense.) > Then again, the problem seems to be that not everything from the > speculative path gets canceled / annulled, not so much that the > speculation took place. I agree. For cache issues...it might be useful to freeze spec ex on a cache miss. Go ahead and service the cache miss, but keep the result in a separate cache line, not part of the normal cache. On annullment, just drop it; on confirmation, push it into the normal cache and unfreeze. If you want to get really fancy, have multiple speculative cache lines, kind of a small cache in front of the regular cache purely for speculative use, and don't freeze speculation unless it fills up. Though the spectre (ha ha) of coherency then raises its ugly head. Does anyone know how the typical time to service a cache miss compares with the typical time to determine whether spec ex is annulled or confirmed? If the former is longer, or at least not much shorter, than the latter, then this wouldn't even impair performance much in the miss case. Of course, this wouldn't do anything about covert channels other than the cache. But it'd stop anything using the cache for a covert channel between spec ex and mainline code cold (meltdown and some variants of spectre). It's only a partial fix, but, for most purposes, that's better than no fix. Of course, some of the covert channels touched on in the spectre paper are not fixable, such as power consumption and EMI generation; fortunately, they are significantly harder to read from software. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: meltdown
On Sat, Jan 06, 2018 at 01:41:38AM -0500, Michael wrote: > R10k had all sorts of weirdo speculative execution related problems > ( see hardware workarounds in the O2 ), and I doubt it's the first to > implement it. Loongson-2 had an issue where from branch prediction it would prefetch instructions from the I/O area and deadlock. This happened in normal usage so we build the kernel with a binutils flag to output different jumps and flush the BTB on kernel entry.* I wouldn't count on MIPS CPUs to hold under the same level of scrutiny as x86 CPUs, luckily they're pretty obscure (and most probably aren't speculative). * https://sourceware.org/ml/binutils/2009-11/msg00387.html
Re: meltdown
Hello, On Fri, 5 Jan 2018 20:55:19 -0500 Thor Lancelot Simon wrote: > On Thu, Jan 04, 2018 at 04:58:30PM -0500, Mouse wrote: > > > As I understand it, on intel cpus and possibly more, we'll need to > > > unmap the kernel on userret, or else userland can read arbitrary > > > kernel memory. > > > > "Possibly more"? Anything that does speculative execution needs a good > > hard look, and that's damn near everything these days. > > I wonder about just "these days". The potential for this kind of problem > goes all the way back to STRETCH or the 6600, doesn't it? If they had > memory permissions, which I frankly don't know. And even in microprocessors > it's got to go back to... the end of the 1980s (R6000?) certainly the 1990s. R10k had all sorts of weirdo speculative execution related problems ( see hardware workarounds in the O2 ), and I doubt it's the first to implement it. > Though of course "fail early" is an obvious principle to security types, > given the cost of aborting work in progress I can easily see the > opposite being true for CPU designers (I'm not one, so I don't really > know). Which idiom (check permissions, then speculate / speculate, then > check permissions) is more common? No idea, one would think that failing early in order to avoid unnecessary resource usage would be useful. Then again, the problem seems to be that not everything from the speculative path gets canceled / annulled, not so much that the speculation took place. have fun Michael
Re: meltdown
>> "Possibly more"? Anything that does speculative execution needs a >> good hard look, and that's damn near everything these days. > I wonder about just "these days". The potential for this kind of > problem goes all the way back to STRETCH or the 6600, doesn't it? I don't know; I don't know enough about either. > Though of course "fail early" is an obvious principle to security > types, given the cost of aborting work in progress I can easily see > the opposite being true for CPU designers I think it's less the cost of aborting work in progress and more the (performance) cost of not keeping silicon busy all the time. > (I'm not one, so I don't really know). Me neither. But it seems passing obvious to me that these hardware bugs were at least partially driven by customer demand for performance. And, to be sure, there are workloads for which neither meltdown nor spectre is a significant risk, even if the hardware is vulnerable. > Which idiom (check permissions, then speculate / speculate, then > check permissions) is more common? I don't know. But the problem is only partially when permissions get checked. Consider spectre used by sandboxed code to read outside the sandbox within a single process; this is doing nothing that, from the hardware point of view, would violate permissions. I could easily see a CPU designer saying "So what's the problem if the code can read that memory? It can read it anytime it wants with a simple load anyway.". The problem is also failure to roll back _all_ side effects when annulling speculative execution. (To be sure, even if that were done it wouldn't fix quite the whole problem; closing one side-channel doesn't necessarily close other side-channels. But it would help.) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: meltdown
On Thu, Jan 04, 2018 at 04:58:30PM -0500, Mouse wrote: > > As I understand it, on intel cpus and possibly more, we'll need to > > unmap the kernel on userret, or else userland can read arbitrary > > kernel memory. > > "Possibly more"? Anything that does speculative execution needs a good > hard look, and that's damn near everything these days. I wonder about just "these days". The potential for this kind of problem goes all the way back to STRETCH or the 6600, doesn't it? If they had memory permissions, which I frankly don't know. And even in microprocessors it's got to go back to... the end of the 1980s (R6000?) certainly the 1990s. Though of course "fail early" is an obvious principle to security types, given the cost of aborting work in progress I can easily see the opposite being true for CPU designers (I'm not one, so I don't really know). Which idiom (check permissions, then speculate / speculate, then check permissions) is more common? Thor
Re: meltdown
On Jan 5, 2018, at 8:52 AM, wrote: > so the illegal read is also speculative, and is voided (exception > and all) when the wrong branch prediction is sorted out. But it > looks like the paper is saying that refinement has not been > demonstrated, though such branch prediction hacks have been shown > in other exploits. Still, if that can be done, a test for > "SEGV too often" is no help. Actually, the javascript exploit works exactly in this way. Sigh.
RE: meltdown
> I think you are confusing spectre and meltdown. Yes, my apologies. --Tery
Re: meltdown
> On Jan 4, 2018, at 6:01 PM, Warner Losh wrote: > > > > On Thu, Jan 4, 2018 at 2:58 PM, Mouse wrote: > > As I understand it, on intel cpus and possibly more, we'll need to > > unmap the kernel on userret, or else userland can read arbitrary > > kernel memory. > > "Possibly more"? Anything that does speculative execution needs a good > hard look, and that's damn near everything these days. > > > Also, I understand that to exploit this, one has to attempt to access > > kernel memory a lot, and SEGV at least once per bit. > > I don't think so. Traps that would be taken during normal execution > are not taken during speculative execution. The problem is, to quote > one writeup I found, "Intel CPUs are allowed to access kernel memory > when performing speculative execution, even when the application in > question is running in user memory space. The CPU does check to see if > an invalid memory access occurs, but it performs the check after > speculative execution, not before.". This means that things like cache > line loads can occur based on values the currently executing process > should not be able to access; timing access to data that cache-collides > with the cache lines of interest reveals the leaked bit(s). > > Nowhere in there is a SEGV generated. > > That's the meltdown stuff. Spectre targets other things (I've seen > branch prediction mentioned) to leak information around protection > barriers. > > I think you are confusing spectre and meltdown. > > meltdown requires a sequence like: > > exception (*0 = 0 or a = 1 / 0); > do speculative read > > to force a trip into kernel land just before the speculative read so that > otherwise not readable stuff gets (or does not get) read into cache which can > then be probed for data. No, that's not correct. You were being mislead by the "Toy example". The toy example demonstrates that speculative operation are done after the point in the code that generates an exception, but it in itself is NOT the exploit. The exploit has the form: x = read(secret_memory_location); touch (cacheline[x]); while (1) ; The first line will SEGV, of course, but in the vulnerable CPUs the speculative load is issued before that happens. And also before the SEGV happens, cacheline[x] is touched, making that line resident in the cache. This "transmits to the side channel". Next, the SEGV happens. The exploit catches that, and then it does a timing test on references to cacheline[i] to see which i is now resident. That i is the value of x. As the paper points out, it would be possible in principle to prefix the exploit with if (false) // predict_true so the illegal read is also speculative, and is voided (exception and all) when the wrong branch prediction is sorted out. But it looks like the paper is saying that refinement has not been demonstrated, though such branch prediction hacks have been shown in other exploits. Still, if that can be done, a test for "SEGV too often" is no help. The Meltdown paper clearly says that the KAISER fix cures this vulnerability. And while it doesn't say so, it is also clear that the problem does not exist on CPUs where speculative memory references do page protection checks. All the above applies to Meltdown. Spectre is unrelated in its core mechanism. The fact that both eventually end up using side channels and were published at the same time seems to have caused some confusion between the two. It is important to understand they are independent, stem from different underlying problems, apply to a different set of vulnerable chips, and have different cures. paul
Re: meltdown
> If there's anything this issue showed is that we definitely need > fewer people independently considering the issue and openly > discussing their own (occasionally wrong) suggestions. Actually, it seems to me we need more. More minds looking at it, more discussion of the various ramifications and workarounds. Lack of public discussion serves nobody at this point, possibly execpting chip-makers trying to downplay their bugs. The hardware bugs behind these (that speculative execution doesn't make security checks correctly and doesn't roll back all its side effects when annulled) are so ubiquitous that the _correct_ fix - buying non-buggy hardware - is close to impossible. The only thing most people can do is try to find workarounds. I feel reasonably sure that, at this point, there are at least a few exploitable side-channels and a few workarounds that aren't known publicly (possibly at all), and more people thinking about them is the only thing likely to fix that. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: meltdown
On Fri, Jan 05, 2018 at 02:48:11AM -0600, Dave Huang wrote: > On Jan 4, 2018, at 15:22, Phil Nelson wrote: > > How about turning on the workaround for any process that ignores > > or catches SEGV.Any process that is terminated by a SEGV should > > be safe, shouldn't it? > > Isn't there a suggested mitigation? Seems to me NetBSD should implement > it as suggested, rather than coming up with its own special criteria > for when to enable the workaround. BTW: latest summary below: https://security.googleblog.com/2018/01/more-details-about-mitigations-for-cpu_4.html https://www.theregister.co.uk/2018/01/05/spectre_flaws_explained/ Regards, -- Piotr 'aniou' Meyer
Re: meltdown
If there's anything this issue showed is that we definitely need fewer people independently considering the issue and openly discussing their own (occasionally wrong) suggestions. It was just a suggestion, I'm not a source of authority.
Re: meltdown
On Jan 4, 2018, at 15:22, Phil Nelson wrote: > How about turning on the workaround for any process that ignores > or catches SEGV.Any process that is terminated by a SEGV should > be safe, shouldn't it? Isn't there a suggested mitigation? Seems to me NetBSD should implement it as suggested, rather than coming up with its own special criteria for when to enable the workaround. -- Name: Dave Huang | Mammal, mammal / their names are called / INet: k...@azeotrope.org | they raise a paw / the bat, the cat / Telegram: @DahanC| dolphin and dog / koala bear and hog -- TMBG Dahan: Hani G Y+C 42 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++
Re: meltdown
On Thursday 04 January 2018 12:49:22 m...@netbsd.org wrote: > I wonder if we can count the number of SEGVs and if we get a few, turn > on the workaround? How about turning on the workaround for any process that ignores or catches SEGV.Any process that is terminated by a SEGV should be safe, shouldn't it? --Phil -- Phil Nelson, http://pcnelson.net
Re: meltdown
On Thu, Jan 4, 2018 at 2:58 PM, Mouse wrote: > > As I understand it, on intel cpus and possibly more, we'll need to > > unmap the kernel on userret, or else userland can read arbitrary > > kernel memory. > > "Possibly more"? Anything that does speculative execution needs a good > hard look, and that's damn near everything these days. > > > Also, I understand that to exploit this, one has to attempt to access > > kernel memory a lot, and SEGV at least once per bit. > > I don't think so. Traps that would be taken during normal execution > are not taken during speculative execution. The problem is, to quote > one writeup I found, "Intel CPUs are allowed to access kernel memory > when performing speculative execution, even when the application in > question is running in user memory space. The CPU does check to see if > an invalid memory access occurs, but it performs the check after > speculative execution, not before.". This means that things like cache > line loads can occur based on values the currently executing process > should not be able to access; timing access to data that cache-collides > with the cache lines of interest reveals the leaked bit(s). > > Nowhere in there is a SEGV generated. > > That's the meltdown stuff. Spectre targets other things (I've seen > branch prediction mentioned) to leak information around protection > barriers. > I think you are confusing spectre and meltdown. meltdown requires a sequence like: exception (*0 = 0 or a = 1 / 0); do speculative read to force a trip into kernel land just before the speculative read so that otherwise not readable stuff gets (or does not get) read into cache which can then be probed for data. spectre requires a trip into the kernel to force execution of if (from-userland < boundary) { array[from-userland]
RE: meltdown
> As I understand it, on intel cpus and possibly more, we'll need to unmap > the kernel on userret, or else userland can read arbitrary kernel > memory. > > People seem to be mentioning a 50% performance penalty and we might do > worse (we don't have vDSOs...) I suggest sticking to the original papers which are at meltdownattack.com: https://meltdownattack.com/meltdown.pdf and https://spectreattack.com/spectre.pdf The problems are fairly subtle. They demonstrated (in the Meltdown paper) that you can use JIT compiled code (they used JavaScript in Chrome) to read anything in the Chrome processes' memory. The kernel memory exploit is troubling, but actually easier to mitigate. It looks relatively *easier* to mitigate the kernel/user separation problem. I think more thought needs to go into how this effects sandboxing approaches (JIT, interpreters, Docker, XEN, etc). JIT, docker, Xen are all mentioned as having difficulties. If I understand the paper correctly, the K/U space unmapping must be accompanied on x86 (32/64) by KASLR; otherwise the things that must always be mapped in U space (trap handlers, etc.) can be used to get access anyway. I don't think you need to SEGV to get this data - the SEGVs can be on the speculative ("predicted taken") branch path, but the branch is not actually taken, so the SEGV never gets reported. However, the cache state is changed by the speculative path. But really: we all need to read the primary papers, and not pay too much attention to the press about this the press discussion is not really very helpful (at least as of this morning EDT). --Terry
Re: meltdown
> On Jan 4, 2018, at 4:58 PM, Mouse wrote: > >> As I understand it, on intel cpus and possibly more, we'll need to >> unmap the kernel on userret, or else userland can read arbitrary >> kernel memory. > > "Possibly more"? Anything that does speculative execution needs a good > hard look, and that's damn near everything these days. > >> Also, I understand that to exploit this, one has to attempt to access >> kernel memory a lot, and SEGV at least once per bit. > > I don't think so. Traps that would be taken during normal execution > are not taken during speculative execution. The problem is, to quote > one writeup I found, "Intel CPUs are allowed to access kernel memory > when performing speculative execution, even when the application in > question is running in user memory space. The CPU does check to see if > an invalid memory access occurs, but it performs the check after > speculative execution, not before.". This means that things like cache > line loads can occur based on values the currently executing process > should not be able to access; timing access to data that cache-collides > with the cache lines of interest reveals the leaked bit(s). > > Nowhere in there is a SEGV generated. That depends. The straightforward case of Meltdown starts with an illegal load, which the CPU will execute anyway speculatively, resulting in downstream code execution that can be used to change the cache state. In that form, the load eventually aborts. There's a discussion in the paper that the load could be preceded by a branch not taken that's predicted taken. If so, the SEGV would indeed not happen, but it isn't clear how feasible this is. In any case, the problem would not occur in any CPU that does protection checks prior to issuing speculative memory references. paul
Re: meltdown
> As I understand it, on intel cpus and possibly more, we'll need to > unmap the kernel on userret, or else userland can read arbitrary > kernel memory. "Possibly more"? Anything that does speculative execution needs a good hard look, and that's damn near everything these days. > Also, I understand that to exploit this, one has to attempt to access > kernel memory a lot, and SEGV at least once per bit. I don't think so. Traps that would be taken during normal execution are not taken during speculative execution. The problem is, to quote one writeup I found, "Intel CPUs are allowed to access kernel memory when performing speculative execution, even when the application in question is running in user memory space. The CPU does check to see if an invalid memory access occurs, but it performs the check after speculative execution, not before.". This means that things like cache line loads can occur based on values the currently executing process should not be able to access; timing access to data that cache-collides with the cache lines of interest reveals the leaked bit(s). Nowhere in there is a SEGV generated. That's the meltdown stuff. Spectre targets other things (I've seen branch prediction mentioned) to leak information around protection barriers. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: meltdown
On Thu, Jan 04, 2018 at 10:01:34PM +0100, Kamil Rytarowski wrote: > We have: PaX Segvguard. Can we mitigate it with this feature? > that's what gave me the idea, but I think segvguard is per-binary, and I could just make new binaries to keep on attacking the kernel.
Re: meltdown
On 04.01.2018 21:49, m...@netbsd.org wrote: > Also, I understand that to exploit this, one has to attempt to access > kernel memory a lot, and SEGV at least once per bit. > > I wonder if we can count the number of SEGVs and if we get a few, turn > on the workaround? that would at least spare us the performance penalty > for normal code. > We have: PaX Segvguard. Can we mitigate it with this feature? signature.asc Description: OpenPGP digital signature