On 11/12/23 00:44, Gernot Heiser via Devel wrote: > On 11 Nov 2023, at 07:49, Demi Marie Obenour <demioben...@gmail.com> wrote: >> >> On 11/9/23 17:47, Gernot Heiser via Devel wrote: >>> On 10 Nov 2023, at 06:03, Demi Marie Obenour <demioben...@gmail.com> wrote: >>> >>>> - Speculative taint tracking provides complete protection against >>>> speculative attacks. This is sufficient to prevent leakage of >>>> cryptographic key material, even in fully dynamic systems. >>>> Furthermore, it is compatible with fast context switches between >>>> protection domains. >>> >>> It’s also a point solution, that provides zero guarantees against >>> unforeseen attacks. >> >> Unless I am severely mistaken, it provides complete protection for code >> that has secret-independent timing, such as cryptographic software. It >> is also cheaper than some of the workarounds existing systems must use. > > Well, *if* speculative taint tracking is really completely and correctly > implemented, *and* you have such magic hardware. That’s a strong statement > (for which there’s no proof). But let’s assume it is true. > > Then you may a complete protection against *speculation* attacks. > > Remember, speculation attacks construct a Trojan in otherwise trustworthy > code using speculative execution of gadgets. There are other ways, > specifically control-flow attacks > > So order to be secure, you then “only” need: > - the magic complete and performant implementation of taint tacking, AND > - complete prevention of control-flow attacks, AND > - all secret-handling code being free from algorithmic timing side channels > (i.e. no branching on or indexing by secrets), AND > - no untrusted code, because any untrusted code may contain a Trojan that > actively leaks through caches etc > > If you’re comfortable with all those ifs, fine. I’m not. > >>> - Full time partitioning eliminates all timing channels, but it is >>> possible only in fully static systems, which severely limits its >>> applicability. >>> >>>> I’m sorry, but this is simply false. >>>> >>>> What you need for time protection (I assume this is what you mean with >>>> “full time partitioning”) are fixed time slices – ”fixed” in that their >>>> length cannot depend on any events in the system that can be controlled by >>>> an untrusted domain. It doesn’t mean they cannot be changed as domains >>>> come and go. >> >> Based on what information should I set these time slices? > > That’s OS/hypervisor policy. Every system I know assigns time slices, that’s > normal. > > But note, the strictly fixed (in the sense of not influencable by user code) > time slices are only needed if you want to prevent *all* timing channels, in > this case leaking by controlling the timing of context switches. > > This is a relatively low-bandwidth channel, i.e. it will need a minute or tow > to leak an SSL key. If you’re fine with that then there’s no need for fixed > time slices. > >>>> - Time protection without time partitioning does _not_ fully prevent >>>> Spectre v1 attacks, and still imposes a large penalty on protection >>>> domain switches. >>>> >>> Time protection does *not* impose a large penalty. Its context-switching >>> cost is completely hidden by the cost of an L1 D-cache flush – as has been >>> demonstrated by published work. And if you don’t flush the L1 cache, you’ll >>> have massive leakage, taint-tracking or not. >>> >>> Where time protection, *without further hardware support*, does have a cost >>> is for partitioning the lower-level caches. This cost is two-fold: >>> >>> 1) Average cache utilisation is reduced due to the static partitioning (in >>> contrast to the dynamic partitioning that happens as a side effect of the >>> hardware’s cache replacement policy). This cost is generally in the >>> single-digit percentage range (as per published work), but can also be >>> negative – there’s plenty of work that uses static cache partitioning for >>> performance *isolation/improvement*. >> >> Static partitioning based on _what_? On a desktop system, the dynamic >> behavior >> of a workload is generally the _only_ information known about that workload, >> so >> any partitioning _must_ be dynamic. > > Again, if you trust all your code to not intentionally leak secrets, then you > don’t have to do this. > > Cache channels are very high bandwidth. Even cache side-channels have high > enough bandwidth to steal encryption keys in minutes. > > If your threat scenario doesn’t care about this, fine. But there’s no way of > preventing cache channels other than flushing or partitioning. > > So, it all depends on your threat scenario. > > If your threat scenario is that: > - your hypervisor/kernel is completely trusted > - all secret-handling code is trusted to > - not have algorithmic timing channels > - not be susceptible to control-flow attacks > - be free of Trojans > … then speculation attacks may be your main worry and you can ignore all the > other timing channels, and you *may* be covered by (complete and > properly-implemented) speculation taint tracking. > > That’s a lot of ifs – too many for my comfort. > > Note, this tracking adds a fair amount of complexity to the processor, which > means there’s a high likelihood the implementation is buggy. This is in > contract to fence.t, which is extremely simple to implement. > >>> And, of course, without partitioning the lower-level caches you have >>> leakage again, and taint tracking isn’t going to help there either. >>> >>> If people want to improve the hardware, focussing on generic mechanisms >>> such as support for partitioning L2-LL caches would be far more beneficial >>> than point-solutions that will be defeated by the next class of attacks. >> >> I would much rather have a theoretically sound solution than an unsound one. >> However, it is even more important that my desktop actually be able to do the >> tasks I expect of it. To the best of my knowledge, time protection and a >> usable >> desktop are incompatible with each other. I do hope you can prove me wrong >> here. > > If you want a theoretically sound solution – Welcome to Time Protection! > > In contrast to your implied claim that time protection is unsound: It’s been > formalised, and it’s in the process of being proved correct and complete in > an seL4 implementation. Its minimal hardware support also been implemented in > RISC-V processors and demonstrated cheap and complete (may even be in silicon > by now). A key storage service _must_ be able to respond to signing and decryption requests to be usable at all, and that means that the requester _will_ know how long the operation took. One can try to hide this information by padding the operation to its worst-case value, but this only works if there _is_ a worst-case value. In Qubes OS, responding to a request will require heap allocation, fork(), disk I/O, and sometimes user interaction. That means that the worst-case operation time is _infinite_, so time protection is simply not possible and will not be possible for the foreseeable future. -- Sincerely, Demi Marie Obenour (she/her/hers)
_______________________________________________ Devel mailing list -- devel@sel4.systems To unsubscribe send an email to devel-leave@sel4.systems