[seL4] Re: Conclusions regarding speculation

Demi Marie Obenour Fri, 17 Nov 2023 13:03:36 -0800

On 11/12/23 00:44, Gernot Heiser via Devel wrote:
> On 11 Nov 2023, at 07:49, Demi Marie Obenour <demioben...@gmail.com> wrote:
>>
>> On 11/9/23 17:47, Gernot Heiser via Devel wrote:
>>> On 10 Nov 2023, at 06:03, Demi Marie Obenour <demioben...@gmail.com> wrote:
>>>
>>>> - Speculative taint tracking provides complete protection against
>>>> speculative attacks.  This is sufficient to prevent leakage of
>>>> cryptographic key material, even in fully dynamic systems.
>>>> Furthermore, it is compatible with fast context switches between
>>>> protection domains.
>>>
>>> It’s also a point solution, that provides zero guarantees against 
>>> unforeseen attacks.
>>
>> Unless I am severely mistaken, it provides complete protection for code
>> that has secret-independent timing, such as cryptographic software.  It
>> is also cheaper than some of the workarounds existing systems must use.
> 
> Well, *if* speculative taint tracking is really completely and correctly 
> implemented, *and* you have such magic hardware. That’s a strong statement 
> (for which there’s no proof). But let’s assume it is true.
> 
> Then you may a complete protection against *speculation* attacks.
> 
> Remember, speculation attacks construct a Trojan in otherwise trustworthy 
> code using speculative execution of gadgets. There are other ways, 
> specifically control-flow attacks
> 
> So order to be secure, you then “only” need:
> - the magic complete and performant implementation of taint tacking, AND
> - complete prevention of control-flow attacks, AND
> - all secret-handling code being free from algorithmic timing side channels 
> (i.e. no branching on or indexing by secrets), AND
> - no untrusted code, because any untrusted code may contain a Trojan that 
> actively leaks through caches etc
> 
> If you’re comfortable with all those ifs, fine. I’m not.
> 
>>> - Full time partitioning eliminates all timing channels, but it is
>>> possible only in fully static systems, which severely limits its
>>> applicability.
>>>
>>>> I’m sorry, but this is simply false.
>>>>
>>>> What you need for time protection (I assume this is what you mean with 
>>>> “full time partitioning”) are fixed time slices – ”fixed” in that their 
>>>> length cannot depend on any events in the system that can be controlled by 
>>>> an untrusted domain. It doesn’t mean they cannot be changed as domains 
>>>> come and go.
>>
>> Based on what information should I set these time slices?
> 
> That’s OS/hypervisor policy. Every system I know assigns time slices, that’s 
> normal.
> 
> But note, the strictly fixed (in the sense of not influencable by user code) 
> time slices are only needed if you want to prevent *all* timing channels, in 
> this case leaking by controlling the timing of context switches.
> 
> This is a relatively low-bandwidth channel, i.e. it will need a minute or tow 
> to leak an SSL key. If you’re fine with that then there’s no need for fixed 
> time slices.
> 
>>>> - Time protection without time partitioning does _not_ fully prevent
>>>> Spectre v1 attacks, and still imposes a large penalty on protection
>>>> domain switches.
>>>>
>>> Time protection does *not* impose a large penalty. Its context-switching 
>>> cost is completely hidden by the cost of an L1 D-cache flush – as has been 
>>> demonstrated by published work. And if you don’t flush the L1 cache, you’ll 
>>> have massive leakage, taint-tracking or not.
>>>
>>> Where time protection, *without further hardware support*, does have a cost 
>>> is for partitioning the lower-level caches. This cost is two-fold:
>>>
>>> 1) Average cache utilisation is reduced due to the static partitioning (in 
>>> contrast to the dynamic partitioning that happens as a side effect of the 
>>> hardware’s cache replacement policy). This cost is generally in the 
>>> single-digit percentage range (as per published work), but can also be 
>>> negative – there’s plenty of work that uses static cache partitioning for 
>>> performance *isolation/improvement*.
>>
>> Static partitioning based on _what_?  On a desktop system, the dynamic 
>> behavior
>> of a workload is generally the _only_ information known about that workload, 
>> so
>> any partitioning _must_ be dynamic.
> 
> Again, if you trust all your code to not intentionally leak secrets, then you 
> don’t have to do this.
> 
> Cache channels are very high bandwidth. Even cache side-channels have high 
> enough bandwidth to steal encryption keys in minutes.
> 
> If your threat scenario doesn’t care about this, fine. But there’s no way of 
> preventing cache channels other than flushing or partitioning.
> 
> So, it all depends on your threat scenario.
> 
> If your threat scenario is that:
> - your hypervisor/kernel is completely trusted
> - all secret-handling code is trusted to
>    - not have algorithmic timing channels
>    - not be susceptible to control-flow attacks
>    - be free of Trojans
> … then speculation attacks may be your main worry and you can ignore all the 
> other timing channels, and you *may* be covered by (complete and 
> properly-implemented) speculation taint tracking.
> 
> That’s a lot of ifs – too many for my comfort.
> 
> Note, this tracking adds a fair amount of complexity to the processor, which 
> means there’s a high likelihood the implementation is buggy. This is in 
> contract to fence.t, which is extremely simple to implement.
> 
>>> And, of course, without partitioning the lower-level caches you have 
>>> leakage again, and taint tracking isn’t going to help there either.
>>>
>>> If people want to improve the hardware, focussing on generic mechanisms 
>>> such as support for partitioning L2-LL caches would be far more beneficial 
>>> than point-solutions that will be defeated by the next class of attacks.
>>
>> I would much rather have a theoretically sound solution than an unsound one.
>> However, it is even more important that my desktop actually be able to do the
>> tasks I expect of it.  To the best of my knowledge, time protection and a 
>> usable
>> desktop are incompatible with each other.  I do hope you can prove me wrong 
>> here.
> 
> If you want a theoretically sound solution – Welcome to Time Protection!
> 
> In contrast to your implied claim that time protection is unsound: It’s been 
> formalised, and it’s in the process of being proved correct and complete in 
> an seL4 implementation. Its minimal hardware support also been implemented in 
> RISC-V processors and demonstrated cheap and complete (may even be in silicon 
> by now).
A key storage service _must_ be able to respond to signing and
decryption requests to be usable at all, and that means that the
requester _will_ know how long the operation took.  One can try to hide
this information by padding the operation to its worst-case value,
but this only works if there _is_ a worst-case value.  In Qubes OS,
responding to a request will require heap allocation, fork(), disk
I/O, and sometimes user interaction.  That means that the worst-case
operation time is _infinite_, so time protection is simply not
possible and will not be possible for the foreseeable future.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)


_______________________________________________
Devel mailing list -- devel@sel4.systems
To unsubscribe send an email to devel-leave@sel4.systems

[seL4] Re: Conclusions regarding speculation

Reply via email to