Re: [RFC PATCH v9 12/13] xpfo, mm: Defer TLB flushes for non-current CPUs (x86 only)

Khalid Aziz Fri, 05 Apr 2019 08:59:05 -0700

On 4/5/19 9:24 AM, Andy Lutomirski wrote:
> 
> 
>> On Apr 5, 2019, at 8:44 AM, Dave Hansen <dave.han...@intel.com> wrote:
>>
>> On 4/5/19 12:17 AM, Thomas Gleixner wrote:
>>>> process. Is that an acceptable trade-off?
>>> You are not seriously asking whether creating a user controllable ret2dir
>>> attack window is a acceptable trade-off? April 1st was a few days ago.
>>
>> Well, let's not forget that this set at least takes us from "always
>> vulnerable to ret2dir" to a choice between:
>>
>> 1. fast-ish and "vulnerable to ret2dir for a user-controllable window"
>> 2. slow and "mitigated against ret2dir"
>>
>> Sounds like we need a mechanism that will do the deferred XPFO TLB
>> flushes whenever the kernel is entered, and not _just_ at context switch
>> time.  This permits an app to run in userspace with stale kernel TLB
>> entries as long as it wants... that's harmless.
> 
> I don’t think this is good enough. The bad guys can enter the kernel and 
> arrange for the kernel to wait, *in kernel*, for long enough to set up the 
> attack.  userfaultfd is the most obvious way, but there are plenty. I suppose 
> we could do the flush at context switch *and* entry.  I bet that performance 
> still utterly sucks, though — on many workloads, this turns every entry into 
> a full flush, and we already know exactly how much that sucks — it’s 
> identical to KPTI without PCID.  (And yes, if we go this route, we need to 
> merge this logic together — we shouldn’t write CR3 twice on entry).


Performance impact might not be all that much from flush at kernel
entry. This flush will happen only if there is a pending flush posted to
the processor and will be done in lieu of flush at the next context
switch. So we are not looking at adding more TLB flushes, rather change
where they might happen. That still can result in some performance
impact and measuring it with real code will be the only way to get that
number.

> 
> I feel like this whole approach is misguided. ret2dir is not such a game 
> changer that fixing it is worth huge slowdowns. I think all this effort 
> should be spent on some kind of sensible CFI. For example, we should be able 
> to mostly squash ret2anything by inserting a check that the high bits of RSP 
> match the value on the top of the stack before any code that pops RSP.  On an 
> FPO build, there aren’t all that many hot POP RSP instructions, I think.
> 
> (Actually, checking the bits is suboptimal. Do:
> 
> unsigned long offset = *rsp - rsp;
> offset >>= THREAD_SHIFT;
> if (unlikely(offset))
> BUG();
> POP RSP;
> 
> This means that it’s also impossible to trick a function to return into a 
> buffer that is on that function’s stack.)
> 
> In other words, I think that ret2dir is an insufficient justification for 
> XPFO.
> 

That is something we may want to explore further. Closing down
/proc/<pid>/pagemap has already helped reduce one way to mount ret2dir
attack. physmap spraying technique still remains viable. XPFO
implementation is expensive. Can we do something different to mitigate
physmap spraying?

Thanks,
Khalid


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 12/13] xpfo, mm: Defer TLB flushes for non-current CPUs (x86 only)

Reply via email to