Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-08 Thread Sean Christopherson
On Mon, Feb 08, 2021, Andy Lutomirski wrote:
> On Mon, Feb 8, 2021 at 9:11 AM Sean Christopherson  wrote:
> >
> > On Sun, Feb 07, 2021, Andy Lutomirski wrote:
> > >
> 
> > > How much of the register state is revealed to the VMM when we do a 
> > > TDVMCALL?
> > > Presumably we should fully sanitize all register state that shows up in
> > > cleartext on the other end, and we should treat all regs that can be 
> > > modified
> > > by the VMM as clobbered.
> >
> > The guest gets to choose, with a few restrictions.  RSP cannot be exposed 
> > to the
> > host.  RAX, RCX, R10, and R11 are always exposed as they hold mandatory info
> > about the TDVMCALL (TDCALL fn, GPR mask, GHCI vs. vendor, and TDVMCALL fn). 
> >  All
> > other GPRs are exposed and clobbered if their bit in RCX is set, otherwise 
> > they
> > are saved/restored by the TDX-Module.
> >
> > I agree with Dave, pass everything required by the GHCI in the main 
> > routine, and
> > sanitize and save/restore all such GPRs.
> 
> Sounds okay to me.

One clarification: only non-volatile GPRs (R12-R15) need to be saved/restored.

And I think it makes sense to sanitize any exposed GPRs (that don't hold an
output value) after TDVMCALL to avoid speculating with a host-controlled value.


Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-08 Thread Andy Lutomirski
On Mon, Feb 8, 2021 at 9:11 AM Sean Christopherson  wrote:
>
> On Sun, Feb 07, 2021, Andy Lutomirski wrote:
> >

> > How much of the register state is revealed to the VMM when we do a TDVMCALL?
> > Presumably we should fully sanitize all register state that shows up in
> > cleartext on the other end, and we should treat all regs that can be 
> > modified
> > by the VMM as clobbered.
>
> The guest gets to choose, with a few restrictions.  RSP cannot be exposed to 
> the
> host.  RAX, RCX, R10, and R11 are always exposed as they hold mandatory info
> about the TDVMCALL (TDCALL fn, GPR mask, GHCI vs. vendor, and TDVMCALL fn).  
> All
> other GPRs are exposed and clobbered if their bit in RCX is set, otherwise 
> they
> are saved/restored by the TDX-Module.
>
> I agree with Dave, pass everything required by the GHCI in the main routine, 
> and
> sanitize and save/restore all such GPRs.

Sounds okay to me.


Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-08 Thread Sean Christopherson
On Sun, Feb 07, 2021, Andy Lutomirski wrote:
> 
> > On Feb 7, 2021, at 2:31 PM, Dave Hansen  wrote:
> > 
> > On 2/7/21 12:29 PM, Kirill A. Shutemov wrote:
> >>> Couldn't you just have one big helper that takes *all* the registers
> >>> that get used in any TDVMCALL and sets all the rcx bits?  The users
> >>> could just pass 0's for the things they don't use.

IIRC, having exactly one helper is a big mess (my original code used a single
main helper).  CPUID has a large number of outputs, and because outputs are
handled as pointers, the assembly routine needs to check output params for NULL.

And if we want to write up port I/O directly to TDVMCALL to avoid the #VE, IN
and OUT need separate helpers to implement a non-standard register ABI in order
to play nice with ALTERANTIVES.

This also has my vote, mainly because gcc doesn't allow directly specifying
r8-r15 as register constraints to inline asm.  That creates a nasty hole where
a register can get corrupted if code is inserted between writing the local
variable and passing it to the inline asm.

Case in point, patch 08 has this exact bug.

> +static u64 tdx_read_msr_safe(unsigned int msr, int *err)
> +{
> +       register long r10 asm("r10") = TDVMCALL_STANDARD;
> +       register long r11 asm("r11") = EXIT_REASON_MSR_READ;
> +       register long r12 asm("r12") = msr;
> +       register long rcx asm("rcx");
> +       long ret;
> +
> +       WARN_ON_ONCE(tdx_is_context_switched_msr(msr));

This can corrupt r10, r11 and/or r12, e.g. if tdx_is_context_switched_msr() is
not inlined, it can use r10 and r11 as scratch registers, and gcc isn't smart
enough to know it needs to save registers before the call.

Even if the code as committed is guaranteed to work, IMO this approach is
hostile toward future debuggers/developers, e.g. adding a printk() in here to
debug can introduce a completely different failure.

> +
> +       if (msr == MSR_CSTAR)
> +               return 0;
> +
> +       /* Allow to pass R10, R11 and R12 down to the VMM */
> +       rcx = BIT(10) | BIT(11) | BIT(12);
> +
> +       asm volatile(TDCALL
> +                       : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12)
> +                       : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), 
> "r"(r12)
> +                       : );
> +
> +       /* XXX: Better error handling needed? */
> +       *err = (ret || r10) ? -EIO : 0;
> +
> +       return r11;
> +}

> >>> Then you've got the ugly inline asm in one place.  It also makes it
> >>> harder to screw up the 'rcx' mask and end up passing registers you
> >>> didn't want into a malicious VMM.
> >> For now we only pass down R10-R15, but the interface allows to pass down
> >> much wider set of registers, including XMM. How far do we want to get it?
> > 
> > Just do what we immediately need: R10-R15
> > .
> > 
> 
> How much of the register state is revealed to the VMM when we do a TDVMCALL?
> Presumably we should fully sanitize all register state that shows up in
> cleartext on the other end, and we should treat all regs that can be modified
> by the VMM as clobbered.

The guest gets to choose, with a few restrictions.  RSP cannot be exposed to the
host.  RAX, RCX, R10, and R11 are always exposed as they hold mandatory info
about the TDVMCALL (TDCALL fn, GPR mask, GHCI vs. vendor, and TDVMCALL fn).  All
other GPRs are exposed and clobbered if their bit in RCX is set, otherwise they
are saved/restored by the TDX-Module.

I agree with Dave, pass everything required by the GHCI in the main routine, and
sanitize and save/restore all such GPRs.


Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-07 Thread Andy Lutomirski


> On Feb 7, 2021, at 2:31 PM, Dave Hansen  wrote:
> 
> On 2/7/21 12:29 PM, Kirill A. Shutemov wrote:
>>> Couldn't you just have one big helper that takes *all* the registers
>>> that get used in any TDVMCALL and sets all the rcx bits?  The users
>>> could just pass 0's for the things they don't use.
>>> 
>>> Then you've got the ugly inline asm in one place.  It also makes it
>>> harder to screw up the 'rcx' mask and end up passing registers you
>>> didn't want into a malicious VMM.
>> For now we only pass down R10-R15, but the interface allows to pass down
>> much wider set of registers, including XMM. How far do we want to get it?
> 
> Just do what we immediately need: R10-R15
> .
> 

How much of the register state is revealed to the VMM when we do a TDVMCALL?  
Presumably we should fully sanitize all register state that shows up in 
cleartext on the other end, and we should treat all regs that can be modified 
by the VMM as clobbered.

Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-07 Thread Dave Hansen
On 2/7/21 12:29 PM, Kirill A. Shutemov wrote:
>> Couldn't you just have one big helper that takes *all* the registers
>> that get used in any TDVMCALL and sets all the rcx bits?  The users
>> could just pass 0's for the things they don't use.
>>
>> Then you've got the ugly inline asm in one place.  It also makes it
>> harder to screw up the 'rcx' mask and end up passing registers you
>> didn't want into a malicious VMM.
> For now we only pass down R10-R15, but the interface allows to pass down
> much wider set of registers, including XMM. How far do we want to get it?

Just do what we immediately need: R10-R15.



Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-07 Thread Kirill A. Shutemov
On Sun, Feb 07, 2021 at 08:01:50AM -0800, Dave Hansen wrote:
> On 2/7/21 6:13 AM, Kirill A. Shutemov wrote:
> >>> +   /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM 
> >>> */
> >>> +   rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15);
> >>> +
> >>> +   asm volatile(TDCALL
> >>> +   : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), 
> >>> "=r"(r13),
> >>> + "=r"(r14), "=r"(r15)
> >>> +   : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), 
> >>> "r"(r12),
> >>> + "r"(r13)
> >>> +   : );
> >> Some "+" constraints would make this simpler.  But I think you should
> >> factor the TDCALL helper out into its own function.
> > Factor out TDCALL into a helper is tricky: different TDCALLs have
> > different list of registers passed to VMM.
> 
> Couldn't you just have one big helper that takes *all* the registers
> that get used in any TDVMCALL and sets all the rcx bits?  The users
> could just pass 0's for the things they don't use.
> 
> Then you've got the ugly inline asm in one place.  It also makes it
> harder to screw up the 'rcx' mask and end up passing registers you
> didn't want into a malicious VMM.

For now we only pass down R10-R15, but the interface allows to pass down
much wider set of registers, including XMM. How far do we want to get it?

-- 
 Kirill A. Shutemov


Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-07 Thread Dave Hansen
On 2/7/21 6:13 AM, Kirill A. Shutemov wrote:
>>> +   /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM */
>>> +   rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15);
>>> +
>>> +   asm volatile(TDCALL
>>> +   : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), 
>>> "=r"(r13),
>>> + "=r"(r14), "=r"(r15)
>>> +   : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), 
>>> "r"(r12),
>>> + "r"(r13)
>>> +   : );
>> Some "+" constraints would make this simpler.  But I think you should
>> factor the TDCALL helper out into its own function.
> Factor out TDCALL into a helper is tricky: different TDCALLs have
> different list of registers passed to VMM.

Couldn't you just have one big helper that takes *all* the registers
that get used in any TDVMCALL and sets all the rcx bits?  The users
could just pass 0's for the things they don't use.

Then you've got the ugly inline asm in one place.  It also makes it
harder to screw up the 'rcx' mask and end up passing registers you
didn't want into a malicious VMM.


Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-07 Thread Kirill A. Shutemov
On Fri, Feb 05, 2021 at 03:42:01PM -0800, Andy Lutomirski wrote:
> On Fri, Feb 5, 2021 at 3:39 PM Kuppuswamy Sathyanarayanan
>  wrote:
> >
> > From: "Kirill A. Shutemov" 
> >
> > TDX has three classes of CPUID leaves: some CPUID leaves
> > are always handled by the CPU, others are handled by the TDX module,
> > and some others are handled by the VMM. Since the VMM cannot directly
> > intercept the instruction these are reflected with a #VE exception
> > to the guest, which then converts it into a TDCALL to the VMM,
> > or handled directly.
> >
> > The TDX module EAS has a full list of CPUID leaves which are handled
> > natively or by the TDX module in 16.2. Only unknown CPUIDs are handled by
> > the #VE method. In practice this typically only applies to the
> > hypervisor specific CPUIDs unknown to the native CPU.
> >
> > Therefore there is no risk of causing this in early CPUID code which
> > runs before the #VE handler is set up because it will never access
> > those exotic CPUID leaves.
> >
> > Signed-off-by: Kirill A. Shutemov 
> > Reviewed-by: Andi Kleen 
> > Signed-off-by: Kuppuswamy Sathyanarayanan 
> > 
> > ---
> >  arch/x86/kernel/tdx.c | 32 
> >  1 file changed, 32 insertions(+)
> >
> > diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
> > index 5d961263601e..e98058c048b5 100644
> > --- a/arch/x86/kernel/tdx.c
> > +++ b/arch/x86/kernel/tdx.c
> > @@ -172,6 +172,35 @@ static int tdx_write_msr_safe(unsigned int msr, 
> > unsigned int low,
> > return ret || r10 ? -EIO : 0;
> >  }
> >
> > +static void tdx_handle_cpuid(struct pt_regs *regs)
> > +{
> > +   register long r10 asm("r10") = TDVMCALL_STANDARD;
> > +   register long r11 asm("r11") = EXIT_REASON_CPUID;
> > +   register long r12 asm("r12") = regs->ax;
> > +   register long r13 asm("r13") = regs->cx;
> > +   register long r14 asm("r14");
> > +   register long r15 asm("r15");
> > +   register long rcx asm("rcx");
> > +   long ret;
> > +
> > +   /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM */
> > +   rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15);
> > +
> > +   asm volatile(TDCALL
> > +   : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), 
> > "=r"(r13),
> > + "=r"(r14), "=r"(r15)
> > +   : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), 
> > "r"(r12),
> > + "r"(r13)
> > +   : );
> 
> Some "+" constraints would make this simpler.  But I think you should
> factor the TDCALL helper out into its own function.

Factor out TDCALL into a helper is tricky: different TDCALLs have
different list of registers passed to VMM.

-- 
 Kirill A. Shutemov


Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

2021-02-05 Thread Andy Lutomirski
On Fri, Feb 5, 2021 at 3:39 PM Kuppuswamy Sathyanarayanan
 wrote:
>
> From: "Kirill A. Shutemov" 
>
> TDX has three classes of CPUID leaves: some CPUID leaves
> are always handled by the CPU, others are handled by the TDX module,
> and some others are handled by the VMM. Since the VMM cannot directly
> intercept the instruction these are reflected with a #VE exception
> to the guest, which then converts it into a TDCALL to the VMM,
> or handled directly.
>
> The TDX module EAS has a full list of CPUID leaves which are handled
> natively or by the TDX module in 16.2. Only unknown CPUIDs are handled by
> the #VE method. In practice this typically only applies to the
> hypervisor specific CPUIDs unknown to the native CPU.
>
> Therefore there is no risk of causing this in early CPUID code which
> runs before the #VE handler is set up because it will never access
> those exotic CPUID leaves.
>
> Signed-off-by: Kirill A. Shutemov 
> Reviewed-by: Andi Kleen 
> Signed-off-by: Kuppuswamy Sathyanarayanan 
> 
> ---
>  arch/x86/kernel/tdx.c | 32 
>  1 file changed, 32 insertions(+)
>
> diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
> index 5d961263601e..e98058c048b5 100644
> --- a/arch/x86/kernel/tdx.c
> +++ b/arch/x86/kernel/tdx.c
> @@ -172,6 +172,35 @@ static int tdx_write_msr_safe(unsigned int msr, unsigned 
> int low,
> return ret || r10 ? -EIO : 0;
>  }
>
> +static void tdx_handle_cpuid(struct pt_regs *regs)
> +{
> +   register long r10 asm("r10") = TDVMCALL_STANDARD;
> +   register long r11 asm("r11") = EXIT_REASON_CPUID;
> +   register long r12 asm("r12") = regs->ax;
> +   register long r13 asm("r13") = regs->cx;
> +   register long r14 asm("r14");
> +   register long r15 asm("r15");
> +   register long rcx asm("rcx");
> +   long ret;
> +
> +   /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM */
> +   rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15);
> +
> +   asm volatile(TDCALL
> +   : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), 
> "=r"(r13),
> + "=r"(r14), "=r"(r15)
> +   : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), 
> "r"(r12),
> + "r"(r13)
> +   : );

Some "+" constraints would make this simpler.  But I think you should
factor the TDCALL helper out into its own function.