Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On Mon, Feb 08, 2021, Andy Lutomirski wrote: > On Mon, Feb 8, 2021 at 9:11 AM Sean Christopherson wrote: > > > > On Sun, Feb 07, 2021, Andy Lutomirski wrote: > > > > > > > How much of the register state is revealed to the VMM when we do a > > > TDVMCALL? > > > Presumably we should fully sanitize all register state that shows up in > > > cleartext on the other end, and we should treat all regs that can be > > > modified > > > by the VMM as clobbered. > > > > The guest gets to choose, with a few restrictions. RSP cannot be exposed > > to the > > host. RAX, RCX, R10, and R11 are always exposed as they hold mandatory info > > about the TDVMCALL (TDCALL fn, GPR mask, GHCI vs. vendor, and TDVMCALL fn). > > All > > other GPRs are exposed and clobbered if their bit in RCX is set, otherwise > > they > > are saved/restored by the TDX-Module. > > > > I agree with Dave, pass everything required by the GHCI in the main > > routine, and > > sanitize and save/restore all such GPRs. > > Sounds okay to me. One clarification: only non-volatile GPRs (R12-R15) need to be saved/restored. And I think it makes sense to sanitize any exposed GPRs (that don't hold an output value) after TDVMCALL to avoid speculating with a host-controlled value.
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On Mon, Feb 8, 2021 at 9:11 AM Sean Christopherson wrote: > > On Sun, Feb 07, 2021, Andy Lutomirski wrote: > > > > How much of the register state is revealed to the VMM when we do a TDVMCALL? > > Presumably we should fully sanitize all register state that shows up in > > cleartext on the other end, and we should treat all regs that can be > > modified > > by the VMM as clobbered. > > The guest gets to choose, with a few restrictions. RSP cannot be exposed to > the > host. RAX, RCX, R10, and R11 are always exposed as they hold mandatory info > about the TDVMCALL (TDCALL fn, GPR mask, GHCI vs. vendor, and TDVMCALL fn). > All > other GPRs are exposed and clobbered if their bit in RCX is set, otherwise > they > are saved/restored by the TDX-Module. > > I agree with Dave, pass everything required by the GHCI in the main routine, > and > sanitize and save/restore all such GPRs. Sounds okay to me.
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On Sun, Feb 07, 2021, Andy Lutomirski wrote: > > > On Feb 7, 2021, at 2:31 PM, Dave Hansen wrote: > > > > On 2/7/21 12:29 PM, Kirill A. Shutemov wrote: > >>> Couldn't you just have one big helper that takes *all* the registers > >>> that get used in any TDVMCALL and sets all the rcx bits? The users > >>> could just pass 0's for the things they don't use. IIRC, having exactly one helper is a big mess (my original code used a single main helper). CPUID has a large number of outputs, and because outputs are handled as pointers, the assembly routine needs to check output params for NULL. And if we want to write up port I/O directly to TDVMCALL to avoid the #VE, IN and OUT need separate helpers to implement a non-standard register ABI in order to play nice with ALTERANTIVES. This also has my vote, mainly because gcc doesn't allow directly specifying r8-r15 as register constraints to inline asm. That creates a nasty hole where a register can get corrupted if code is inserted between writing the local variable and passing it to the inline asm. Case in point, patch 08 has this exact bug. > +static u64 tdx_read_msr_safe(unsigned int msr, int *err) > +{ > + register long r10 asm("r10") = TDVMCALL_STANDARD; > + register long r11 asm("r11") = EXIT_REASON_MSR_READ; > + register long r12 asm("r12") = msr; > + register long rcx asm("rcx"); > + long ret; > + > + WARN_ON_ONCE(tdx_is_context_switched_msr(msr)); This can corrupt r10, r11 and/or r12, e.g. if tdx_is_context_switched_msr() is not inlined, it can use r10 and r11 as scratch registers, and gcc isn't smart enough to know it needs to save registers before the call. Even if the code as committed is guaranteed to work, IMO this approach is hostile toward future debuggers/developers, e.g. adding a printk() in here to debug can introduce a completely different failure. > + > + if (msr == MSR_CSTAR) > + return 0; > + > + /* Allow to pass R10, R11 and R12 down to the VMM */ > + rcx = BIT(10) | BIT(11) | BIT(12); > + > + asm volatile(TDCALL > + : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12) > + : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), > "r"(r12) > + : ); > + > + /* XXX: Better error handling needed? */ > + *err = (ret || r10) ? -EIO : 0; > + > + return r11; > +} > >>> Then you've got the ugly inline asm in one place. It also makes it > >>> harder to screw up the 'rcx' mask and end up passing registers you > >>> didn't want into a malicious VMM. > >> For now we only pass down R10-R15, but the interface allows to pass down > >> much wider set of registers, including XMM. How far do we want to get it? > > > > Just do what we immediately need: R10-R15 > > . > > > > How much of the register state is revealed to the VMM when we do a TDVMCALL? > Presumably we should fully sanitize all register state that shows up in > cleartext on the other end, and we should treat all regs that can be modified > by the VMM as clobbered. The guest gets to choose, with a few restrictions. RSP cannot be exposed to the host. RAX, RCX, R10, and R11 are always exposed as they hold mandatory info about the TDVMCALL (TDCALL fn, GPR mask, GHCI vs. vendor, and TDVMCALL fn). All other GPRs are exposed and clobbered if their bit in RCX is set, otherwise they are saved/restored by the TDX-Module. I agree with Dave, pass everything required by the GHCI in the main routine, and sanitize and save/restore all such GPRs.
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
> On Feb 7, 2021, at 2:31 PM, Dave Hansen wrote: > > On 2/7/21 12:29 PM, Kirill A. Shutemov wrote: >>> Couldn't you just have one big helper that takes *all* the registers >>> that get used in any TDVMCALL and sets all the rcx bits? The users >>> could just pass 0's for the things they don't use. >>> >>> Then you've got the ugly inline asm in one place. It also makes it >>> harder to screw up the 'rcx' mask and end up passing registers you >>> didn't want into a malicious VMM. >> For now we only pass down R10-R15, but the interface allows to pass down >> much wider set of registers, including XMM. How far do we want to get it? > > Just do what we immediately need: R10-R15 > . > How much of the register state is revealed to the VMM when we do a TDVMCALL? Presumably we should fully sanitize all register state that shows up in cleartext on the other end, and we should treat all regs that can be modified by the VMM as clobbered.
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On 2/7/21 12:29 PM, Kirill A. Shutemov wrote: >> Couldn't you just have one big helper that takes *all* the registers >> that get used in any TDVMCALL and sets all the rcx bits? The users >> could just pass 0's for the things they don't use. >> >> Then you've got the ugly inline asm in one place. It also makes it >> harder to screw up the 'rcx' mask and end up passing registers you >> didn't want into a malicious VMM. > For now we only pass down R10-R15, but the interface allows to pass down > much wider set of registers, including XMM. How far do we want to get it? Just do what we immediately need: R10-R15.
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On Sun, Feb 07, 2021 at 08:01:50AM -0800, Dave Hansen wrote: > On 2/7/21 6:13 AM, Kirill A. Shutemov wrote: > >>> + /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM > >>> */ > >>> + rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15); > >>> + > >>> + asm volatile(TDCALL > >>> + : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), > >>> "=r"(r13), > >>> + "=r"(r14), "=r"(r15) > >>> + : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), > >>> "r"(r12), > >>> + "r"(r13) > >>> + : ); > >> Some "+" constraints would make this simpler. But I think you should > >> factor the TDCALL helper out into its own function. > > Factor out TDCALL into a helper is tricky: different TDCALLs have > > different list of registers passed to VMM. > > Couldn't you just have one big helper that takes *all* the registers > that get used in any TDVMCALL and sets all the rcx bits? The users > could just pass 0's for the things they don't use. > > Then you've got the ugly inline asm in one place. It also makes it > harder to screw up the 'rcx' mask and end up passing registers you > didn't want into a malicious VMM. For now we only pass down R10-R15, but the interface allows to pass down much wider set of registers, including XMM. How far do we want to get it? -- Kirill A. Shutemov
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On 2/7/21 6:13 AM, Kirill A. Shutemov wrote: >>> + /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM */ >>> + rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15); >>> + >>> + asm volatile(TDCALL >>> + : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), >>> "=r"(r13), >>> + "=r"(r14), "=r"(r15) >>> + : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), >>> "r"(r12), >>> + "r"(r13) >>> + : ); >> Some "+" constraints would make this simpler. But I think you should >> factor the TDCALL helper out into its own function. > Factor out TDCALL into a helper is tricky: different TDCALLs have > different list of registers passed to VMM. Couldn't you just have one big helper that takes *all* the registers that get used in any TDVMCALL and sets all the rcx bits? The users could just pass 0's for the things they don't use. Then you've got the ugly inline asm in one place. It also makes it harder to screw up the 'rcx' mask and end up passing registers you didn't want into a malicious VMM.
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On Fri, Feb 05, 2021 at 03:42:01PM -0800, Andy Lutomirski wrote: > On Fri, Feb 5, 2021 at 3:39 PM Kuppuswamy Sathyanarayanan > wrote: > > > > From: "Kirill A. Shutemov" > > > > TDX has three classes of CPUID leaves: some CPUID leaves > > are always handled by the CPU, others are handled by the TDX module, > > and some others are handled by the VMM. Since the VMM cannot directly > > intercept the instruction these are reflected with a #VE exception > > to the guest, which then converts it into a TDCALL to the VMM, > > or handled directly. > > > > The TDX module EAS has a full list of CPUID leaves which are handled > > natively or by the TDX module in 16.2. Only unknown CPUIDs are handled by > > the #VE method. In practice this typically only applies to the > > hypervisor specific CPUIDs unknown to the native CPU. > > > > Therefore there is no risk of causing this in early CPUID code which > > runs before the #VE handler is set up because it will never access > > those exotic CPUID leaves. > > > > Signed-off-by: Kirill A. Shutemov > > Reviewed-by: Andi Kleen > > Signed-off-by: Kuppuswamy Sathyanarayanan > > > > --- > > arch/x86/kernel/tdx.c | 32 > > 1 file changed, 32 insertions(+) > > > > diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c > > index 5d961263601e..e98058c048b5 100644 > > --- a/arch/x86/kernel/tdx.c > > +++ b/arch/x86/kernel/tdx.c > > @@ -172,6 +172,35 @@ static int tdx_write_msr_safe(unsigned int msr, > > unsigned int low, > > return ret || r10 ? -EIO : 0; > > } > > > > +static void tdx_handle_cpuid(struct pt_regs *regs) > > +{ > > + register long r10 asm("r10") = TDVMCALL_STANDARD; > > + register long r11 asm("r11") = EXIT_REASON_CPUID; > > + register long r12 asm("r12") = regs->ax; > > + register long r13 asm("r13") = regs->cx; > > + register long r14 asm("r14"); > > + register long r15 asm("r15"); > > + register long rcx asm("rcx"); > > + long ret; > > + > > + /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM */ > > + rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15); > > + > > + asm volatile(TDCALL > > + : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), > > "=r"(r13), > > + "=r"(r14), "=r"(r15) > > + : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), > > "r"(r12), > > + "r"(r13) > > + : ); > > Some "+" constraints would make this simpler. But I think you should > factor the TDCALL helper out into its own function. Factor out TDCALL into a helper is tricky: different TDCALLs have different list of registers passed to VMM. -- Kirill A. Shutemov
Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE
On Fri, Feb 5, 2021 at 3:39 PM Kuppuswamy Sathyanarayanan wrote: > > From: "Kirill A. Shutemov" > > TDX has three classes of CPUID leaves: some CPUID leaves > are always handled by the CPU, others are handled by the TDX module, > and some others are handled by the VMM. Since the VMM cannot directly > intercept the instruction these are reflected with a #VE exception > to the guest, which then converts it into a TDCALL to the VMM, > or handled directly. > > The TDX module EAS has a full list of CPUID leaves which are handled > natively or by the TDX module in 16.2. Only unknown CPUIDs are handled by > the #VE method. In practice this typically only applies to the > hypervisor specific CPUIDs unknown to the native CPU. > > Therefore there is no risk of causing this in early CPUID code which > runs before the #VE handler is set up because it will never access > those exotic CPUID leaves. > > Signed-off-by: Kirill A. Shutemov > Reviewed-by: Andi Kleen > Signed-off-by: Kuppuswamy Sathyanarayanan > > --- > arch/x86/kernel/tdx.c | 32 > 1 file changed, 32 insertions(+) > > diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c > index 5d961263601e..e98058c048b5 100644 > --- a/arch/x86/kernel/tdx.c > +++ b/arch/x86/kernel/tdx.c > @@ -172,6 +172,35 @@ static int tdx_write_msr_safe(unsigned int msr, unsigned > int low, > return ret || r10 ? -EIO : 0; > } > > +static void tdx_handle_cpuid(struct pt_regs *regs) > +{ > + register long r10 asm("r10") = TDVMCALL_STANDARD; > + register long r11 asm("r11") = EXIT_REASON_CPUID; > + register long r12 asm("r12") = regs->ax; > + register long r13 asm("r13") = regs->cx; > + register long r14 asm("r14"); > + register long r15 asm("r15"); > + register long rcx asm("rcx"); > + long ret; > + > + /* Allow to pass R10, R11, R12, R13, R14 and R15 down to the VMM */ > + rcx = BIT(10) | BIT(11) | BIT(12) | BIT(13) | BIT(14) | BIT(15); > + > + asm volatile(TDCALL > + : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12), > "=r"(r13), > + "=r"(r14), "=r"(r15) > + : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), > "r"(r12), > + "r"(r13) > + : ); Some "+" constraints would make this simpler. But I think you should factor the TDCALL helper out into its own function.