Re: [PATCH] x86/tdx: Support vmalloc() for tdx_enc_status_changed()

2024-06-28 Thread Kirill A. Shutemov
On Mon, May 20, 2024 at 07:12:38PM -0700, Dexuan Cui wrote:
> @@ -785,15 +799,22 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t 
> end, bool enc)
>   */
>  static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool 
> enc)
>  {
> - phys_addr_t start = __pa(vaddr);
> - phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
> + unsigned long start = vaddr;
> + unsigned long end = start + numpages * PAGE_SIZE;
> + unsigned long step = end - start;
> + unsigned long addr;
>  
> - if (!tdx_map_gpa(start, end, enc))
> - return false;
> + /* Step through page-by-page for vmalloc() mappings */
> + if (is_vmalloc_addr((void *)vaddr))
> + step = PAGE_SIZE;
>  
> - /* shared->private conversion requires memory to be accepted before use 
> */
> - if (enc)
> - return tdx_accept_memory(start, end);
> + for (addr = start; addr < end; addr += step) {
> + phys_addr_t start_pa = slow_virt_to_phys((void *)addr);
> + phys_addr_t end_pa   = start_pa + step;
> +
> + if (!tdx_enc_status_changed_phys(start_pa, end_pa, enc))
> + return false;
> + }
>  
>   return true;
>  }

This patch collied with kexec changes. tdx_kexec_finish() calls
tdx_enc_status_changed() after clearing pte, so slow_virt_to_phys()
crashes on in.

Daxuan, could you check if the fixup below works for you on vmalloc
addresses?

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index ef8ec2425998..5e455c883bcc 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -813,8 +813,15 @@ static bool tdx_enc_status_changed(unsigned long vaddr, 
int numpages, bool enc)
step = PAGE_SIZE;
 
for (addr = start; addr < end; addr += step) {
-   phys_addr_t start_pa = slow_virt_to_phys((void *)addr);
-   phys_addr_t end_pa   = start_pa + step;
+   phys_addr_t start_pa;
+   phys_addr_t end_pa;
+
+   if (virt_addr_valid(addr))
+   start_pa = __pa(addr);
+   else
+   start_pa = slow_virt_to_phys((void *)addr);
+
+   end_pa = start_pa + step;
 
if (!tdx_enc_status_changed_phys(start_pa, end_pa, enc))
return false;
-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH] x86/tdx: Enhance code generation for TDCALL and SEAMCALL wrappers

2024-06-04 Thread Kirill A. Shutemov
On Mon, Jun 03, 2024 at 06:37:45AM -0700, Dave Hansen wrote:
> On 6/2/24 04:54, Kirill A. Shutemov wrote:
> > Sean observed that the compiler is generating inefficient code to clear
> > the tdx_module_args struct for TDCALL and SEAMCALL wrappers. The
> > compiler is generating numerous instructions at each call site to clear
> > the unused fields of the structure.
> > 
> > To address this issue, avoid using C99-initializer and instead
> > explicitly use string instructions to clear the struct.
> > 
> > With Clang, this change results in a savings of approximately 3K with my
> > configuration:
> > 
> > add/remove: 0/0 grow/shrink: 0/21 up/down: 0/-3187 (-3187)
> > 
> > With GCC, the savings are less significant at around 300 bytes:
> > 
> > add/remove: 0/0 grow/shrink: 3/22 up/down: 17/-313 (-296)
> > 
> > GCC tends to generate string instructions more frequently to clear the
> > struct.
> 
> 
> 
> I don't think moving away from perfectly normal C struct initialization
> is worth it for 300 bytes of text in couple of slow paths.
> 
> If someone out there is using clang, is confident that it is doing the
> right thing and not just being silly, _and_ is horribly bothered by its
> code generation, then please speak up.

Conceptually, I like my previous attempt more. But it is much more
intrusive and I am not sure it is worth the risk.

This patch feels like hack around compiler.

Sean, do you have any comments?

> > +static __always_inline void tdx_arg_init(struct tdx_module_args *args)
> > +{
> > +   asm ("rep stosb"
> > +: "+D" (args)
> > +: "c" (sizeof(*args)), "a" (0)
> > +: "memory");
> > +}
> 
> The inline assembly also has the side-effect of tripping up the
> compiler.  The compiler can't optimize across these at all and it
> probably has the effect of bloating the code.

It can, but it is limited. Compiler has to flush registers content back to
memory before asm() and cannot assume anything that read from memory
before the asm() is still valid after.
 
> Oh, and if we're going to leave this weirdo initialization idiom for
> TDX, it needs to be well commented:
> 
> /*
>  * Using normal " = {};" to initialize tdx_module_args results in
>  * bloated hard-to-read assembly.  Zero it using the most compact way
>  * available.
>  */
> 
> Eh?

Okay.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



[PATCH] x86/tdx: Enhance code generation for TDCALL and SEAMCALL wrappers

2024-06-02 Thread Kirill A. Shutemov
Sean observed that the compiler is generating inefficient code to clear
the tdx_module_args struct for TDCALL and SEAMCALL wrappers. The
compiler is generating numerous instructions at each call site to clear
the unused fields of the structure.

To address this issue, avoid using C99-initializer and instead
explicitly use string instructions to clear the struct.

With Clang, this change results in a savings of approximately 3K with my
configuration:

add/remove: 0/0 grow/shrink: 0/21 up/down: 0/-3187 (-3187)

With GCC, the savings are less significant at around 300 bytes:

add/remove: 0/0 grow/shrink: 3/22 up/down: 17/-313 (-296)

GCC tends to generate string instructions more frequently to clear the
struct.

Signed-off-by: Kirill A. Shutemov 
Suggested-by: Dave Hansen 
Cc: Sean Christopherson 
---
 arch/x86/boot/compressed/tdx.c|  32 ---
 arch/x86/coco/tdx/tdx-shared.c|   3 +-
 arch/x86/coco/tdx/tdx.c   | 150 +-
 arch/x86/hyperv/ivm.c |  33 ---
 arch/x86/include/asm/shared/tdx.h |  25 +++--
 arch/x86/virt/vmx/tdx/tdx.c   |  28 +++---
 6 files changed, 155 insertions(+), 116 deletions(-)

diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c
index 8451d6a1030c..a6784a9153e4 100644
--- a/arch/x86/boot/compressed/tdx.c
+++ b/arch/x86/boot/compressed/tdx.c
@@ -18,13 +18,14 @@ void __tdx_hypercall_failed(void)
 
 static inline unsigned int tdx_io_in(int size, u16 port)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
-   .r12 = size,
-   .r13 = 0,
-   .r14 = port,
-   };
+   struct tdx_module_args args;
+
+   tdx_arg_init();
+   args.r10 = TDX_HYPERCALL_STANDARD;
+   args.r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION);
+   args.r12 = size;
+   args.r13 = 0;
+   args.r14 = port;
 
if (__tdx_hypercall())
return UINT_MAX;
@@ -34,14 +35,15 @@ static inline unsigned int tdx_io_in(int size, u16 port)
 
 static inline void tdx_io_out(int size, u16 port, u32 value)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
-   .r12 = size,
-   .r13 = 1,
-   .r14 = port,
-   .r15 = value,
-   };
+   struct tdx_module_args args;
+
+   tdx_arg_init();
+   args.r10 = TDX_HYPERCALL_STANDARD;
+   args.r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION);
+   args.r12 = size;
+   args.r13 = 1;
+   args.r14 = port;
+   args.r15 = value;
 
__tdx_hypercall();
 }
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index 1655aa56a0a5..b8d1b3d940d2 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -5,7 +5,7 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned 
long len,
enum pg_level pg_level)
 {
unsigned long accept_size = page_level_size(pg_level);
-   struct tdx_module_args args = {};
+   struct tdx_module_args args;
u8 page_size;
 
if (!IS_ALIGNED(start, accept_size))
@@ -34,6 +34,7 @@ static unsigned long try_accept_one(phys_addr_t start, 
unsigned long len,
return 0;
}
 
+   tdx_arg_init();
args.rcx = start | page_size;
if (__tdcall(TDG_MEM_PAGE_ACCEPT, ))
return 0;
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index c1cb90369915..8112b2910ca2 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -49,13 +49,14 @@ noinstr void __noreturn __tdx_hypercall_failed(void)
 long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, unsigned long p2,
   unsigned long p3, unsigned long p4)
 {
-   struct tdx_module_args args = {
-   .r10 = nr,
-   .r11 = p1,
-   .r12 = p2,
-   .r13 = p3,
-   .r14 = p4,
-   };
+   struct tdx_module_args args;
+
+   tdx_arg_init();
+   args.r10 = nr;
+   args.r11 = p1;
+   args.r12 = p2;
+   args.r13 = p3;
+   args.r14 = p4;
 
return __tdx_hypercall();
 }
@@ -89,13 +90,14 @@ static inline void tdcall(u64 fn, struct tdx_module_args 
*args)
  */
 int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
 {
-   struct tdx_module_args args = {
-   .rcx = virt_to_phys(tdreport),
-   .rdx = virt_to_phys(reportdata),
-   .r8 = TDREPORT_SUBTYPE_0,
-   };
+   struct tdx_module_args args;
u64 ret;
 
+   tdx_arg_init();
+   args.rcx = virt_to_phys(tdreport);
+   args.rdx = virt_to_phys(reportdata);
+   args.r8 = TDREPORT_SUBTYPE_0;
+
ret = __tdcall(TDG_MR_REPORT, );
if (ret) {
if (TDCALL_RETURN_CODE(ret

Re: [RFC PATCH] clocksource: hyper-v: Enable the tsc_page for a TDX VM in TD mode

2024-05-23 Thread Kirill A. Shutemov
On Wed, May 22, 2024 at 07:24:41PM -0700, Dexuan Cui wrote:
> A TDX VM on Hyper-V may run in TD mode or Partitioned TD mode (L2). For
> the former, the VM has not enabled the Hyper-V TSC page (which is defined
> in drivers/clocksource/hyperv_timer.c: "... tsc_pg __bss_decrypted ...")
> because, for such a VM, the hypervisor requires that the page should be
> shared, but currently the __bss_decrypted is not working for such a VM
> yet.

I don't see how it is safe. It opens guest clock for manipulations form
VMM. Could you elaborate on security implications?

> Hyper-V TSC page can work as a clocksource device similar to KVM pv
> clock, and it's also used by the Hyper-V timer code to get the current
> time: see hv_init_tsc_clocksource(), which sets the global function
> pointer hv_read_reference_counter to read_hv_clock_tsc(); when
> Hyper-V TSC page is not enabled, hv_read_reference_counter defaults to
> be drivers/hv/hv_common.c: __hv_read_ref_counter(), which is suboptimal
> as it uses the slow MSR interface to get the time info.

Why can't the guest just read the TSC directly? Why do we need the page?
I am confused.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH 16/20] x86/tdx: Convert VP_INFO tdcall to use new TDCALL_5() macro

2024-05-20 Thread Kirill A. Shutemov
On Fri, May 17, 2024 at 08:57:10AM -0700, Dave Hansen wrote:
> On 5/17/24 07:19, Kirill A. Shutemov wrote:
> > -   /*
> > -* TDINFO TDX module call is used to get the TD execution environment
> > -* information like GPA width, number of available vcpus, debug mode
> > -* information, etc. More details about the ABI can be found in TDX
> > -* Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL
> > -* [TDG.VP.INFO].
> > -*/
> > -   tdcall(TDG_VP_INFO, );
> > +   tdg_vp_info(_width, _attr);
> 
> Why is the comment going away?

By mistake. Will fix.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH 02/20] x86/tdx: Add macros to generate TDVMCALL wrappers

2024-05-20 Thread Kirill A. Shutemov
On Fri, May 17, 2024 at 06:54:15PM +0200, Paolo Bonzini wrote:
> On 5/17/24 16:19, Kirill A. Shutemov wrote:
> > Introduce a set of macros that allow to generate wrappers for TDVMCALL
> > leafs. The macros uses tdvmcall_trmapoline() and provides SYSV-complaint
> > ABI on top of it.
> 
> Not really SYSV-compliant, more like "The macros use asm() to call
> tdvmcall_trampoline with its custom parameter passing convention".

Sounds better, thanks.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH 01/20] x86/tdx: Introduce tdvmcall_trampoline()

2024-05-20 Thread Kirill A. Shutemov
On Fri, May 17, 2024 at 07:02:25PM +0200, Paolo Bonzini wrote:
> On 5/17/24 16:19, Kirill A. Shutemov wrote:
> > The function will be used from inline assembly to handle most TDVMCALL
> > cases.
> 
> Perhaps add that the calling convention is designed to allow using the asm
> constraints a/b/c/d/S/D and keep the asm blocks simpler?

Sure.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH 01/20] x86/tdx: Introduce tdvmcall_trampoline()

2024-05-20 Thread Kirill A. Shutemov
On Fri, May 17, 2024 at 08:21:37AM -0700, Dave Hansen wrote:
> On 5/17/24 07:19, Kirill A. Shutemov wrote:
> > TDCALL calls are centralized into a few megawrappers that take the
> > struct tdx_module_args as input. Most of the call sites only use a few
> > arguments, but they have to zero out unused fields in the structure to
> > avoid data leaks to the VMM. This leads to the compiler generating
> > inefficient code: dozens of instructions per call site to clear unused
> > fields of the structure.
> 
> I agree that this is what the silly compiler does in practice.  But my
> first preference for fixing it would just be an out-of-line memset() or
> a pretty bare REP;MOV.
> 
> In other words, I think this as the foundational justification for the
> rest of the series leaves a little to be desired.

See the patch below. Is it what you had in mind?

This patch saves ~2K of code, comparing to ~3K for my patchset:

add/remove: 0/0 grow/shrink: 1/17 up/down: 8/-2266 (-2258)

But it is considerably simpler.

 arch/x86/boot/compressed/tdx.c|  32 
 arch/x86/coco/tdx/tdx-shared.c|   3 +-
 arch/x86/coco/tdx/tdx.c   | 159 +-
 arch/x86/hyperv/ivm.c |  32 
 arch/x86/include/asm/shared/tdx.h |  25 --
 5 files changed, 142 insertions(+), 109 deletions(-)

diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c
index 8451d6a1030c..a6784a9153e4 100644
--- a/arch/x86/boot/compressed/tdx.c
+++ b/arch/x86/boot/compressed/tdx.c
@@ -18,13 +18,14 @@ void __tdx_hypercall_failed(void)
 
 static inline unsigned int tdx_io_in(int size, u16 port)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
-   .r12 = size,
-   .r13 = 0,
-   .r14 = port,
-   };
+   struct tdx_module_args args;
+
+   tdx_arg_init();
+   args.r10 = TDX_HYPERCALL_STANDARD;
+   args.r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION);
+   args.r12 = size;
+   args.r13 = 0;
+   args.r14 = port;
 
if (__tdx_hypercall())
return UINT_MAX;
@@ -34,14 +35,15 @@ static inline unsigned int tdx_io_in(int size, u16 port)
 
 static inline void tdx_io_out(int size, u16 port, u32 value)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
-   .r12 = size,
-   .r13 = 1,
-   .r14 = port,
-   .r15 = value,
-   };
+   struct tdx_module_args args;
+
+   tdx_arg_init();
+   args.r10 = TDX_HYPERCALL_STANDARD;
+   args.r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION);
+   args.r12 = size;
+   args.r13 = 1;
+   args.r14 = port;
+   args.r15 = value;
 
__tdx_hypercall();
 }
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index 1655aa56a0a5..b8d1b3d940d2 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -5,7 +5,7 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned 
long len,
enum pg_level pg_level)
 {
unsigned long accept_size = page_level_size(pg_level);
-   struct tdx_module_args args = {};
+   struct tdx_module_args args;
u8 page_size;
 
if (!IS_ALIGNED(start, accept_size))
@@ -34,6 +34,7 @@ static unsigned long try_accept_one(phys_addr_t start, 
unsigned long len,
return 0;
}
 
+   tdx_arg_init();
args.rcx = start | page_size;
if (__tdcall(TDG_MEM_PAGE_ACCEPT, ))
return 0;
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index cadd583d6f62..e8bb8afe04a9 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -53,13 +53,14 @@ noinstr void __noreturn __tdx_hypercall_failed(void)
 long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, unsigned long p2,
   unsigned long p3, unsigned long p4)
 {
-   struct tdx_module_args args = {
-   .r10 = nr,
-   .r11 = p1,
-   .r12 = p2,
-   .r13 = p3,
-   .r14 = p4,
-   };
+   struct tdx_module_args args;
+
+   tdx_arg_init();
+   args.r10 = nr;
+   args.r11 = p1;
+   args.r12 = p2;
+   args.r13 = p3;
+   args.r14 = p4;
 
return __tdx_hypercall();
 }
@@ -80,11 +81,12 @@ static inline void tdcall(u64 fn, struct tdx_module_args 
*args)
 /* Read TD-scoped metadata */
 static inline u64 tdg_vm_rd(u64 field, u64 *value)
 {
-   struct tdx_module_args args = {
-   .rdx = field,
-   };
+   struct tdx_module_args args;
u64 ret;
 
+   tdx_arg_init();
+   args.rdx = field,
+
ret = __tdcall_ret(TDG_VM_RD, );
*value = args.r8;

[PATCH 20/20] x86/tdx: Remove old TDCALL wrappers

2024-05-17 Thread Kirill A. Shutemov
All code has been converted to new TDCALL wrappers.

Drop the old wrappers.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/boot/compressed/tdx.c|  6 
 arch/x86/coco/tdx/tdcall.S| 60 ++-
 arch/x86/coco/tdx/tdx-shared.c| 20 ---
 arch/x86/coco/tdx/tdx.c   | 18 --
 arch/x86/include/asm/shared/tdx.h | 43 +-
 arch/x86/virt/vmx/tdx/tdxcall.S   | 29 +--
 tools/objtool/noreturns.h |  1 -
 7 files changed, 12 insertions(+), 165 deletions(-)

diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c
index 0ae05edc7d42..b74084a46f2f 100644
--- a/arch/x86/boot/compressed/tdx.c
+++ b/arch/x86/boot/compressed/tdx.c
@@ -10,12 +10,6 @@
 
 #include 
 
-/* Called from __tdx_hypercall() for unrecoverable failure */
-void __tdx_hypercall_failed(void)
-{
-   error("TDVMCALL failed. TDX module bug?");
-}
-
 static inline unsigned int tdx_io_in(int size, u16 port)
 {
u64 out;
diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index 5b60b9c8799f..407e2b7ae515 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -1,66 +1,12 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include 
 #include 
+#include 
 
 #include 
-#include 
 
-#include "../../virt/vmx/tdx/tdxcall.S"
-
-.section .noinstr.text, "ax"
-
-/*
- * __tdcall()  - Used by TDX guests to request services from the TDX
- * module (does not include VMM services) using TDCALL instruction.
- *
- * __tdcall() function ABI:
- *
- * @fn   (RDI) - TDCALL Leaf ID, moved to RAX
- * @args (RSI) - struct tdx_module_args for input
- *
- * Only RCX/RDX/R8-R11 are used as input registers.
- *
- * Return status of TDCALL via RAX.
- */
-SYM_FUNC_START(__tdcall)
-   TDX_MODULE_CALL host=0
-SYM_FUNC_END(__tdcall)
-
-/*
- * __tdcall_ret() - Used by TDX guests to request services from the TDX
- * module (does not include VMM services) using TDCALL instruction, with
- * saving output registers to the 'struct tdx_module_args' used as input.
- *
- * __tdcall_ret() function ABI:
- *
- * @fn   (RDI) - TDCALL Leaf ID, moved to RAX
- * @args (RSI) - struct tdx_module_args for input and output
- *
- * Only RCX/RDX/R8-R11 are used as input/output registers.
- *
- * Return status of TDCALL via RAX.
- */
-SYM_FUNC_START(__tdcall_ret)
-   TDX_MODULE_CALL host=0 ret=1
-SYM_FUNC_END(__tdcall_ret)
-
-/*
- * __tdcall_saved_ret() - Used by TDX guests to request services from the
- * TDX module (including VMM services) using TDCALL instruction, with
- * saving output registers to the 'struct tdx_module_args' used as input.
- *
- * __tdcall_saved_ret() function ABI:
- *
- * @fn   (RDI) - TDCALL leaf ID, moved to RAX
- * @args (RSI) - struct tdx_module_args for input/output
- *
- * All registers in @args are used as input/output registers.
- *
- * On successful completion, return the hypercall error code.
- */
-SYM_FUNC_START(__tdcall_saved_ret)
-   TDX_MODULE_CALL host=0 ret=1 saved=1
-SYM_FUNC_END(__tdcall_saved_ret)
+/* TDCALL is supported in Binutils >= 2.36 */
+#define tdcall .byte 0x66,0x0f,0x01,0xcc
 
 /*
  * tdvmcall_trampoline() - Wrapper for TDG.VP.VMCALL. Covers common cases: up
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index 9104e96eeefd..b181f7d4d3b9 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -69,23 +69,3 @@ bool tdx_accept_memory(phys_addr_t start, phys_addr_t end)
 
return true;
 }
-
-noinstr u64 __tdx_hypercall(struct tdx_module_args *args)
-{
-   /*
-* For TDVMCALL explicitly set RCX to the bitmap of shared registers.
-* The caller isn't expected to set @args->rcx anyway.
-*/
-   args->rcx = TDVMCALL_EXPOSE_REGS_MASK;
-
-   /*
-* Failure of __tdcall_saved_ret() indicates a failure of the TDVMCALL
-* mechanism itself and that something has gone horribly wrong with
-* the TDX module.  __tdx_hypercall_failed() never returns.
-*/
-   if (__tdcall_saved_ret(TDG_VP_VMCALL, args))
-   __tdx_hypercall_failed();
-
-   /* TDVMCALL leaf return code is in R10 */
-   return args->r10;
-}
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 45be53d5eeb4..7d9306bd67af 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -38,13 +38,6 @@
 
 static atomic_long_t nr_shared;
 
-/* Called from __tdx_hypercall() for unrecoverable failure */
-noinstr void __noreturn __tdx_hypercall_failed(void)
-{
-   instrumentation_begin();
-   panic("TDVMCALL failed. TDX module bug?");
-}
-
 #ifdef CONFIG_KVM_GUEST
 long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, unsigned long p2,
   unsigned long p3, unsigned long p4)
@@ -62,17 +55,6 @@ long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, 
unsigned long p2,
 EXPORT_SYMBOL_GPL(tdx_k

[PATCH 17/20] x86/tdx: Convert VM_RD/VM_WR tdcalls to use new TDCALL macros

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDCALL instead of tdcall() to issue VM_RD/VM_WR
tdcalls

It increase code slightly:

Function old new   delta
tdx_early_init   744 776 +32

but combined with VP_INFO changes the total effect on tdx_early_init()
is code reduction.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 18 ++
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index e1849878f3bc..6559f3842f67 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -76,27 +76,13 @@ static inline void tdcall(u64 fn, struct tdx_module_args 
*args)
 /* Read TD-scoped metadata */
 static inline u64 tdg_vm_rd(u64 field, u64 *value)
 {
-   struct tdx_module_args args = {
-   .rdx = field,
-   };
-   u64 ret;
-
-   ret = __tdcall_ret(TDG_VM_RD, );
-   *value = args.r8;
-
-   return ret;
+   return TDCALL_1(TDG_VM_RD, 0, field, 0, 0, value);
 }
 
 /* Write TD-scoped metadata */
 static inline u64 tdg_vm_wr(u64 field, u64 value, u64 mask)
 {
-   struct tdx_module_args args = {
-   .rdx = field,
-   .r8 = value,
-   .r9 = mask,
-   };
-
-   return __tdcall(TDG_VM_WR, );
+   return TDCALL_1(TDG_VM_WR, 0, field, value, mask, value);
 }
 
 /**
-- 
2.43.0




[PATCH 19/20] x86/tdx: Convert MR_REPORT tdcall to use new TDCALL_0() macro

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDCALL_0() instead of tdcall() to issue
MR_REPORT tdcall.

It cuts code bloat substantially:

Function old new   delta
tdx_mcall_get_report0229 111-118

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 42436a43bb49..45be53d5eeb4 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -101,19 +101,15 @@ static inline u64 tdg_vm_wr(u64 field, u64 value, u64 
mask)
  */
 int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
 {
-   struct tdx_module_args args = {
-   .rcx = virt_to_phys(tdreport),
-   .rdx = virt_to_phys(reportdata),
-   .r8 = TDREPORT_SUBTYPE_0,
-   };
u64 ret;
 
-   ret = __tdcall(TDG_MR_REPORT, );
-   if (ret) {
-   if (TDCALL_RETURN_CODE(ret) == TDCALL_INVALID_OPERAND)
-   return -EINVAL;
+   ret = TDCALL_0(TDG_MR_REPORT, virt_to_phys(tdreport),
+  virt_to_phys(reportdata), TDREPORT_SUBTYPE_0, 0);
+
+   if (TDCALL_RETURN_CODE(ret) == TDCALL_INVALID_OPERAND)
+   return -EINVAL;
+   else if (ret)
return -EIO;
-   }
 
return 0;
 }
-- 
2.43.0




[PATCH 18/20] x86/tdx: Convert VP_VEINFO_GET tdcall to use new TDCALL_5() macro

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDCALL_5() instead of tdcall() to issue
VP_VEINFO_GET tdcall.

It cuts code bloat substantially:

Function old new   delta
tdx_get_ve_info  253 116-137

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 31 +++
 1 file changed, 7 insertions(+), 24 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 6559f3842f67..42436a43bb49 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -615,32 +615,15 @@ __init bool tdx_early_handle_ve(struct pt_regs *regs)
 
 void tdx_get_ve_info(struct ve_info *ve)
 {
-   struct tdx_module_args args = {};
+   u64 instr_info, ret;
 
-   /*
-* Called during #VE handling to retrieve the #VE info from the
-* TDX module.
-*
-* This has to be called early in #VE handling.  A "nested" #VE which
-* occurs before this will raise a #DF and is not recoverable.
-*
-* The call retrieves the #VE info from the TDX module, which also
-* clears the "#VE valid" flag. This must be done before anything else
-* because any #VE that occurs while the valid flag is set will lead to
-* #DF.
-*
-* Note, the TDX module treats virtual NMIs as inhibited if the #VE
-* valid flag is set. It means that NMI=>#VE will not result in a #DF.
-*/
-   tdcall(TDG_VP_VEINFO_GET, );
+   ret = TDCALL_5(TDG_VP_VEINFO_GET, 0, 0, 0, 0,
+ve->exit_reason, ve->exit_qual, ve->gla, ve->gpa, instr_info);
 
-   /* Transfer the output parameters */
-   ve->exit_reason = args.rcx;
-   ve->exit_qual   = args.rdx;
-   ve->gla = args.r8;
-   ve->gpa = args.r9;
-   ve->instr_len   = lower_32_bits(args.r10);
-   ve->instr_info  = upper_32_bits(args.r10);
+   BUG_ON(ret);
+
+   ve->instr_len   = lower_32_bits(instr_info);
+   ve->instr_info  = upper_32_bits(instr_info);
 }
 
 /*
-- 
2.43.0




[PATCH 16/20] x86/tdx: Convert VP_INFO tdcall to use new TDCALL_5() macro

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDCALL_5() instead of tdcall() to issue VP_INFO
tdcall.

It cuts code bloat slightly:

Function old new   delta
tdx_early_init   780 744 -36

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index e7ffe1cd6d32..e1849878f3bc 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -247,20 +247,22 @@ static void enable_cpu_topology_enumeration(void)
tdg_vm_wr(TDCS_TD_CTLS, TD_CTLS_ENUM_TOPOLOGY, TD_CTLS_ENUM_TOPOLOGY);
 }
 
+static void tdg_vp_info(u64 *gpa_width, u64 *attributes)
+{
+   u64 dummy, ret;
+
+   ret = TDCALL_5(TDG_VP_INFO, 0, 0, 0, 0, *gpa_width, *attributes, dummy,
+  dummy, dummy);
+   BUG_ON(ret);
+
+   *gpa_width &= GENMASK(5, 0);
+}
+
 static void tdx_setup(u64 *cc_mask)
 {
-   struct tdx_module_args args = {};
-   unsigned int gpa_width;
-   u64 td_attr;
+   u64 gpa_width, td_attr;
 
-   /*
-* TDINFO TDX module call is used to get the TD execution environment
-* information like GPA width, number of available vcpus, debug mode
-* information, etc. More details about the ABI can be found in TDX
-* Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL
-* [TDG.VP.INFO].
-*/
-   tdcall(TDG_VP_INFO, );
+   tdg_vp_info(_width, _attr);
 
/*
 * The highest bit of a guest physical address is the "sharing" bit.
@@ -269,11 +271,8 @@ static void tdx_setup(u64 *cc_mask)
 * The GPA width that comes out of this call is critical. TDX guests
 * can not meaningfully run without it.
 */
-   gpa_width = args.rcx & GENMASK(5, 0);
*cc_mask = BIT_ULL(gpa_width - 1);
 
-   td_attr = args.rdx;
-
/* Kernel does not use NOTIFY_ENABLES and does not need random #VEs */
tdg_vm_wr(TDCS_NOTIFY_ENABLES, 0, -1ULL);
 
-- 
2.43.0




[PATCH 15/20] x86/tdx: Convert PAGE_ACCEPT tdcall to use new TDCALL_0() macro

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDCALL_0() instead of __tdcall() to issue
PAGE_ACCEPT tdcall.

It cuts code bloat substantially:

Function old new   delta
tdx_accept_memory592 233-359

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx-shared.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index 1655aa56a0a5..9104e96eeefd 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -5,8 +5,8 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned 
long len,
enum pg_level pg_level)
 {
unsigned long accept_size = page_level_size(pg_level);
-   struct tdx_module_args args = {};
u8 page_size;
+   u64 ret;
 
if (!IS_ALIGNED(start, accept_size))
return 0;
@@ -34,8 +34,8 @@ static unsigned long try_accept_one(phys_addr_t start, 
unsigned long len,
return 0;
}
 
-   args.rcx = start | page_size;
-   if (__tdcall(TDG_MEM_PAGE_ACCEPT, ))
+   ret = TDCALL_0(TDG_MEM_PAGE_ACCEPT, start | page_size, 0, 0, 0);
+   if (ret)
return 0;
 
return accept_size;
-- 
2.43.0




[PATCH 08/20] x86/tdx: Convert MMIO handling to use new TDVMCALL macros

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_0() and TDVMCALL_1() instead of
__tdx_hypercall() to handle MMIO emulation.

It cuts code bloat substantially:

Function old new   delta
tdx_handle_virt_exception   17471383-364

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 30 +-
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index c436cab355e0..df3e10d899b3 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -438,38 +438,34 @@ static int handle_cpuid(struct pt_regs *regs, struct 
ve_info *ve)
return ve_instr_len(ve);
 }
 
-static bool mmio_read(int size, unsigned long addr, unsigned long *val)
+static bool mmio_read(int size, unsigned long gpa, u64 *val)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_EPT_VIOLATION),
-   .r12 = size,
-   .r13 = EPT_READ,
-   .r14 = addr,
-   .r15 = *val,
-   };
+   bool ret;
+   u64 out;
 
-   if (__tdx_hypercall())
-   return false;
+   ret = !TDVMCALL_1(hcall_func(EXIT_REASON_EPT_VIOLATION),
+ size, EPT_READ, gpa, 0, out);
+   if (ret)
+   *val = out;
 
-   *val = args.r11;
-   return true;
+   return ret;
 }
 
-static bool mmio_write(int size, unsigned long addr, unsigned long val)
+static bool mmio_write(int size, u64 gpa, u64 val)
 {
-   return !_tdx_hypercall(hcall_func(EXIT_REASON_EPT_VIOLATION), size,
-  EPT_WRITE, addr, val);
+   return !TDVMCALL_0(hcall_func(EXIT_REASON_EPT_VIOLATION),
+size, EPT_WRITE, gpa, val);
 }
 
 static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 {
-   unsigned long *reg, val, vaddr;
+   unsigned long *reg, vaddr;
char buffer[MAX_INSN_SIZE];
enum insn_mmio_type mmio;
struct insn insn = {};
int size, extend_size;
u8 extend_val = 0;
+   u64 val;
 
/* Only in-kernel MMIO is supported */
if (WARN_ON_ONCE(user_mode(regs)))
-- 
2.43.0




[PATCH 12/20] x86/tdx: Rewrite tdx_kvm_hypercall() without __tdx_hypercall()

2024-05-17 Thread Kirill A. Shutemov
tdx_kvm_hypercall() issues KVM hypercall. Rewrite it without using
__tdx_hypercall(). Use tdvmcall_trampoline() instead.

It cuts code bloat substantially:

Function old new   delta
tdx_kvm_hypercall160  53-107

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index b7299e668564..e7ffe1cd6d32 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -49,15 +49,15 @@ noinstr void __noreturn __tdx_hypercall_failed(void)
 long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, unsigned long p2,
   unsigned long p3, unsigned long p4)
 {
-   struct tdx_module_args args = {
-   .r10 = nr,
-   .r11 = p1,
-   .r12 = p2,
-   .r13 = p3,
-   .r14 = p4,
-   };
+   long ret;
 
-   return __tdx_hypercall();
+   asm("call   tdvmcall_trampoline\n\t"
+   "movq   %%r10, %0\n\t"
+   : "=r" (ret)
+   : "a" (nr), "b" (p1), "D" (p2), "S"(p3), "d"(p4), "c" (0)
+   : "r12", "r13", "r14", "r15");
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(tdx_kvm_hypercall);
 #endif
-- 
2.43.0




[PATCH 11/20] x86/tdx: Rewrite tdx_panic() without __tdx_hypercall()

2024-05-17 Thread Kirill A. Shutemov
tdx_panic() uses REPORT_FATAL_ERROR hypercall to deliver panic message
in ealy boot. Rewrite it without using __tdx_hypercall().

REPORT_FATAL_ERROR hypercall is special. It uses pretty much all
available registers to pass down the error message. TDVMCALL macros are
not usable here.

Implement the hypercall directly in assembly.

It cuts code bloat substantially:

Function old new   delta
tdx_panic222  59-163

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdcall.S | 28 
 arch/x86/coco/tdx/tdx.c| 31 +++
 arch/x86/include/asm/tdx.h |  2 ++
 tools/objtool/noreturns.h  |  1 +
 4 files changed, 34 insertions(+), 28 deletions(-)

diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index 12185fbd33ba..269e5789672a 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -110,3 +110,31 @@ SYM_FUNC_START(tdvmcall_trampoline)
ud2
 SYM_FUNC_END(tdvmcall_trampoline)
 .popsection
+
+SYM_FUNC_START(tdvmcall_report_fatal_error)
+   movq$TDX_HYPERCALL_STANDARD, %r10
+   movq$TDVMCALL_REPORT_FATAL_ERROR, %r11
+   movq%rdi, %r12
+   movq$0, %r13
+
+   movq%rsi, %rcx
+
+   /* Order according to the GHCI */
+   movq0*8(%rcx), %r14
+   movq1*8(%rcx), %r15
+   movq2*8(%rcx), %rbx
+   movq3*8(%rcx), %rdi
+   movq4*8(%rcx), %rsi
+   movq5*8(%rcx), %r8
+   movq6*8(%rcx), %r9
+   movq7*8(%rcx), %rdx
+
+   movq$TDG_VP_VMCALL, %rax
+   movq$(TDX_RDX | TDX_RBX | TDX_RSI | TDX_RDI | TDX_R8  | TDX_R9  | \
+ TDX_R10 | TDX_R11 | TDX_R12 | TDX_R13 | TDX_R14 | TDX_R15), \
+   %rcx
+
+   tdcall
+
+   ud2
+SYM_FUNC_END(tdvmcall_report_fatal_error)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 3f0be1d3cccb..b7299e668564 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -157,37 +157,12 @@ EXPORT_SYMBOL_GPL(tdx_hcall_get_quote);
 
 static void __noreturn tdx_panic(const char *msg)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = TDVMCALL_REPORT_FATAL_ERROR,
-   .r12 = 0, /* Error code: 0 is Panic */
-   };
-   union {
-   /* Define register order according to the GHCI */
-   struct { u64 r14, r15, rbx, rdi, rsi, r8, r9, rdx; };
-
-   char str[64];
-   } message;
+   char str[64];
 
/* VMM assumes '\0' in byte 65, if the message took all 64 bytes */
-   strtomem_pad(message.str, msg, '\0');
+   strtomem_pad(str, msg, '\0');
 
-   args.r8  = message.r8;
-   args.r9  = message.r9;
-   args.r14 = message.r14;
-   args.r15 = message.r15;
-   args.rdi = message.rdi;
-   args.rsi = message.rsi;
-   args.rbx = message.rbx;
-   args.rdx = message.rdx;
-
-   /*
-* This hypercall should never return and it is not safe
-* to keep the guest running. Call it forever if it
-* happens to return.
-*/
-   while (1)
-   __tdx_hypercall();
+   tdvmcall_report_fatal_error(0, str);
 }
 
 /*
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index eba178996d84..f67e5e6b66ad 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -54,6 +54,8 @@ struct ve_info {
 
 void __init tdx_early_init(void);
 
+void __noreturn tdvmcall_report_fatal_error(u64 error_code, const char 
str[64]);
+
 void tdx_get_ve_info(struct ve_info *ve);
 
 bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve);
diff --git a/tools/objtool/noreturns.h b/tools/objtool/noreturns.h
index 7ebf29c91184..0670cacf0734 100644
--- a/tools/objtool/noreturns.h
+++ b/tools/objtool/noreturns.h
@@ -39,6 +39,7 @@ NORETURN(sev_es_terminate)
 NORETURN(snp_abort)
 NORETURN(start_kernel)
 NORETURN(stop_this_cpu)
+NORETURN(tdvmcall_report_fatal_error)
 NORETURN(usercopy_abort)
 NORETURN(x86_64_start_kernel)
 NORETURN(x86_64_start_reservations)
-- 
2.43.0




[PATCH 14/20] x86/tdx: Add macros to generate TDCALL wrappers

2024-05-17 Thread Kirill A. Shutemov
Introduce a set of macros that allow to generate wrappers for TDCALL
leafs.

There are three macros differentiated by number of return parameters.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/include/asm/shared/tdx.h | 58 +++
 1 file changed, 58 insertions(+)

diff --git a/arch/x86/include/asm/shared/tdx.h 
b/arch/x86/include/asm/shared/tdx.h
index 46c299dc9cf0..70190ebc63ca 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -80,6 +80,64 @@
 
 #include 
 
+#define TDCALL ".byte  0x66,0x0f,0x01,0xcc\n\t"
+
+#define TDCALL_0(reason, in_rcx, in_rdx, in_r8, in_r9) 
\
+({ 
\
+   long __ret; 
\
+   
\
+   asm(
\
+   "movq   %[r8_in], %%r8\n\t" 
\
+   "movq   %[r9_in], %%r9\n\t" 
\
+   TDCALL  
\
+   : "=a" (__ret), ASM_CALL_CONSTRAINT 
\
+   : "a" (reason), "c" (in_rcx), "d" (in_rdx), 
\
+ [r8_in] "rm" ((u64)in_r8), [r9_in] "rm" ((u64)in_r9)  
\
+   : "r8", "r9"
\
+   );  
\
+   __ret;  
\
+})
+
+#define TDCALL_1(reason, in_rcx, in_rdx, in_r8, in_r9, out_r8) 
\
+({ 
\
+   long __ret; 
\
+   
\
+   asm(
\
+   "movq   %[r8_in], %%r8\n\t" 
\
+   "movq   %[r9_in], %%r9\n\t" 
\
+   TDCALL  
\
+   "movq   %%r8, %[r8_out]\n\t"
\
+   : "=a" (__ret), ASM_CALL_CONSTRAINT, [r8_out] "=rm" (out_r8)
\
+   : "a" (reason), "c" (in_rcx), "d" (in_rdx), 
\
+ [r8_in] "rm" ((u64)in_r8), [r9_in] "rm" ((u64)in_r9)  
\
+   : "r8", "r9"
\
+   );  
\
+   __ret;  
\
+})
+
+#define TDCALL_5(reason, in_rcx, in_rdx, in_r8, in_r9, 
\
+out_rcx, out_rdx, out_r8, out_r9, out_r10) 
\
+({ 
\
+   long __ret; 
\
+   
\
+   asm(
\
+   "movq   %[r8_in], %%r8\n\t" 
\
+   "movq   %[r9_in], %%r9\n\t" 
\
+   TDCALL  
\
+   "movq   %%r8, %[r8_out]\n\t"
\
+   "movq   %%r9, %[r9_out]\n\t"
\
+   "movq   %%r10, %[r10_out]\n\t"  
\
+   : "=a" (__ret), ASM_CALL_CONSTRAINT,
\
+ "=c" (out_rcx), "=d" (out_rdx),   
\
+ [r8_out] "=rm" (out_r8), [r9_out] "=rm" (out_r9), 
\
+ [r10_out] "=rm" (out_r10) 
\
+   : "a" (reason), "c" (in_rcx), "d" (in_rdx), 
\
+ [r8_in] "rm" ((u64)in_r8), [r9_in] "rm" ((u64)in_r9)  
\
+   : "r8", "r9",

[PATCH 09/20] x86/tdx: Convert MAP_GPA hypercall to use new TDVMCALL macros

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_1() instead of __tdx_hypercall() to issue
MAP_GPA hypercall.

It cuts code bloat substantially:

Function old new   delta
tdx_enc_status_changed   352 242-110
tdx_kexec_finish 645 530-115
tdx_enc_status_change_prepare326 181-145
Total: Before=5553, After=5183, chg -6.66%

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index df3e10d899b3..7c874a50a319 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -797,15 +797,10 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t 
end, bool enc)
}
 
while (retry_count < max_retries_per_page) {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = TDVMCALL_MAP_GPA,
-   .r12 = start,
-   .r13 = end - start };
-
-   u64 map_fail_paddr;
-   u64 ret = __tdx_hypercall();
+   u64 map_fail_paddr, ret;
 
+   ret = TDVMCALL_1(TDVMCALL_MAP_GPA,
+start, end - start, 0, 0, map_fail_paddr);
if (ret != TDVMCALL_STATUS_RETRY)
return !ret;
/*
@@ -813,7 +808,6 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, 
bool enc)
 * region starting at the GPA specified in R11. R11 comes
 * from the untrusted VMM. Sanity check it.
 */
-   map_fail_paddr = args.r11;
if (map_fail_paddr < start || map_fail_paddr >= end)
return false;
 
-- 
2.43.0




[PATCH 07/20] x86/tdx: Convert CPUID handling to use new TDVMCALL_4()

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_4() instead of __tdx_hypercall() to handle
CPUID instruction emulation.

It cuts code bloat substantially:

Function old new   delta
tdx_handle_virt_exception   18191747 -72

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 20 ++--
 1 file changed, 2 insertions(+), 18 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index f59a2b3500db..c436cab355e0 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -414,13 +414,6 @@ static int write_msr(struct pt_regs *regs, struct ve_info 
*ve)
 
 static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_CPUID),
-   .r12 = regs->ax,
-   .r13 = regs->cx,
-   };
-
/*
 * Only allow VMM to control range reserved for hypervisor
 * communication.
@@ -438,19 +431,10 @@ static int handle_cpuid(struct pt_regs *regs, struct 
ve_info *ve)
 * ABI can be found in TDX Guest-Host-Communication Interface
 * (GHCI), section titled "VP.VMCALL".
 */
-   if (__tdx_hypercall())
+   if (TDVMCALL_4(EXIT_REASON_CPUID, regs->ax, regs->cx, 0, 0,
+  regs->ax, regs->bx, regs->cx, regs->dx))
return -EIO;
 
-   /*
-* As per TDX GHCI CPUID ABI, r12-r15 registers contain contents of
-* EAX, EBX, ECX, EDX registers after the CPUID instruction execution.
-* So copy the register contents back to pt_regs.
-*/
-   regs->ax = args.r12;
-   regs->bx = args.r13;
-   regs->cx = args.r14;
-   regs->dx = args.r15;
-
return ve_instr_len(ve);
 }
 
-- 
2.43.0




[PATCH 13/20] x86/tdx: Rewrite hv_tdx_hypercall() without __tdx_hypercall()

2024-05-17 Thread Kirill A. Shutemov
Rewrite hv_tdx_hypercall() in assembly to remove one more
__tdx_hypercall() user.

tdvmcall_trampoline() cannot be used here as Hyper-V uses R8 and RDX to
pass down parameters which is incompatible with tdvmcall_trampoline()

The rewrite cuts code bloat substantially:

Function old new   delta
hv_tdx_hypercall 171  42-129

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdcall.S | 30 ++
 arch/x86/hyperv/ivm.c  | 14 --
 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index 269e5789672a..5b60b9c8799f 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -138,3 +138,33 @@ SYM_FUNC_START(tdvmcall_report_fatal_error)
 
ud2
 SYM_FUNC_END(tdvmcall_report_fatal_error)
+
+#ifdef CONFIG_HYPERV
+/*
+ * hv_tdx_hypercall() - Issue Hyper-V hypercall
+ *
+ * RDI - Hypercall ID
+ * RSI - Parameter 1
+ * RCX - Parameter 2
+ */
+SYM_FUNC_START(hv_tdx_hypercall)
+   movq%rdi, %r10
+   movq%rsi, %rdx
+   movq%rcx, %r8
+
+   movq$TDG_VP_VMCALL, %rax
+   movq$(TDX_R8 | TDX_R10 | TDX_RDX), %rcx
+
+   tdcall
+
+   /* TDG.VP.VMCALL never fails on correct use. Panic if it fails. */
+   testq   %rax, %rax
+   jnz .Lpanic_hv
+
+   movq%r11, %rax
+
+   RET
+.Lpanic_hv:
+   ud2
+SYM_FUNC_END(hv_tdx_hypercall)
+#endif
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index 18d0892d9fc4..562980e19d68 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -401,20 +401,6 @@ static void hv_tdx_msr_read(u64 msr, u64 *val)
if (WARN_ONCE(ret, "Failed to emulate MSR read: %lld\n", ret))
*val = 0;
 }
-
-u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2)
-{
-   struct tdx_module_args args = { };
-
-   args.r10 = control;
-   args.rdx = param1;
-   args.r8  = param2;
-
-   (void)__tdx_hypercall();
-
-   return args.r11;
-}
-
 #else
 static inline void hv_tdx_msr_write(u64 msr, u64 value) {}
 static inline void hv_tdx_msr_read(u64 msr, u64 *value) {}
-- 
2.43.0




[PATCH 10/20] x86/tdx: Convert GET_QUOTE hypercall to use new TDVMCALL macros

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_0() instead of __tdx_hypercall() to issue
GET_QUOTE hypercall.

It cuts code bloat substantially:

Function old new   delta
tdx_hcall_get_quote  188  76-112

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 7c874a50a319..3f0be1d3cccb 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -150,7 +150,8 @@ EXPORT_SYMBOL_GPL(tdx_mcall_get_report0);
 u64 tdx_hcall_get_quote(u8 *buf, size_t size)
 {
/* Since buf is a shared memory, set the shared (decrypted) bits */
-   return _tdx_hypercall(TDVMCALL_GET_QUOTE, cc_mkdec(virt_to_phys(buf)), 
size, 0, 0);
+   return TDVMCALL_0(TDVMCALL_GET_QUOTE,
+ cc_mkdec(virt_to_phys(buf)), size, 0, 0);
 }
 EXPORT_SYMBOL_GPL(tdx_hcall_get_quote);
 
-- 
2.43.0




[PATCH 05/20] x86/tdx: Convert MSR read handling to use new TDVMCALL_1()

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_1() instead of __tdx_hypercall() to handle
MSR read emulation.

It cuts code bloat substantially:

Function old new   delta
tdx_handle_virt_exception   20521947-105

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 15 +++
 arch/x86/hyperv/ivm.c   | 10 ++
 2 files changed, 9 insertions(+), 16 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index dce7d6f9f895..32c519d096de 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -379,22 +379,21 @@ void __cpuidle tdx_safe_halt(void)
 
 static int read_msr(struct pt_regs *regs, struct ve_info *ve)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_MSR_READ),
-   .r12 = regs->cx,
-   };
+   u64 val;
 
/*
 * Emulate the MSR read via hypercall. More info about ABI
 * can be found in TDX Guest-Host-Communication Interface
 * (GHCI), section titled "TDG.VP.VMCALL".
 */
-   if (__tdx_hypercall())
+   if (TDVMCALL_1(hcall_func(EXIT_REASON_MSR_READ),
+  regs->cx, 0, 0, 0, val)) {
return -EIO;
+   }
+
+   regs->ax = lower_32_bits(val);
+   regs->dx = upper_32_bits(val);
 
-   regs->ax = lower_32_bits(args.r11);
-   regs->dx = upper_32_bits(args.r11);
return ve_instr_len(ve);
 }
 
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index b4a851d27c7c..3e2cbfb2203d 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -399,18 +399,12 @@ static void hv_tdx_msr_write(u64 msr, u64 val)
 
 static void hv_tdx_msr_read(u64 msr, u64 *val)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = EXIT_REASON_MSR_READ,
-   .r12 = msr,
-   };
+   u64 ret;
 
-   u64 ret = __tdx_hypercall();
+   ret = TDVMCALL_1(hcall_func(EXIT_REASON_MSR_READ), msr, 0, 0, 0, *val);
 
if (WARN_ONCE(ret, "Failed to emulate MSR read: %lld\n", ret))
*val = 0;
-   else
-   *val = args.r11;
 }
 
 u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2)
-- 
2.43.0




[PATCH 01/20] x86/tdx: Introduce tdvmcall_trampoline()

2024-05-17 Thread Kirill A. Shutemov
TDCALL calls are centralized into a few megawrappers that take the
struct tdx_module_args as input. Most of the call sites only use a few
arguments, but they have to zero out unused fields in the structure to
avoid data leaks to the VMM. This leads to the compiler generating
inefficient code: dozens of instructions per call site to clear unused
fields of the structure.

This issue can be avoided by using more targeted wrappers.
tdvmcall_trampoline() provides a common base for them.

The function will be used from inline assembly to handle most TDVMCALL
cases.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdcall.S | 49 ++
 1 file changed, 49 insertions(+)

diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index 52d9786da308..12185fbd33ba 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -61,3 +61,52 @@ SYM_FUNC_END(__tdcall_ret)
 SYM_FUNC_START(__tdcall_saved_ret)
TDX_MODULE_CALL host=0 ret=1 saved=1
 SYM_FUNC_END(__tdcall_saved_ret)
+
+/*
+ * tdvmcall_trampoline() - Wrapper for TDG.VP.VMCALL. Covers common cases: up
+ * to five input and out arguments.
+ *
+ * tdvmcall_trampoline() function ABI is not SYSV ABI compliant. Caller has to
+ * deal with it.
+ *
+ * Input:
+ * RAX - Type of call, TDX_HYPERCALL_STANDARD for calls defined in GHCI spec
+ * RBX - 1st argument (R11), leaf ID if RAX is TDX_HYPERCALL_STANDARD
+ * RDI - 2nd argument (R12)
+ * RSI - 3rd argument (R13)
+ * RDX - 4th argument (R14)
+ * RCX - 5th argument (R15)
+ *
+ * Output:
+ * R10 - TDVMCALL error code
+ * R11 - Output 1
+ * R12 - Output 2
+ * R13 - Output 3
+ * R14 - Output 4
+ * R15 - Output 5
+ */
+.pushsection .noinstr.text, "ax"
+SYM_FUNC_START(tdvmcall_trampoline)
+   movq%rax, %r10
+   movq%rbx, %r11
+   movq%rdi, %r12
+   movq%rsi, %r13
+   movq%rdx, %r14
+   movq%rcx, %r15
+
+   movq$TDG_VP_VMCALL, %rax
+
+   /* RCX is bitmap of registers exposed to VMM on TDG.VM.VMCALL */
+   movq$(TDX_R10 | TDX_R11 | TDX_R12 | TDX_R13 | TDX_R14 | TDX_R15), 
%rcx
+
+   tdcall
+
+   /* TDG.VP.VMCALL never fails on correct use. Panic if it fails. */
+   testq   %rax, %rax
+   jnz .Lpanic
+
+   RET
+.Lpanic:
+   ud2
+SYM_FUNC_END(tdvmcall_trampoline)
+.popsection
-- 
2.43.0




[PATCH 06/20] x86/tdx: Convert MSR write handling to use new TDVMCALL_0()

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_0() instead of __tdx_hypercall() to handle
MSR write emulation.

It cuts code bloat substantially:

Function old new   delta
tdx_handle_virt_exception   19471819-128

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 9 ++---
 arch/x86/hyperv/ivm.c   | 9 ++---
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 32c519d096de..f59a2b3500db 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -399,19 +399,14 @@ static int read_msr(struct pt_regs *regs, struct ve_info 
*ve)
 
 static int write_msr(struct pt_regs *regs, struct ve_info *ve)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_MSR_WRITE),
-   .r12 = regs->cx,
-   .r13 = (u64)regs->dx << 32 | regs->ax,
-   };
+   u64 val = (u64)regs->dx << 32 | regs->ax;
 
/*
 * Emulate the MSR write via hypercall. More info about ABI
 * can be found in TDX Guest-Host-Communication Interface
 * (GHCI) section titled "TDG.VP.VMCALL".
 */
-   if (__tdx_hypercall())
+   if (TDVMCALL_0(hcall_func(EXIT_REASON_MSR_WRITE), regs->cx, val, 0, 0))
return -EIO;
 
return ve_instr_len(ve);
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index 3e2cbfb2203d..18d0892d9fc4 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -385,14 +385,9 @@ static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {}
 #ifdef CONFIG_INTEL_TDX_GUEST
 static void hv_tdx_msr_write(u64 msr, u64 val)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = EXIT_REASON_MSR_WRITE,
-   .r12 = msr,
-   .r13 = val,
-   };
+   u64 ret;
 
-   u64 ret = __tdx_hypercall();
+   ret = TDVMCALL_0(hcall_func(EXIT_REASON_MSR_WRITE), msr, val, 0, 0);
 
WARN_ONCE(ret, "Failed to emulate MSR write: %lld\n", ret);
 }
-- 
2.43.0




[PATCH 02/20] x86/tdx: Add macros to generate TDVMCALL wrappers

2024-05-17 Thread Kirill A. Shutemov
Introduce a set of macros that allow to generate wrappers for TDVMCALL
leafs. The macros uses tdvmcall_trmapoline() and provides SYSV-complaint
ABI on top of it.

There are three macros differentiated by number of return parameters.

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/include/asm/shared/tdx.h | 54 +++
 1 file changed, 54 insertions(+)

diff --git a/arch/x86/include/asm/shared/tdx.h 
b/arch/x86/include/asm/shared/tdx.h
index 89f7fcade8ae..ddf2cc4a45da 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -76,6 +76,60 @@
 
 #include 
 
+#define TDVMCALL_0(reason, in_r12, in_r13, in_r14, in_r15) 
\
+({ 
\
+   long __ret; 
\
+   
\
+   asm(
\
+   "call   tdvmcall_trampoline\n\t"
\
+   "movq   %%r10, %[r10_out]\n\t"  
\
+   : [r10_out] "=r" (__ret), ASM_CALL_CONSTRAINT   
\
+   : "a" (TDX_HYPERCALL_STANDARD), "b" (reason),   
\
+ "D" (in_r12), "S"(in_r13), "d"(in_r14), "c" (in_r15)  
\
+   : "r12", "r13", "r14", "r15"
\
+   );  
\
+   __ret;  
\
+})
+
+#define TDVMCALL_1(reason, in_r12, in_r13, in_r14, in_r15, out_r11)
\
+({ 
\
+   long __ret; 
\
+   
\
+   asm(
\
+   "call   tdvmcall_trampoline\n\t"
\
+   "movq   %%r10, %[r10_out]\n\t"  
\
+   "movq   %%r11, %[r11_out]\n\t"  
\
+   : [r10_out] "=r" (__ret), [r11_out] "=r" (out_r11), 
\
+ ASM_CALL_CONSTRAINT   
\
+   : "a" (TDX_HYPERCALL_STANDARD), "b" (reason),   
\
+ "D" (in_r12), "S"(in_r13), "d"(in_r14), "c" (in_r15)  
\
+   : "r10", "r11", "r12", "r13", "r14", "r15"  
\
+   );  
\
+   __ret;  
\
+})
+
+#define TDVMCALL_4(reason, in_r12, in_r13, in_r14, in_r15, 
\
+  out_r12, out_r13, out_r14, out_r15)  
\
+({ 
\
+   long __ret; 
\
+   
\
+   asm(
\
+   "call   tdvmcall_trampoline\n\t"
\
+   "movq   %%r10, %[r10_out]\n\t"  
\
+   "movq   %%r12, %[r12_out]\n\t"  
\
+   "movq   %%r13, %[r13_out]\n\t"  
\
+   "movq   %%r14, %[r14_out]\n\t"  
\
+   "movq   %%r15, %[r15_out]\n\t"  
\
+   : [r10_out] "=r" (__ret), ASM_CALL_CONSTRAINT,  
\
+ [r12_out] "=r" (out_r12), [r13_out] "=r" (out_r13),   
\
+ [r14_out] "=r" (out_r14), [r15_out] "=r" (out_r15)
\
+   : "a" (TDX_HYPERCALL_STANDARD), "b" (reason),   
\
+ "D" (in_r12), "S"(in_r13), "d"(in_r14), "c" (in_r15)  
\
+   : "r10", "r12", "r13", "r14", "r15" 
\
+   );  
\
+   __ret;  
\
+})
+
 /*
  * Used in __tdcall*() to gather the input/output registers' values of the
  * TDCALL instruction when requesting services from the TDX module. This is a
-- 
2.43.0




[PATCH 03/20] x86/tdx: Convert port I/O handling to use new TDVMCALL macros

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_0() and TDVMCALL_1() instead of
__tdx_hypercall() to handle port I/O in TDX guest.

It cuts handle_io() size in half:

Function old new   delta
handle_io436 202-234

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/boot/compressed/tdx.c| 26 +++---
 arch/x86/coco/tdx/tdx.c   | 23 +++
 arch/x86/include/asm/shared/tdx.h |  4 
 3 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c
index 8451d6a1030c..0ae05edc7d42 100644
--- a/arch/x86/boot/compressed/tdx.c
+++ b/arch/x86/boot/compressed/tdx.c
@@ -18,32 +18,20 @@ void __tdx_hypercall_failed(void)
 
 static inline unsigned int tdx_io_in(int size, u16 port)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
-   .r12 = size,
-   .r13 = 0,
-   .r14 = port,
-   };
+   u64 out;
 
-   if (__tdx_hypercall())
+   if (TDVMCALL_1(hcall_func(EXIT_REASON_IO_INSTRUCTION),
+  size, TDX_PORT_READ, port, 0, out)) {
return UINT_MAX;
+   }
 
-   return args.r11;
+   return out;
 }
 
 static inline void tdx_io_out(int size, u16 port, u32 value)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
-   .r12 = size,
-   .r13 = 1,
-   .r14 = port,
-   .r15 = value,
-   };
-
-   __tdx_hypercall();
+   TDVMCALL_0(hcall_func(EXIT_REASON_IO_INSTRUCTION),
+  size, TDX_PORT_WRITE, port, value);
 }
 
 static inline u8 tdx_inb(u16 port)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index cadd583d6f62..6e0e5648ebd1 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -21,10 +21,6 @@
 #define EPT_READ   0
 #define EPT_WRITE  1
 
-/* Port I/O direction */
-#define PORT_READ  0
-#define PORT_WRITE 1
-
 /* See Exit Qualification for I/O Instructions in VMX documentation */
 #define VE_IS_IO_IN(e) ((e) & BIT(3))
 #define VE_GET_IO_SIZE(e)  (((e) & GENMASK(2, 0)) + 1)
@@ -612,14 +608,7 @@ static int handle_mmio(struct pt_regs *regs, struct 
ve_info *ve)
 
 static bool handle_in(struct pt_regs *regs, int size, int port)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
-   .r12 = size,
-   .r13 = PORT_READ,
-   .r14 = port,
-   };
-   u64 mask = GENMASK(BITS_PER_BYTE * size, 0);
+   u64 mask, out;
bool success;
 
/*
@@ -627,12 +616,14 @@ static bool handle_in(struct pt_regs *regs, int size, int 
port)
 * in TDX Guest-Host-Communication Interface (GHCI) section titled
 * "TDG.VP.VMCALL".
 */
-   success = !__tdx_hypercall();
+   success = !TDVMCALL_1(hcall_func(EXIT_REASON_IO_INSTRUCTION),
+ size, TDX_PORT_READ, port, 0, out);
 
/* Update part of the register affected by the emulated instruction */
+   mask = GENMASK(BITS_PER_BYTE * size, 0);
regs->ax &= ~mask;
if (success)
-   regs->ax |= args.r11 & mask;
+   regs->ax |= out & mask;
 
return success;
 }
@@ -646,8 +637,8 @@ static bool handle_out(struct pt_regs *regs, int size, int 
port)
 * in TDX Guest-Host-Communication Interface (GHCI) section titled
 * "TDG.VP.VMCALL".
 */
-   return !_tdx_hypercall(hcall_func(EXIT_REASON_IO_INSTRUCTION), size,
-  PORT_WRITE, port, regs->ax & mask);
+   return !TDVMCALL_0(hcall_func(EXIT_REASON_IO_INSTRUCTION),
+  size, TDX_PORT_WRITE, port, regs->ax & mask);
 }
 
 /*
diff --git a/arch/x86/include/asm/shared/tdx.h 
b/arch/x86/include/asm/shared/tdx.h
index ddf2cc4a45da..46c299dc9cf0 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -72,6 +72,10 @@
 #define TDX_PS_1G  2
 #define TDX_PS_NR  (TDX_PS_1G + 1)
 
+/* Port I/O direction */
+#define TDX_PORT_READ  0
+#define TDX_PORT_WRITE 1
+
 #ifndef __ASSEMBLY__
 
 #include 
-- 
2.43.0




[PATCH 04/20] x86/tdx: Convert HLT handling to use new TDVMCALL_0()

2024-05-17 Thread Kirill A. Shutemov
Use newly introduced TDVMCALL_0() instead of __tdx_hypercall() to handle
HLT instruction emulation.

It cuts code bloat substantially:

Function old new   delta
tdx_safe_halt 58  88 +30
tdx_handle_virt_exception   20232052 +29
__pfx___halt  16   - -16
__halt   171   --171
Total: Before=6350, After=6222, chg -2.02%

Signed-off-by: Kirill A. Shutemov 
---
 arch/x86/coco/tdx/tdx.c | 29 ++---
 1 file changed, 6 insertions(+), 23 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 6e0e5648ebd1..dce7d6f9f895 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -350,18 +350,12 @@ static int ve_instr_len(struct ve_info *ve)
}
 }
 
-static u64 __cpuidle __halt(const bool irq_disabled)
+static int handle_halt(struct ve_info *ve)
 {
-   struct tdx_module_args args = {
-   .r10 = TDX_HYPERCALL_STANDARD,
-   .r11 = hcall_func(EXIT_REASON_HLT),
-   .r12 = irq_disabled,
-   };
-
/*
 * Emulate HLT operation via hypercall. More info about ABI
 * can be found in TDX Guest-Host-Communication Interface
-* (GHCI), section 3.8 TDG.VP.VMCALL.
+* (GHCI), section TDG.VP.VMCALL.
 *
 * The VMM uses the "IRQ disabled" param to understand IRQ
 * enabled status (RFLAGS.IF) of the TD guest and to determine
@@ -370,14 +364,7 @@ static u64 __cpuidle __halt(const bool irq_disabled)
 * can keep the vCPU in virtual HLT, even if an IRQ is
 * pending, without hanging/breaking the guest.
 */
-   return __tdx_hypercall();
-}
-
-static int handle_halt(struct ve_info *ve)
-{
-   const bool irq_disabled = irqs_disabled();
-
-   if (__halt(irq_disabled))
+   if (TDVMCALL_0(hcall_func(EXIT_REASON_HLT), irqs_disabled(), 0, 0, 0))
return -EIO;
 
return ve_instr_len(ve);
@@ -385,13 +372,9 @@ static int handle_halt(struct ve_info *ve)
 
 void __cpuidle tdx_safe_halt(void)
 {
-   const bool irq_disabled = false;
-
-   /*
-* Use WARN_ONCE() to report the failure.
-*/
-   if (__halt(irq_disabled))
-   WARN_ONCE(1, "HLT instruction emulation failed\n");
+   /* See comment in handle_halt() */
+   WARN_ONCE(TDVMCALL_0(hcall_func(EXIT_REASON_HLT), false, 0, 0, 0),
+ "HLT instruction emulation failed");
 }
 
 static int read_msr(struct pt_regs *regs, struct ve_info *ve)
-- 
2.43.0




[PATCH 00/20] x86/tdx: Rewrite TDCALL wrappers

2024-05-17 Thread Kirill A. Shutemov
Sean noticing that the TDCALL wrappers were generating a lot of awful
code.

TDCALL calls are centralized into a few megawrappers that take the
struct tdx_module_args as input. Most of the call sites only use a few
arguments, but they have to zero out unused fields in the structure to
avoid data leaks to the VMM. This leads to the compiler generating
inefficient code: dozens of instructions per call site to clear unused
fields of the structure.

This issue can be avoided by using more targeted wrappers.

After the rewrite code size is cut by ~3K:

add/remove: 7/15 grow/shrink: 1/17 up/down: 212/-3502 (-3290)

Please take a look. I would appreciate any feedback.

Kirill A. Shutemov (20):
  x86/tdx: Introduce tdvmcall_trampoline()
  x86/tdx: Add macros to generate TDVMCALL wrappers
  x86/tdx: Convert port I/O handling to use new TDVMCALL macros
  x86/tdx: Convert HLT handling to use new TDVMCALL_0()
  x86/tdx: Convert MSR read handling to use new TDVMCALL_1()
  x86/tdx: Convert MSR write handling to use new TDVMCALL_0()
  x86/tdx: Convert CPUID handling to use new TDVMCALL_4()
  x86/tdx: Convert MMIO handling to use new TDVMCALL macros
  x86/tdx: Convert MAP_GPA hypercall to use new TDVMCALL macros
  x86/tdx: Convert GET_QUOTE hypercall to use new TDVMCALL macros
  x86/tdx: Rewrite tdx_panic() without __tdx_hypercall()
  x86/tdx: Rewrite tdx_kvm_hypercall() without __tdx_hypercall()
  x86/tdx: Rewrite hv_tdx_hypercall() without __tdx_hypercall()
  x86/tdx: Add macros to generate TDCALL wrappers
  x86/tdx: Convert PAGE_ACCEPT tdcall to use new TDCALL_0() macro
  x86/tdx: Convert VP_INFO tdcall to use new TDCALL_5() macro
  x86/tdx: Convert VM_RD/VM_WR tdcalls to use new TDCALL macros
  x86/tdx: Convert VP_VEINFO_GET tdcall to use new TDCALL_5() macro
  x86/tdx: Convert MR_REPORT tdcall to use new TDCALL_0() macro
  x86/tdx: Remove old TDCALL wrappers

 arch/x86/boot/compressed/tdx.c|  32 +---
 arch/x86/coco/tdx/tdcall.S| 145 ++-
 arch/x86/coco/tdx/tdx-shared.c|  26 +--
 arch/x86/coco/tdx/tdx.c   | 298 --
 arch/x86/hyperv/ivm.c |  33 +---
 arch/x86/include/asm/shared/tdx.h | 159 +++-
 arch/x86/include/asm/tdx.h|   2 +
 arch/x86/virt/vmx/tdx/tdxcall.S   |  29 +--
 tools/objtool/noreturns.h |   2 +-
 9 files changed, 322 insertions(+), 404 deletions(-)

-- 
2.43.0




Re: [PATCH 0/5] Handle set_memory_XXcrypted() errors in Hyper-V

2024-03-12 Thread Kirill A. Shutemov
On Mon, Mar 11, 2024 at 09:15:53AM -0700, mhkelle...@gmail.com wrote:
> From: Michael Kelley 
> Michael Kelley (1):
>   Drivers: hv: vmbus: Don't free ring buffers that couldn't be
> re-encrypted
> 
> Rick Edgecombe (4):
>   Drivers: hv: vmbus: Leak pages if set_memory_encrypted() fails
>   Drivers: hv: vmbus: Track decrypted status in vmbus_gpadl
>   hv_netvsc: Don't free decrypted memory
>   uio_hv_generic: Don't free decrypted memory
> 
>  drivers/hv/channel.c | 16 
>  drivers/hv/connection.c  | 11 +++
>  drivers/net/hyperv/netvsc.c  |  7 +--
>  drivers/uio/uio_hv_generic.c | 12 
>  include/linux/hyperv.h   |  1 +
>  5 files changed, 33 insertions(+), 14 deletions(-)

Acked-by: Kirill A. Shutemov 

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init

2023-12-07 Thread Kirill A. Shutemov
On Thu, Dec 07, 2023 at 06:06:38PM +0100, Jeremi Piotrowski wrote:
> > 
> >> This doesn't work in partitioning when TDVMCALLs go to L0: 
> >> TDVMCALL_MAP_GPA bypasses
> >> L1 and TDX_ACCEPT_PAGE is L1 responsibility.
> >>
> >> If you want to see how this is currently supported take a look at 
> >> arch/x86/hyperv/ivm.c.
> >> All memory starts as private and there is a hypercall to notify the 
> >> paravisor for both
> >> TDX (when partitioning) and SNP (when VMPL). This guarantees that all page 
> >> conversions
> >> go through L1.
> > 
> > But L1 guest control anyway during page conversion and it has to manage
> > aliases with TDG.MEM.PAGE.ATTR.RD/WR. Why do you need MAP_GPA for that?
> >
> 
> When the L2 wants to perform a page conversion it needs to notify L1 of this 
> so that it
> can do its part managing the aliases. Without L1 involvement the conversion 
> doesn't
> happen. MAP_GPA is not suitable for this purpose as I've described and you've 
> confirmed
> above.

Memory conversion causes exit to L1 as there will be no aliases in L2
otherwise. There's no need to intercept MAP_GPA for that. See section
21.8 of TD partitioning spec.

>  
> > One possible change I mentioned above: make TDVMCALL exit to L1 for some
> > TDVMCALL leafs (or something along the line).
> > 
> 
> You can explore changes to TDVMCALL handling in the TDX module but I don't 
> see any reason
> this would be adopted, because a shared hypercall to control page visibility 
> for SNP & TDX is
> already part of Hyper-V ABI and works great for this purpose.
> 
> > I would like to keep it transparent for enlightened TDX Linux guest. It
> > should not care if it runs as L1 or as L2 in your environment.
> 
> I understand that is how you would prefer it but, as we've established in 
> these emails,
> that doesn't work when the L1 paravisor provides services to the L2 with an 
> L1 specific
> protocol and TDVMCALLs are routed to L0 for performance reasons. It can't be 
> done
> transparently with TDX 1.5 calls alone and we already have TDX 1.5 deployed 
> to users with
> an upstream kernel.

TDX 1.5 is not set in stone (yet). The spec is still draft. We can add
capabilities if we make case for them.

Let's try to shift the discussion to how to make TDX better rather than
adding workaround to kernel.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init

2023-12-06 Thread Kirill A. Shutemov
On Wed, Dec 06, 2023 at 06:49:11PM +0100, Jeremi Piotrowski wrote:
> On 05/12/2023 11:54, Kirill A. Shutemov wrote:
> > On Mon, Dec 04, 2023 at 08:07:38PM +0100, Jeremi Piotrowski wrote:
> >> On 04/12/2023 10:17, Reshetova, Elena wrote:
> >>>> Check for additional CPUID bits to identify TDX guests running with Trust
> >>>> Domain (TD) partitioning enabled. TD partitioning is like nested 
> >>>> virtualization
> >>>> inside the Trust Domain so there is a L1 TD VM(M) and there can be L2 TD 
> >>>> VM(s).
> >>>>
> >>>> In this arrangement we are not guaranteed that the TDX_CPUID_LEAF_ID is
> >>>> visible
> >>>> to Linux running as an L2 TD VM. This is because a majority of TDX 
> >>>> facilities
> >>>> are controlled by the L1 VMM and the L2 TDX guest needs to use TD 
> >>>> partitioning
> >>>> aware mechanisms for what's left. So currently such guests do not have
> >>>> X86_FEATURE_TDX_GUEST set.
> >>>
> >>> Back to this concrete patch. Why cannot L1 VMM emulate the correct value 
> >>> of
> >>> the TDX_CPUID_LEAF_ID to L2 VM? It can do this per TDX partitioning arch.
> >>> How do you handle this and other CPUID calls call currently in L1? Per 
> >>> spec,
> >>> all CPUIDs calls from L2 will cause L2 --> L1 exit, so what do you do in 
> >>> L1?
> >> The disclaimer here is that I don't have access to the paravisor (L1) 
> >> code. But
> >> to the best of my knowledge the L1 handles CPUID calls by calling into the 
> >> TDX
> >> module, or synthesizing a response itself. TDX_CPUID_LEAF_ID is not 
> >> provided to
> >> the L2 guest in order to discriminate a guest that is solely responsible 
> >> for every
> >> TDX mechanism (running at L1) from one running at L2 that has to cooperate 
> >> with L1.
> >> More below.
> >>
> >>>
> >>> Given that you do that simple emulation, you already end up with TDX guest
> >>> code being activated. Next you can check what features you wont be able to
> >>> provide in L1 and create simple emulation calls for the TDG calls that 
> >>> must be
> >>> supported and cannot return error. The biggest TDG call (TDVMCALL) is 
> >>> already
> >>> direct call into L0 VMM, so this part doesn’t require L1 VMM support. 
> >>
> >> I don't see anything in the TD-partitioning spec that gives the TDX guest 
> >> a way
> >> to detect if it's running at L2 or L1, or check whether TDVMCALLs go to 
> >> L0/L1.
> >> So in any case this requires an extra cpuid call to establish the 
> >> environment.
> >> Given that, exposing TDX_CPUID_LEAF_ID to the guest doesn't help.
> >>
> >> I'll give some examples of where the idea of emulating a TDX environment
> >> without attempting L1-L2 cooperation breaks down.
> >>
> >> hlt: if the guest issues a hlt TDVMCALL it goes to L0, but if it issues a 
> >> classic hlt
> >> it traps to L1. The hlt should definitely go to L1 so that L1 has a chance 
> >> to do
> >> housekeeping.
> > 
> > Why would L2 issue HLT TDVMCALL? It only happens in response to #VE, but
> > if partitioning enabled #VEs are routed to L1 anyway.
> 
> What about tdx_safe_halt? When X86_FEATURE_TDX_GUEST is defined I see
> "using TDX aware idle routing" in dmesg.

Yeah. I forgot about this one. My bad. :/

I think it makes a case for more fine-grained control on where TDVMCALL
routed: to L1 or to L0. I think TDX module can do that.

BTW, what kind of housekeeping do you do in L1 for HLT case?

> >> map gpa: say the guest uses MAP_GPA TDVMCALL. This goes to L0, not L1 
> >> which is the actual
> >> entity that needs to have a say in performing the conversion. L1 can't act 
> >> on the request
> >> if L0 would forward it because of the CoCo threat model. So L1 and L2 get 
> >> out of sync.
> >> The only safe approach is for L2 to use a different mechanism to trap to 
> >> L1 explicitly.
> > 
> > Hm? L1 is always in loop on share<->private conversion. I don't know why
> > you need MAP_GPA for that.
> > 
> > You can't rely on MAP_GPA anyway. It is optional (unfortunately). Conversion
> > doesn't require MAP_GPA call.
> > 
> 
> I'm sorry, I don't quite follow. I'm reading tdx_enc_status_changed():
> - TDVMCALL_MAP_GPA is issued for all transitions
> - TDX_A

Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init

2023-12-05 Thread Kirill A. Shutemov
On Mon, Dec 04, 2023 at 08:07:38PM +0100, Jeremi Piotrowski wrote:
> On 04/12/2023 10:17, Reshetova, Elena wrote:
> >> Check for additional CPUID bits to identify TDX guests running with Trust
> >> Domain (TD) partitioning enabled. TD partitioning is like nested 
> >> virtualization
> >> inside the Trust Domain so there is a L1 TD VM(M) and there can be L2 TD 
> >> VM(s).
> >>
> >> In this arrangement we are not guaranteed that the TDX_CPUID_LEAF_ID is
> >> visible
> >> to Linux running as an L2 TD VM. This is because a majority of TDX 
> >> facilities
> >> are controlled by the L1 VMM and the L2 TDX guest needs to use TD 
> >> partitioning
> >> aware mechanisms for what's left. So currently such guests do not have
> >> X86_FEATURE_TDX_GUEST set.
> > 
> > Back to this concrete patch. Why cannot L1 VMM emulate the correct value of
> > the TDX_CPUID_LEAF_ID to L2 VM? It can do this per TDX partitioning arch.
> > How do you handle this and other CPUID calls call currently in L1? Per spec,
> > all CPUIDs calls from L2 will cause L2 --> L1 exit, so what do you do in L1?
> The disclaimer here is that I don't have access to the paravisor (L1) code. 
> But
> to the best of my knowledge the L1 handles CPUID calls by calling into the TDX
> module, or synthesizing a response itself. TDX_CPUID_LEAF_ID is not provided 
> to
> the L2 guest in order to discriminate a guest that is solely responsible for 
> every
> TDX mechanism (running at L1) from one running at L2 that has to cooperate 
> with L1.
> More below.
> 
> > 
> > Given that you do that simple emulation, you already end up with TDX guest
> > code being activated. Next you can check what features you wont be able to
> > provide in L1 and create simple emulation calls for the TDG calls that must 
> > be
> > supported and cannot return error. The biggest TDG call (TDVMCALL) is 
> > already
> > direct call into L0 VMM, so this part doesn’t require L1 VMM support. 
> 
> I don't see anything in the TD-partitioning spec that gives the TDX guest a 
> way
> to detect if it's running at L2 or L1, or check whether TDVMCALLs go to L0/L1.
> So in any case this requires an extra cpuid call to establish the environment.
> Given that, exposing TDX_CPUID_LEAF_ID to the guest doesn't help.
> 
> I'll give some examples of where the idea of emulating a TDX environment
> without attempting L1-L2 cooperation breaks down.
> 
> hlt: if the guest issues a hlt TDVMCALL it goes to L0, but if it issues a 
> classic hlt
> it traps to L1. The hlt should definitely go to L1 so that L1 has a chance to 
> do
> housekeeping.

Why would L2 issue HLT TDVMCALL? It only happens in response to #VE, but
if partitioning enabled #VEs are routed to L1 anyway.

> map gpa: say the guest uses MAP_GPA TDVMCALL. This goes to L0, not L1 which 
> is the actual
> entity that needs to have a say in performing the conversion. L1 can't act on 
> the request
> if L0 would forward it because of the CoCo threat model. So L1 and L2 get out 
> of sync.
> The only safe approach is for L2 to use a different mechanism to trap to L1 
> explicitly.

Hm? L1 is always in loop on share<->private conversion. I don't know why
you need MAP_GPA for that.

You can't rely on MAP_GPA anyway. It is optional (unfortunately). Conversion
doesn't require MAP_GPA call.

> Having a paravisor is required to support a TPM and having TDVMCALLs go to L0 
> is
> required to make performance viable for real workloads.
> 
> > 
> > Until we really see what breaks with this approach, I don’t think it is 
> > worth to
> > take in the complexity to support different L1 hypervisors view on 
> > partitioning.
> > 
> 
> I'm not asking to support different L1 hypervisors view on partitioning, I 
> want to
> clean up the code (by fixing assumptions that no longer hold) for the model 
> that I'm
> describing that: the kernel already supports, has an implementation that 
> works and
> has actual users. This is also a model that Intel intentionally created the 
> TD-partitioning
> spec to support.
> 
> So lets work together to make X86_FEATURE_TDX_GUEST match reality.

I think the right direction is to make TDX architecture good enough
without that. If we need more hooks in TDX module that give required
control to L1, let's do that. (I don't see it so far)

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init

2023-11-24 Thread Kirill A. Shutemov
On Fri, Nov 24, 2023 at 12:04:56PM +0100, Jeremi Piotrowski wrote:
> On 24/11/2023 11:43, Kirill A. Shutemov wrote:
> > On Fri, Nov 24, 2023 at 11:31:44AM +0100, Jeremi Piotrowski wrote:
> >> On 23/11/2023 14:58, Kirill A. Shutemov wrote:
> >>> On Wed, Nov 22, 2023 at 06:01:04PM +0100, Jeremi Piotrowski wrote:
> >>>> Check for additional CPUID bits to identify TDX guests running with Trust
> >>>> Domain (TD) partitioning enabled. TD partitioning is like nested 
> >>>> virtualization
> >>>> inside the Trust Domain so there is a L1 TD VM(M) and there can be L2 TD 
> >>>> VM(s).
> >>>>
> >>>> In this arrangement we are not guaranteed that the TDX_CPUID_LEAF_ID is 
> >>>> visible
> >>>> to Linux running as an L2 TD VM. This is because a majority of TDX 
> >>>> facilities
> >>>> are controlled by the L1 VMM and the L2 TDX guest needs to use TD 
> >>>> partitioning
> >>>> aware mechanisms for what's left. So currently such guests do not have
> >>>> X86_FEATURE_TDX_GUEST set.
> >>>>
> >>>> We want the kernel to have X86_FEATURE_TDX_GUEST set for all TDX guests 
> >>>> so we
> >>>> need to check these additional CPUID bits, but we skip further 
> >>>> initialization
> >>>> in the function as we aren't guaranteed access to TDX module calls.
> >>>
> >>> I don't follow. The idea of partitioning is that L2 OS can be
> >>> unenlightened and have no idea if it runs indide of TD. But this patch
> >>> tries to enumerate TDX anyway.
> >>>
> >>> Why?
> >>>
> >>
> >> That's not the only idea of partitioning. Partitioning provides different 
> >> privilege
> >> levels within the TD, and unenlightened L2 OS can be made to work but are 
> >> inefficient.
> >> In our case Linux always runs enlightened (both with and without TD 
> >> partitioning), and
> >> uses TDX functionality where applicable (TDX vmcalls, PTE encryption bit).
> > 
> > What value L1 adds in this case? If L2 has to be enlightened just run the
> > enlightened OS directly as L1 and ditch half-measures. I think you can
> > gain some performance this way.
> > 
> 
> It's primarily about the privilege separation, performance is a reason
> one doesn't want to run unenlightened. The L1 makes the following possible:
> - TPM emulation within the trust domain but isolated from the OS
> - infrastructure interfaces for things like VM live migration
> - support for Virtual Trust Levels[1], Virtual Secure Mode[2]
> 
> These provide a lot of value to users, it's not at all about half-measures.

Hm. Okay.

Can we take a step back? What is bigger picture here? What enlightenment
do you expect from the guest when everything is in-place?

So far I see that you try to get kernel think that it runs as TDX guest,
but not really. This is not very convincing model.

Why does L2 need to know if it runs under TDX or SEV? Can't it just think
it runs as Hyper-V guest and all difference between TDX and SEV abstracted
by L1?

So far, I failed to see coherent design. Maybe I just don't know where to
look.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init

2023-11-24 Thread Kirill A. Shutemov
On Fri, Nov 24, 2023 at 11:31:44AM +0100, Jeremi Piotrowski wrote:
> On 23/11/2023 14:58, Kirill A. Shutemov wrote:
> > On Wed, Nov 22, 2023 at 06:01:04PM +0100, Jeremi Piotrowski wrote:
> >> Check for additional CPUID bits to identify TDX guests running with Trust
> >> Domain (TD) partitioning enabled. TD partitioning is like nested 
> >> virtualization
> >> inside the Trust Domain so there is a L1 TD VM(M) and there can be L2 TD 
> >> VM(s).
> >>
> >> In this arrangement we are not guaranteed that the TDX_CPUID_LEAF_ID is 
> >> visible
> >> to Linux running as an L2 TD VM. This is because a majority of TDX 
> >> facilities
> >> are controlled by the L1 VMM and the L2 TDX guest needs to use TD 
> >> partitioning
> >> aware mechanisms for what's left. So currently such guests do not have
> >> X86_FEATURE_TDX_GUEST set.
> >>
> >> We want the kernel to have X86_FEATURE_TDX_GUEST set for all TDX guests so 
> >> we
> >> need to check these additional CPUID bits, but we skip further 
> >> initialization
> >> in the function as we aren't guaranteed access to TDX module calls.
> > 
> > I don't follow. The idea of partitioning is that L2 OS can be
> > unenlightened and have no idea if it runs indide of TD. But this patch
> > tries to enumerate TDX anyway.
> > 
> > Why?
> > 
> 
> That's not the only idea of partitioning. Partitioning provides different 
> privilege
> levels within the TD, and unenlightened L2 OS can be made to work but are 
> inefficient.
> In our case Linux always runs enlightened (both with and without TD 
> partitioning), and
> uses TDX functionality where applicable (TDX vmcalls, PTE encryption bit).

What value L1 adds in this case? If L2 has to be enlightened just run the
enlightened OS directly as L1 and ditch half-measures. I think you can
gain some performance this way.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v1 2/3] x86/coco: Disable TDX module calls when TD partitioning is active

2023-11-23 Thread Kirill A. Shutemov
On Wed, Nov 22, 2023 at 06:01:05PM +0100, Jeremi Piotrowski wrote:
> Introduce CC_ATTR_TDX_MODULE_CALLS to allow code to check whether TDX module
> calls are available. When TD partitioning is enabled, a L1 TD VMM handles most
> TDX facilities and the kernel running as an L2 TD VM does not have access to
> TDX module calls. The kernel still has access to TDVMCALL(0) which is 
> forwarded
> to the VMM for processing, which is the L1 TD VM in this case.

Sounds like a problem introduced by patch 1/3 :/

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v1 3/3] x86/tdx: Provide stub tdx_accept_memory() for non-TDX configs

2023-11-23 Thread Kirill A. Shutemov
On Wed, Nov 22, 2023 at 06:01:06PM +0100, Jeremi Piotrowski wrote:
> When CONFIG_INTEL_TDX_GUEST is not defined but CONFIG_UNACCEPTED_MEMORY=y is,
> the kernel fails to link with an undefined reference to tdx_accept_memory from
> arch_accept_memory. Provide a stub for tdx_accept_memory to fix the build for
> that configuration.
> 
> CONFIG_UNACCEPTED_MEMORY is also selected by CONFIG_AMD_MEM_ENCRYPT, and there
> are stubs for snp_accept_memory for when it is not defined. Previously this 
> did
> not result in an error when CONFIG_INTEL_TDX_GUEST was not defined because the
> branch that references tdx_accept_memory() was being discarded due to
> DISABLE_TDX_GUEST being set.

And who unsets it now?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init

2023-11-23 Thread Kirill A. Shutemov
On Wed, Nov 22, 2023 at 06:01:04PM +0100, Jeremi Piotrowski wrote:
> Check for additional CPUID bits to identify TDX guests running with Trust
> Domain (TD) partitioning enabled. TD partitioning is like nested 
> virtualization
> inside the Trust Domain so there is a L1 TD VM(M) and there can be L2 TD 
> VM(s).
> 
> In this arrangement we are not guaranteed that the TDX_CPUID_LEAF_ID is 
> visible
> to Linux running as an L2 TD VM. This is because a majority of TDX facilities
> are controlled by the L1 VMM and the L2 TDX guest needs to use TD partitioning
> aware mechanisms for what's left. So currently such guests do not have
> X86_FEATURE_TDX_GUEST set.
> 
> We want the kernel to have X86_FEATURE_TDX_GUEST set for all TDX guests so we
> need to check these additional CPUID bits, but we skip further initialization
> in the function as we aren't guaranteed access to TDX module calls.

I don't follow. The idea of partitioning is that L2 OS can be
unenlightened and have no idea if it runs indide of TD. But this patch
tries to enumerate TDX anyway.

Why?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov