Re: [PATCH v5 17/65] i386/tdx: Adjust the supported CPUID based on TDX restrictions
On 6/13/2024 4:26 PM, Duan, Zhenzhong wrote: + * + * It also has side effect to enable unsupported bits, e.g., the + * bits of "fixed0" type while present natively. It's safe because + * the unsupported bits will be masked off by .fixed0 later. + */ + *ret |= host_cpuid_reg(function, index, reg); Looks KVM capabilities are merged with native bits, is this intentional? yes, if we change the order, it would be more clear for you I guess. host_cpuid_reg() | kvm_capabilities The base is host's native value, while any bit that absent from native but KVM can emulate is also added to base. Imagine there is a 'type native' bit that's absent from native but KVM emulated, With above code we pass 1 to tdx module but it wants native 0, is it an issue? yes, it will have issue but it's not "we pass 1 to tdx_module". "Native" bit is not configurable in the view of TDX module, and QEMU/KVM cannot configure it. But it does causes mismatch in above case that QEMU sees the bit is supported while in the TD the bit is not supported. This is one of the reason why we are going to drop the solution that QEMU maintains the CPUID configurability in this series.
RE: [PATCH v5 17/65] i386/tdx: Adjust the supported CPUID based on TDX restrictions
>-Original Message- >From: Li, Xiaoyao >Subject: Re: [PATCH v5 17/65] i386/tdx: Adjust the supported CPUID based >on TDX restrictions > >On 5/31/2024 4:47 PM, Duan, Zhenzhong wrote: >> >> On 2/29/2024 2:36 PM, Xiaoyao Li wrote: >>> According to Chapter "CPUID Virtualization" in TDX module spec, CPUID >>> bits of TD can be classified into 6 types: >>> >>> >>> 1 | As configured | configurable by VMM, independent of native value; >>> >>> 2 | As configured | configurable by VMM if the bit is supported natively >>> (if native) | Otherwise it equals as native(0). >>> >>> 3 | Fixed | fixed to 0/1 >>> >>> 4 | Native | reflect the native value >>> >>> 5 | Calculated | calculated by TDX module. >>> >>> 6 | Inducing #VE | get #VE exception >>> >>> >>> Note: >>> 1. All the configurable XFAM related features and TD attributes related >>> features fall into type #2. And fixed0/1 bits of XFAM and TD >>> attributes fall into type #3. >>> >>> 2. For CPUID leaves not listed in "CPUID virtualization Overview" table >>> in TDX module spec, TDX module injects #VE to TDs when those are >>> queried. For this case, TDs can request CPUID emulation from VMM via >>> TDVMCALL and the values are fully controlled by VMM. >>> >>> Due to TDX module has its own virtualization policy on CPUID bits, it >>> leads >>> to what reported via KVM_GET_SUPPORTED_CPUID diverges from the >supported >>> CPUID bits for TDs. In order to keep a consistent CPUID configuration >>> between VMM and TDs. Adjust supported CPUID for TDs based on TDX >>> restrictions. >>> >>> Currently only focus on the CPUID leaves recognized by QEMU's >>> feature_word_info[] that are indexed by a FeatureWord. >>> >>> Introduce a TDX CPUID lookup table, which maintains 1 entry for each >>> FeatureWord. Each entry has below fields: >>> >>> - tdx_fixed0/1: The bits that are fixed as 0/1; >>> >>> - depends_on_vmm_cap: The bits that are configurable from the view of >>> TDX module. But they requires emulation of VMM >>> when configured as enabled. For those, they are >>> not supported if VMM doesn't report them as >>> supported. So they need be fixed up by checking >>> if VMM supports them. >> >> Previously I thought bits configurable for TDX module are emulated by >> TDX module, >> >> it looks not. Just curious why doesn't those bits belong to "Inducing >> #VE" type? > >Because when TD guest queries this type of CPUID leaf, it doesn't get #VE. > >The bits in this category are free to be configured as on/off by VMM and >passed to TDX module via TD_PARAM. Once they get configured, they are >queried directly by TD guest without getting #VE. > >The problem is whether VMM allows them to be configured freely. E.g., >for features TME and PCONFIG, they are configurable. However when VMM >configures them to 1, VMM needs to provide the support of related MSRs >of them. If VMM cannot provide such support, VMM should be allow user to >configured to 1. > >That's why we have this kind of type. > >BTW, we are going to abondan the solution that let QEMU to maintain the >CPUID configurability table in this series. Next version we will come up >with > >> Then guest can get KVM reported capabilities with tdvmcall directly. >> >>> >>> - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is >>> totally configurable by VMM. >>> >>> - supported_on_ve: It's valid only when @inducing_ve is true. It >>> represents >>> the maximum feature set supported that be emulated >>> for TDs. >> This is never used together with depends_on_vmm_cap, maybe one >variable >> is enough? >>> >>> By
Re: [PATCH v5 17/65] i386/tdx: Adjust the supported CPUID based on TDX restrictions
On 5/31/2024 4:47 PM, Duan, Zhenzhong wrote: On 2/29/2024 2:36 PM, Xiaoyao Li wrote: According to Chapter "CPUID Virtualization" in TDX module spec, CPUID bits of TD can be classified into 6 types: 1 | As configured | configurable by VMM, independent of native value; 2 | As configured | configurable by VMM if the bit is supported natively (if native) | Otherwise it equals as native(0). 3 | Fixed | fixed to 0/1 4 | Native | reflect the native value 5 | Calculated | calculated by TDX module. 6 | Inducing #VE | get #VE exception Note: 1. All the configurable XFAM related features and TD attributes related features fall into type #2. And fixed0/1 bits of XFAM and TD attributes fall into type #3. 2. For CPUID leaves not listed in "CPUID virtualization Overview" table in TDX module spec, TDX module injects #VE to TDs when those are queried. For this case, TDs can request CPUID emulation from VMM via TDVMCALL and the values are fully controlled by VMM. Due to TDX module has its own virtualization policy on CPUID bits, it leads to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported CPUID bits for TDs. In order to keep a consistent CPUID configuration between VMM and TDs. Adjust supported CPUID for TDs based on TDX restrictions. Currently only focus on the CPUID leaves recognized by QEMU's feature_word_info[] that are indexed by a FeatureWord. Introduce a TDX CPUID lookup table, which maintains 1 entry for each FeatureWord. Each entry has below fields: - tdx_fixed0/1: The bits that are fixed as 0/1; - depends_on_vmm_cap: The bits that are configurable from the view of TDX module. But they requires emulation of VMM when configured as enabled. For those, they are not supported if VMM doesn't report them as supported. So they need be fixed up by checking if VMM supports them. Previously I thought bits configurable for TDX module are emulated by TDX module, it looks not. Just curious why doesn't those bits belong to "Inducing #VE" type? Because when TD guest queries this type of CPUID leaf, it doesn't get #VE. The bits in this category are free to be configured as on/off by VMM and passed to TDX module via TD_PARAM. Once they get configured, they are queried directly by TD guest without getting #VE. The problem is whether VMM allows them to be configured freely. E.g., for features TME and PCONFIG, they are configurable. However when VMM configures them to 1, VMM needs to provide the support of related MSRs of them. If VMM cannot provide such support, VMM should be allow user to configured to 1. That's why we have this kind of type. BTW, we are going to abondan the solution that let QEMU to maintain the CPUID configurability table in this series. Next version we will come up with Then guest can get KVM reported capabilities with tdvmcall directly. - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is totally configurable by VMM. - supported_on_ve: It's valid only when @inducing_ve is true. It represents the maximum feature set supported that be emulated for TDs. This is never used together with depends_on_vmm_cap, maybe one variable is enough? By applying TDX CPUID lookup table and TDX capabilities reported from TDX module, the supported CPUID for TDs can be obtained from following steps: - get the base of VMM supported feature set; - if the leaf is not a FeatureWord just return VMM's value without modification; - if the leaf is an inducing_ve type, applying supported_on_ve mask and return; - include all native bits, it covers type #2, #4, and parts of type #1. (it also includes some unsupported bits. The following step will correct it.) - apply fixed0/1 to it (it covers #3, and rectifies the previous step); - add configurable bits (it covers the other part of type #1); - fix the ones in vmm_fixup; (Calculated type is ignored since it's determined at runtime). Co-developed-by: Chenyi Qiang Signed-off-by: Chenyi Qiang Signed-off-by: Xiaoyao Li --- target/i386/cpu.h | 16 +++ target/i386/kvm/kvm.c | 4 + target/i386/kvm/tdx.c | 263 ++ target/i386/kvm/tdx.h | 3 + 4 files changed, 286 insertions(+) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 952174bb6f52..7bd604f802a1 100644 ---
Re: [PATCH v5 17/65] i386/tdx: Adjust the supported CPUID based on TDX restrictions
On 2/29/2024 2:36 PM, Xiaoyao Li wrote: According to Chapter "CPUID Virtualization" in TDX module spec, CPUID bits of TD can be classified into 6 types: 1 | As configured | configurable by VMM, independent of native value; 2 | As configured | configurable by VMM if the bit is supported natively (if native) | Otherwise it equals as native(0). 3 | Fixed | fixed to 0/1 4 | Native| reflect the native value 5 | Calculated| calculated by TDX module. 6 | Inducing #VE | get #VE exception Note: 1. All the configurable XFAM related features and TD attributes related features fall into type #2. And fixed0/1 bits of XFAM and TD attributes fall into type #3. 2. For CPUID leaves not listed in "CPUID virtualization Overview" table in TDX module spec, TDX module injects #VE to TDs when those are queried. For this case, TDs can request CPUID emulation from VMM via TDVMCALL and the values are fully controlled by VMM. Due to TDX module has its own virtualization policy on CPUID bits, it leads to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported CPUID bits for TDs. In order to keep a consistent CPUID configuration between VMM and TDs. Adjust supported CPUID for TDs based on TDX restrictions. Currently only focus on the CPUID leaves recognized by QEMU's feature_word_info[] that are indexed by a FeatureWord. Introduce a TDX CPUID lookup table, which maintains 1 entry for each FeatureWord. Each entry has below fields: - tdx_fixed0/1: The bits that are fixed as 0/1; - depends_on_vmm_cap: The bits that are configurable from the view of TDX module. But they requires emulation of VMM when configured as enabled. For those, they are not supported if VMM doesn't report them as supported. So they need be fixed up by checking if VMM supports them. Previously I thought bits configurable for TDX module are emulated by TDX module, it looks not. Just curious why doesn't those bits belong to "Inducing #VE" type? Then guest can get KVM reported capabilities with tdvmcall directly. - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is totally configurable by VMM. - supported_on_ve: It's valid only when @inducing_ve is true. It represents the maximum feature set supported that be emulated for TDs. This is never used together with depends_on_vmm_cap, maybe one variable is enough? By applying TDX CPUID lookup table and TDX capabilities reported from TDX module, the supported CPUID for TDs can be obtained from following steps: - get the base of VMM supported feature set; - if the leaf is not a FeatureWord just return VMM's value without modification; - if the leaf is an inducing_ve type, applying supported_on_ve mask and return; - include all native bits, it covers type #2, #4, and parts of type #1. (it also includes some unsupported bits. The following step will correct it.) - apply fixed0/1 to it (it covers #3, and rectifies the previous step); - add configurable bits (it covers the other part of type #1); - fix the ones in vmm_fixup; (Calculated type is ignored since it's determined at runtime). Co-developed-by: Chenyi Qiang Signed-off-by: Chenyi Qiang Signed-off-by: Xiaoyao Li --- target/i386/cpu.h | 16 +++ target/i386/kvm/kvm.c | 4 + target/i386/kvm/tdx.c | 263 ++ target/i386/kvm/tdx.h | 3 + 4 files changed, 286 insertions(+) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 952174bb6f52..7bd604f802a1 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -787,6 +787,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w, /* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */ #define CPUID_7_0_EBX_FSGSBASE (1U << 0) +/* Support for TSC adjustment MSR 0x3B */ +#define CPUID_7_0_EBX_TSC_ADJUST(1U << 1) /* Support SGX */ #define CPUID_7_0_EBX_SGX (1U << 2) /* 1st Group of Advanced Bit Manipulation Extensions */ @@ -805,8 +807,12 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w, #define CPUID_7_0_EBX_INVPCID (1U << 10) /* Restricted Transactional Memory */ #define CPUID_7_0_EBX_RTM (1U << 11) +/* Cache QoS Monitoring */ +#define CPUID_7_0_EBX_PQM (1U << 12) /* M
[PATCH v5 17/65] i386/tdx: Adjust the supported CPUID based on TDX restrictions
According to Chapter "CPUID Virtualization" in TDX module spec, CPUID bits of TD can be classified into 6 types: 1 | As configured | configurable by VMM, independent of native value; 2 | As configured | configurable by VMM if the bit is supported natively (if native) | Otherwise it equals as native(0). 3 | Fixed | fixed to 0/1 4 | Native| reflect the native value 5 | Calculated| calculated by TDX module. 6 | Inducing #VE | get #VE exception Note: 1. All the configurable XFAM related features and TD attributes related features fall into type #2. And fixed0/1 bits of XFAM and TD attributes fall into type #3. 2. For CPUID leaves not listed in "CPUID virtualization Overview" table in TDX module spec, TDX module injects #VE to TDs when those are queried. For this case, TDs can request CPUID emulation from VMM via TDVMCALL and the values are fully controlled by VMM. Due to TDX module has its own virtualization policy on CPUID bits, it leads to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported CPUID bits for TDs. In order to keep a consistent CPUID configuration between VMM and TDs. Adjust supported CPUID for TDs based on TDX restrictions. Currently only focus on the CPUID leaves recognized by QEMU's feature_word_info[] that are indexed by a FeatureWord. Introduce a TDX CPUID lookup table, which maintains 1 entry for each FeatureWord. Each entry has below fields: - tdx_fixed0/1: The bits that are fixed as 0/1; - depends_on_vmm_cap: The bits that are configurable from the view of TDX module. But they requires emulation of VMM when configured as enabled. For those, they are not supported if VMM doesn't report them as supported. So they need be fixed up by checking if VMM supports them. - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is totally configurable by VMM. - supported_on_ve: It's valid only when @inducing_ve is true. It represents the maximum feature set supported that be emulated for TDs. By applying TDX CPUID lookup table and TDX capabilities reported from TDX module, the supported CPUID for TDs can be obtained from following steps: - get the base of VMM supported feature set; - if the leaf is not a FeatureWord just return VMM's value without modification; - if the leaf is an inducing_ve type, applying supported_on_ve mask and return; - include all native bits, it covers type #2, #4, and parts of type #1. (it also includes some unsupported bits. The following step will correct it.) - apply fixed0/1 to it (it covers #3, and rectifies the previous step); - add configurable bits (it covers the other part of type #1); - fix the ones in vmm_fixup; (Calculated type is ignored since it's determined at runtime). Co-developed-by: Chenyi Qiang Signed-off-by: Chenyi Qiang Signed-off-by: Xiaoyao Li --- target/i386/cpu.h | 16 +++ target/i386/kvm/kvm.c | 4 + target/i386/kvm/tdx.c | 263 ++ target/i386/kvm/tdx.h | 3 + 4 files changed, 286 insertions(+) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 952174bb6f52..7bd604f802a1 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -787,6 +787,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w, /* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */ #define CPUID_7_0_EBX_FSGSBASE (1U << 0) +/* Support for TSC adjustment MSR 0x3B */ +#define CPUID_7_0_EBX_TSC_ADJUST(1U << 1) /* Support SGX */ #define CPUID_7_0_EBX_SGX (1U << 2) /* 1st Group of Advanced Bit Manipulation Extensions */ @@ -805,8 +807,12 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w, #define CPUID_7_0_EBX_INVPCID (1U << 10) /* Restricted Transactional Memory */ #define CPUID_7_0_EBX_RTM (1U << 11) +/* Cache QoS Monitoring */ +#define CPUID_7_0_EBX_PQM (1U << 12) /* Memory Protection Extension */ #define CPUID_7_0_EBX_MPX (1U << 14) +/* Resource Director Technology Allocation */ +#define CPUID_7_0_EBX_RDT_A (1U << 15) /* AVX-512 Foundation */ #define CPUID_7_0_EBX_AVX512F (1U << 16) /* AVX-512 Doubleword & Quadword Instruction */ @@ -862,10 +868,16 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w, #define CPUI