[RFC for-4.20 v1 0/1] x86/hvm: Introduce Xen-wide ASID allocator
Motivation: --- This is part of the effort to enable AMD SEV technologies in Xen. For AMD SEV support, we need a fixed ASID associated with all vcpus of the same domain throughout the domain's lifetime. This is because for SEV/ SEV-{ES,SNP} VM, the ASID is the index which is associated with the encryption key. Currently, ASID generation and management is done per-PCPU in Xen. And at the time of each VMENTER, the ASID associated with vcpus of the domain is changed. This implementation is incompatible with SEV technologies for the above mentioned reasons. In a discussion with Andrew Cooper, it came up that it’ll be nice to have fixed ASIDs not only for SEV VMs but also for all VMs. Because it opens up the opportunity to use instructions like TLBSYNC and INVLPGB (Section 5.5.3 in AMD Architecture manual[0]) for broadcasting the TLB Invalidations. Why is this RFC? This is only tested on AMD SVM at the moment. There are a few points that I would like to discuss and get a feedback on from the community before further development and testing. I’ve also submitted a design session for this RFC to discuss further at the Xen Summit. Points of discussion: - 1. I’m not sure how this should be handled for the nestedhvm. To start with, at the moment all the values seem to be handled via struct nestedvcpu. Do we want to keep it that way or do we want to have something like nestedhvm_domain to associate certain values like asid? I’ve not handled this as part of this RFC as I would like to know the opinions and plans of those working on nested virtualization. 2. I’m doing initialization of xen-wide asids at the moment in setup.c but is that the right place to do it? I’m asking this because I’ve been seeing a weird bug with the code in this RFC. Dom0 is able to have a fixed asid through the lifecycle of it. But if I start a domU with 2/4 vcpus via xl, sometimes it only brings up the one vcpu and shows ‘tsc: Unable to calibrate against PIT’ while booting the kernel. Notes: - 1. Currently the RFC doesn’t demonstrate the use of TLBSYNC and INVLPGB. It can further be added if required. I'm not sure if it should be part of the same patch series or not. 2. This is a basic RFC to start the discussion on the above points but I further plan to add a logic to reclaim the asids that are no longer in use and add a check to pick the asid from such stack before doing hvm_asid_flush_all. [0] https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf Vaishali Thakkar (1): x86/hvm: Introduce Xen-wide ASID Allocator xen/arch/x86/flushtlb.c| 1 - xen/arch/x86/hvm/asid.c| 61 ++ xen/arch/x86/hvm/emulate.c | 3 -- xen/arch/x86/hvm/hvm.c | 2 +- xen/arch/x86/hvm/nestedhvm.c | 4 +- xen/arch/x86/hvm/svm/asid.c| 28 +++- xen/arch/x86/hvm/svm/nestedsvm.c | 2 +- xen/arch/x86/hvm/svm/svm.c | 33 ++ xen/arch/x86/hvm/svm/svm.h | 1 - xen/arch/x86/hvm/vmx/vmcs.c| 2 +- xen/arch/x86/hvm/vmx/vmx.c | 15 +++ xen/arch/x86/hvm/vmx/vvmx.c| 6 +-- xen/arch/x86/include/asm/hvm/asid.h| 19 xen/arch/x86/include/asm/hvm/domain.h | 1 + xen/arch/x86/include/asm/hvm/hvm.h | 2 +- xen/arch/x86/include/asm/hvm/svm/svm.h | 1 + xen/arch/x86/include/asm/hvm/vcpu.h| 6 +-- xen/arch/x86/include/asm/hvm/vmx/vmx.h | 3 +- xen/arch/x86/mm/hap/hap.c | 4 +- xen/arch/x86/mm/p2m.c | 6 +-- xen/arch/x86/mm/paging.c | 2 +- xen/arch/x86/setup.c | 7 +++ 22 files changed, 108 insertions(+), 101 deletions(-) -- 2.45.0
[RFC for-4.20 v1 1/1] x86/hvm: Introduce Xen-wide ASID Allocator
Currently ASID generation and management is done per-PCPU. This scheme is incompatible with SEV technologies as SEV VMs need to have a fixed ASID associated with all vcpus of the VM throughout it's lifetime. This commit introduces a Xen-wide allocator which initializes the asids at the start of xen and allows to have a fixed asids throughout the lifecycle of all domains. Having a fixed asid for non-SEV domains also presents us with the opportunity to further take use of AMD instructions like TLBSYNC and INVLPGB for broadcasting the TLB invalidations. Signed-off-by: Vaishali Thakkar --- xen/arch/x86/flushtlb.c| 1 - xen/arch/x86/hvm/asid.c| 61 ++ xen/arch/x86/hvm/emulate.c | 3 -- xen/arch/x86/hvm/hvm.c | 2 +- xen/arch/x86/hvm/nestedhvm.c | 4 +- xen/arch/x86/hvm/svm/asid.c| 28 +++- xen/arch/x86/hvm/svm/nestedsvm.c | 2 +- xen/arch/x86/hvm/svm/svm.c | 33 ++ xen/arch/x86/hvm/svm/svm.h | 1 - xen/arch/x86/hvm/vmx/vmcs.c| 2 +- xen/arch/x86/hvm/vmx/vmx.c | 15 +++ xen/arch/x86/hvm/vmx/vvmx.c| 6 +-- xen/arch/x86/include/asm/hvm/asid.h| 19 xen/arch/x86/include/asm/hvm/domain.h | 1 + xen/arch/x86/include/asm/hvm/hvm.h | 2 +- xen/arch/x86/include/asm/hvm/svm/svm.h | 1 + xen/arch/x86/include/asm/hvm/vcpu.h| 6 +-- xen/arch/x86/include/asm/hvm/vmx/vmx.h | 3 +- xen/arch/x86/mm/hap/hap.c | 4 +- xen/arch/x86/mm/p2m.c | 6 +-- xen/arch/x86/mm/paging.c | 2 +- xen/arch/x86/setup.c | 7 +++ 22 files changed, 108 insertions(+), 101 deletions(-) diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c index 18748b2bc8..69d72944d7 100644 --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -124,7 +124,6 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4) if ( tlb_clk_enabled ) t = pre_flush(); -hvm_flush_guest_tlbs(); old_cr4 = read_cr4(); ASSERT(!(old_cr4 & X86_CR4_PCIDE) || !(old_cr4 & X86_CR4_PGE)); diff --git a/xen/arch/x86/hvm/asid.c b/xen/arch/x86/hvm/asid.c index 8d27b7dba1..f343bfcdb9 100644 --- a/xen/arch/x86/hvm/asid.c +++ b/xen/arch/x86/hvm/asid.c @@ -27,8 +27,8 @@ boolean_param("asid", opt_asid_enabled); * the TLB. * * Sketch of the Implementation: - * - * ASIDs are a CPU-local resource. As preemption of ASIDs is not possible, + * TODO(vaishali): Update this comment + * ASIDs are Xen-wide resource. As preemption of ASIDs is not possible, * ASIDs are assigned in a round-robin scheme. To minimize the overhead of * ASID invalidation, at the time of a TLB flush, ASIDs are tagged with a * 64-bit generation. Only on a generation overflow the code needs to @@ -38,20 +38,21 @@ boolean_param("asid", opt_asid_enabled); * ASID useage to retain correctness. */ -/* Per-CPU ASID management. */ +/* Xen-wide ASID management */ struct hvm_asid_data { - uint64_t core_asid_generation; + uint64_t asid_generation; uint32_t next_asid; uint32_t max_asid; + uint32_t min_asid; bool disabled; }; -static DEFINE_PER_CPU(struct hvm_asid_data, hvm_asid_data); +static struct hvm_asid_data asid_data; void hvm_asid_init(int nasids) { static int8_t g_disabled = -1; -struct hvm_asid_data *data = _cpu(hvm_asid_data); +struct hvm_asid_data *data = _data; data->max_asid = nasids - 1; data->disabled = !opt_asid_enabled || (nasids <= 1); @@ -64,67 +65,73 @@ void hvm_asid_init(int nasids) } /* Zero indicates 'invalid generation', so we start the count at one. */ -data->core_asid_generation = 1; +data->asid_generation = 1; /* Zero indicates 'ASIDs disabled', so we start the count at one. */ data->next_asid = 1; } -void hvm_asid_flush_vcpu_asid(struct hvm_vcpu_asid *asid) +void hvm_asid_flush_domain_asid(struct hvm_domain_asid *asid) { write_atomic(>generation, 0); } -void hvm_asid_flush_vcpu(struct vcpu *v) +void hvm_asid_flush_domain(struct domain *d) { -hvm_asid_flush_vcpu_asid(>arch.hvm.n1asid); -hvm_asid_flush_vcpu_asid(_nestedhvm(v).nv_n2asid); +hvm_asid_flush_domain_asid(>arch.hvm.n1asid); +//hvm_asid_flush_domain_asid(_nestedhvm(v).nv_n2asid); } -void hvm_asid_flush_core(void) +void hvm_asid_flush_all(void) { -struct hvm_asid_data *data = _cpu(hvm_asid_data); +struct hvm_asid_data *data = _data; -if ( data->disabled ) +if ( data->disabled) return; -if ( likely(++data->core_asid_generation != 0) ) +if ( likely(++data->asid_generation != 0) ) return; /* - * ASID generations are 64 bit. Overflow of generations never happens. - * For safety, we simply disable ASIDs, so correctness is established; it - * only runs a
Re: [PATCH 5/5] x86/cpu-policy: Introduce some SEV features
On 4/29/24 5:16 PM, Andrew Cooper wrote: For display purposes only right now. Signed-off-by: Andrew Cooper Reviewed-by: Vaishali Thakkar --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Xenia Ragiadakou CC: Sergiy Kibrik CC: George Dunlap CC: Andrei Semenov CC: Vaishali Thakkar This is only half the work to get SEV working nicely. The other half (rearranging __start_xen() so we can move the host policy collection earlier) is still a work-in-progress. --- tools/misc/xen-cpuid.c | 3 +++ xen/arch/x86/include/asm/cpufeature.h | 3 +++ xen/include/public/arch-x86/cpufeatureset.h | 4 xen/tools/gen-cpuid.py | 6 +- 4 files changed, 15 insertions(+), 1 deletion(-) diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c index 0d01b0e797f1..1463e0429ba1 100644 --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -281,6 +281,9 @@ static const char *const str_eAd[32] = static const char *const str_e1Fa[32] = { +[ 0] = "sme", [ 1] = "sev", +/* 2 */ [ 3] = "sev-es", +[ 4] = "sev-snp", }; static const struct { diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h index b6fb8c24423c..732f0d2bf758 100644 --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -230,6 +230,9 @@ static inline bool boot_cpu_has(unsigned int feat) #define cpu_has_v_gif boot_cpu_has(X86_FEATURE_V_GIF) #define cpu_has_v_spec_ctrl boot_cpu_has(X86_FEATURE_V_SPEC_CTRL) +/* CPUID level 0x801f.eax */ +#define cpu_has_sev boot_cpu_has(X86_FEATURE_SEV) + /* Synthesized. */ #define cpu_has_arch_perfmonboot_cpu_has(X86_FEATURE_ARCH_PERFMON) #define cpu_has_cpuid_faulting boot_cpu_has(X86_FEATURE_CPUID_FAULTING) diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h index 80d252a38c2d..7ee0f2329151 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -374,6 +374,10 @@ XEN_CPUFEATURE(NPT_SSS,18*32+19) /* NPT Supervisor Shadow Stacks * XEN_CPUFEATURE(V_SPEC_CTRL,18*32+20) /* Virtualised MSR_SPEC_CTRL */ /* AMD-defined CPU features, CPUID level 0x801f.eax, word 19 */ +XEN_CPUFEATURE(SME,19*32+ 0) /* Secure Memory Encryption */ +XEN_CPUFEATURE(SEV,19*32+ 1) /* Secure Encryped VM */ +XEN_CPUFEATURE(SEV_ES, 19*32+ 3) /* SEV Encrypted State */ +XEN_CPUFEATURE(SEV_SNP,19*32+ 4) /* SEV Secure Nested Paging */ #endif /* XEN_CPUFEATURE */ diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py index f07b1f4cf905..bff4d9389ff6 100755 --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -281,7 +281,7 @@ def crunch_numbers(state): _3DNOW: [_3DNOWEXT], # The SVM bit enumerates the whole SVM leave. -SVM: list(range(NPT, NPT + 32)), +SVM: list(range(NPT, NPT + 32)) + [SEV], # This is just the dependency between AVX512 and AVX2 of XSTATE # feature flags. If want to use AVX512, AVX2 must be supported and @@ -341,6 +341,10 @@ def crunch_numbers(state): # The behaviour described by RRSBA depend on eIBRS being active. EIBRS: [RRSBA], + +SEV: [SEV_ES], + +SEV_ES: [SEV_SNP], } deep_features = tuple(sorted(deps.keys()))
Re: [PATCH 4/5] x86/svm: Switch SVM features over normal cpu_has_*
On 4/29/24 5:16 PM, Andrew Cooper wrote: Delete the boot time rendering of advanced features. It's entirely ad-hoc and not even everything printed here is used by Xen. It is available in `xen-cpuid` now. With (only) svm_load_segs_{,prefetch}() declared now in svm.h, only svm.c and domain.c which need the header. Clean up all others. No functional change. Signed-off-by: Andrew Cooper Reviewed-by: Vaishali Thakkar --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Xenia Ragiadakou CC: Sergiy Kibrik CC: George Dunlap CC: Andrei Semenov CC: Vaishali Thakkar --- xen/arch/x86/hvm/svm/asid.c| 5 ++- xen/arch/x86/hvm/svm/emulate.c | 3 +- xen/arch/x86/hvm/svm/intr.c| 1 - xen/arch/x86/hvm/svm/nestedsvm.c | 14 xen/arch/x86/hvm/svm/svm.c | 50 +++--- xen/arch/x86/hvm/svm/vmcb.c| 1 - xen/arch/x86/include/asm/cpufeature.h | 10 ++ xen/arch/x86/include/asm/hvm/svm/svm.h | 36 --- 8 files changed, 31 insertions(+), 89 deletions(-) diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c index 7977a8e86b53..6117a362d310 100644 --- a/xen/arch/x86/hvm/svm/asid.c +++ b/xen/arch/x86/hvm/svm/asid.c @@ -6,7 +6,6 @@ #include #include -#include #include "svm.h" @@ -39,7 +38,7 @@ void svm_asid_handle_vmrun(void) { vmcb_set_asid(vmcb, true); vmcb->tlb_control = -cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL; +cpu_has_flush_by_asid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL; return; } @@ -48,7 +47,7 @@ void svm_asid_handle_vmrun(void) vmcb->tlb_control = !need_flush ? TLB_CTRL_NO_FLUSH : -cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL; +cpu_has_flush_by_asid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL; } /* diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c index 93ac1d3435f9..da6e21b2e270 100644 --- a/xen/arch/x86/hvm/svm/emulate.c +++ b/xen/arch/x86/hvm/svm/emulate.c @@ -11,7 +11,6 @@ #include #include #include -#include #include #include "svm.h" @@ -20,7 +19,7 @@ static unsigned long svm_nextrip_insn_length(struct vcpu *v) { struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb; -if ( !cpu_has_svm_nrips ) +if ( !cpu_has_nrips ) return 0; #ifndef NDEBUG diff --git a/xen/arch/x86/hvm/svm/intr.c b/xen/arch/x86/hvm/svm/intr.c index 4805c5567213..facd2894a2c6 100644 --- a/xen/arch/x86/hvm/svm/intr.c +++ b/xen/arch/x86/hvm/svm/intr.c @@ -17,7 +17,6 @@ #include #include #include -#include #include /* for nestedhvm_vcpu_in_guestmode */ #include #include diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index 35a2cbfd7d13..255af112661f 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -6,7 +6,6 @@ */ #include -#include #include #include #include @@ -1620,7 +1619,7 @@ void svm_nested_features_on_efer_update(struct vcpu *v) { if ( !vmcb->virt_ext.fields.vloadsave_enable && paging_mode_hap(v->domain) && - cpu_has_svm_vloadsave ) + cpu_has_v_loadsave ) { vmcb->virt_ext.fields.vloadsave_enable = 1; general2_intercepts = vmcb_get_general2_intercepts(vmcb); @@ -1629,8 +1628,7 @@ void svm_nested_features_on_efer_update(struct vcpu *v) vmcb_set_general2_intercepts(vmcb, general2_intercepts); } -if ( !vmcb->_vintr.fields.vgif_enable && - cpu_has_svm_vgif ) +if ( !vmcb->_vintr.fields.vgif_enable && cpu_has_v_gif ) { vintr = vmcb_get_vintr(vmcb); vintr.fields.vgif = svm->ns_gif; @@ -1675,8 +1673,8 @@ void __init start_nested_svm(struct hvm_function_table *hvm_function_table) */ hvm_function_table->caps.nested_virt = hvm_function_table->caps.hap && -cpu_has_svm_lbrv && -cpu_has_svm_nrips && -cpu_has_svm_flushbyasid && -cpu_has_svm_decode; +cpu_has_v_lbr && +cpu_has_nrips && +cpu_has_flush_by_asid && +cpu_has_decode_assist; } diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 4719fffae589..16eb875aab94 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1287,7 +1287,7 @@ static void cf_check svm_inject_event(const struct x86_event *event) * that hardware doesn't perform DPL checking on injection. */ if ( event->type == X86_EVENTTYPE_PRI_SW_EXCEPTION || - (!cpu_has_svm_nrips && (event->type >= X86_EVENTTYPE_SW_INTERRUPT)) ) +
Re: [PATCH] x86/hvm: Allow supplying a dynamic start ASID
On 4/16/24 4:25 PM, Vaishali Thakkar wrote: On 4/16/24 4:12 PM, Andrew Cooper wrote: On 16/04/2024 9:54 am, Vaishali Thakkar wrote: Currently, Xen always starts the ASID allocation at 1. But for SEV technologies the ASID space is divided. This is because it's a security issue if a guest is started as ES/SNP and is migrated to SEV-only. So, the types are tracked explicitly. Thus, in preparation of SEV support in Xen, add min_asid to allow supplying the dynamic start ASID during the allocation process. Signed-off-by: Vaishali Thakkar Mechanically, this is fine, but is it going to be useful in the long term? For SEV and SEV-ES/SNP, we must run the VM in the single fixed ASID negotiated with the ASP. This is not not optional. For non-encrypted VMs, we are free to choose between using the remaining ASID space as we used to (i.e. run it per-pCPU and tick it over to flush the TLBs), or to run it in a fixed ASID per guest too. The latter lets us use INVLPGB, and would avoid maintaining two different TLB-tagging schemes. I'd say that we absolutely do want INVLPGB support for managing non-encrypted VMs, and we cannot mix both fixed and floating ASIDs at the same time. I agree that having a both schemes at the time is not practical. And I've some RFC patches in work to propose fixed ASID scheme for all domains (encrypted or not) to start a discussion regarding that. IMO, this patch is still useful because my idea was to then use min_asid as a holder to separate out the non-encrypted, encrypted space and SEV ES/ SNP ASID space. If it's more easier to see the usefulness of this patch along with other asid fixes, I'm fine with putting it in that RFC patchset. But I thought that this change was independent enough to be sent by itself. s/encrypted/SEV We should seriously consider whether it's worth maintaining two schemes, or just switching wholesale to a fixed ASID scheme. I don't have a good answer here... If it where anything but TLB flushing, it would be an obvious choice, but TLB handling bugs are some of the nastiest to show up. ~Andrew
Re: [PATCH] x86/hvm: Allow supplying a dynamic start ASID
On 4/16/24 4:12 PM, Andrew Cooper wrote: On 16/04/2024 9:54 am, Vaishali Thakkar wrote: Currently, Xen always starts the ASID allocation at 1. But for SEV technologies the ASID space is divided. This is because it's a security issue if a guest is started as ES/SNP and is migrated to SEV-only. So, the types are tracked explicitly. Thus, in preparation of SEV support in Xen, add min_asid to allow supplying the dynamic start ASID during the allocation process. Signed-off-by: Vaishali Thakkar Mechanically, this is fine, but is it going to be useful in the long term? For SEV and SEV-ES/SNP, we must run the VM in the single fixed ASID negotiated with the ASP. This is not not optional. For non-encrypted VMs, we are free to choose between using the remaining ASID space as we used to (i.e. run it per-pCPU and tick it over to flush the TLBs), or to run it in a fixed ASID per guest too. The latter lets us use INVLPGB, and would avoid maintaining two different TLB-tagging schemes. I'd say that we absolutely do want INVLPGB support for managing non-encrypted VMs, and we cannot mix both fixed and floating ASIDs at the same time. I agree that having a both schemes at the time is not practical. And I've some RFC patches in work to propose fixed ASID scheme for all domains (encrypted or not) to start a discussion regarding that. IMO, this patch is still useful because my idea was to then use min_asid as a holder to separate out the non-encrypted, encrypted space and SEV ES/ SNP ASID space. If it's more easier to see the usefulness of this patch along with other asid fixes, I'm fine with putting it in that RFC patchset. But I thought that this change was independent enough to be sent by itself. We should seriously consider whether it's worth maintaining two schemes, or just switching wholesale to a fixed ASID scheme. I don't have a good answer here... If it where anything but TLB flushing, it would be an obvious choice, but TLB handling bugs are some of the nastiest to show up. ~Andrew
Re: [PATCH] x86/svm: Add flushbyasid in the supported features
On 4/16/24 3:38 PM, Andrew Cooper wrote: On 16/04/2024 10:08 am, Vaishali Thakkar wrote: TLB Flush by ASID is missing in the list of supported features here. So, add it. Signed-off-by: Vaishali Thakkar --- xen/arch/x86/hvm/svm/svm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index a745acd903..4719fffae5 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -2510,6 +2510,7 @@ const struct hvm_function_table * __init start_svm(void) P(cpu_has_svm_lbrv, "Last Branch Record (LBR) Virtualisation"); P(cpu_has_svm_nrips, "Next-RIP Saved on #VMEXIT"); P(cpu_has_svm_cleanbits, "VMCB Clean Bits"); +P(cpu_has_svm_flushbyasid, "TLB flush by ASID"); P(cpu_has_svm_decode, "DecodeAssists"); P(cpu_has_svm_vloadsave, "Virtual VMLOAD/VMSAVE"); P(cpu_has_svm_vgif, "Virtual GIF"); This is consistent with pre-existing behaviour, so Acked-by: Andrew Cooper Thanks. However, an ever increasing list of lines like this is something I'm trying to push back against. They don't match the configured state of VMs in the system, not least Right, makes sense to not add more stuff to print here. because one of the things required to fix security vulnerabilities in nested virt is to break the (false) assumption that there is a single global state of how a VM is configured. These ones in particular are just about to appear in CPU policies. As part of nested virt work? ~Andrew
[PATCH] x86/svm: Add flushbyasid in the supported features
TLB Flush by ASID is missing in the list of supported features here. So, add it. Signed-off-by: Vaishali Thakkar --- xen/arch/x86/hvm/svm/svm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index a745acd903..4719fffae5 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -2510,6 +2510,7 @@ const struct hvm_function_table * __init start_svm(void) P(cpu_has_svm_lbrv, "Last Branch Record (LBR) Virtualisation"); P(cpu_has_svm_nrips, "Next-RIP Saved on #VMEXIT"); P(cpu_has_svm_cleanbits, "VMCB Clean Bits"); +P(cpu_has_svm_flushbyasid, "TLB flush by ASID"); P(cpu_has_svm_decode, "DecodeAssists"); P(cpu_has_svm_vloadsave, "Virtual VMLOAD/VMSAVE"); P(cpu_has_svm_vgif, "Virtual GIF"); -- 2.44.0
[PATCH] x86/hvm: Allow supplying a dynamic start ASID
Currently, Xen always starts the ASID allocation at 1. But for SEV technologies the ASID space is divided. This is because it's a security issue if a guest is started as ES/SNP and is migrated to SEV-only. So, the types are tracked explicitly. Thus, in preparation of SEV support in Xen, add min_asid to allow supplying the dynamic start ASID during the allocation process. Signed-off-by: Vaishali Thakkar --- xen/arch/x86/hvm/asid.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/asid.c b/xen/arch/x86/hvm/asid.c index 8d27b7dba1..e14b64f2c8 100644 --- a/xen/arch/x86/hvm/asid.c +++ b/xen/arch/x86/hvm/asid.c @@ -41,6 +41,7 @@ boolean_param("asid", opt_asid_enabled); /* Per-CPU ASID management. */ struct hvm_asid_data { uint64_t core_asid_generation; + uint32_t min_asid; uint32_t next_asid; uint32_t max_asid; bool disabled; @@ -53,7 +54,8 @@ void hvm_asid_init(int nasids) static int8_t g_disabled = -1; struct hvm_asid_data *data = _cpu(hvm_asid_data); -data->max_asid = nasids - 1; +data->min_asid = 1; +data->max_asid = nasids - data->min_asid; data->disabled = !opt_asid_enabled || (nasids <= 1); if ( g_disabled != data->disabled ) @@ -66,8 +68,8 @@ void hvm_asid_init(int nasids) /* Zero indicates 'invalid generation', so we start the count at one. */ data->core_asid_generation = 1; -/* Zero indicates 'ASIDs disabled', so we start the count at one. */ -data->next_asid = 1; +/* Zero indicates 'ASIDs disabled', so we start the count at min_asid. */ +data->next_asid = data->min_asid; } void hvm_asid_flush_vcpu_asid(struct hvm_vcpu_asid *asid) @@ -117,7 +119,7 @@ bool hvm_asid_handle_vmenter(struct hvm_vcpu_asid *asid) if ( unlikely(data->next_asid > data->max_asid) ) { hvm_asid_flush_core(); -data->next_asid = 1; +data->next_asid = data->min_asid; if ( data->disabled ) goto disabled; } -- 2.44.0
Re: [PATCH v1 1/2] Implemented AMD SEV discovery and enabling.
On 4/12/24 5:07 PM, Andrew Cooper wrote: On 12/04/2024 3:38 pm, Vaishali Thakkar wrote: On 4/12/24 4:06 PM, Andrei Semenov wrote: On 4/11/24 20:32, Andrew Cooper wrote: On 10/04/2024 4:36 pm, Andrei Semenov wrote: + } + + if (!(cpu_has_sme || cpu_has_sev)) + return; + + if (!smp_processor_id()) { + if (cpu_has_sev) + printk(XENLOG_INFO "SEV: ASID range [0x%x - 0x%x]\n", + min_sev_asid, max_sev_asid); Why do we have a min as well as a max? Isn't min always 1? In the case of SEV, it's not true. Some BIOS allow to set the min_asid. So yeah Xen will also need to adapted for the same. I've a WIP patch for allowing dynamic generation of asid in such a case. I also got an answer to this out of a contact of mine at AMD. The ASID space is divided, 1->$N for SEV-ES/SNP guest, and $N->$M for SEV guests. It is a security issue to start a guest as ES/SNP, then "migrate" it to being SEV-only, so the different types are tracked explicitly. Aha, yeah that seems like a better explanation. Thanks for checking with the AMD person. ~Andrew
Re: [PATCH v1 1/2] Implemented AMD SEV discovery and enabling.
On 4/12/24 4:06 PM, Andrei Semenov wrote: On 4/11/24 20:32, Andrew Cooper wrote: On 10/04/2024 4:36 pm, Andrei Semenov wrote: diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c index ab92333673..a5903613f0 100644 --- a/xen/arch/x86/cpu/amd.c +++ b/xen/arch/x86/cpu/amd.c @@ -1030,6 +1031,54 @@ static void amd_check_erratum_1485(void) wrmsrl(MSR_AMD64_BP_CFG, val | chickenbit); } +#ifdef CONFIG_HVM +static void amd_enable_mem_encrypt(const struct cpuinfo_x86 *c) +{ + unsigned int eax, ebx, ecx, edx; + uint64_t syscfg; + + if (!smp_processor_id()) { c == _cpu_info. Agree, will fix. + + cpuid_count(0x8000,0,, , , ); + + if (eax < 0x801f) + return; Max leaf is already collected. c->extended_cpuid_level Agree, will fix. + + cpuid_count(0x801f,0,, , , ); + + if (eax & 0x1) + setup_force_cpu_cap(X86_FEATURE_SME); + + if (eax & 0x2) { + setup_force_cpu_cap(X86_FEATURE_SEV); + max_sev_asid = ecx; + min_sev_asid = edx; + } + + if (eax & 0x3) + pte_c_bit_mask = 1UL << (ebx & 0x3f); This is decoding the main SEV feature leaf, but outside of normal mechanisms. I've got half a mind to brute-force through the remaining work to un-screw our boot sequence order, and express this in a cpu-policy straight away. This is wanted for the SVM leaf info too. Leave it with me for a bit. OK. I wait for your insights on this so. + } + + if (!(cpu_has_sme || cpu_has_sev)) + return; + + if (!smp_processor_id()) { + if (cpu_has_sev) + printk(XENLOG_INFO "SEV: ASID range [0x%x - 0x%x]\n", + min_sev_asid, max_sev_asid); Why do we have a min as well as a max? Isn't min always 1? In the case of SEV, it's not true. Some BIOS allow to set the min_asid. So yeah Xen will also need to adapted for the same. I've a WIP patch for allowing dynamic generation of asid in such a case. Well, "normally it is". But this is the part of CPUID leaf specs. Do they plan to potentially change it? No idea. + } + + rdmsrl(MSR_K8_SYSCFG, syscfg); + + if (syscfg & SYSCFG_MEM_ENCRYPT) { + return; + } + + syscfg |= SYSCFG_MEM_ENCRYPT; + wrmsrl(MSR_K8_SYSCFG, syscfg); +} +#endif + static void cf_check init_amd(struct cpuinfo_x86 *c) { u32 l, h; @@ -1305,6 +1354,10 @@ static void cf_check init_amd(struct cpuinfo_x86 *c) check_syscfg_dram_mod_en(); amd_log_freq(c); + +#ifdef CONFIG_HVM + amd_enable_mem_encrypt(c); +#endif I think we want to drop the CONFIG_HVM here. Memory encryption is an all-or-nothing thing. If it's active, it affects all pagetables that Xen controls, even dom0's. And we likely do want get to the point of Xen running on encrypted mappings even if dom0 can't operate it very nicely. Thoughts? Basically I put CONFIG_HVM here because I also wanted to put related variables (max/min_asid) in sev.c. And sev.c is in "HVM" part of the code as SEV is only related to HVM guests. Now, basically I agree that - Xen would like potentially use encrypted memory for itself - in SME case, some encryption could be offered for non-HVM guests, so they can protect their memory (even though the key is shared and the hypervisor can read it). OK, so I will drop CONFIG_HVM and put these variables elsewhere. amd.h is probably a good candidate? } const struct cpu_dev __initconst_cf_clobber amd_cpu_dev = { diff --git a/xen/arch/x86/hvm/svm/Makefile b/xen/arch/x86/hvm/svm/Makefile index 760d2954da..9773d539ef 100644 --- a/xen/arch/x86/hvm/svm/Makefile +++ b/xen/arch/x86/hvm/svm/Makefile @@ -6,3 +6,4 @@ obj-y += nestedsvm.o obj-y += svm.o obj-y += svmdebug.o obj-y += vmcb.o +obj-y += sev.o Please keep this sorted by object file name. Got it. Will do. diff --git a/xen/arch/x86/hvm/svm/sev.c b/xen/arch/x86/hvm/svm/sev.c new file mode 100644 index 00..336fad25f5 --- /dev/null +++ b/xen/arch/x86/hvm/svm/sev.c @@ -0,0 +1,4 @@ +#include +uint64_t __read_mostly pte_c_bit_mask; +unsigned int __read_mostly min_sev_asid; +unsigned int __read_mostly max_sev_asid; Several things. All new files should come with an SPDX tag. Unless you have other constraints, GPL-2.0-only is preferred. There also wants to be at least a oneline summary of what's going on here. Will do. All these variables look like they should be __ro_after_init. However, it's rather hard to judge, given no users yet. Yes, this is not supposed to dynamically change. Will fix. pte_c_bit_mask may want to be an intpte_t rather than uint64_t. Agree. Will fix ~Andrew Andrei.
Re: [PATCH v3 2/3] x86/svm: Drop the suffix _guest from vmcb bit
On 3/14/24 17:04, Jan Beulich wrote: On 13.03.2024 17:41, Vaishali Thakkar wrote: The suffix _guest is redundant for asid bit. Drop it to avoid adding extra code volume. While we're here, replace 0/1 with false/true and use VMCB accessors instead of open coding. Suggested-by: Andrew Cooper Signed-off-by: Vaishali Thakkar Reviewed-by: Jan Beulich with ... --- a/xen/arch/x86/hvm/svm/asid.c +++ b/xen/arch/x86/hvm/svm/asid.c @@ -37,14 +37,14 @@ void svm_asid_handle_vmrun(void) /* ASID 0 indicates that ASIDs are disabled. */ if ( p_asid->asid == 0 ) { -vmcb_set_guest_asid(vmcb, 1); +vmcb_set_asid(vmcb,true); ... the blank put back that was lost here (can be done while committing). Thanks Jan
[PATCH v3 3/3] x86/svmdebug: Print np, sev and sev_es vmcb bits
Currently only raw _np_ctrl is being printed. It can be informational to know about which particular bits are enabled. So, this commit adds the bit-by-bit decode for np, sev and sev_es bits. Note that while, only np is enabled in certain scenarios at the moment, work for enabling sev and sev_es is in progress. And it'll be useful to have this information as part of svmdebug. Signed-off-by: Vaishali Thakkar --- Changes since v1: - Pretty printing Changes since v2: - Minor changes in pretty printing to make information clear - Improve commit log and subject to include _np bit --- xen/arch/x86/hvm/svm/svmdebug.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/svm/svmdebug.c b/xen/arch/x86/hvm/svm/svmdebug.c index 0d714c728c..9d3badcf5d 100644 --- a/xen/arch/x86/hvm/svm/svmdebug.c +++ b/xen/arch/x86/hvm/svm/svmdebug.c @@ -51,8 +51,11 @@ void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb) vmcb->exitcode, vmcb->exit_int_info.raw); printk("exitinfo1 = %#"PRIx64" exitinfo2 = %#"PRIx64"\n", vmcb->exitinfo1, vmcb->exitinfo2); -printk("np_ctrl = %#"PRIx64" asid = %#x\n", - vmcb_get_np_ctrl(vmcb), vmcb_get_asid(vmcb)); +printk("asid = %#x np_ctrl = %#"PRIx64":%s%s%s\n", + vmcb_get_asid(vmcb), vmcb_get_np_ctrl(vmcb), + vmcb_get_np(vmcb) ? " NP" : "", + vmcb_get_sev(vmcb)? " SEV": "", + vmcb_get_sev_es(vmcb) ? " SEV_ES" : ""); printk("virtual vmload/vmsave = %d, virt_ext = %#"PRIx64"\n", vmcb->virt_ext.fields.vloadsave_enable, vmcb->virt_ext.bytes); printk("cpl = %d efer = %#"PRIx64" star = %#"PRIx64" lstar = %#"PRIx64"\n", -- 2.44.0
[PATCH v3 1/3] x86/svm: Drop the _enabled suffix from vmcb bits
The suffix is redundant for np/sev/sev-es bits. Drop it to avoid adding extra code volume. Also, while we're here, drop the double negations in one of the instances of _np bit, replace 0/1 with false/true in the use cases of _np and use VMCB accessors instead of open coding. Suggested-by: Andrew Cooper Signed-off-by: Vaishali Thakkar Reviewed-by: Andrew Cooper --- Changes since v1: - Address Andrew and Jan's reviews related to dropping double negation and replacing 0/1 with false/true - Fix the typo around signed-off-by Changes since v2: - Use VMCB accessors instead of open coding for the code that's touched by this commit --- xen/arch/x86/hvm/svm/nestedsvm.c| 14 +++--- xen/arch/x86/hvm/svm/svm.c | 2 +- xen/arch/x86/hvm/svm/vmcb.c | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 20 ++-- 4 files changed, 19 insertions(+), 19 deletions(-) diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index e4e01add8c..07630d74d3 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -571,7 +571,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -n2vmcb->_np_enable = 1; +vmcb_set_np(n2vmcb, true); nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); @@ -585,7 +585,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -n2vmcb->_np_enable = 1; +vmcb_set_np(n2vmcb, true); /* Keep h_cr3 as it is. */ n2vmcb->_h_cr3 = n1vmcb->_h_cr3; /* When l1 guest does shadow paging @@ -601,7 +601,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -n2vmcb->_np_enable = 0; +vmcb_set_np(n2vmcb, false); n2vmcb->_h_cr3 = 0x0; /* TODO: Once shadow-shadow paging is in place come back to here @@ -706,7 +706,7 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, } /* nested paging for the guest */ -svm->ns_hap_enabled = !!ns_vmcb->_np_enable; +svm->ns_hap_enabled = vmcb_get_np(ns_vmcb); /* Remember the V_INTR_MASK in hostflags */ svm->ns_hostflags.fields.vintrmask = !!ns_vmcb->_vintr.fields.intr_masking; @@ -1084,7 +1084,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -ns_vmcb->_np_enable = n2vmcb->_np_enable; +vmcb_set_np(ns_vmcb, vmcb_get_np(n2vmcb)); ns_vmcb->_cr3 = n2vmcb->_cr3; /* The vmcb->h_cr3 is the shadowed h_cr3. The original * unshadowed guest h_cr3 is kept in ns_vmcb->h_cr3, @@ -1093,7 +1093,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +vmcb_set_np(ns_vmcb, false); /* Throw h_cr3 away. Guest is not allowed to set it or * it can break out, otherwise (security hole!) */ ns_vmcb->_h_cr3 = 0x0; @@ -1104,7 +1104,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +vmcb_set_np(ns_vmcb, false); ns_vmcb->_h_cr3 = 0x0; /* The vmcb->_cr3 is the shadowed cr3. The original * unshadowed guest cr3 is kept in ns_vmcb->_cr3, diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index b551eac807..b1ab0b568b 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -473,7 +473,7 @@ static int svm_vmcb_restore(struct vcpu *v, struct hvm_hw_cpu *c) if ( paging_mode_hap(v->domain) ) { -vmcb_set_np_enable(vmcb, 1); +vmcb_set_np(vmcb, true); vmcb_set_g_pat(vmcb, MSR_IA32_CR_PAT_RESET /* guest PAT */); vmcb_set_h_cr3(vmcb, pagetable_get_paddr(p2m_get_pagetable(p2m))); } diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c index 282fe7cdbe..4e1f61dbe0 100644 --- a/xen/arch/x86/hvm/svm/vmcb.c +++ b/xen/arch/x86/hvm/svm/vmcb.c @@ -133,7 +133,7 @@ static int construct_vmcb(struct vcpu *v) if ( paging_mode_hap(v->domain) ) { -vmcb->_np_enable = 1; /* enable nested paging */ +vmcb_set_np(vmcb, true); /* enable nested paging */ vmcb->_g_pat = MSR_IA32_CR_PAT_RESET; /* guest PAT */ vmcb-
[PATCH v3 2/3] x86/svm: Drop the suffix _guest from vmcb bit
The suffix _guest is redundant for asid bit. Drop it to avoid adding extra code volume. While we're here, replace 0/1 with false/true and use VMCB accessors instead of open coding. Suggested-by: Andrew Cooper Signed-off-by: Vaishali Thakkar --- Changes since v1: - This patch wasn't part of v1. It's been added to address Andrew's suggestion. Changes since v2: - replace 0/1 with false/true in one of the instance - Use VMCB accessors instead of open coding --- xen/arch/x86/hvm/svm/asid.c | 6 +++--- xen/arch/x86/hvm/svm/nestedsvm.c | 8 xen/arch/x86/hvm/svm/svmdebug.c | 4 ++-- xen/arch/x86/include/asm/hvm/svm/nestedsvm.h | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 6 +++--- 5 files changed, 13 insertions(+), 13 deletions(-) diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c index 28e0dbc176..39e3c56919 100644 --- a/xen/arch/x86/hvm/svm/asid.c +++ b/xen/arch/x86/hvm/svm/asid.c @@ -37,14 +37,14 @@ void svm_asid_handle_vmrun(void) /* ASID 0 indicates that ASIDs are disabled. */ if ( p_asid->asid == 0 ) { -vmcb_set_guest_asid(vmcb, 1); +vmcb_set_asid(vmcb,true); vmcb->tlb_control = cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL; return; } -if ( vmcb_get_guest_asid(vmcb) != p_asid->asid ) -vmcb_set_guest_asid(vmcb, p_asid->asid); +if ( vmcb_get_asid(vmcb) != p_asid->asid ) +vmcb_set_asid(vmcb, p_asid->asid); vmcb->tlb_control = !need_flush ? TLB_CTRL_NO_FLUSH : diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index 07630d74d3..a8d5f4ee95 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -157,7 +157,7 @@ int cf_check nsvm_vcpu_reset(struct vcpu *v) svm->ns_hap_enabled = 0; svm->ns_vmcb_guestcr3 = 0; svm->ns_vmcb_hostcr3 = 0; -svm->ns_guest_asid = 0; +svm->ns_asid = 0; svm->ns_hostflags.bytes = 0; svm->ns_vmexit.exitinfo1 = 0; svm->ns_vmexit.exitinfo2 = 0; @@ -698,11 +698,11 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, /* Convert explicitely to boolean. Deals with l1 guests * that use flush-by-asid w/o checking the cpuid bits */ nv->nv_flushp2m = !!ns_vmcb->tlb_control; -if ( svm->ns_guest_asid != ns_vmcb->_guest_asid ) +if ( svm->ns_asid != vmcb_get_asid(ns_vmcb)) { nv->nv_flushp2m = 1; hvm_asid_flush_vcpu_asid(_nestedhvm(v).nv_n2asid); -svm->ns_guest_asid = ns_vmcb->_guest_asid; +svm->ns_asid = vmcb_get_asid(ns_vmcb); } /* nested paging for the guest */ @@ -1046,7 +1046,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) /* Keep it. It's maintainted by the l1 guest. */ /* ASID */ -/* ns_vmcb->_guest_asid = n2vmcb->_guest_asid; */ +/* vmcb_set_asid(ns_vmcb, vmcb_get_asid(n2vmcb)); */ /* TLB control */ ns_vmcb->tlb_control = 0; diff --git a/xen/arch/x86/hvm/svm/svmdebug.c b/xen/arch/x86/hvm/svm/svmdebug.c index 24358c6eea..0d714c728c 100644 --- a/xen/arch/x86/hvm/svm/svmdebug.c +++ b/xen/arch/x86/hvm/svm/svmdebug.c @@ -51,8 +51,8 @@ void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb) vmcb->exitcode, vmcb->exit_int_info.raw); printk("exitinfo1 = %#"PRIx64" exitinfo2 = %#"PRIx64"\n", vmcb->exitinfo1, vmcb->exitinfo2); -printk("np_ctrl = %#"PRIx64" guest_asid = %#x\n", - vmcb_get_np_ctrl(vmcb), vmcb_get_guest_asid(vmcb)); +printk("np_ctrl = %#"PRIx64" asid = %#x\n", + vmcb_get_np_ctrl(vmcb), vmcb_get_asid(vmcb)); printk("virtual vmload/vmsave = %d, virt_ext = %#"PRIx64"\n", vmcb->virt_ext.fields.vloadsave_enable, vmcb->virt_ext.bytes); printk("cpl = %d efer = %#"PRIx64" star = %#"PRIx64" lstar = %#"PRIx64"\n", diff --git a/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h b/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h index 406fc082b1..7767cd6080 100644 --- a/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h +++ b/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h @@ -51,7 +51,7 @@ struct nestedsvm { * the l1 guest nested page table */ uint64_t ns_vmcb_guestcr3, ns_vmcb_hostcr3; -uint32_t ns_guest_asid; +uint32_t ns_asid; bool ns_hap_enabled; diff --git a/xen/arch/x86/include/asm/hvm/svm/vmcb.h b/xen/arch/x86/include/asm/hvm/svm/vmcb.h index bf2b8d9a94..0396d10b90 100644 --- a/xen/arch/x86/include/asm/hvm/svm/vmcb.h +++ b/xen/arch/x86/include/asm/hvm/svm/vmcb.h @@ -383,7 +383,7 @@ typedef union bool intercepts:1; /* 0: cr/dr/exception/general interc
[PATCH v3 0/3] Misc changes for few VMCB bits
Hi, In this patchset, first & second patch removes the unnecessary suffix from a bunch of vmcb bits. Third patch is about printing the status of sev and sev-es bits while dumping VMCB. Changes since v1: - Address comments from Andrew and Jan - Add extrapatch to drop the suffix _guest as per Andrew's suggestion in one of the reviews - Address Andrew's comment with respect to pretty printing Changes since v2: - Use VMCB accessors instead of open coding in patch 1 & 2 - Fix one remaining instance of using false/true instead of 0/1 in patch 2 - Improve the pretty printing in svm-debug based on Jan's comments - Improve commit logs and the subject of patch 3 to include the changes done in v3 Vaishali Thakkar (3): x86/svm: Drop the _enabled suffix from vmcb bits x86/svm: Drop the suffix _guest from vmcb bit x86/svmdebug: Print np, sev and sev_es vmcb bits xen/arch/x86/hvm/svm/asid.c | 6 ++--- xen/arch/x86/hvm/svm/nestedsvm.c | 22 - xen/arch/x86/hvm/svm/svm.c | 2 +- xen/arch/x86/hvm/svm/svmdebug.c | 7 -- xen/arch/x86/hvm/svm/vmcb.c | 2 +- xen/arch/x86/include/asm/hvm/svm/nestedsvm.h | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 26 ++-- 7 files changed, 35 insertions(+), 32 deletions(-) -- 2.44.0
Re: [PATCH v2 3/3] x86/svmdebug: Print sev and sev_es vmcb bits
On 3/12/24 09:05, Jan Beulich wrote: On 11.03.2024 13:40, Vaishali Thakkar wrote: While sev and sev_es bits are not yet enabled in xen, including their status in the VMCB dump could be informational.Therefore, print it via svmdebug. Yet there are more bits there. I'm okay with leaving off printing of them here, but it wants clarifying why some are printed and some are not. Well, the idea is to print the bits that are either enabled or has WIP to enable them. (e.g. sev and sev_es) I didn't include other bits as I'm not sure if there is any WIP to enable them. Particularly including sev and sev_es is useful for us while working on the enablement of it. Does a commit log like the following makes it clear for you? " Currently only raw _np_ctrl is being printed. It can be informational to know about which particular bits are enabled. So, this commit adds the bit-by-bit decode for np, sev and sev_es bits. Note that while only np is enabled in certain scenarios at the moment, work for enabling sev and sev_es is in progress. And it's useful to have this information as part of svmdebug. " I'm also fine with including other bits here if that's preferred. --- a/xen/arch/x86/hvm/svm/svmdebug.c +++ b/xen/arch/x86/hvm/svm/svmdebug.c @@ -51,8 +51,11 @@ void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb) vmcb->exitcode, vmcb->exit_int_info.raw); printk("exitinfo1 = %#"PRIx64" exitinfo2 = %#"PRIx64"\n", vmcb->exitinfo1, vmcb->exitinfo2); -printk("np_ctrl = %#"PRIx64" asid = %#x\n", - vmcb_get_np_ctrl(vmcb), vmcb_get_asid(vmcb)); +printk("asid = %#x np_ctrl = %#"PRIx64" - %s%s%s\n", + vmcb_get_asid(vmcb), vmcb_get_np_ctrl(vmcb), + vmcb_get_np(vmcb) ? "NP" : "", + vmcb_get_sev(vmcb)? "SEV": "", + vmcb_get_sev_es(vmcb) ? "SEV_ES" : ""); Each of these three string literals needs a leading blank as separator. In exchange the one in the format string immediately after '-' then will want dropping. That'll still lead to slightly odd output if none of the bits is set; imo it would be slightly less odd if you used printk("asid = %#x np_ctrl = %#"PRIx64":%s%s%s\n", instead. Jan
Re: [PATCH v2 1/3] x86/svm: Drop the _enabled suffix from vmcb bits
On 3/12/24 11:49, Jan Beulich wrote: On 12.03.2024 11:00, Vaishali Thakkar wrote: On 3/12/24 08:54, Jan Beulich wrote: On 11.03.2024 13:40, Vaishali Thakkar wrote: --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -571,7 +571,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = true; nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); @@ -585,7 +585,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = true; /* Keep h_cr3 as it is. */ n2vmcb->_h_cr3 = n1vmcb->_h_cr3; /* When l1 guest does shadow paging @@ -601,7 +601,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -n2vmcb->_np_enable = 0; +n2vmcb->_np = false; n2vmcb->_h_cr3 = 0x0; /* TODO: Once shadow-shadow paging is in place come back to here @@ -706,7 +706,7 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, } /* nested paging for the guest */ -svm->ns_hap_enabled = !!ns_vmcb->_np_enable; +svm->ns_hap_enabled = ns_vmcb->_np; /* Remember the V_INTR_MASK in hostflags */ svm->ns_hostflags.fields.vintrmask = !!ns_vmcb->_vintr.fields.intr_masking; @@ -1084,7 +1084,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -ns_vmcb->_np_enable = n2vmcb->_np_enable; +ns_vmcb->_np = n2vmcb->_np; ns_vmcb->_cr3 = n2vmcb->_cr3; /* The vmcb->h_cr3 is the shadowed h_cr3. The original * unshadowed guest h_cr3 is kept in ns_vmcb->h_cr3, @@ -1093,7 +1093,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = false; /* Throw h_cr3 away. Guest is not allowed to set it or * it can break out, otherwise (security hole!) */ ns_vmcb->_h_cr3 = 0x0; @@ -1104,7 +1104,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = false; ns_vmcb->_h_cr3 = 0x0; /* The vmcb->_cr3 is the shadowed cr3. The original * unshadowed guest cr3 is kept in ns_vmcb->_cr3, While spotting the small issue below it occurred to me: Why is it that vmcb_set_...() is open-coded everywhere here? I think this would be pretty nice to avoid at the same time (for lines touched anyway, or in a separate prereq patch, or alternatively [and only ideally] for all other instances in a follow-on patch). Thoughts? Yes, I noticed this too. My plan was to send a followup patch for fixing all the instances where vmcb_set/get_...() can be used. There are bunch of other vmcb bits (apart from the ones being handled in this patchset) in this file and in svm.c which can benefit from using VMCB accessors. To keep churn as well as effort to find commits touching individual lines low, doing the conversion when touching lines anyway is imo preferable. A follow-on patch can then convert what's left. Alright, I'll replace open coding with VMCB accessors for the lines which are touched by this patchset. And others in a followup patchset. Jan
Re: [PATCH v2 2/3] x86/svm: Drop the suffix _guest from vmcb bit
On 3/12/24 08:59, Jan Beulich wrote: On 11.03.2024 13:40, Vaishali Thakkar wrote: @@ -698,11 +698,11 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, /* Convert explicitely to boolean. Deals with l1 guests * that use flush-by-asid w/o checking the cpuid bits */ nv->nv_flushp2m = !!ns_vmcb->tlb_control; -if ( svm->ns_guest_asid != ns_vmcb->_guest_asid ) +if ( svm->ns_asid != ns_vmcb->_asid ) { nv->nv_flushp2m = 1; hvm_asid_flush_vcpu_asid(_nestedhvm(v).nv_n2asid); -svm->ns_guest_asid = ns_vmcb->_guest_asid; +svm->ns_asid = ns_vmcb->_asid; } /* nested paging for the guest */ @@ -1046,7 +1046,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) /* Keep it. It's maintainted by the l1 guest. */ /* ASID */ -/* ns_vmcb->_guest_asid = n2vmcb->_guest_asid; */ +/* ns_vmcb->_asid = n2vmcb->_asid; */ Unlike in the earlier patch, where I could accept the request to switch to using accessor functions as scope-creep-ish, here I'm pretty firm with my request to stop their open-coding at the same time. Unless of course there's a technical reason the accessors cannot be used here. Yes, so as mentioned in the other patch's reply, I plan to tackle this instance too in the followup patchset along with others. So, if you're fine with it, I'll leave this one here for now. Unless you prefer otherwise. Jan
Re: [PATCH v2 1/3] x86/svm: Drop the _enabled suffix from vmcb bits
On 3/12/24 08:54, Jan Beulich wrote: On 11.03.2024 13:40, Vaishali Thakkar wrote: --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -571,7 +571,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = true; nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); @@ -585,7 +585,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = true; /* Keep h_cr3 as it is. */ n2vmcb->_h_cr3 = n1vmcb->_h_cr3; /* When l1 guest does shadow paging @@ -601,7 +601,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -n2vmcb->_np_enable = 0; +n2vmcb->_np = false; n2vmcb->_h_cr3 = 0x0; /* TODO: Once shadow-shadow paging is in place come back to here @@ -706,7 +706,7 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, } /* nested paging for the guest */ -svm->ns_hap_enabled = !!ns_vmcb->_np_enable; +svm->ns_hap_enabled = ns_vmcb->_np; /* Remember the V_INTR_MASK in hostflags */ svm->ns_hostflags.fields.vintrmask = !!ns_vmcb->_vintr.fields.intr_masking; @@ -1084,7 +1084,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -ns_vmcb->_np_enable = n2vmcb->_np_enable; +ns_vmcb->_np = n2vmcb->_np; ns_vmcb->_cr3 = n2vmcb->_cr3; /* The vmcb->h_cr3 is the shadowed h_cr3. The original * unshadowed guest h_cr3 is kept in ns_vmcb->h_cr3, @@ -1093,7 +1093,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = false; /* Throw h_cr3 away. Guest is not allowed to set it or * it can break out, otherwise (security hole!) */ ns_vmcb->_h_cr3 = 0x0; @@ -1104,7 +1104,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = false; ns_vmcb->_h_cr3 = 0x0; /* The vmcb->_cr3 is the shadowed cr3. The original * unshadowed guest cr3 is kept in ns_vmcb->_cr3, While spotting the small issue below it occurred to me: Why is it that vmcb_set_...() is open-coded everywhere here? I think this would be pretty nice to avoid at the same time (for lines touched anyway, or in a separate prereq patch, or alternatively [and only ideally] for all other instances in a follow-on patch). Thoughts? Yes, I noticed this too. My plan was to send a followup patch for fixing all the instances where vmcb_set/get_...() can be used. There are bunch of other vmcb bits (apart from the ones being handled in this patchset) in this file and in svm.c which can benefit from using VMCB accessors. --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -473,7 +473,7 @@ static int svm_vmcb_restore(struct vcpu *v, struct hvm_hw_cpu *c) if ( paging_mode_hap(v->domain) ) { -vmcb_set_np_enable(vmcb, 1); +vmcb_set_np(vmcb, 1); No switching to "true" here? (If the answer to the other question is "No" for whatever reason, I'd nevertheless like to see this on adjusted, which could then be done while committing.) Sorry, I missed this instance. I'll fix it if I'll need to send another revised patchset for some other fixes (based on review comments), else feel free to fix it while committing. Thanks. Jan
[PATCH v2 1/3] x86/svm: Drop the _enabled suffix from vmcb bits
The suffix is redundant for np/sev/sev-es bits. Drop it to avoid adding extra code volume. While we're here, drop the double negations in one of the instances of _np bit and replace 0/1 with false/true in the use cases of _np. Suggested-by: Andrew Cooper Signed-off-by: Vaishali Thakkar Reviewed-by: Andrew Cooper --- Changes since v1: - Address Andrew and Jan's reviews related to dropping double negation and replacing 0/1 with false/true - Fix the typo around signed-off-by --- xen/arch/x86/hvm/svm/nestedsvm.c| 14 +++--- xen/arch/x86/hvm/svm/svm.c | 2 +- xen/arch/x86/hvm/svm/vmcb.c | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 18 +- 4 files changed, 18 insertions(+), 18 deletions(-) diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index e4e01add8c..37548cf05c 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -571,7 +571,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = true; nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); @@ -585,7 +585,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = true; /* Keep h_cr3 as it is. */ n2vmcb->_h_cr3 = n1vmcb->_h_cr3; /* When l1 guest does shadow paging @@ -601,7 +601,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -n2vmcb->_np_enable = 0; +n2vmcb->_np = false; n2vmcb->_h_cr3 = 0x0; /* TODO: Once shadow-shadow paging is in place come back to here @@ -706,7 +706,7 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, } /* nested paging for the guest */ -svm->ns_hap_enabled = !!ns_vmcb->_np_enable; +svm->ns_hap_enabled = ns_vmcb->_np; /* Remember the V_INTR_MASK in hostflags */ svm->ns_hostflags.fields.vintrmask = !!ns_vmcb->_vintr.fields.intr_masking; @@ -1084,7 +1084,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -ns_vmcb->_np_enable = n2vmcb->_np_enable; +ns_vmcb->_np = n2vmcb->_np; ns_vmcb->_cr3 = n2vmcb->_cr3; /* The vmcb->h_cr3 is the shadowed h_cr3. The original * unshadowed guest h_cr3 is kept in ns_vmcb->h_cr3, @@ -1093,7 +1093,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = false; /* Throw h_cr3 away. Guest is not allowed to set it or * it can break out, otherwise (security hole!) */ ns_vmcb->_h_cr3 = 0x0; @@ -1104,7 +1104,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = false; ns_vmcb->_h_cr3 = 0x0; /* The vmcb->_cr3 is the shadowed cr3. The original * unshadowed guest cr3 is kept in ns_vmcb->_cr3, diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index b551eac807..1b38f6a494 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -473,7 +473,7 @@ static int svm_vmcb_restore(struct vcpu *v, struct hvm_hw_cpu *c) if ( paging_mode_hap(v->domain) ) { -vmcb_set_np_enable(vmcb, 1); +vmcb_set_np(vmcb, 1); vmcb_set_g_pat(vmcb, MSR_IA32_CR_PAT_RESET /* guest PAT */); vmcb_set_h_cr3(vmcb, pagetable_get_paddr(p2m_get_pagetable(p2m))); } diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c index 282fe7cdbe..770a0228d4 100644 --- a/xen/arch/x86/hvm/svm/vmcb.c +++ b/xen/arch/x86/hvm/svm/vmcb.c @@ -133,7 +133,7 @@ static int construct_vmcb(struct vcpu *v) if ( paging_mode_hap(v->domain) ) { -vmcb->_np_enable = 1; /* enable nested paging */ +vmcb->_np = true; /* enable nested paging */ vmcb->_g_pat = MSR_IA32_CR_PAT_RESET; /* guest PAT */ vmcb->_h_cr3 = pagetable_get_paddr( p2m_get_pagetable(p2m_get_hostp2m(v->domain))); diff --git a/xen/arch/x86/include/asm/hvm/svm/vmcb.h b/xen/arch/x86/include/asm/hvm/svm/vmcb.h index 91221ff4e2..76507576
[PATCH v2 3/3] x86/svmdebug: Print sev and sev_es vmcb bits
While sev and sev_es bits are not yet enabled in xen, including their status in the VMCB dump could be informational.Therefore, print it via svmdebug. Signed-off-by: Vaishali Thakkar --- Changes since v1: - Pretty printing --- xen/arch/x86/hvm/svm/svmdebug.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/svm/svmdebug.c b/xen/arch/x86/hvm/svm/svmdebug.c index 0d714c728c..ba5fa40071 100644 --- a/xen/arch/x86/hvm/svm/svmdebug.c +++ b/xen/arch/x86/hvm/svm/svmdebug.c @@ -51,8 +51,11 @@ void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb) vmcb->exitcode, vmcb->exit_int_info.raw); printk("exitinfo1 = %#"PRIx64" exitinfo2 = %#"PRIx64"\n", vmcb->exitinfo1, vmcb->exitinfo2); -printk("np_ctrl = %#"PRIx64" asid = %#x\n", - vmcb_get_np_ctrl(vmcb), vmcb_get_asid(vmcb)); +printk("asid = %#x np_ctrl = %#"PRIx64" - %s%s%s\n", + vmcb_get_asid(vmcb), vmcb_get_np_ctrl(vmcb), + vmcb_get_np(vmcb) ? "NP" : "", + vmcb_get_sev(vmcb)? "SEV": "", + vmcb_get_sev_es(vmcb) ? "SEV_ES" : ""); printk("virtual vmload/vmsave = %d, virt_ext = %#"PRIx64"\n", vmcb->virt_ext.fields.vloadsave_enable, vmcb->virt_ext.bytes); printk("cpl = %d efer = %#"PRIx64" star = %#"PRIx64" lstar = %#"PRIx64"\n", -- 2.44.0
[PATCH v2 2/3] x86/svm: Drop the suffix _guest from vmcb bit
The suffix _guest is redundant for asid bit. Drop it to avoid adding extra code volume. Suggested-by: Andrew Cooper Signed-off-by: Vaishali Thakkar --- Changes since v1: - This patch wasn't part of v1. It's been added to address Andrew's suggestion. --- xen/arch/x86/hvm/svm/asid.c | 6 +++--- xen/arch/x86/hvm/svm/nestedsvm.c | 8 xen/arch/x86/hvm/svm/svmdebug.c | 4 ++-- xen/arch/x86/include/asm/hvm/svm/nestedsvm.h | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 6 +++--- 5 files changed, 13 insertions(+), 13 deletions(-) diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c index 28e0dbc176..d5f70f8848 100644 --- a/xen/arch/x86/hvm/svm/asid.c +++ b/xen/arch/x86/hvm/svm/asid.c @@ -37,14 +37,14 @@ void svm_asid_handle_vmrun(void) /* ASID 0 indicates that ASIDs are disabled. */ if ( p_asid->asid == 0 ) { -vmcb_set_guest_asid(vmcb, 1); +vmcb_set_asid(vmcb, 1); vmcb->tlb_control = cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL; return; } -if ( vmcb_get_guest_asid(vmcb) != p_asid->asid ) -vmcb_set_guest_asid(vmcb, p_asid->asid); +if ( vmcb_get_asid(vmcb) != p_asid->asid ) +vmcb_set_asid(vmcb, p_asid->asid); vmcb->tlb_control = !need_flush ? TLB_CTRL_NO_FLUSH : diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index 37548cf05c..adbd49b5a2 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -157,7 +157,7 @@ int cf_check nsvm_vcpu_reset(struct vcpu *v) svm->ns_hap_enabled = 0; svm->ns_vmcb_guestcr3 = 0; svm->ns_vmcb_hostcr3 = 0; -svm->ns_guest_asid = 0; +svm->ns_asid = 0; svm->ns_hostflags.bytes = 0; svm->ns_vmexit.exitinfo1 = 0; svm->ns_vmexit.exitinfo2 = 0; @@ -698,11 +698,11 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, /* Convert explicitely to boolean. Deals with l1 guests * that use flush-by-asid w/o checking the cpuid bits */ nv->nv_flushp2m = !!ns_vmcb->tlb_control; -if ( svm->ns_guest_asid != ns_vmcb->_guest_asid ) +if ( svm->ns_asid != ns_vmcb->_asid ) { nv->nv_flushp2m = 1; hvm_asid_flush_vcpu_asid(_nestedhvm(v).nv_n2asid); -svm->ns_guest_asid = ns_vmcb->_guest_asid; +svm->ns_asid = ns_vmcb->_asid; } /* nested paging for the guest */ @@ -1046,7 +1046,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) /* Keep it. It's maintainted by the l1 guest. */ /* ASID */ -/* ns_vmcb->_guest_asid = n2vmcb->_guest_asid; */ +/* ns_vmcb->_asid = n2vmcb->_asid; */ /* TLB control */ ns_vmcb->tlb_control = 0; diff --git a/xen/arch/x86/hvm/svm/svmdebug.c b/xen/arch/x86/hvm/svm/svmdebug.c index 24358c6eea..0d714c728c 100644 --- a/xen/arch/x86/hvm/svm/svmdebug.c +++ b/xen/arch/x86/hvm/svm/svmdebug.c @@ -51,8 +51,8 @@ void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb) vmcb->exitcode, vmcb->exit_int_info.raw); printk("exitinfo1 = %#"PRIx64" exitinfo2 = %#"PRIx64"\n", vmcb->exitinfo1, vmcb->exitinfo2); -printk("np_ctrl = %#"PRIx64" guest_asid = %#x\n", - vmcb_get_np_ctrl(vmcb), vmcb_get_guest_asid(vmcb)); +printk("np_ctrl = %#"PRIx64" asid = %#x\n", + vmcb_get_np_ctrl(vmcb), vmcb_get_asid(vmcb)); printk("virtual vmload/vmsave = %d, virt_ext = %#"PRIx64"\n", vmcb->virt_ext.fields.vloadsave_enable, vmcb->virt_ext.bytes); printk("cpl = %d efer = %#"PRIx64" star = %#"PRIx64" lstar = %#"PRIx64"\n", diff --git a/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h b/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h index 406fc082b1..7767cd6080 100644 --- a/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h +++ b/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h @@ -51,7 +51,7 @@ struct nestedsvm { * the l1 guest nested page table */ uint64_t ns_vmcb_guestcr3, ns_vmcb_hostcr3; -uint32_t ns_guest_asid; +uint32_t ns_asid; bool ns_hap_enabled; diff --git a/xen/arch/x86/include/asm/hvm/svm/vmcb.h b/xen/arch/x86/include/asm/hvm/svm/vmcb.h index 76507576ba..5d539fcdcf 100644 --- a/xen/arch/x86/include/asm/hvm/svm/vmcb.h +++ b/xen/arch/x86/include/asm/hvm/svm/vmcb.h @@ -383,7 +383,7 @@ typedef union bool intercepts:1; /* 0: cr/dr/exception/general intercepts, * pause_filter_count, tsc_offset */ bool iopm:1; /* 1: iopm_base_pa, msrpm_base_pa */ -bool asid:1; /* 2: guest_asid */ +bool asid:1; /* 2: asid */ bool tpr:1;
[PATCH v2 0/3] x86/svm : Misc changes for few vmcb bits
Hi, In this patchset, first & second patch removes the unnecessary suffix from a bunch of vmcb bits. Third patch is about printing the status of sev and sev-es bits while dumping VMCB. Changes since v1: - Address comments from Andrew and Jan - Add extrapatch to drop the suffix _guest as per Andrew's suggestion in one of the reviews - Address Andrew's comment with respect to pretty printing Vaishali Thakkar (3): x86/svm: Drop the _enabled suffix from vmcb bits x86/svm: Drop the suffix _guest from vmcb bit x86/svmdebug: Print sev and sev_es vmcb bits xen/arch/x86/hvm/svm/asid.c | 6 ++--- xen/arch/x86/hvm/svm/nestedsvm.c | 22 +- xen/arch/x86/hvm/svm/svm.c | 2 +- xen/arch/x86/hvm/svm/svmdebug.c | 7 -- xen/arch/x86/hvm/svm/vmcb.c | 2 +- xen/arch/x86/include/asm/hvm/svm/nestedsvm.h | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 24 ++-- 7 files changed, 34 insertions(+), 31 deletions(-) -- 2.44.0
Re: [PATCH 2/2] x86/svmdebug: Print sev and sev_es vmcb bits
On 3/8/24 00:34, Andrew Cooper wrote: On 07/03/2024 9:40 pm, Vaishali Thakkar wrote: diff --git a/xen/arch/x86/hvm/svm/svmdebug.c b/xen/arch/x86/hvm/svm/svmdebug.c index 24358c6eea..f54b426fb3 100644 --- a/xen/arch/x86/hvm/svm/svmdebug.c +++ b/xen/arch/x86/hvm/svm/svmdebug.c @@ -53,6 +53,8 @@ void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb) vmcb->exitinfo1, vmcb->exitinfo2); printk("np_ctrl = %#"PRIx64" guest_asid = %#x\n", vmcb_get_np_ctrl(vmcb), vmcb_get_guest_asid(vmcb)); +printk("sev = %d sev_es = %d\n", + vmcb_get_sev(vmcb), vmcb_get_sev_es(vmcb)); Hmm. These are covered by the previous line printing all of np_ctrl. What about rearranging the previous line to be something like: printk("asid: %#x, np_ctrl: %#"PRIx64" -%s%s%s\n", vmcb->_asid, vmcb->_np_ctrl, vmcb->_np ? " NP" : "", vmcb->_sev ? " SEV" : "", ...); This is more compact (things like "guest" in "guest asid" is entirely redundant), and provides both the raw _np_ctrl field and a bit-by-bit decode on the same line, rather than having different parts of the info on different lines and bools written out in longhand? Good point. Will change it in the revised v2. See xen/arch/x86/spec_ctrl.c: print_details() for a rather more complete example of this style of printk() rendering for bits, including how to tabulate it for better readability. Thanks for pointing to the reference. ~Andrew
Re: [PATCH 1/2] x86/svm: Drop the _enabled suffix from vmcb bits
On 3/8/24 00:22, Andrew Cooper wrote: On 07/03/2024 9:40 pm, Vaishali Thakkar wrote: The suffix is redundant for np/sev/sev-es bits. Drop it to avoid adding extra code volume. Suggested-by: Andrew Cooper Signed-off-by: Vaishali Thakkar i Typo on the end of your email address? Oops, thanks for catching it. diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index e4e01add8c..7e285cf85a 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -706,7 +706,7 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, } /* nested paging for the guest */ -svm->ns_hap_enabled = !!ns_vmcb->_np_enable; +svm->ns_hap_enabled = !!ns_vmcb->_np; Because the type of is bool, the !! can be dropped too while changing this line. Thanks for the review. As I'm sending the revised patchset anyway, will fix both things in this patch too. It seems that this was missing cleanup from f57ae00635 "SVM: split _np_enable VMCB field". Anyway, Reviewed-by: Andrew Cooper and I'm happy to fix on commit.
[PATCH 0/2] x86/svm : Misc changes for few vmcb bits
Hi, In this patchset, first patch removes the unnecessary suffix from a bunch of vmcb bits and the second patch is about printing the status of sev and sev-es bits while dumping VMCB. Vaishali Thakkar (2): x86/svm: Drop the _enabled suffix from vmcb bits x86/svmdebug: Print sev and sev_es vmcb bits xen/arch/x86/hvm/svm/nestedsvm.c| 14 +++--- xen/arch/x86/hvm/svm/svm.c | 2 +- xen/arch/x86/hvm/svm/svmdebug.c | 2 ++ xen/arch/x86/hvm/svm/vmcb.c | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 18 +- 5 files changed, 20 insertions(+), 18 deletions(-) -- 2.44.0
[PATCH 2/2] x86/svmdebug: Print sev and sev_es vmcb bits
While sev and sev_es bits are not yet enabled in xen, including their status in the VMCB dump could be informational.Therefore, print it via svmdebug. Signed-off-by: Vaishali Thakkar --- JFYI, we'll send the follow-up patches with the enablement of sev and ASP driver. --- xen/arch/x86/hvm/svm/svmdebug.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/arch/x86/hvm/svm/svmdebug.c b/xen/arch/x86/hvm/svm/svmdebug.c index 24358c6eea..f54b426fb3 100644 --- a/xen/arch/x86/hvm/svm/svmdebug.c +++ b/xen/arch/x86/hvm/svm/svmdebug.c @@ -53,6 +53,8 @@ void svm_vmcb_dump(const char *from, const struct vmcb_struct *vmcb) vmcb->exitinfo1, vmcb->exitinfo2); printk("np_ctrl = %#"PRIx64" guest_asid = %#x\n", vmcb_get_np_ctrl(vmcb), vmcb_get_guest_asid(vmcb)); +printk("sev = %d sev_es = %d\n", + vmcb_get_sev(vmcb), vmcb_get_sev_es(vmcb)); printk("virtual vmload/vmsave = %d, virt_ext = %#"PRIx64"\n", vmcb->virt_ext.fields.vloadsave_enable, vmcb->virt_ext.bytes); printk("cpl = %d efer = %#"PRIx64" star = %#"PRIx64" lstar = %#"PRIx64"\n", -- 2.44.0
[PATCH 1/2] x86/svm: Drop the _enabled suffix from vmcb bits
The suffix is redundant for np/sev/sev-es bits. Drop it to avoid adding extra code volume. Suggested-by: Andrew Cooper Signed-off-by: Vaishali Thakkar i --- xen/arch/x86/hvm/svm/nestedsvm.c| 14 +++--- xen/arch/x86/hvm/svm/svm.c | 2 +- xen/arch/x86/hvm/svm/vmcb.c | 2 +- xen/arch/x86/include/asm/hvm/svm/vmcb.h | 18 +- 4 files changed, 18 insertions(+), 18 deletions(-) diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index e4e01add8c..7e285cf85a 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -571,7 +571,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = 1; nestedsvm_vmcb_set_nestedp2m(v, ns_vmcb, n2vmcb); @@ -585,7 +585,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -n2vmcb->_np_enable = 1; +n2vmcb->_np = 1; /* Keep h_cr3 as it is. */ n2vmcb->_h_cr3 = n1vmcb->_h_cr3; /* When l1 guest does shadow paging @@ -601,7 +601,7 @@ static int nsvm_vmcb_prepare4vmrun(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -n2vmcb->_np_enable = 0; +n2vmcb->_np = 0; n2vmcb->_h_cr3 = 0x0; /* TODO: Once shadow-shadow paging is in place come back to here @@ -706,7 +706,7 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs, } /* nested paging for the guest */ -svm->ns_hap_enabled = !!ns_vmcb->_np_enable; +svm->ns_hap_enabled = !!ns_vmcb->_np; /* Remember the V_INTR_MASK in hostflags */ svm->ns_hostflags.fields.vintrmask = !!ns_vmcb->_vintr.fields.intr_masking; @@ -1084,7 +1084,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) if ( nestedhvm_paging_mode_hap(v) ) { /* host nested paging + guest nested paging. */ -ns_vmcb->_np_enable = n2vmcb->_np_enable; +ns_vmcb->_np = n2vmcb->_np; ns_vmcb->_cr3 = n2vmcb->_cr3; /* The vmcb->h_cr3 is the shadowed h_cr3. The original * unshadowed guest h_cr3 is kept in ns_vmcb->h_cr3, @@ -1093,7 +1093,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else if ( paging_mode_hap(v->domain) ) { /* host nested paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = 0; /* Throw h_cr3 away. Guest is not allowed to set it or * it can break out, otherwise (security hole!) */ ns_vmcb->_h_cr3 = 0x0; @@ -1104,7 +1104,7 @@ nsvm_vmcb_prepare4vmexit(struct vcpu *v, struct cpu_user_regs *regs) else { /* host shadow paging + guest shadow paging. */ -ns_vmcb->_np_enable = 0; +ns_vmcb->_np = 0; ns_vmcb->_h_cr3 = 0x0; /* The vmcb->_cr3 is the shadowed cr3. The original * unshadowed guest cr3 is kept in ns_vmcb->_cr3, diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index b551eac807..1b38f6a494 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -473,7 +473,7 @@ static int svm_vmcb_restore(struct vcpu *v, struct hvm_hw_cpu *c) if ( paging_mode_hap(v->domain) ) { -vmcb_set_np_enable(vmcb, 1); +vmcb_set_np(vmcb, 1); vmcb_set_g_pat(vmcb, MSR_IA32_CR_PAT_RESET /* guest PAT */); vmcb_set_h_cr3(vmcb, pagetable_get_paddr(p2m_get_pagetable(p2m))); } diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c index 282fe7cdbe..a8dd87ca36 100644 --- a/xen/arch/x86/hvm/svm/vmcb.c +++ b/xen/arch/x86/hvm/svm/vmcb.c @@ -133,7 +133,7 @@ static int construct_vmcb(struct vcpu *v) if ( paging_mode_hap(v->domain) ) { -vmcb->_np_enable = 1; /* enable nested paging */ +vmcb->_np = 1; /* enable nested paging */ vmcb->_g_pat = MSR_IA32_CR_PAT_RESET; /* guest PAT */ vmcb->_h_cr3 = pagetable_get_paddr( p2m_get_pagetable(p2m_get_hostp2m(v->domain))); diff --git a/xen/arch/x86/include/asm/hvm/svm/vmcb.h b/xen/arch/x86/include/asm/hvm/svm/vmcb.h index 91221ff4e2..76507576ba 100644 --- a/xen/arch/x86/include/asm/hvm/svm/vmcb.h +++ b/xen/arch/x86/include/asm/hvm/svm/vmcb.h @@ -473,12 +473,12 @@ struct vmcb_struct { intinfo_t exit_int_info;/* offset 0x88 */ union { /* offset 0x90 - cleanbit 4 */ struct { -bool _np_enable :1; -bool _sev_enable:1; -
AMD SEV Enablement plans in Xen
Hi All, This is an informational post about our plans for the enablement of AMD SEV support in Xen. This work will be done as part of the Hyper OpenX project[1]. Phase Zero: Our primary intention is to gather the necessary information required to commence the upstream work for this project. This phase also encompasses the development of a small demo to demonstrate the SME technology itself. Please note, this small demo won't be part of the upstream, as its primary purpose is to demonstrate the SME technology as part of Hyper OpenX project. And it is not integral to the enablement work in Xen. Although the demo code will be open source, so feel free to keep an eye on this repo[2] as we'll add the related links there. Phase One: The main goal of this phase is to achieve basic SEV support using XTF guests. This work will entail: - Xen adaptations to make it compliant with SEV technology. (For example: how Xen currently manages multiple ASIDs for a single VM) - An ASP/PSP driver for platform and key management. - ASID allocation mechanism Dom0: - Introduction of a security hypercall and an ASP sub-op - Support for enabling SEV during guest creation DomU XTF: - Support for the security hypercall and ASP sub-op - Support for the C bit - Test cases for OSS-Test to launch the XTF guest Phase Two: This phase emphasizes achieving full support for the PVH VM with paravirtualized devices capable of running in the SEV-ES environment. The primary tasks include: - SEV-ES support addition - GHCB MSR protocol implementation and #vc handler - Enhancements in PV protocol related to PV devices framework (Xenstore/Xen console) - Adjustments in VMEXIT handling - Establishing ABI rules for the HVM ABI redesign - Dom0 developments concerning the HVM ABI redesign - OSS Test: PVH Linux+initrd+metadata+signature mimicking phase one XTF test - XTF(testing): comprehensive test cases for the new HVM ABI Phase Three: - This phase revolves around enabling SEV-SNP support for PVH Linux guests. Predominantly, this will require: - Addition of alternative SNP commands supporting the API and extending the flow in the PSP/ASP driver - Hypercall expansion for domain creation - Developments in RMP Management - Dom0 and DomU developments related to the enablement of SNP in guests - Testing that includes support for guest RMP instructions We're also looking forward to integrating CI and documenting the various project stages. Like any significant upstream project, implementation details may change as we advance. However, we are committed to collaborating and communicating with the Xen community regarding any modifications. We would also like to thank Andrew and folks from Apertus solutions , in doing the early research with regards to defining the tasks for the AMD SEV-SNP enablement in Xen. Please don't hesitate to reply here or email me & Andrei (CC'ed here) if you have any further inquiries. Thank you, Vaishali [1] https://www.lemondeinformatique.fr/actualites/lire-hyper-open-x-sort-de-terre-avec-10-meteuro-de-financements-90954.html [2] https://github.com/xcp-ng/hyper-sev-project