Re: [PATCH 1/2] X86/KVM: Properly update 'tsc_offset' to represent the running guest
On Fri, 2018-04-13 at 17:35 +0200, Paolo Bonzini wrote: > On 13/04/2018 14:40, Raslan, KarimAllah wrote: > > > > > > > > > > > static void update_ia32_tsc_adjust_msr(struct kvm_vcpu *vcpu, s64 offset) > > > { > > > - u64 curr_offset = vcpu->arch.tsc_offset; > > > + u64 curr_offset = kvm_x86_ops->read_l1_tsc_offset(vcpu); > > I might be missing something but is this really strictly needed or is > > it really a bug? > > > > I can see update_ia32_tsc_adjust_msr called from kvm_write_tsc only > > which is called from a) vmx_set_msr or b) kvm_arch_vcpu_postcreate. > > The adjust_msr would only be called if !host_initiated. So only > > vmx_set_msr which is coming from an L1 write (or a restore but that > > would not be !host_initiated). So the only that tsc_adjust is called is > > !is_guest_mode. > > It can also be called from guest mode if the MSR bitmap says there's no > L1 vmexit for that MSR; that's what the testcases do. Apparently I will never wrap my head around this nested stuff :D > > Paolo > > > > > > > > > vcpu->arch.ia32_tsc_adjust_msr += offset - curr_offset; > > Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Re: [PATCH 1/2] X86/KVM: Properly update 'tsc_offset' to represent the running guest
On Fri, 2018-04-13 at 18:04 +0200, Paolo Bonzini wrote: > On 13/04/2018 18:02, Jim Mattson wrote: > > > > On Fri, Apr 13, 2018 at 4:23 AM, Paolo Bonzini wrote: > > > > > > From: KarimAllah Ahmed > > > > > > Update 'tsc_offset' on vmenty/vmexit of L2 guests to ensure that it always > > > captures the TSC_OFFSET of the running guest whether it is the L1 or L2 > > > guest. > > > > > > Cc: Jim Mattson > > > Cc: Paolo Bonzini > > > Cc: Radim Krčmář > > > Cc: k...@vger.kernel.org > > > Cc: linux-kernel@vger.kernel.org > > > Suggested-by: Paolo Bonzini > > > Signed-off-by: KarimAllah Ahmed > > > [AMD changes, fix update_ia32_tsc_adjust_msr. - Paolo] > > > Signed-off-by: Paolo Bonzini > > > > > > > > @@ -11489,6 +11497,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, > > > bool launch) > > > if (enable_shadow_vmcs) > > > copy_shadow_to_vmcs12(vmx); > > > > > > + if (vmcs12->cpu_based_vm_exec_control & > > > CPU_BASED_USE_TSC_OFFSETING) > > > + vcpu->arch.tsc_offset += vmcs12->tsc_offset; > > > + > > > > This seems a little early, since we don't restore the L1 TSC offset on > > the nested_vmx_failValid path. > > > > Now this can be a nice one to introduce the VMX API tests. :) I'll try > to do it on Monday as punishment for not noticing the bug. In the > meanwhile, Karim, can you post a fixed fixed version? done > > Paolo > Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Re: [PATCH 1/2] X86/KVM: Properly update 'tsc_offset' to represent the running guest
On 13/04/2018 18:02, Jim Mattson wrote: > On Fri, Apr 13, 2018 at 4:23 AM, Paolo Bonzini wrote: >> From: KarimAllah Ahmed >> >> Update 'tsc_offset' on vmenty/vmexit of L2 guests to ensure that it always >> captures the TSC_OFFSET of the running guest whether it is the L1 or L2 >> guest. >> >> Cc: Jim Mattson >> Cc: Paolo Bonzini >> Cc: Radim Krčmář >> Cc: k...@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Suggested-by: Paolo Bonzini >> Signed-off-by: KarimAllah Ahmed >> [AMD changes, fix update_ia32_tsc_adjust_msr. - Paolo] >> Signed-off-by: Paolo Bonzini > >> @@ -11489,6 +11497,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, >> bool launch) >> if (enable_shadow_vmcs) >> copy_shadow_to_vmcs12(vmx); >> >> + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) >> + vcpu->arch.tsc_offset += vmcs12->tsc_offset; >> + > > This seems a little early, since we don't restore the L1 TSC offset on > the nested_vmx_failValid path. > Now this can be a nice one to introduce the VMX API tests. :) I'll try to do it on Monday as punishment for not noticing the bug. In the meanwhile, Karim, can you post a fixed fixed version? Paolo
Re: [PATCH 1/2] X86/KVM: Properly update 'tsc_offset' to represent the running guest
On Fri, Apr 13, 2018 at 4:23 AM, Paolo Bonzini wrote: > From: KarimAllah Ahmed > > Update 'tsc_offset' on vmenty/vmexit of L2 guests to ensure that it always > captures the TSC_OFFSET of the running guest whether it is the L1 or L2 > guest. > > Cc: Jim Mattson > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: k...@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Suggested-by: Paolo Bonzini > Signed-off-by: KarimAllah Ahmed > [AMD changes, fix update_ia32_tsc_adjust_msr. - Paolo] > Signed-off-by: Paolo Bonzini > @@ -11489,6 +11497,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool > launch) > if (enable_shadow_vmcs) > copy_shadow_to_vmcs12(vmx); > > + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) > + vcpu->arch.tsc_offset += vmcs12->tsc_offset; > + This seems a little early, since we don't restore the L1 TSC offset on the nested_vmx_failValid path.
Re: [PATCH 1/2] X86/KVM: Properly update 'tsc_offset' to represent the running guest
On 13/04/2018 14:40, Raslan, KarimAllah wrote: >> >> static void update_ia32_tsc_adjust_msr(struct kvm_vcpu *vcpu, s64 offset) >> { >> -u64 curr_offset = vcpu->arch.tsc_offset; >> +u64 curr_offset = kvm_x86_ops->read_l1_tsc_offset(vcpu); > I might be missing something but is this really strictly needed or is > it really a bug? > > I can see update_ia32_tsc_adjust_msr called from kvm_write_tsc only > which is called from a) vmx_set_msr or b) kvm_arch_vcpu_postcreate. > The adjust_msr would only be called if !host_initiated. So only > vmx_set_msr which is coming from an L1 write (or a restore but that > would not be !host_initiated). So the only that tsc_adjust is called is > !is_guest_mode. It can also be called from guest mode if the MSR bitmap says there's no L1 vmexit for that MSR; that's what the testcases do. Paolo >> vcpu->arch.ia32_tsc_adjust_msr += offset - curr_offset;
Re: [PATCH 1/2] X86/KVM: Properly update 'tsc_offset' to represent the running guest
On Fri, 2018-04-13 at 13:23 +0200, Paolo Bonzini wrote: > From: KarimAllah Ahmed > > Update 'tsc_offset' on vmenty/vmexit of L2 guests to ensure that it always > captures the TSC_OFFSET of the running guest whether it is the L1 or L2 > guest. > > Cc: Jim Mattson > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: k...@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Suggested-by: Paolo Bonzini > Signed-off-by: KarimAllah Ahmed > [AMD changes, fix update_ia32_tsc_adjust_msr. - Paolo] > Signed-off-by: Paolo Bonzini > --- > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/svm.c | 17 - > arch/x86/kvm/vmx.c | 25 - > arch/x86/kvm/x86.c | 6 -- > 4 files changed, 41 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 949c977bc4c9..c25775fad4ed 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1013,6 +1013,7 @@ struct kvm_x86_ops { > > bool (*has_wbinvd_exit)(void); > > + u64 (*read_l1_tsc_offset)(struct kvm_vcpu *vcpu); > void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset); > > void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2); > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index b3ebc8ad6891..ea7c6d29aca5 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c Thank you for adding the AMD bits, I did not have a machine to test the AMD bits on so I left it untouched :) > @@ -1424,12 +1424,23 @@ static void init_sys_seg(struct vmcb_seg *seg, > uint32_t type) > seg->base = 0; > } > > +static u64 svm_read_l1_tsc_offset(struct kvm_vcpu *vcpu) > +{ > + struct vcpu_svm *svm = to_svm(vcpu); > + > + if (is_guest_mode(vcpu)) > + return svm->nested.hsave->control.tsc_offset; > + > + return vcpu->arch.tsc_offset; > +} > + > static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) > { > struct vcpu_svm *svm = to_svm(vcpu); > u64 g_tsc_offset = 0; > > if (is_guest_mode(vcpu)) { > + /* Write L1's TSC offset. */ > g_tsc_offset = svm->vmcb->control.tsc_offset - > svm->nested.hsave->control.tsc_offset; > svm->nested.hsave->control.tsc_offset = offset; > @@ -3323,6 +3334,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm) > /* Restore the original control entries */ > copy_vmcb_control_area(vmcb, hsave); > > + vcpu->arch.tsc_offset = svm->vmcb->control.tsc_offset; > kvm_clear_exception_queue(&svm->vcpu); > kvm_clear_interrupt_queue(&svm->vcpu); > > @@ -3483,10 +3495,12 @@ static void enter_svm_guest_mode(struct vcpu_svm > *svm, u64 vmcb_gpa, > /* We don't want to see VMMCALLs from a nested guest */ > clr_intercept(svm, INTERCEPT_VMMCALL); > > + vcpu->arch.tsc_offset += nested_vmcb->control.tsc_offset; > + svm->vmcb->control.tsc_offset = vcpu->arch.tsc_offset; > + > svm->vmcb->control.virt_ext = nested_vmcb->control.virt_ext; > svm->vmcb->control.int_vector = nested_vmcb->control.int_vector; > svm->vmcb->control.int_state = nested_vmcb->control.int_state; > - svm->vmcb->control.tsc_offset += nested_vmcb->control.tsc_offset; > svm->vmcb->control.event_inj = nested_vmcb->control.event_inj; > svm->vmcb->control.event_inj_err = nested_vmcb->control.event_inj_err; > > @@ -7102,6 +7116,7 @@ static int svm_unregister_enc_region(struct kvm *kvm, > > .has_wbinvd_exit = svm_has_wbinvd_exit, > > + .read_l1_tsc_offset = svm_read_l1_tsc_offset, > .write_tsc_offset = svm_write_tsc_offset, > > .set_tdp_cr3 = set_tdp_cr3, > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index a13c603bdefb..6553419202ee 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -2874,6 +2874,17 @@ static void setup_msrs(struct vcpu_vmx *vmx) > vmx_update_msr_bitmap(&vmx->vcpu); > } > > +static u64 vmx_read_l1_tsc_offset(struct kvm_vcpu *vcpu) > +{ > + struct vmcs12 *vmcs12 = get_vmcs12(vcpu); > + > + if (is_guest_mode(vcpu) && > + (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING)) > + return vcpu->arch.tsc_offset - vmcs12->tsc_offset; > + > + return vcpu->arch.tsc_offset; > +} > + > /* > * reads and returns guest's timestamp counter "register" > * guest_tsc = (host_tsc * tsc multiplier) >> 48 + tsc_offset > @@ -11175,11 +11186,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, > struct vmcs12 *vmcs12, > vmcs_write64(GUEST_IA32_PAT, vmx->vcpu.arch.pat); > } > > - if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) > - vmcs_write64(TSC_OFFSET, > - vcpu->arch.tsc_offset + vmcs12->tsc_offset); > - else > - vmcs_write64(TSC_OFFSET, vcpu->arch.tsc_offset); > +