Re: [PATCH v3 3/9] KVM: x86: Defer tick-based accounting 'til after IRQ handling

2021-04-20 Thread Sean Christopherson
On Wed, Apr 21, 2021, Frederic Weisbecker wrote:
> On Thu, Apr 15, 2021 at 03:21:00PM -0700, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 16fb39503296..e4d475df1d4a 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -9230,6 +9230,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > local_irq_disable();
> > kvm_after_interrupt(vcpu);
> >  
> > +   /*
> > +* When using tick-based accounting, wait until after servicing IRQs to
> > +* account guest time so that any ticks that occurred while running the
> > +* guest are properly accounted to the guest.
> > +*/
> > +   if (!vtime_accounting_enabled_this_cpu())
> > +   vtime_account_guest_exit();
> 
> Can we rather have instead:
> 
> static inline void tick_account_guest_exit(void)
> {
>   if (!vtime_accounting_enabled_this_cpu())
>   current->flags &= ~PF_VCPU;
> }
> 
> It duplicates a bit of code but I think this will read less confusing.

Either way works for me.  I used vtime_account_guest_exit() to try to keep as
many details as possible inside vtime, e.g. in case the implemenation is tweaked
in the future.  But I agree that pretending KVM isn't already deeply intertwined
with the details is a lie.


Re: [PATCH v3 3/9] KVM: x86: Defer tick-based accounting 'til after IRQ handling

2021-04-20 Thread Frederic Weisbecker
On Thu, Apr 15, 2021 at 03:21:00PM -0700, Sean Christopherson wrote:
> From: Wanpeng Li 
> 
> When using tick-based accounting, defer the call to account guest time
> until after servicing any IRQ(s) that happened in the guest or
> immediately after VM-Exit.  Tick-based accounting of vCPU time relies on
> PF_VCPU being set when the tick IRQ handler runs, and IRQs are blocked
> throughout {svm,vmx}_vcpu_enter_exit().
> 
> This fixes a bug[*] where reported guest time remains '0', even when
> running an infinite loop in the guest.
> 
> [*] https://bugzilla.kernel.org/show_bug.cgi?id=209831
> 
> Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
> Cc: Thomas Gleixner 
> Cc: Sean Christopherson 
> Cc: Michael Tokarev 
> Cc: sta...@vger.kernel.org#v5.9-rc1+
> Suggested-by: Thomas Gleixner 
> Signed-off-by: Wanpeng Li 
> Co-developed-by: Sean Christopherson 
> Signed-off-by: Sean Christopherson 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 16fb39503296..e4d475df1d4a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9230,6 +9230,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   local_irq_disable();
>   kvm_after_interrupt(vcpu);
>  
> + /*
> +  * When using tick-based accounting, wait until after servicing IRQs to
> +  * account guest time so that any ticks that occurred while running the
> +  * guest are properly accounted to the guest.
> +  */
> + if (!vtime_accounting_enabled_this_cpu())
> + vtime_account_guest_exit();

Can we rather have instead:

static inline void tick_account_guest_exit(void)
{
if (!vtime_accounting_enabled_this_cpu())
current->flags &= ~PF_VCPU;
}

It duplicates a bit of code but I think this will read less confusing.

Thanks.

> +
>   if (lapic_in_kernel(vcpu)) {
>   s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
>   if (delta != S64_MIN) {
> -- 
> 2.31.1.368.gbe11c130af-goog
> 


[PATCH v3 3/9] KVM: x86: Defer tick-based accounting 'til after IRQ handling

2021-04-15 Thread Sean Christopherson
From: Wanpeng Li 

When using tick-based accounting, defer the call to account guest time
until after servicing any IRQ(s) that happened in the guest or
immediately after VM-Exit.  Tick-based accounting of vCPU time relies on
PF_VCPU being set when the tick IRQ handler runs, and IRQs are blocked
throughout {svm,vmx}_vcpu_enter_exit().

This fixes a bug[*] where reported guest time remains '0', even when
running an infinite loop in the guest.

[*] https://bugzilla.kernel.org/show_bug.cgi?id=209831

Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
Cc: Thomas Gleixner 
Cc: Sean Christopherson 
Cc: Michael Tokarev 
Cc: sta...@vger.kernel.org#v5.9-rc1+
Suggested-by: Thomas Gleixner 
Signed-off-by: Wanpeng Li 
Co-developed-by: Sean Christopherson 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/svm/svm.c | 13 ++---
 arch/x86/kvm/vmx/vmx.c | 13 ++---
 arch/x86/kvm/x86.c |  8 
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 48b396f33bee..bb2aa0dde7c5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3750,17 +3750,24 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu 
*vcpu)
 * have them in state 'on' as recorded before entering guest mode.
 * Same as enter_from_user_mode().
 *
-* guest_exit_irqoff() restores host context and reinstates RCU if
-* enabled and required.
+* context_tracking_guest_exit_irqoff() restores host context and
+* reinstates RCU if enabled and required.
 *
 * This needs to be done before the below as native_read_msr()
 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
 * into world and some more.
 */
lockdep_hardirqs_off(CALLER_ADDR0);
-   guest_exit_irqoff();
+   context_tracking_guest_exit_irqoff();
 
instrumentation_begin();
+   /*
+* Account guest time when precise accounting based on context tracking
+* is enabled.  Tick based accounting is deferred until after IRQs that
+* occurred within the VM-Enter/VM-Exit "window" are handled.
+*/
+   if (vtime_accounting_enabled_this_cpu())
+   vtime_account_guest_exit();
trace_hardirqs_off_finish();
instrumentation_end();
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c05e6e2854b5..5ae9dc197048 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6639,17 +6639,24 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu 
*vcpu,
 * have them in state 'on' as recorded before entering guest mode.
 * Same as enter_from_user_mode().
 *
-* guest_exit_irqoff() restores host context and reinstates RCU if
-* enabled and required.
+* context_tracking_guest_exit_irqoff() restores host context and
+* reinstates RCU if enabled and required.
 *
 * This needs to be done before the below as native_read_msr()
 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
 * into world and some more.
 */
lockdep_hardirqs_off(CALLER_ADDR0);
-   guest_exit_irqoff();
+   context_tracking_guest_exit_irqoff();
 
instrumentation_begin();
+   /*
+* Account guest time when precise accounting based on context tracking
+* is enabled.  Tick based accounting is deferred until after IRQs that
+* occurred within the VM-Enter/VM-Exit "window" are handled.
+*/
+   if (vtime_accounting_enabled_this_cpu())
+   vtime_account_guest_exit();
trace_hardirqs_off_finish();
instrumentation_end();
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 16fb39503296..e4d475df1d4a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9230,6 +9230,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
local_irq_disable();
kvm_after_interrupt(vcpu);
 
+   /*
+* When using tick-based accounting, wait until after servicing IRQs to
+* account guest time so that any ticks that occurred while running the
+* guest are properly accounted to the guest.
+*/
+   if (!vtime_accounting_enabled_this_cpu())
+   vtime_account_guest_exit();
+
if (lapic_in_kernel(vcpu)) {
s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
if (delta != S64_MIN) {
-- 
2.31.1.368.gbe11c130af-goog