RE: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-15 Thread Moger, Babu


> -Original Message-
> From: Radim Krčmář <rkrc...@redhat.com>
> Sent: Wednesday, March 14, 2018 8:26 AM
> To: Moger, Babu <babu.mo...@amd.com>
> Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> k...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in
> SVM
> 
> 2018-03-10 05:07+, Moger, Babu:
> > Radim,
> >  Thanks for the comments. Taken care of most of the comments.
> >  I have few questions/comments. Please see inline.
> >
> > > -Original Message-
> > > From: Radim Krčmář <rkrc...@redhat.com>
> > > Sent: Friday, March 9, 2018 12:13 PM
> > > To: Moger, Babu <babu.mo...@amd.com>
> > > Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> > > h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> > > k...@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit
> logic in
> > > SVM
> > >
> > > 2018-03-02 11:17-0500, Babu Moger:
> > > > Bring the PLE(pause loop exit) logic to AMD svm driver.
> > > > We have noticed it help in situations where numerous pauses are
> > > generated
> > > > due to spinlock or other scenarios. Tested it with idle=poll and noticed
> > > > pause interceptions go down considerably.
> > > >
> > > > Signed-off-by: Babu Moger <babu.mo...@amd.com>
> > > > ---
> > > > @@ -1046,6 +1095,58 @@ static int avic_ga_log_notifier(u32 ga_tag)
> > > > return 0;
> > > >  }
> > > >
> > > > +static void grow_ple_window(struct kvm_vcpu *vcpu)
> > > > +{
> > > > +   struct vcpu_svm *svm = to_svm(vcpu);
> > > > +   struct vmcb_control_area *control = >vmcb->control;
> > > > +   int old = control->pause_filter_count;
> > > > +
> > > > +   control->pause_filter_count = __grow_ple_window(old,
> > > > +   
> > > > pause_filter_count,
> > > > +   ple_window_grow,
> > > > +
> > >   ple_window_actual_max);
> > > > +
> > > > +   if (control->pause_filter_count != old)
> > > > +   mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
> > > > +
> > > > +   trace_kvm_ple_window_grow(vcpu->vcpu_id,
> > > > + control->pause_filter_count, old);
> > >
> > > Please move the tracing into __shrink_ple_window to share the code.
> > > This probably belongs to patch [2/3].
> >
> > I will have to pass vcpu_id, and have to make few changes to display old
> and new values.
> > I am afraid it might add few more extra instructions.
> 
> Right, vcpu_id isn't available in that function.
> Keeping it like this is ok.
> 
> > >
> > > > +/*
> > > > + * ple_window_actual_max is computed to be one
> grow_ple_window()
> > > below
> > > > + * ple_window_max. (See __grow_ple_window for the reason.)
> > > > + * This prevents overflows, because ple_window_max is int.
> > > > + * ple_window_max effectively rounded down to a multiple of
> > > ple_window_grow in
> > > > + * this process.
> > > > + * ple_window_max is also prevented from setting control-
> > > >pause_filter_count <
> > > > + * pause_filter_count.
> > > > + */
> > > > +static void update_ple_window_actual_max(void)
> > > > +{
> > > > +   ple_window_actual_max =
> > > > +   __shrink_ple_window(max(ple_window_max,
> > > pause_filter_count),
> > >
> > > (I have no idea what I was thinking when I wrote that for VMX. :[
> > >  I'll write a patch to get rid of ple_window_actual_max, because its
> > >  benefits are really minuscule and the logic is complicated.)
> >
> > If you are thinking of just straight forward removal, I can take care of it.
> 
> And tweaking the overflow handling to account for that.  Go ahead if
> you'd like to.

Ok. Will add new patch to the series to handle this. Thanks.

> 
> > >
> > > > +   pause_filter_count,
> > > > +   ple_window_grow, SHRT_MIN);
> > > > +}
&g

RE: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-15 Thread Moger, Babu


> -Original Message-
> From: Radim Krčmář 
> Sent: Wednesday, March 14, 2018 8:26 AM
> To: Moger, Babu 
> Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> k...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in
> SVM
> 
> 2018-03-10 05:07+, Moger, Babu:
> > Radim,
> >  Thanks for the comments. Taken care of most of the comments.
> >  I have few questions/comments. Please see inline.
> >
> > > -Original Message-
> > > From: Radim Krčmář 
> > > Sent: Friday, March 9, 2018 12:13 PM
> > > To: Moger, Babu 
> > > Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> > > h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> > > k...@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit
> logic in
> > > SVM
> > >
> > > 2018-03-02 11:17-0500, Babu Moger:
> > > > Bring the PLE(pause loop exit) logic to AMD svm driver.
> > > > We have noticed it help in situations where numerous pauses are
> > > generated
> > > > due to spinlock or other scenarios. Tested it with idle=poll and noticed
> > > > pause interceptions go down considerably.
> > > >
> > > > Signed-off-by: Babu Moger 
> > > > ---
> > > > @@ -1046,6 +1095,58 @@ static int avic_ga_log_notifier(u32 ga_tag)
> > > > return 0;
> > > >  }
> > > >
> > > > +static void grow_ple_window(struct kvm_vcpu *vcpu)
> > > > +{
> > > > +   struct vcpu_svm *svm = to_svm(vcpu);
> > > > +   struct vmcb_control_area *control = >vmcb->control;
> > > > +   int old = control->pause_filter_count;
> > > > +
> > > > +   control->pause_filter_count = __grow_ple_window(old,
> > > > +   
> > > > pause_filter_count,
> > > > +   ple_window_grow,
> > > > +
> > >   ple_window_actual_max);
> > > > +
> > > > +   if (control->pause_filter_count != old)
> > > > +   mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
> > > > +
> > > > +   trace_kvm_ple_window_grow(vcpu->vcpu_id,
> > > > + control->pause_filter_count, old);
> > >
> > > Please move the tracing into __shrink_ple_window to share the code.
> > > This probably belongs to patch [2/3].
> >
> > I will have to pass vcpu_id, and have to make few changes to display old
> and new values.
> > I am afraid it might add few more extra instructions.
> 
> Right, vcpu_id isn't available in that function.
> Keeping it like this is ok.
> 
> > >
> > > > +/*
> > > > + * ple_window_actual_max is computed to be one
> grow_ple_window()
> > > below
> > > > + * ple_window_max. (See __grow_ple_window for the reason.)
> > > > + * This prevents overflows, because ple_window_max is int.
> > > > + * ple_window_max effectively rounded down to a multiple of
> > > ple_window_grow in
> > > > + * this process.
> > > > + * ple_window_max is also prevented from setting control-
> > > >pause_filter_count <
> > > > + * pause_filter_count.
> > > > + */
> > > > +static void update_ple_window_actual_max(void)
> > > > +{
> > > > +   ple_window_actual_max =
> > > > +   __shrink_ple_window(max(ple_window_max,
> > > pause_filter_count),
> > >
> > > (I have no idea what I was thinking when I wrote that for VMX. :[
> > >  I'll write a patch to get rid of ple_window_actual_max, because its
> > >  benefits are really minuscule and the logic is complicated.)
> >
> > If you are thinking of just straight forward removal, I can take care of it.
> 
> And tweaking the overflow handling to account for that.  Go ahead if
> you'd like to.

Ok. Will add new patch to the series to handle this. Thanks.

> 
> > >
> > > > +   pause_filter_count,
> > > > +   ple_window_grow, SHRT_MIN);
> > > > +}
> > > >  static __init int svm_hardware_setup(void)
> > > >  {
> > > > int cpu;
> >

Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-14 Thread Radim Krčmář
2018-03-10 05:07+, Moger, Babu:
> Radim,
>  Thanks for the comments. Taken care of most of the comments.
>  I have few questions/comments. Please see inline.
> 
> > -Original Message-
> > From: Radim Krčmář <rkrc...@redhat.com>
> > Sent: Friday, March 9, 2018 12:13 PM
> > To: Moger, Babu <babu.mo...@amd.com>
> > Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> > h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> > k...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in
> > SVM
> > 
> > 2018-03-02 11:17-0500, Babu Moger:
> > > Bring the PLE(pause loop exit) logic to AMD svm driver.
> > > We have noticed it help in situations where numerous pauses are
> > generated
> > > due to spinlock or other scenarios. Tested it with idle=poll and noticed
> > > pause interceptions go down considerably.
> > >
> > > Signed-off-by: Babu Moger <babu.mo...@amd.com>
> > > ---
> > > @@ -1046,6 +1095,58 @@ static int avic_ga_log_notifier(u32 ga_tag)
> > >   return 0;
> > >  }
> > >
> > > +static void grow_ple_window(struct kvm_vcpu *vcpu)
> > > +{
> > > + struct vcpu_svm *svm = to_svm(vcpu);
> > > + struct vmcb_control_area *control = >vmcb->control;
> > > + int old = control->pause_filter_count;
> > > +
> > > + control->pause_filter_count = __grow_ple_window(old,
> > > + pause_filter_count,
> > > + ple_window_grow,
> > > +
> > ple_window_actual_max);
> > > +
> > > + if (control->pause_filter_count != old)
> > > + mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
> > > +
> > > + trace_kvm_ple_window_grow(vcpu->vcpu_id,
> > > +   control->pause_filter_count, old);
> > 
> > Please move the tracing into __shrink_ple_window to share the code.
> > This probably belongs to patch [2/3].
> 
> I will have to pass vcpu_id, and have to make few changes to display old and 
> new values.
> I am afraid it might add few more extra instructions.

Right, vcpu_id isn't available in that function.
Keeping it like this is ok.

> > 
> > > +/*
> > > + * ple_window_actual_max is computed to be one grow_ple_window()
> > below
> > > + * ple_window_max. (See __grow_ple_window for the reason.)
> > > + * This prevents overflows, because ple_window_max is int.
> > > + * ple_window_max effectively rounded down to a multiple of
> > ple_window_grow in
> > > + * this process.
> > > + * ple_window_max is also prevented from setting control-
> > >pause_filter_count <
> > > + * pause_filter_count.
> > > + */
> > > +static void update_ple_window_actual_max(void)
> > > +{
> > > + ple_window_actual_max =
> > > + __shrink_ple_window(max(ple_window_max,
> > pause_filter_count),
> > 
> > (I have no idea what I was thinking when I wrote that for VMX. :[
> >  I'll write a patch to get rid of ple_window_actual_max, because its
> >  benefits are really minuscule and the logic is complicated.)
> 
> If you are thinking of just straight forward removal, I can take care of it.

And tweaking the overflow handling to account for that.  Go ahead if
you'd like to.

> > 
> > > + pause_filter_count,
> > > + ple_window_grow, SHRT_MIN);
> > > +}
> > >  static __init int svm_hardware_setup(void)
> > >  {
> > >   int cpu;
> > > @@ -1309,7 +1412,11 @@ static void init_vmcb(struct vcpu_svm *svm)
> > >   svm->vcpu.arch.hflags = 0;
> > >
> > >   if (boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
> > > - control->pause_filter_count = 3000;
> > > + control->pause_filter_count = pause_filter_count;
> > > + if (boot_cpu_has(X86_FEATURE_PFTHRESHOLD))
> > > + control->pause_filter_thresh = pause_filter_thresh;
> > > + else
> > > + pause_filter_thresh = 0;
> > 
> > Please move this to hardware_setup and also clear pause_filter_count if
> 
> Moving this to hardware_setup will be a problem.  We don't have access to svm 
> data structure in hardware_setup.

I mean just the pause_filter_thresh = 0 and pause_filter_count = 0 logic
based on boot_cpu_has (it's weird if the user-visible parameters are
corrected after starting a VM);  VMCB configuration stays,

thanks.


Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-14 Thread Radim Krčmář
2018-03-10 05:07+, Moger, Babu:
> Radim,
>  Thanks for the comments. Taken care of most of the comments.
>  I have few questions/comments. Please see inline.
> 
> > -Original Message-
> > From: Radim Krčmář 
> > Sent: Friday, March 9, 2018 12:13 PM
> > To: Moger, Babu 
> > Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> > h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> > k...@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in
> > SVM
> > 
> > 2018-03-02 11:17-0500, Babu Moger:
> > > Bring the PLE(pause loop exit) logic to AMD svm driver.
> > > We have noticed it help in situations where numerous pauses are
> > generated
> > > due to spinlock or other scenarios. Tested it with idle=poll and noticed
> > > pause interceptions go down considerably.
> > >
> > > Signed-off-by: Babu Moger 
> > > ---
> > > @@ -1046,6 +1095,58 @@ static int avic_ga_log_notifier(u32 ga_tag)
> > >   return 0;
> > >  }
> > >
> > > +static void grow_ple_window(struct kvm_vcpu *vcpu)
> > > +{
> > > + struct vcpu_svm *svm = to_svm(vcpu);
> > > + struct vmcb_control_area *control = >vmcb->control;
> > > + int old = control->pause_filter_count;
> > > +
> > > + control->pause_filter_count = __grow_ple_window(old,
> > > + pause_filter_count,
> > > + ple_window_grow,
> > > +
> > ple_window_actual_max);
> > > +
> > > + if (control->pause_filter_count != old)
> > > + mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
> > > +
> > > + trace_kvm_ple_window_grow(vcpu->vcpu_id,
> > > +   control->pause_filter_count, old);
> > 
> > Please move the tracing into __shrink_ple_window to share the code.
> > This probably belongs to patch [2/3].
> 
> I will have to pass vcpu_id, and have to make few changes to display old and 
> new values.
> I am afraid it might add few more extra instructions.

Right, vcpu_id isn't available in that function.
Keeping it like this is ok.

> > 
> > > +/*
> > > + * ple_window_actual_max is computed to be one grow_ple_window()
> > below
> > > + * ple_window_max. (See __grow_ple_window for the reason.)
> > > + * This prevents overflows, because ple_window_max is int.
> > > + * ple_window_max effectively rounded down to a multiple of
> > ple_window_grow in
> > > + * this process.
> > > + * ple_window_max is also prevented from setting control-
> > >pause_filter_count <
> > > + * pause_filter_count.
> > > + */
> > > +static void update_ple_window_actual_max(void)
> > > +{
> > > + ple_window_actual_max =
> > > + __shrink_ple_window(max(ple_window_max,
> > pause_filter_count),
> > 
> > (I have no idea what I was thinking when I wrote that for VMX. :[
> >  I'll write a patch to get rid of ple_window_actual_max, because its
> >  benefits are really minuscule and the logic is complicated.)
> 
> If you are thinking of just straight forward removal, I can take care of it.

And tweaking the overflow handling to account for that.  Go ahead if
you'd like to.

> > 
> > > + pause_filter_count,
> > > + ple_window_grow, SHRT_MIN);
> > > +}
> > >  static __init int svm_hardware_setup(void)
> > >  {
> > >   int cpu;
> > > @@ -1309,7 +1412,11 @@ static void init_vmcb(struct vcpu_svm *svm)
> > >   svm->vcpu.arch.hflags = 0;
> > >
> > >   if (boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
> > > - control->pause_filter_count = 3000;
> > > + control->pause_filter_count = pause_filter_count;
> > > + if (boot_cpu_has(X86_FEATURE_PFTHRESHOLD))
> > > + control->pause_filter_thresh = pause_filter_thresh;
> > > + else
> > > + pause_filter_thresh = 0;
> > 
> > Please move this to hardware_setup and also clear pause_filter_count if
> 
> Moving this to hardware_setup will be a problem.  We don't have access to svm 
> data structure in hardware_setup.

I mean just the pause_filter_thresh = 0 and pause_filter_count = 0 logic
based on boot_cpu_has (it's weird if the user-visible parameters are
corrected after starting a VM);  VMCB configuration stays,

thanks.


RE: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-09 Thread Moger, Babu
Radim,
 Thanks for the comments. Taken care of most of the comments.
 I have few questions/comments. Please see inline.

> -Original Message-
> From: Radim Krčmář <rkrc...@redhat.com>
> Sent: Friday, March 9, 2018 12:13 PM
> To: Moger, Babu <babu.mo...@amd.com>
> Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> k...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in
> SVM
> 
> 2018-03-02 11:17-0500, Babu Moger:
> > Bring the PLE(pause loop exit) logic to AMD svm driver.
> > We have noticed it help in situations where numerous pauses are
> generated
> > due to spinlock or other scenarios. Tested it with idle=poll and noticed
> > pause interceptions go down considerably.
> >
> > Signed-off-by: Babu Moger <babu.mo...@amd.com>
> > ---
> >  arch/x86/kvm/svm.c | 114
> -
> >  arch/x86/kvm/x86.h |   1 +
> >  2 files changed, 114 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 50a4e95..30bc851 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -263,6 +263,55 @@ struct amd_svm_iommu_ir {
> >  static bool npt_enabled;
> >  #endif
> >
> > +/*
> > + * These 2 parameters are used to config the controls for Pause-Loop
> Exiting:
> > + * pause_filter_thresh: On processors that support Pause
> filtering(indicated
> > + * by CPUID Fn8000_000A_EDX), the VMCB provides a 16 bit pause filter
> > + * count value. On VMRUN this value is loaded into an internal counter.
> > + * Each time a pause instruction is executed, this counter is
> decremented
> > + * until it reaches zero at which time a #VMEXIT is generated if pause
> > + * intercept is enabled. Refer to  AMD APM Vol 2 Section 15.14.4 Pause
> > + * Intercept Filtering for more details.
> > + * This also indicate if ple logic enabled.
> > + *
> > + * pause_filter_count: In addition, some processor families support
> advanced
> 
> The comment has thresh/count flipped.

Good catch. Thanks

> 
> > + * pause filtering (indicated by CPUID Fn8000_000A_EDX) upper bound
> on
> > + * the amount of time a guest is allowed to execute in a pause loop.
> > + * In this mode, a 16-bit pause filter threshold field is added in the
> > + * VMCB. The threshold value is a cycle count that is used to reset the
> > + * pause counter. As with simple pause filtering, VMRUN loads the
> pause
> > + * count value from VMCB into an internal counter. Then, on each
> pause
> > + * instruction the hardware checks the elapsed number of cycles since
> > + * the most recent pause instruction against the pause filter threshold.
> > + * If the elapsed cycle count is greater than the pause filter threshold,
> > + * then the internal pause count is reloaded from the VMCB and
> execution
> > + * continues. If the elapsed cycle count is less than the pause filter
> > + * threshold, then the internal pause count is decremented. If the
> count
> > + * value is less than zero and PAUSE intercept is enabled, a #VMEXIT is
> > + * triggered. If advanced pause filtering is supported and pause filter
> > + * threshold field is set to zero, the filter will operate in the simpler,
> > + * count only mode.
> > + */
> > +
> > +static int pause_filter_thresh = KVM_DEFAULT_PLE_GAP;
> > +module_param(pause_filter_thresh, int, S_IRUGO);
> 
> I think it was a mistake to put signed values in VMX ...
> Please use unsigned variants and also properly sized.
> (The module param type would be "ushort" instead of "int".)

Sure. Will take care.
> 
> > +static int pause_filter_count = KVM_DEFAULT_PLE_WINDOW;
> > +module_param(pause_filter_count, int, S_IRUGO);
> 
> We are going to want a different default for pause_filter_count, because
> they have a different meaning.  On Intel, it's the number of cycles, on
> AMD, it's the number of PAUSE instructions.
> 
> The AMD's 3k is a bit high in comparison to Intel's 4k, but I'd keep 3k
> unless we have other benchmark results.

Ok. Testing with pause_filter_count = 3k for AMD. If everything goes fine, will 
make these changes.

> 
> > +static int ple_window_grow = KVM_DEFAULT_PLE_WINDOW_GROW;
> 
> The naming would be nicer with a consistent prefix.  We're growing
> pause_filter_count, so pause_filter_count_grow is easier to understand.
> (Albeit unwieldy.)

Sure. Will take care.

> 
> > +module_param(ple_wi

RE: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-09 Thread Moger, Babu
Radim,
 Thanks for the comments. Taken care of most of the comments.
 I have few questions/comments. Please see inline.

> -Original Message-
> From: Radim Krčmář 
> Sent: Friday, March 9, 2018 12:13 PM
> To: Moger, Babu 
> Cc: j...@8bytes.org; t...@linutronix.de; mi...@redhat.com;
> h...@zytor.com; x...@kernel.org; pbonz...@redhat.com;
> k...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in
> SVM
> 
> 2018-03-02 11:17-0500, Babu Moger:
> > Bring the PLE(pause loop exit) logic to AMD svm driver.
> > We have noticed it help in situations where numerous pauses are
> generated
> > due to spinlock or other scenarios. Tested it with idle=poll and noticed
> > pause interceptions go down considerably.
> >
> > Signed-off-by: Babu Moger 
> > ---
> >  arch/x86/kvm/svm.c | 114
> -
> >  arch/x86/kvm/x86.h |   1 +
> >  2 files changed, 114 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 50a4e95..30bc851 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -263,6 +263,55 @@ struct amd_svm_iommu_ir {
> >  static bool npt_enabled;
> >  #endif
> >
> > +/*
> > + * These 2 parameters are used to config the controls for Pause-Loop
> Exiting:
> > + * pause_filter_thresh: On processors that support Pause
> filtering(indicated
> > + * by CPUID Fn8000_000A_EDX), the VMCB provides a 16 bit pause filter
> > + * count value. On VMRUN this value is loaded into an internal counter.
> > + * Each time a pause instruction is executed, this counter is
> decremented
> > + * until it reaches zero at which time a #VMEXIT is generated if pause
> > + * intercept is enabled. Refer to  AMD APM Vol 2 Section 15.14.4 Pause
> > + * Intercept Filtering for more details.
> > + * This also indicate if ple logic enabled.
> > + *
> > + * pause_filter_count: In addition, some processor families support
> advanced
> 
> The comment has thresh/count flipped.

Good catch. Thanks

> 
> > + * pause filtering (indicated by CPUID Fn8000_000A_EDX) upper bound
> on
> > + * the amount of time a guest is allowed to execute in a pause loop.
> > + * In this mode, a 16-bit pause filter threshold field is added in the
> > + * VMCB. The threshold value is a cycle count that is used to reset the
> > + * pause counter. As with simple pause filtering, VMRUN loads the
> pause
> > + * count value from VMCB into an internal counter. Then, on each
> pause
> > + * instruction the hardware checks the elapsed number of cycles since
> > + * the most recent pause instruction against the pause filter threshold.
> > + * If the elapsed cycle count is greater than the pause filter threshold,
> > + * then the internal pause count is reloaded from the VMCB and
> execution
> > + * continues. If the elapsed cycle count is less than the pause filter
> > + * threshold, then the internal pause count is decremented. If the
> count
> > + * value is less than zero and PAUSE intercept is enabled, a #VMEXIT is
> > + * triggered. If advanced pause filtering is supported and pause filter
> > + * threshold field is set to zero, the filter will operate in the simpler,
> > + * count only mode.
> > + */
> > +
> > +static int pause_filter_thresh = KVM_DEFAULT_PLE_GAP;
> > +module_param(pause_filter_thresh, int, S_IRUGO);
> 
> I think it was a mistake to put signed values in VMX ...
> Please use unsigned variants and also properly sized.
> (The module param type would be "ushort" instead of "int".)

Sure. Will take care.
> 
> > +static int pause_filter_count = KVM_DEFAULT_PLE_WINDOW;
> > +module_param(pause_filter_count, int, S_IRUGO);
> 
> We are going to want a different default for pause_filter_count, because
> they have a different meaning.  On Intel, it's the number of cycles, on
> AMD, it's the number of PAUSE instructions.
> 
> The AMD's 3k is a bit high in comparison to Intel's 4k, but I'd keep 3k
> unless we have other benchmark results.

Ok. Testing with pause_filter_count = 3k for AMD. If everything goes fine, will 
make these changes.

> 
> > +static int ple_window_grow = KVM_DEFAULT_PLE_WINDOW_GROW;
> 
> The naming would be nicer with a consistent prefix.  We're growing
> pause_filter_count, so pause_filter_count_grow is easier to understand.
> (Albeit unwieldy.)

Sure. Will take care.

> 
> > +module_param(ple_window_grow, int, S_IRUGO);
> 
> (This is better as unsigned too ... VMX should have had t

Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-09 Thread Radim Krčmář
2018-03-02 11:17-0500, Babu Moger:
> Bring the PLE(pause loop exit) logic to AMD svm driver.
> We have noticed it help in situations where numerous pauses are generated
> due to spinlock or other scenarios. Tested it with idle=poll and noticed
> pause interceptions go down considerably.
> 
> Signed-off-by: Babu Moger 
> ---
>  arch/x86/kvm/svm.c | 114 
> -
>  arch/x86/kvm/x86.h |   1 +
>  2 files changed, 114 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 50a4e95..30bc851 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -263,6 +263,55 @@ struct amd_svm_iommu_ir {
>  static bool npt_enabled;
>  #endif
>  
> +/*
> + * These 2 parameters are used to config the controls for Pause-Loop Exiting:
> + * pause_filter_thresh: On processors that support Pause filtering(indicated
> + *   by CPUID Fn8000_000A_EDX), the VMCB provides a 16 bit pause filter
> + *   count value. On VMRUN this value is loaded into an internal counter.
> + *   Each time a pause instruction is executed, this counter is decremented
> + *   until it reaches zero at which time a #VMEXIT is generated if pause
> + *   intercept is enabled. Refer to  AMD APM Vol 2 Section 15.14.4 Pause
> + *   Intercept Filtering for more details.
> + *   This also indicate if ple logic enabled.
> + *
> + * pause_filter_count: In addition, some processor families support advanced

The comment has thresh/count flipped.

> + *   pause filtering (indicated by CPUID Fn8000_000A_EDX) upper bound on
> + *   the amount of time a guest is allowed to execute in a pause loop.
> + *   In this mode, a 16-bit pause filter threshold field is added in the
> + *   VMCB. The threshold value is a cycle count that is used to reset the
> + *   pause counter. As with simple pause filtering, VMRUN loads the pause
> + *   count value from VMCB into an internal counter. Then, on each pause
> + *   instruction the hardware checks the elapsed number of cycles since
> + *   the most recent pause instruction against the pause filter threshold.
> + *   If the elapsed cycle count is greater than the pause filter threshold,
> + *   then the internal pause count is reloaded from the VMCB and execution
> + *   continues. If the elapsed cycle count is less than the pause filter
> + *   threshold, then the internal pause count is decremented. If the count
> + *   value is less than zero and PAUSE intercept is enabled, a #VMEXIT is
> + *   triggered. If advanced pause filtering is supported and pause filter
> + *   threshold field is set to zero, the filter will operate in the simpler,
> + *   count only mode.
> + */
> +
> +static int pause_filter_thresh = KVM_DEFAULT_PLE_GAP;
> +module_param(pause_filter_thresh, int, S_IRUGO);

I think it was a mistake to put signed values in VMX ...
Please use unsigned variants and also properly sized.
(The module param type would be "ushort" instead of "int".)

> +static int pause_filter_count = KVM_DEFAULT_PLE_WINDOW;
> +module_param(pause_filter_count, int, S_IRUGO);

We are going to want a different default for pause_filter_count, because
they have a different meaning.  On Intel, it's the number of cycles, on
AMD, it's the number of PAUSE instructions.

The AMD's 3k is a bit high in comparison to Intel's 4k, but I'd keep 3k
unless we have other benchmark results.

> +static int ple_window_grow = KVM_DEFAULT_PLE_WINDOW_GROW;

The naming would be nicer with a consistent prefix.  We're growing
pause_filter_count, so pause_filter_count_grow is easier to understand.
(Albeit unwieldy.)

> +module_param(ple_window_grow, int, S_IRUGO);

(This is better as unsigned too ... VMX should have had that.)

> @@ -1046,6 +1095,58 @@ static int avic_ga_log_notifier(u32 ga_tag)
>   return 0;
>  }
>  
> +static void grow_ple_window(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_svm *svm = to_svm(vcpu);
> + struct vmcb_control_area *control = >vmcb->control;
> + int old = control->pause_filter_count;
> +
> + control->pause_filter_count = __grow_ple_window(old,
> + pause_filter_count,
> + ple_window_grow,
> + ple_window_actual_max);
> +
> + if (control->pause_filter_count != old)
> + mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
> +
> + trace_kvm_ple_window_grow(vcpu->vcpu_id,
> +   control->pause_filter_count, old);

Please move the tracing into __shrink_ple_window to share the code.
This probably belongs to patch [2/3].

> +/*
> + * ple_window_actual_max is computed to be one grow_ple_window() below
> + * ple_window_max. (See __grow_ple_window for the reason.)
> + * This prevents overflows, because ple_window_max is int.
> + * ple_window_max effectively rounded down to a multiple of ple_window_grow 
> in
> + * this process.
> + * 

Re: [RFC 3/3] arch/x86/kvm: SVM: Introduce pause loop exit logic in SVM

2018-03-09 Thread Radim Krčmář
2018-03-02 11:17-0500, Babu Moger:
> Bring the PLE(pause loop exit) logic to AMD svm driver.
> We have noticed it help in situations where numerous pauses are generated
> due to spinlock or other scenarios. Tested it with idle=poll and noticed
> pause interceptions go down considerably.
> 
> Signed-off-by: Babu Moger 
> ---
>  arch/x86/kvm/svm.c | 114 
> -
>  arch/x86/kvm/x86.h |   1 +
>  2 files changed, 114 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 50a4e95..30bc851 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -263,6 +263,55 @@ struct amd_svm_iommu_ir {
>  static bool npt_enabled;
>  #endif
>  
> +/*
> + * These 2 parameters are used to config the controls for Pause-Loop Exiting:
> + * pause_filter_thresh: On processors that support Pause filtering(indicated
> + *   by CPUID Fn8000_000A_EDX), the VMCB provides a 16 bit pause filter
> + *   count value. On VMRUN this value is loaded into an internal counter.
> + *   Each time a pause instruction is executed, this counter is decremented
> + *   until it reaches zero at which time a #VMEXIT is generated if pause
> + *   intercept is enabled. Refer to  AMD APM Vol 2 Section 15.14.4 Pause
> + *   Intercept Filtering for more details.
> + *   This also indicate if ple logic enabled.
> + *
> + * pause_filter_count: In addition, some processor families support advanced

The comment has thresh/count flipped.

> + *   pause filtering (indicated by CPUID Fn8000_000A_EDX) upper bound on
> + *   the amount of time a guest is allowed to execute in a pause loop.
> + *   In this mode, a 16-bit pause filter threshold field is added in the
> + *   VMCB. The threshold value is a cycle count that is used to reset the
> + *   pause counter. As with simple pause filtering, VMRUN loads the pause
> + *   count value from VMCB into an internal counter. Then, on each pause
> + *   instruction the hardware checks the elapsed number of cycles since
> + *   the most recent pause instruction against the pause filter threshold.
> + *   If the elapsed cycle count is greater than the pause filter threshold,
> + *   then the internal pause count is reloaded from the VMCB and execution
> + *   continues. If the elapsed cycle count is less than the pause filter
> + *   threshold, then the internal pause count is decremented. If the count
> + *   value is less than zero and PAUSE intercept is enabled, a #VMEXIT is
> + *   triggered. If advanced pause filtering is supported and pause filter
> + *   threshold field is set to zero, the filter will operate in the simpler,
> + *   count only mode.
> + */
> +
> +static int pause_filter_thresh = KVM_DEFAULT_PLE_GAP;
> +module_param(pause_filter_thresh, int, S_IRUGO);

I think it was a mistake to put signed values in VMX ...
Please use unsigned variants and also properly sized.
(The module param type would be "ushort" instead of "int".)

> +static int pause_filter_count = KVM_DEFAULT_PLE_WINDOW;
> +module_param(pause_filter_count, int, S_IRUGO);

We are going to want a different default for pause_filter_count, because
they have a different meaning.  On Intel, it's the number of cycles, on
AMD, it's the number of PAUSE instructions.

The AMD's 3k is a bit high in comparison to Intel's 4k, but I'd keep 3k
unless we have other benchmark results.

> +static int ple_window_grow = KVM_DEFAULT_PLE_WINDOW_GROW;

The naming would be nicer with a consistent prefix.  We're growing
pause_filter_count, so pause_filter_count_grow is easier to understand.
(Albeit unwieldy.)

> +module_param(ple_window_grow, int, S_IRUGO);

(This is better as unsigned too ... VMX should have had that.)

> @@ -1046,6 +1095,58 @@ static int avic_ga_log_notifier(u32 ga_tag)
>   return 0;
>  }
>  
> +static void grow_ple_window(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_svm *svm = to_svm(vcpu);
> + struct vmcb_control_area *control = >vmcb->control;
> + int old = control->pause_filter_count;
> +
> + control->pause_filter_count = __grow_ple_window(old,
> + pause_filter_count,
> + ple_window_grow,
> + ple_window_actual_max);
> +
> + if (control->pause_filter_count != old)
> + mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
> +
> + trace_kvm_ple_window_grow(vcpu->vcpu_id,
> +   control->pause_filter_count, old);

Please move the tracing into __shrink_ple_window to share the code.
This probably belongs to patch [2/3].

> +/*
> + * ple_window_actual_max is computed to be one grow_ple_window() below
> + * ple_window_max. (See __grow_ple_window for the reason.)
> + * This prevents overflows, because ple_window_max is int.
> + * ple_window_max effectively rounded down to a multiple of ple_window_grow 
> in
> + * this process.
> + * ple_window_max is also