Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.
2014-08-22 12:45+0800, Wanpeng Li: Hi Radim, On Thu, Aug 21, 2014 at 06:50:03PM +0200, Radim Krčmář wrote: 2014-08-21 18:30+0200, Paolo Bonzini: Il 21/08/2014 18:08, Radim Krčmář ha scritto: I'm not sure of the usefulness of patch 6, so I'm going to drop it. I'll keep it in my local junkyard branch in case it's going to be useful in some scenario I didn't think of. I've been using it to benchmark different values, because it is more Is there any benchmark data for this patchset? Sorry, I already returned the testing machine and it wasn't quality benchmarking, so I haven't kept the results ... I used ebizzy and dbench, because ebizzy had large difference between PLE on/off and dbench minimal (without overcommit), so one was looking for improvements while the other was checking regressions. (And they are easy to set up.) From what I remember, this patch had roughly 5x better performance with ebizzy on 60 VCPU guests and no obvious difference for dbench. (And improvement under overcommit was visible for both.) There was a significant reduction in %sys, which never raised much above 30%, as oposed to original 90%+. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.
Il 21/08/2014 18:08, Radim Krčmář ha scritto: v2 - v3: * copypaste frenzy [v3 4/7] (split modify_ple_window) * commented update_ple_window_actual_max [v3 4/7] * renamed shrinker to modifier [v3 4/7] * removed an extraneous max(new, ple_window) [v3 4/7] (should have been in v2) * changed tracepoint argument type, printing and macro abstractions [v3 5/7] * renamed ple_t to ple_int [v3 6/7] (visible in modinfo) * intelligent updates of ple_window [v3 7/7] --- v1 - v2: * squashed [v1 4/9] and [v1 5/9] (clamping) * dropped [v1 7/9] (CPP abstractions) * merged core of [v1 9/9] into [v1 4/9] (automatic maximum) * reworked kernel_param_ops: closer to pure int [v2 6/6] * introduced ple_window_actual_max reworked clamping [v2 4/6] * added seqlock for parameter modifications [v2 6/6] --- PLE does not scale in its current form. When increasing VCPU count above 150, one can hit soft lockups because of runqueue lock contention. (Which says a lot about performance.) The main reason is that kvm_ple_loop cycles through all VCPUs. Replacing it with a scalable solution would be ideal, but it has already been well optimized for various workloads, so this series tries to alleviate one different major problem while minimizing a chance of regressions: we have too many useless PLE exits. Just increasing PLE window would help some cases, but it still spirals out of control. By increasing the window after every PLE exit, we can limit the amount of useless ones, so we don't reach the state where CPUs spend 99% of the time waiting for a lock. HP confirmed that this series prevents soft lockups and TSC sync errors on large guests. Hi, I'm not sure of the usefulness of patch 6, so I'm going to drop it. I'll keep it in my local junkyard branch in case it's going to be useful in some scenario I didn't think of. Patch 7 can be easily rebased, so no need to repost (and I might even squash it into patch 3, what do you think?). Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.
2014-08-21 18:30+0200, Paolo Bonzini: Il 21/08/2014 18:08, Radim Krčmář ha scritto: I'm not sure of the usefulness of patch 6, so I'm going to drop it. I'll keep it in my local junkyard branch in case it's going to be useful in some scenario I didn't think of. I've been using it to benchmark different values, because it is more convenient than reloading the module after shutting down guests. (And easier to sell than writing to kernel memory.) I don't think the additional code is worth it though. Patch 7 can be easily rebased, so no need to repost (and I might even squash it into patch 3, what do you think?). Yeah, the core is already a huge patch, so it does look weird without squashing. (No-one wants to rebase to that point anyway.) Thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.
Il 21/08/2014 18:50, Radim Krčmář ha scritto: 2014-08-21 18:30+0200, Paolo Bonzini: Il 21/08/2014 18:08, Radim Krčmář ha scritto: I'm not sure of the usefulness of patch 6, so I'm going to drop it. I'll keep it in my local junkyard branch in case it's going to be useful in some scenario I didn't think of. I've been using it to benchmark different values, because it is more convenient than reloading the module after shutting down guests. (And easier to sell than writing to kernel memory.) I don't think the additional code is worth it though. Patch 7 can be easily rebased, so no need to repost (and I might even squash it into patch 3, what do you think?). Yeah, the core is already a huge patch, so it does look weird without squashing. (No-one wants to rebase to that point anyway.) Ok, my queue is a bit large so I'll probably not push to git.kernel.org until next week but in any case this is what it will look like: Radim Krčmář (5): KVM: add kvm_arch_sched_in KVM: x86: introduce sched_in to kvm_x86_ops KVM: VMX: make PLE window per-VCPU ple_window_dirty squashed here KVM: VMX: dynamise PLE window KVM: trace kvm_ple_window grow/shrink Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.
On 08/21/2014 10:00 PM, Paolo Bonzini wrote: Il 21/08/2014 18:08, Radim Krčmář ha scritto: v2 - v3: * copypaste frenzy [v3 4/7] (split modify_ple_window) * commented update_ple_window_actual_max [v3 4/7] * renamed shrinker to modifier [v3 4/7] * removed an extraneous max(new, ple_window) [v3 4/7] (should have been in v2) * changed tracepoint argument type, printing and macro abstractions [v3 5/7] * renamed ple_t to ple_int [v3 6/7] (visible in modinfo) * intelligent updates of ple_window [v3 7/7] --- v1 - v2: * squashed [v1 4/9] and [v1 5/9] (clamping) * dropped [v1 7/9] (CPP abstractions) * merged core of [v1 9/9] into [v1 4/9] (automatic maximum) * reworked kernel_param_ops: closer to pure int [v2 6/6] * introduced ple_window_actual_max reworked clamping [v2 4/6] * added seqlock for parameter modifications [v2 6/6] --- PLE does not scale in its current form. When increasing VCPU count above 150, one can hit soft lockups because of runqueue lock contention. (Which says a lot about performance.) The main reason is that kvm_ple_loop cycles through all VCPUs. Replacing it with a scalable solution would be ideal, but it has already been well optimized for various workloads, so this series tries to alleviate one different major problem while minimizing a chance of regressions: we have too many useless PLE exits. Just increasing PLE window would help some cases, but it still spirals out of control. By increasing the window after every PLE exit, we can limit the amount of useless ones, so we don't reach the state where CPUs spend 99% of the time waiting for a lock. HP confirmed that this series prevents soft lockups and TSC sync errors on large guests. Hi, I'm not sure of the usefulness of patch 6, so I'm going to drop it. I'll keep it in my local junkyard branch in case it's going to be useful in some scenario I didn't think of. I think grow knob may be helpful to some extent considering number of vcpus can vary from few to hundreds, which in turn helps in fast convergence of ple_window value in non overcommit scenarios. I will try to experiment with shrink knob. One argument favouring shrink knob may be the fact that we rudely reset vmx-ple_window back to default 4k. Ofcourse danger on the other side is slow convergence during overcommit/sudden burst of load. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.
On 08/21/2014 09:38 PM, Radim Krčmář wrote: v2 - v3: * copypaste frenzy [v3 4/7] (split modify_ple_window) * commented update_ple_window_actual_max [v3 4/7] * renamed shrinker to modifier [v3 4/7] * removed an extraneous max(new, ple_window) [v3 4/7] (should have been in v2) * changed tracepoint argument type, printing and macro abstractions [v3 5/7] * renamed ple_t to ple_int [v3 6/7] (visible in modinfo) * intelligent updates of ple_window [v3 7/7] --- v1 - v2: * squashed [v1 4/9] and [v1 5/9] (clamping) * dropped [v1 7/9] (CPP abstractions) * merged core of [v1 9/9] into [v1 4/9] (automatic maximum) * reworked kernel_param_ops: closer to pure int [v2 6/6] * introduced ple_window_actual_max reworked clamping [v2 4/6] * added seqlock for parameter modifications [v2 6/6] --- Was able to test, both V1 and V2. and trace showed good behaviour of ple_window in undercommit and overcommit. Considering V3 does not have any change w.r.t functionality except intelligent update with dirty field, Please feel free to add Tested-by: Raghavendra KT raghavendra...@linux.vnet.ibm.com I do have some observations and comments though. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/7] Dynamic Pause Loop Exiting window.
Hi Radim, On Thu, Aug 21, 2014 at 06:50:03PM +0200, Radim Krčmář wrote: 2014-08-21 18:30+0200, Paolo Bonzini: Il 21/08/2014 18:08, Radim Krčmář ha scritto: I'm not sure of the usefulness of patch 6, so I'm going to drop it. I'll keep it in my local junkyard branch in case it's going to be useful in some scenario I didn't think of. I've been using it to benchmark different values, because it is more Is there any benchmark data for this patchset? Regards, Wanpeng Li convenient than reloading the module after shutting down guests. (And easier to sell than writing to kernel memory.) I don't think the additional code is worth it though. Patch 7 can be easily rebased, so no need to repost (and I might even squash it into patch 3, what do you think?). Yeah, the core is already a huge patch, so it does look weird without squashing. (No-one wants to rebase to that point anyway.) Thanks. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html