Re: [PATCH v5 9/9] KVM-GST: KVM Steal time registration

2011-07-06 Thread Rik van Riel
offlining/onlining. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the

Re: [PATCH v5 8/9] KVM-GST: adjust scheduler cpu power

2011-07-06 Thread Rik van Riel
ed in update_process_tick() would never reach us in update_rq_clock(). Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the

Re: [PATCH v5 7/9] KVM-GST: KVM Steal time accounting

2011-07-06 Thread Rik van Riel
halted. Accounting steal time from the core scheduler give us the advantage of direct acess to the runqueue data. In a later opportunity, it can be used to tweak cpu power and make the scheduler aware of the time it lost. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter

Re: [PATCH v5 6/9] add jump labels for ia64 paravirt

2011-07-06 Thread Rik van Riel
: Eddie Dong CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kerne

Re: [PATCH v5 5/9] KVM-GST: Add a pv_ops stub for steal time

2011-07-06 Thread Rik van Riel
bypassed when not in use. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in t

Re: [PATCH v5 4/9] KVM-HV: KVM Steal time implementation

2011-07-06 Thread Rik van Riel
steal time This patch contains the hypervisor part of the steal time infrasructure, and can be backported independently of the guest portion. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik

Re: [PATCH v5 3/9] KVM-HDR: KVM Steal time implementation

2011-07-05 Thread Rik van Riel
the other way around. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the

Re: [PATCH v5 2/9] KVM-HDR Add constant to represent KVM MSRs enabled bit

2011-07-05 Thread Rik van Riel
On 07/04/2011 11:32 AM, Glauber Costa wrote: This patch is simple, put in a different commit so it can be more easily shared between guest and hypervisor. It just defines a named constant to indicate the enable bit for KVM-specific MSRs. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy

Re: [PATCH v5 1/9] introduce kvm_read_guest_cached

2011-07-05 Thread Rik van Riel
On 07/04/2011 11:32 AM, Glauber Costa wrote: From: Gleb Natapov Introduce kvm_read_guest_cached() function in addition to write one we already have. [ by glauber: export function signature in kvm header ] Signed-off-by: Gleb Natapov Signed-off-by: Glauber Costa Acked-by: Rik van Riel

Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking

2011-06-22 Thread Rik van Riel
On 06/22/2011 07:37 PM, Nai Xia wrote: On 2MB pages, I'd like to remind you and Rik that ksmd currently splits huge pages before their sub pages gets really merged to stable tree. Your proposal appears to add a condition that causes ksmd to skip doing that, which can cause the system to start

Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking

2011-06-22 Thread Rik van Riel
On 06/22/2011 07:13 PM, Nai Xia wrote: On Wed, Jun 22, 2011 at 11:39 PM, Rik van Riel wrote: On 06/22/2011 07:19 AM, Izik Eidus wrote: So what we say here is: it is better to have little junk in the unstable tree that get flushed eventualy anyway, instead of make the guest slower this

Re: [PATCH] mmu_notifier, kvm: Introduce dirty bit tracking in spte and mmu notifier to help KSM dirty bit tracking

2011-06-22 Thread Rik van Riel
On 06/22/2011 07:19 AM, Izik Eidus wrote: So what we say here is: it is better to have little junk in the unstable tree that get flushed eventualy anyway, instead of make the guest slower this race is something that does not reflect accurate of ksm anyway due to the full memcmp that we will

Re: [PATCH 4/7] KVM-GST: Add a pv_ops stub for steal time

2011-06-14 Thread Rik van Riel
On 06/13/2011 07:31 PM, Glauber Costa wrote: This patch adds a function pointer in one of the many paravirt_ops structs, to allow guests to register a steal time function. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori

Re: [PATCH 3/7] KVM-HV: KVM Steal time implementation

2011-06-14 Thread Rik van Riel
but not the hypervisor, or the other way around. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubs

Re: [PATCH 2/7] KVM-HDR: KVM Steal time implementation

2011-06-13 Thread Rik van Riel
the other way around. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity CC: Anthony Liguori CC: Eric B Munson Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the

Re: [PATCH 1/7] KVM-HDR Add constant to represent KVM MSRs enabled bit

2011-06-13 Thread Rik van Riel
On 06/13/2011 07:31 PM, Glauber Costa wrote: This patch is simple, put in a different commit so it can be more easily shared between guest and hypervisor. It just defines a named constant to indicate the enable bit for KVM-specific MSRs. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy

Re: [PATCH v3 3/6] KVM-GST: KVM Steal time accounting

2011-02-15 Thread Rik van Riel
On 02/15/2011 10:17 AM, Avi Kivity wrote: Ah, so we're all set. Do you know if any user tools process this information? Top and vmstat have been displaying steal time for maybe 4 or 5 years now. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to major

[PATCH -v8a 3/7] sched: use a buddy to implement yield_task_fair

2011-02-01 Thread Rik van Riel
the correct entity once we get to the right level. Signed-off-by: Rik van Riel diff --git a/kernel/sched.c b/kernel/sched.c index dc91a4d..7ff53e2 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -327,7 +327,7 @@ struct cfs_rq { * 'curr' points to currently running entity on t

[PATCH -v8a 7/7] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-02-01 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to get another VCPU in the same KVM guest to run sooner. This seems to give a 10-15% speedup in certain workloads. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti

[PATCH -v8a 6/7] kvm: keep track of which task is running a KVM vcpu

2011-02-01 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a055742..9d56ed5 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -81,6 +81,7 @@ struct kvm_vcpu { #endif int vcpu_id; struct mutex mutex; +

[PATCH -v8a 2/7] sched: limit the scope of clear_buddies

2011-02-01 Thread Rik van Riel
red or pointed elsewhere. Signed-off-by: Rik van Riel --- kernel/sched_fair.c | 30 +++--- 1 files changed, 23 insertions(+), 7 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index f4ee445..0321473 100644 --- a/kernel/sched_fair.c +++

[PATCH -v8a 0/7] directed yield for Pause Loop Exiting

2011-02-01 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

[PATCH -v8a 5/7] export pid symbols needed for kvm_vcpu_on_spin

2011-02-01 Thread Rik van Riel
Export the symbols required for a race-free kvm_vcpu_on_spin. Signed-off-by: Rik van Riel diff --git a/kernel/fork.c b/kernel/fork.c index 3b159c5..adc8f47 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -191,6 +191,7 @@ void __put_task_struct(struct task_struct *tsk) if

[PATCH -v8a 4/7] sched: Add yield_to(task, preempt) functionality

2011-02-01 Thread Rik van Riel
ler just ignoring the hint. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti Signed-off-by: Mike Galbraith diff --git a/include/linux/sched.h b/include/linux/sched.h index 2c79e92..6c43fc4 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1047,6 +1047,7 @@ struct sch

[PATCH -v8a 1/7] sched: check the right ->nr_running in yield_task_fair

2011-02-01 Thread Rik van Riel
With CONFIG_FAIR_GROUP_SCHED, each task_group has its own cfs_rq. Yielding to a task from another cfs_rq may be worthwhile, since a process calling yield typically cannot use the CPU right now. Therefor, we want to check the per-cpu nr_running, not the cgroup local one. Signed-off-by: Rik van

Re: [PATCH -v8 0/7] directed yield for Pause Loop Exiting

2011-02-01 Thread Rik van Riel
On 02/01/2011 05:53 AM, Peter Zijlstra wrote: On Mon, 2011-01-31 at 16:40 -0500, Rik van Riel wrote: v8: - some more changes and cleanups suggested by Peter Did you, by accident, send out the -v7 patches again? I don't think I've spotted a difference.. Arghhh. Yeah, I did

[PATCH -v8 5/7] export pid symbols needed for kvm_vcpu_on_spin

2011-01-31 Thread Rik van Riel
Export the symbols required for a race-free kvm_vcpu_on_spin. Signed-off-by: Rik van Riel diff --git a/kernel/fork.c b/kernel/fork.c index 3b159c5..adc8f47 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -191,6 +191,7 @@ void __put_task_struct(struct task_struct *tsk) if

[PATCH -v8 1/7] sched: check the right ->nr_running in yield_task_fair

2011-01-31 Thread Rik van Riel
With CONFIG_FAIR_GROUP_SCHED, each task_group has its own cfs_rq. Yielding to a task from another cfs_rq may be worthwhile, since a process calling yield typically cannot use the CPU right now. Therefor, we want to check the per-cpu nr_running, not the cgroup local one. Signed-off-by: Rik van

[PATCH -v8 6/7] kvm: keep track of which task is running a KVM vcpu

2011-01-31 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a055742..9d56ed5 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -81,6 +81,7 @@ struct kvm_vcpu { #endif int vcpu_id; struct mutex mutex; +

[PATCH -v8 7/7] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-31 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to get another VCPU in the same KVM guest to run sooner. This seems to give a 10-15% speedup in certain workloads. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti

[PATCH -v8 4/7] sched: Add yield_to(task, preempt) functionality.

2011-01-31 Thread Rik van Riel
ler just ignoring the hint. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti Signed-off-by: Mike Galbraith diff --git a/include/linux/sched.h b/include/linux/sched.h index 2c79e92..6c43fc4 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1047,6 +1047,7 @@ struct sch

[PATCH -v8 2/7] sched: limit the scope of clear_buddies

2011-01-31 Thread Rik van Riel
red or pointed elsewhere. Signed-off-by: Rik van Riel --- kernel/sched_fair.c | 30 +++--- 1 files changed, 23 insertions(+), 7 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index f4ee445..0321473 100644 --- a/kernel/sched_fair.c +++

[PATCH -v8 3/7] sched: use a buddy to implement yield_task_fair

2011-01-31 Thread Rik van Riel
the correct entity once we get to the right level. Signed-off-by: Rik van Riel diff --git a/kernel/sched.c b/kernel/sched.c index dc91a4d..7ff53e2 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -327,7 +327,7 @@ struct cfs_rq { * 'curr' points to currently running entity on t

[PATCH -v8 0/7] directed yield for Pause Loop Exiting

2011-01-31 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

Re: [RFC -v7 PATCH 4/7] Add yield_to(task, preempt) functionality.

2011-01-31 Thread Rik van Riel
On 01/31/2011 06:49 AM, Peter Zijlstra wrote: On Wed, 2011-01-26 at 17:21 -0500, Rik van Riel wrote: + if (yielded) + yield(); + + return yielded; +} +EXPORT_SYMBOL_GPL(yield_to); yield() will again acquire rq->lock.. not not simply have ->yield_to_tas

Re: [RFC -v7 PATCH 3/7] sched: use a buddy to implement yield_task_fair

2011-01-31 Thread Rik van Riel
On 01/31/2011 06:47 AM, Peter Zijlstra wrote: On Wed, 2011-01-26 at 17:21 -0500, Rik van Riel wrote: +static struct sched_entity *__pick_second_entity(struct cfs_rq *cfs_rq) +{ + struct rb_node *left = cfs_rq->rb_leftmost; + struct rb_node *second; + + if (!l

Re: [PATCH v2 4/6] KVM-GST: KVM Steal time registration

2011-01-28 Thread Rik van Riel
On 01/28/2011 02:52 PM, Glauber Costa wrote: Register steal time within KVM. Everytime we sample the steal time information, we update a local variable that tells what was the last time read. We then account the difference. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge

Re: [PATCH v2 3/6] KVM-GST: KVM Steal time accounting

2011-01-28 Thread Rik van Riel
it. Since functions like account_idle_time() can be called from multiple places, not only account_process_tick(), steal time grabbing is repeated in each account function separatedely. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity Not

Re: [PATCH v2 2/6] KVM-HV: KVM Steal time implementation

2011-01-28 Thread Rik van Riel
but not the hypervisor, or the other way around. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a

Re: [PATCH v2 1/6] KVM-HDR: KVM Steal time implementation

2011-01-28 Thread Rik van Riel
the other way around. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity Acked-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.

[RFC -v7 PATCH 1/7] sched: check the right ->nr_running in yield_task_fair

2011-01-26 Thread Rik van Riel
With CONFIG_FAIR_GROUP_SCHED, each task_group has its own cfs_rq. Yielding to a task from another cfs_rq may be worthwhile, since a process calling yield typically cannot use the CPU right now. Therefor, we want to check the per-cpu nr_running, not the cgroup local one. Signed-off-by: Rik van

[RFC -v7 PATCH 2/7] sched: limit the scope of clear_buddies

2011-01-26 Thread Rik van Riel
red or pointed elsewhere. Signed-off-by: Rik van Riel --- kernel/sched_fair.c | 30 +++--- 1 files changed, 23 insertions(+), 7 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index f4ee445..0321473 100644 --- a/kernel/sched_fair.c +++

[RFC -v7 PATCH 7/7] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-26 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to get another VCPU in the same KVM guest to run sooner. This seems to give a 10-15% speedup in certain workloads. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti

[RFC -v7 PATCH 0/7] directed yield for Pause Loop Exiting

2011-01-26 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

[RFC -v7 PATCH 4/7] Add yield_to(task, preempt) functionality.

2011-01-26 Thread Rik van Riel
ler just ignoring the hint. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti Signed-off-by: Mike Galbraith diff --git a/include/linux/sched.h b/include/linux/sched.h index 2c79e92..6c43fc4 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1047,6 +1047,7 @@ struct sch

[RFC -v7 PATCH 5/7] export pid symbols needed for kvm_vcpu_on_spin

2011-01-26 Thread Rik van Riel
Export the symbols required for a race-free kvm_vcpu_on_spin. Signed-off-by: Rik van Riel diff --git a/kernel/fork.c b/kernel/fork.c index 3b159c5..adc8f47 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -191,6 +191,7 @@ void __put_task_struct(struct task_struct *tsk) if

[RFC -v7 PATCH 6/7] kvm: keep track of which task is running a KVM vcpu

2011-01-26 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a055742..9d56ed5 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -81,6 +81,7 @@ struct kvm_vcpu { #endif int vcpu_id; struct mutex mutex; +

[RFC -v7 PATCH 3/7] sched: use a buddy to implement yield_task_fair

2011-01-26 Thread Rik van Riel
the correct entity once we get to the right level. Signed-off-by: Rik van Riel diff --git a/kernel/sched.c b/kernel/sched.c index dc91a4d..7ff53e2 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -327,7 +327,7 @@ struct cfs_rq { * 'curr' points to currently running entity on t

Re: [RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu

2011-01-26 Thread Rik van Riel
On 01/26/2011 08:01 AM, Avi Kivity wrote: Suggest moving the code to vcpu_load(), where it can execute under the protection of vcpu->mutex. I've made the suggested changes by you and Peter, and will re-post the patch series in a bit... -- All rights reversed -- To unsubscribe from this list:

Re: [PATCH 14/16] KVM-GST: KVM Steal time registration

2011-01-24 Thread Rik van Riel
On 01/24/2011 08:25 PM, Glauber Costa wrote: On Mon, 2011-01-24 at 18:31 -0500, Rik van Riel wrote: On 01/24/2011 01:06 PM, Glauber Costa wrote: Register steal time within KVM. Everytime we sample the steal time information, we update a local variable that tells what was the last time read. We

Re: [PATCH 15/16] KVM-GST: KVM Steal time accounting

2011-01-24 Thread Rik van Riel
it. Since functions like account_idle_time() can be called from multiple places, not only account_process_tick(), steal time grabbing is repeated in each account function separatedely. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity

Re: [PATCH 14/16] KVM-GST: KVM Steal time registration

2011-01-24 Thread Rik van Riel
On 01/24/2011 01:06 PM, Glauber Costa wrote: Register steal time within KVM. Everytime we sample the steal time information, we update a local variable that tells what was the last time read. We then account the difference. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge

Re: [PATCH 14/16] KVM-GST: KVM Steal time registration

2011-01-24 Thread Rik van Riel
On 01/24/2011 01:06 PM, Glauber Costa wrote: Register steal time within KVM. Everytime we sample the steal time information, we update a local variable that tells what was the last time read. We then account the difference. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge

Re: [PATCH 13/16] KVM-HV: KVM Steal time calculation

2011-01-24 Thread Rik van Riel
again. If this is, or if this will not, be accounted as steal time for the guest, is a guest's decision. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity Reviewed-by: Rik van Riel -- To unsubscribe from this list: send the line "u

Re: [PATCH 12/16] KVM-HV: KVM Steal time implementation

2011-01-24 Thread Rik van Riel
patch contains the hypervisor part for it. I am keeping it separate from the headers to facilitate backports to people who wants to backport the kernel part but not the hypervisor, or the other way around. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC

Re: [PATCH 11/16] KVM-HDR: KVM Steal time implementation

2011-01-24 Thread Rik van Riel
patch contains the headers for it. I am keeping it separate to facilitate backports to people who wants to backport the kernel part but not the hypervisor, or the other way around. Signed-off-by: Glauber Costa CC: Rik van Riel CC: Jeremy Fitzhardinge CC: Peter Zijlstra CC: Avi Kivity Acked-by: Rik

Re: [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality

2011-01-24 Thread Rik van Riel
On 01/24/2011 01:12 PM, Peter Zijlstra wrote: On Thu, 2011-01-20 at 16:34 -0500, Rik van Riel wrote: From: Mike Galbraith Currently only implemented for fair class tasks. Add a yield_to_task method() to the fair scheduling class. allowing the caller of yield_to() to accelerate another thread

Re: [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair

2011-01-24 Thread Rik van Riel
On 01/24/2011 01:04 PM, Peter Zijlstra wrote: diff --git a/kernel/sched.c b/kernel/sched.c index dc91a4d..e4e57ff 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -327,7 +327,7 @@ struct cfs_rq { * 'curr' points to currently running entity on this cfs_rq. * It is set to NULL

Re: [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies

2011-01-24 Thread Rik van Riel
On 01/24/2011 12:57 PM, Peter Zijlstra wrote: On Thu, 2011-01-20 at 16:33 -0500, Rik van Riel wrote: The clear_buddies function does not seem to play well with the concept of hierarchical runqueues. In the following tree, task groups are represented by 'G', tasks by 'T',

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-22 Thread Rik van Riel
On 01/22/2011 01:14 AM, Srivatsa Vaddagiri wrote: Also it may be possible for the pv-ticketlocks to track owning vcpu and make use of a yield-to interface as further optimization to avoid the "others-get-more-time" problem, but Peterz rightly pointed that PI would be a better solution there than

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-21 Thread Rik van Riel
On 01/21/2011 09:02 AM, Srivatsa Vaddagiri wrote: On Thu, Jan 20, 2011 at 09:56:27AM -0800, Jeremy Fitzhardinge wrote: The key here is not to sleep when waiting for locks (as implemented by current patch-series, which can put other VMs at an advantage by giving them more time than they are ent

[RFC -v6 PATCH 6/8] export pid symbols needed for kvm_vcpu_on_spin

2011-01-20 Thread Rik van Riel
Export the symbols required for a race-free kvm_vcpu_on_spin. Signed-off-by: Rik van Riel diff --git a/kernel/fork.c b/kernel/fork.c index 3b159c5..adc8f47 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -191,6 +191,7 @@ void __put_task_struct(struct task_struct *tsk) if

[RFC -v6 PATCH 5/8] sched: drop superfluous tests from yield_to

2011-01-20 Thread Rik van Riel
Fairness is enforced by pick_next_entity, so we can drop some superfluous tests from yield_to. Signed-off-by: Rik van Riel --- kernel/sched.c |8 1 files changed, 0 insertions(+), 8 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 1f38ed2..398eedf 100644 --- a

[RFC -v6 PATCH 7/8] kvm: keep track of which task is running a KVM vcpu

2011-01-20 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a055742..9d56ed5 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -81,6 +81,7 @@ struct kvm_vcpu { #endif int vcpu_id; struct mutex mutex; +

[RFC -v6 PATCH 0/8] directed yield for Pause Loop Exiting

2011-01-20 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

[RFC -v6 PATCH 8/8] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-20 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to get another VCPU in the same KVM guest to run sooner. This seems to give a 10-15% speedup in certain workloads, versus not having PLE at all. Signed-off-by: Rik van Riel

[RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair

2011-01-20 Thread Rik van Riel
the correct entity once we get to the right level. Signed-off-by: Rik van Riel diff --git a/kernel/sched.c b/kernel/sched.c index dc91a4d..e4e57ff 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -327,7 +327,7 @@ struct cfs_rq { * 'curr' points to currently running entity on t

[RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality

2011-01-20 Thread Rik van Riel
ler just ignoring the hint. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti Signed-off-by: Mike Galbraith diff --git a/include/linux/sched.h b/include/linux/sched.h index 2c79e92..6c43fc4 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1047,6 +1047,7 @@ struct sch

[RFC -v6 PATCH 1/8] sched: check the right ->nr_running in yield_task_fair

2011-01-20 Thread Rik van Riel
With CONFIG_FAIR_GROUP_SCHED, each task_group has its own cfs_rq. Yielding to a task from another cfs_rq may be worthwhile, since a process calling yield typically cannot use the CPU right now. Therefor, we want to check the per-cpu nr_running, not the cgroup local one. Signed-off-by: Rik van

[RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies

2011-01-20 Thread Rik van Riel
red or pointed elsewhere. Signed-off-by: Rik van Riel --- kernel/sched_fair.c | 30 +++--- 1 files changed, 23 insertions(+), 7 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index f4ee445..0321473 100644 --- a/kernel/sched_fair.c +++

Re: [RFC -v5 PATCH 0/4] directed yield for Pause Loop Exiting

2011-01-14 Thread Rik van Riel
On 01/14/2011 03:02 AM, Rik van Riel wrote: Benchmark "results": Two 4-CPU KVM guests are pinned to the same 4 physical CPUs. Unfortunately, it turned out I was running my benchmark on only two CPU cores, using two HT threads of each core. I have re-run the benchmark with the gu

Re: [RFC -v5 PATCH 2/4] sched: Add yield_to(task, preempt) functionality.

2011-01-14 Thread Rik van Riel
On 01/14/2011 12:47 PM, Srivatsa Vaddagiri wrote: If I recall correctly, one of the motivations for yield_to_task (rather than a simple yield) was to avoid leaking bandwidth to other guests i.e we don't want the remaining timeslice of spinning vcpu to be given away to other guests but rather don

Re: [RFC -v5 PATCH 0/4] directed yield for Pause Loop Exiting

2011-01-14 Thread Rik van Riel
On 01/14/2011 03:02 AM, Rik van Riel wrote: Benchmark "results": Two 4-CPU KVM guests are pinned to the same 4 physical CPUs. I just discovered that I had in fact pinned the 4-CPU KVM guests to 4 HT threads across 2 cores, and the scheduler has all kinds of special magic for deali

[RFC -v5 PATCH 3/4] export pid symbols needed for kvm_vcpu_on_spin

2011-01-14 Thread Rik van Riel
Export the symbols required for a race-free kvm_vcpu_on_spin. Signed-off-by: Rik van Riel diff --git a/kernel/fork.c b/kernel/fork.c index 3b159c5..adc8f47 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -191,6 +191,7 @@ void __put_task_struct(struct task_struct *tsk) if

[RFC -v5 PATCH 0/4] directed yield for Pause Loop Exiting

2011-01-14 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

[RFC -v5 PATCH 2/4] sched: Add yield_to(task, preempt) functionality.

2011-01-14 Thread Rik van Riel
ler just ignoring the hint. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti Signed-off-by: Mike Galbraith diff --git a/include/linux/sched.h b/include/linux/sched.h index 2c79e92..6c43fc4 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1047,6 +1047,7 @@ struct sch

[RFC -v5 PATCH 1/4] kvm: keep track of which task is running a KVM vcpu

2011-01-14 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a055742..9d56ed5 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -81,6 +81,7 @@ struct kvm_vcpu { #endif int vcpu_id; struct mutex mutex; +

[RFC -v5 PATCH 4/4] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-14 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to hand the rest of our timeslice to another vcpu in the same KVM guest. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti diff --git a/include/linux/kvm_host.h b

Re: [RFC -v4 PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-13 Thread Rik van Riel
On 01/13/2011 10:23 AM, Avi Kivity wrote: On 01/13/2011 05:06 PM, Rik van Riel wrote: I think the first patch needs some reference counting... I'd move it to the outermost KVM_RUN loop to reduce the performance impact. I don't see how refcounting from that other thread could pos

Re: [RFC -v4 PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-13 Thread Rik van Riel
On 01/13/2011 08:16 AM, Avi Kivity wrote: + for (pass = 0; pass< 2&& !yielded; pass++) { + kvm_for_each_vcpu(i, vcpu, kvm) { + struct task_struct *task = vcpu->task; + if (!pass&& i< last_boosted_vcpu) { + i = last_boosted_vcpu; + continue; + } else if (pass&& i> last_boosted_vcpu) + break; + if

[RFC -v4 PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2011-01-12 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel --- - move vcpu->task manipulation as suggested by Chris Wright include/linux/kvm_host.h |1 + virt/kvm/kvm_main.c |2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h

[RFC -v4 PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-12 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to hand the rest of our timeslice to another vcpu in the same KVM guest. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti diff --git a/include/linux/kvm_host.h b

[RFC -v4 PATCH 2/3] sched: Add yield_to(task, preempt) functionality.

2011-01-12 Thread Rik van Riel
ncourage the target being selected. We can rely on pick_next_entity to keep things fair, so noone can accelerate a thread that has already used its fair share of CPU time. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti Signed-off-by: Mike Galbraith diff --git a/include/linux/sched.h b

[RFC -v4 PATCH 0/3] directed yield for Pause Loop Exiting

2011-01-12 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is spi

[PATCH -v2] vmx: increase ple_gap default to 64

2011-01-12 Thread Rik van Riel
ound-robin 36616 Increase the ple_gap to 128 to be on the safe side. Is this enough for a CPU with HT that has a busy sibling thread, or should it be even larger? On the X5670, loading up the sibling thread with an infinite loop does not seem to increase the required ple_gap. Signed-off-by: Ri

Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-12 Thread Rik van Riel
On 01/12/2011 10:26 PM, Mike Galbraith wrote: On Wed, 2011-01-12 at 22:02 -0500, Rik van Riel wrote: Cgroups only makes the matter worse - libvirt places each KVM guest into its own cgroup, so a VCPU will generally always be alone on its own per-cgroup, per-cpu runqueue! That can lead to

Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-12 Thread Rik van Riel
On 01/07/2011 12:29 AM, Mike Galbraith wrote: +#ifdef CONFIG_SMP + /* +* If this yield is important enough to want to preempt instead +* of only dropping a ->next hint, we're alone, and the target +* is not alone, pull the target to this cpu. +* +* N

Re: [GIT PULL] KVM updates for the 2.6.38 merge window

2011-01-12 Thread Rik van Riel
On 01/11/2011 04:25 AM, Avi Kivity wrote: On 01/10/2011 09:31 PM, Linus Torvalds wrote: Why wasn't I notified before-hand? Was Andrew cc'd? Andrew and linux-mm were copied. Rik was the only one who reviewed (and ack'ed) it. I guess I should have explicitly asked for Nick's review. Last tim

Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Rik van Riel
On 01/04/2011 12:08 PM, Peter Zijlstra wrote: On Wed, 2011-01-05 at 00:51 +0800, Hillf Danton wrote: Where is the yield_to callback in the patch for RT schedule class? If @p is RT, what could you do? RT guests are a pipe dream, you first need to get the hypervisor (kvm in this case) to be RT,

Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Rik van Riel
On 01/04/2011 11:51 AM, Hillf Danton wrote: Wouldn't that break for FIFO and RR tasks? There's a reason all the scheduler folks wanted a per-class yield_to_task function :) Where is the yield_to callback in the patch for RT schedule class? If @p is RT, what could you do? If the user choose

Re: [RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-04 Thread Rik van Riel
On 01/04/2011 11:41 AM, Hillf Danton wrote: /* !curr->sched_class->yield_to_task ||*/ + curr->sched_class != p->sched_class) { + goto out; + } + /* * ask scheduler to compute the next for successfully ki

Re: [PATCH] increase ple_gap default to 64

2011-01-04 Thread Rik van Riel
On 01/03/2011 10:21 PM, Zhai, Edwin wrote: Riel, Thanks for your patch. I have changed the ple_gap to 128 on xen side, but forget the patch for KVM:( A little bit big is no harm, but more perf data is better. So should I resend the patch with the ple_gap default changed to 128, or are you will

[RFC -v3 PATCH 0/3] directed yield for Pause Loop Exiting

2011-01-03 Thread Rik van Riel
When running SMP virtual machines, it is possible for one VCPU to be spinning on a spinlock, while the VCPU that holds the spinlock is not currently running, because the host scheduler preempted it to run something else. Both Intel and AMD CPUs have a feature that detects when a virtual CPU is sp

[RFC -v3 PATCH 3/3] Subject: kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2011-01-03 Thread Rik van Riel
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to hand the rest of our timeslice to another vcpu in the same KVM guest. Signed-off-by: Rik van Riel Signed-off-by: Marcelo Tosatti diff --git a/include/linux/kvm_host.h b

[RFC -v3 PATCH 2/3] sched: add yield_to function

2011-01-03 Thread Rik van Riel
van Riel Signed-off-by: Marcelo Tosatti Not-signed-off-by: Mike Galbraith --- Mike, want to change the above into a Signed-off-by: ? :) This code seems to work well. diff --git a/include/linux/sched.h b/include/linux/sched.h index c5f926c..0b8a3e6 100644 --- a/include/linux/sched.h +++ b/include

[RFC -v3 PATCH 1/3] kvm: keep track of which task is running a KVM vcpu

2011-01-03 Thread Rik van Riel
f the vcpu. Signed-off-by: Rik van Riel --- - move vcpu->task manipulation as suggested by Chris Wright include/linux/kvm_host.h |1 + virt/kvm/kvm_main.c |2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h

[PATCH] increase ple_gap default to 64

2011-01-03 Thread Rik van Riel
ound-robin 36616 Increase the ple_gap to 64 to be on the safe side. Is this enough for a CPU with HT that has a busy sibling thread, or should it be even larger? On the X5670, loading up the sibling thread with an infinite loop does not seem to increase the required ple_gap. Signed-off-by: Ri

Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

2010-12-28 Thread Rik van Riel
On 12/28/2010 12:54 AM, Mike Galbraith wrote: On Mon, 2010-12-20 at 17:04 +0100, Mike Galbraith wrote: On Mon, 2010-12-20 at 10:40 -0500, Rik van Riel wrote: On 12/17/2010 02:15 AM, Mike Galbraith wrote: BTW, with this vruntime donation thingy, what prevents a task from forking off

Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

2010-12-20 Thread Rik van Riel
On 12/17/2010 02:15 AM, Mike Galbraith wrote: BTW, with this vruntime donation thingy, what prevents a task from forking off accomplices who do nothing but wait for a wakeup and yield_to(exploit)? Even swapping vruntimes in the same cfs_rq is dangerous as hell, because one party is going backwa

Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

2010-12-18 Thread Rik van Riel
On 12/14/2010 07:22 AM, Peter Zijlstra wrote: ... fixed all the obvious stuff. No idea what the hell I was thinking while doing that "cleanup" - probably too busy looking at the tests that I was running on a previous codebase :( For the next version of the patches, I have switched to your vers

<    1   2   3   >