Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-19 Thread Avi Kivity
On 09/18/2012 06:03 AM, Andrew Theurer wrote: > On Sun, 2012-09-16 at 11:55 +0300, Avi Kivity wrote: >> On 09/14/2012 12:30 AM, Andrew Theurer wrote: >> >> > The concern I have is that even though we have gone through changes to >> > help reduce the candidate vcpus we yield to, we still have a ver

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-17 Thread Andrew Theurer
On Sun, 2012-09-16 at 11:55 +0300, Avi Kivity wrote: > On 09/14/2012 12:30 AM, Andrew Theurer wrote: > > > The concern I have is that even though we have gone through changes to > > help reduce the candidate vcpus we yield to, we still have a very poor > > idea of which vcpu really needs to run.

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-17 Thread Andrew Jones
On Sat, Sep 15, 2012 at 09:38:54PM +0530, Raghavendra K T wrote: > On 09/14/2012 10:40 PM, Andrew Jones wrote: > >On Thu, Sep 13, 2012 at 04:30:58PM -0500, Andrew Theurer wrote: > >>On Thu, 2012-09-13 at 17:18 +0530, Raghavendra K T wrote: > >>>* Andrew Theurer [2012-09-11 13:27:41]: > >>> > [...]

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-17 Thread Andrew Jones
On Sun, Sep 16, 2012 at 11:55:28AM +0300, Avi Kivity wrote: > On 09/14/2012 12:30 AM, Andrew Theurer wrote: > > > The concern I have is that even though we have gone through changes to > > help reduce the candidate vcpus we yield to, we still have a very poor > > idea of which vcpu really needs to

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-17 Thread Andrew Jones
On Fri, Sep 14, 2012 at 04:34:24PM -0400, Konrad Rzeszutek Wilk wrote: > > The concern I have is that even though we have gone through changes to > > help reduce the candidate vcpus we yield to, we still have a very poor > > idea of which vcpu really needs to run. The result is high cpu usage in >

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-16 Thread Avi Kivity
On 09/14/2012 12:30 AM, Andrew Theurer wrote: > The concern I have is that even though we have gone through changes to > help reduce the candidate vcpus we yield to, we still have a very poor > idea of which vcpu really needs to run. The result is high cpu usage in > the get_pid_task and still so

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-15 Thread Raghavendra K T
On 09/14/2012 10:40 PM, Andrew Jones wrote: On Thu, Sep 13, 2012 at 04:30:58PM -0500, Andrew Theurer wrote: On Thu, 2012-09-13 at 17:18 +0530, Raghavendra K T wrote: * Andrew Theurer [2012-09-11 13:27:41]: [...] On picking a better vcpu to yield to: I really hesitate to rely on paravirt h

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-14 Thread Konrad Rzeszutek Wilk
> The concern I have is that even though we have gone through changes to > help reduce the candidate vcpus we yield to, we still have a very poor > idea of which vcpu really needs to run. The result is high cpu usage in > the get_pid_task and still some contention in the double runqueue lock. > To

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-14 Thread Andrew Jones
On Thu, Sep 13, 2012 at 04:30:58PM -0500, Andrew Theurer wrote: > On Thu, 2012-09-13 at 17:18 +0530, Raghavendra K T wrote: > > * Andrew Theurer [2012-09-11 13:27:41]: > > > > > On Tue, 2012-09-11 at 11:38 +0530, Raghavendra K T wrote: > > > > On 09/11/2012 01:42 AM, Andrew Theurer wrote: > > > >

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-13 Thread Andrew Theurer
On Thu, 2012-09-13 at 17:18 +0530, Raghavendra K T wrote: > * Andrew Theurer [2012-09-11 13:27:41]: > > > On Tue, 2012-09-11 at 11:38 +0530, Raghavendra K T wrote: > > > On 09/11/2012 01:42 AM, Andrew Theurer wrote: > > > > On Mon, 2012-09-10 at 19:12 +0200, Peter Zijlstra wrote: > > > >> On Mon,

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-13 Thread Avi Kivity
On 09/11/2012 09:27 PM, Andrew Theurer wrote: > > So, having both is probably not a good idea. However, I feel like > there's more work to be done. With no over-commit (10 VMs), total > throughput is 23427 +/- 2.76%. A 2x over-commit will no doubt have some > overhead, but a reduction to ~4500

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-13 Thread Raghavendra K T
* Andrew Theurer [2012-09-11 13:27:41]: > On Tue, 2012-09-11 at 11:38 +0530, Raghavendra K T wrote: > > On 09/11/2012 01:42 AM, Andrew Theurer wrote: > > > On Mon, 2012-09-10 at 19:12 +0200, Peter Zijlstra wrote: > > >> On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote: > > +static

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-11 Thread Andrew Theurer
On Tue, 2012-09-11 at 11:38 +0530, Raghavendra K T wrote: > On 09/11/2012 01:42 AM, Andrew Theurer wrote: > > On Mon, 2012-09-10 at 19:12 +0200, Peter Zijlstra wrote: > >> On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote: > +static bool __yield_to_candidate(struct task_struct *curr,

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-11 Thread Andrew Theurer
On Tue, 2012-09-11 at 11:38 +0530, Raghavendra K T wrote: > On 09/11/2012 01:42 AM, Andrew Theurer wrote: > > On Mon, 2012-09-10 at 19:12 +0200, Peter Zijlstra wrote: > >> On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote: > +static bool __yield_to_candidate(struct task_struct *curr,

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-11 Thread Srikar Dronamraju
> > > @@ -4323,6 +4340,10 @@ bool __sched yield_to(struct task_struct *p, > > bool preempt) > > > rq = this_rq(); > > > > > > again: > > > + /* optimistic test to avoid taking locks */ > > > + if (!__yield_to_candidate(curr, p)) > > > + goto out_irq; > > > + > > So add

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Raghavendra K T
On 09/11/2012 01:42 AM, Andrew Theurer wrote: On Mon, 2012-09-10 at 19:12 +0200, Peter Zijlstra wrote: On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote: +static bool __yield_to_candidate(struct task_struct *curr, struct task_struct *p) +{ + if (!curr->sched_class->yield_to_task)

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Rik van Riel
On 09/10/2012 04:19 PM, Peter Zijlstra wrote: On Mon, 2012-09-10 at 15:12 -0500, Andrew Theurer wrote: + /* +* if the target task is not running, then only yield if the +* current task is in guest mode +*/ + if (!(p_rq->curr->flags & PF_VCPU)) +

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Peter Zijlstra
On Mon, 2012-09-10 at 15:12 -0500, Andrew Theurer wrote: > + /* > +* if the target task is not running, then only yield if the > +* current task is in guest mode > +*/ > + if (!(p_rq->curr->flags & PF_VCPU)) > + goto out_irq; This would make yield

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Andrew Theurer
On Mon, 2012-09-10 at 19:12 +0200, Peter Zijlstra wrote: > On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote: > > > +static bool __yield_to_candidate(struct task_struct *curr, struct > > > task_struct *p) > > > +{ > > > + if (!curr->sched_class->yield_to_task) > > > + retu

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Raghavendra K T
On 09/10/2012 10:42 PM, Peter Zijlstra wrote: On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote: +static bool __yield_to_candidate(struct task_struct *curr, struct task_struct *p) +{ + if (!curr->sched_class->yield_to_task) + return false; + + if (curr->sched_class

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Peter Zijlstra
On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote: > > +static bool __yield_to_candidate(struct task_struct *curr, struct > > task_struct *p) > > +{ > > + if (!curr->sched_class->yield_to_task) > > + return false; > > + > > + if (curr->sched_class != p->sched_class) >

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Srikar Dronamraju
* Peter Zijlstra [2012-09-10 18:03:55]: > On Mon, 2012-09-10 at 08:16 -0500, Andrew Theurer wrote: > > > > @@ -4856,8 +4859,6 @@ again: > > > > if (curr->sched_class != p->sched_class) > > > > goto out; > > > > > > > > - if (task_running(p_rq, p) || p->state) > > > > -

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Peter Zijlstra
On Mon, 2012-09-10 at 08:16 -0500, Andrew Theurer wrote: > > > @@ -4856,8 +4859,6 @@ again: > > > if (curr->sched_class != p->sched_class) > > > goto out; > > > > > > - if (task_running(p_rq, p) || p->state) > > > - goto out; > > > > Is it possible that by this time th

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Raghavendra K T
On 09/08/2012 01:12 AM, Andrew Theurer wrote: On Fri, 2012-09-07 at 23:36 +0530, Raghavendra K T wrote: CCing PeterZ also. On 09/07/2012 06:41 PM, Andrew Theurer wrote: I have noticed recently that PLE/yield_to() is still not that scalable for really large guests, sometimes even with no CPU ov

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-10 Thread Andrew Theurer
On Sat, 2012-09-08 at 14:13 +0530, Srikar Dronamraju wrote: > > > > signed-off-by: Andrew Theurer > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index fbf1fd0..c767915 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -4844,6 +4844,9 @@ bool __sched yi

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-08 Thread Srikar Dronamraju
> > signed-off-by: Andrew Theurer > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index fbf1fd0..c767915 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -4844,6 +4844,9 @@ bool __sched yield_to(struct task_struct *p, bool > preempt) > > again: > p_rq = ta

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-07 Thread Andrew Theurer
On Fri, 2012-09-07 at 23:36 +0530, Raghavendra K T wrote: > CCing PeterZ also. > > On 09/07/2012 06:41 PM, Andrew Theurer wrote: > > I have noticed recently that PLE/yield_to() is still not that scalable > > for really large guests, sometimes even with no CPU over-commit. I have > > a small chang

Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-07 Thread Raghavendra K T
CCing PeterZ also. On 09/07/2012 06:41 PM, Andrew Theurer wrote: I have noticed recently that PLE/yield_to() is still not that scalable for really large guests, sometimes even with no CPU over-commit. I have a small change that make a very big difference. First, let me explain what I saw: Tim

[RFC][PATCH] Improving directed yield scalability for PLE handler

2012-09-07 Thread Andrew Theurer
I have noticed recently that PLE/yield_to() is still not that scalable for really large guests, sometimes even with no CPU over-commit. I have a small change that make a very big difference. First, let me explain what I saw: Time to boot a 3.6-rc kernel in an 80-way VM on a 4 socket, 40 core, 80