Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-10 Thread Avi Kivity

On 12/09/2010 07:07 PM, Rik van Riel wrote:

Right. May be clearer by using a for () loop instead of the goto.



And open coding kvm_for_each_vcpu ?

Somehow I suspect that won't add to clarity...


No, I meant having a for (pass = 0; pass  2; ++pass) and nesting 
kvm_for_each_vcpu() in it.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-09 Thread Avi Kivity

On 12/09/2010 12:38 AM, Rik van Riel wrote:



- /* Sleep for 100 us, and hope lock-holder got scheduled */
- expires = ktime_add_ns(ktime_get(), 10UL);
- schedule_hrtimeout(expires, HRTIMER_MODE_ABS);
+ if (first_round last_boosted_vcpu == kvm-last_boosted_vcpu) {
+ /* We have not found anyone yet. */
+ first_round = 0;
+ goto again;


Need to guarantee termination.


We do that by setting first_round to 0 :)

We at most walk N+1 VCPUs in a VM with N VCPUs, with
this patch.



Right.  May be clearer by using a for () loop instead of the goto.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-09 Thread Rik van Riel

On 12/09/2010 05:28 AM, Avi Kivity wrote:

On 12/09/2010 12:38 AM, Rik van Riel wrote:



- /* Sleep for 100 us, and hope lock-holder got scheduled */
- expires = ktime_add_ns(ktime_get(), 10UL);
- schedule_hrtimeout(expires, HRTIMER_MODE_ABS);
+ if (first_round last_boosted_vcpu == kvm-last_boosted_vcpu) {
+ /* We have not found anyone yet. */
+ first_round = 0;
+ goto again;


Need to guarantee termination.


We do that by setting first_round to 0 :)

We at most walk N+1 VCPUs in a VM with N VCPUs, with
this patch.



Right. May be clearer by using a for () loop instead of the goto.


And open coding kvm_for_each_vcpu ?

Somehow I suspect that won't add to clarity...

--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-08 Thread Rik van Riel

On 12/05/2010 07:56 AM, Avi Kivity wrote:


+ if (vcpu == me)
+ continue;
+ if (vcpu-spinning)
+ continue;


You may well want to wake up a spinner. Suppose

A takes a lock
B preempts A
B grabs a ticket, starts spinning, yields to A
A releases lock
A grabs ticket, starts spinning

at this point, we want A to yield to B, but it won't because of this check.


That's a good point.  I guess we'll have to benchmark both with
and without the vcpu-spinning logic.


+ if (!task)
+ continue;
+ if (waitqueue_active(vcpu-wq))
+ continue;
+ if (task-flags PF_VCPU)
+ continue;
+ kvm-last_boosted_vcpu = i;
+ yield_to(task);
+ break;
+ }


I think a random selection algorithm will be a better fit against
special guest behaviour.


Possibly, though I suspect we'd have to hit very heavy overcommit ratios
with very large VMs before round robin stops working.


- /* Sleep for 100 us, and hope lock-holder got scheduled */
- expires = ktime_add_ns(ktime_get(), 10UL);
- schedule_hrtimeout(expires, HRTIMER_MODE_ABS);
+ if (first_round last_boosted_vcpu == kvm-last_boosted_vcpu) {
+ /* We have not found anyone yet. */
+ first_round = 0;
+ goto again;


Need to guarantee termination.


We do that by setting first_round to 0 :)

We at most walk N+1 VCPUs in a VM with N VCPUs, with
this patch.

--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-05 Thread Avi Kivity

On 12/03/2010 04:24 AM, Chris Wright wrote:

* Rik van Riel (r...@redhat.com) wrote:
  --- a/virt/kvm/kvm_main.c
  +++ b/virt/kvm/kvm_main.c
  @@ -1880,18 +1880,53 @@ void kvm_resched(struct kvm_vcpu *vcpu)
   }
   EXPORT_SYMBOL_GPL(kvm_resched);

  -void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu)
  +void kvm_vcpu_on_spin(struct kvm_vcpu *me)
   {
  - ktime_t expires;
  - DEFINE_WAIT(wait);
  + struct kvm *kvm = me-kvm;
  + struct kvm_vcpu *vcpu;
  + int last_boosted_vcpu = me-kvm-last_boosted_vcpu;

s/me-//

  + int first_round = 1;
  + int i;

  - prepare_to_wait(vcpu-wq,wait, TASK_INTERRUPTIBLE);
  + me-spinning = 1;
  +
  + /*
  +  * We boost the priority of a VCPU that is runnable but not
  +  * currently running, because it got preempted by something
  +  * else and called schedule in __vcpu_run.  Hopefully that
  +  * VCPU is holding the lock that we need and will release it.
  +  * We approximate round-robin by starting at the last boosted VCPU.
  +  */
  + again:
  + kvm_for_each_vcpu(i, vcpu, kvm) {
  + struct task_struct *task = vcpu-task;
  + if (first_round  i  last_boosted_vcpu) {
  + i = last_boosted_vcpu;
  + continue;
  + } else if (!first_round  i  last_boosted_vcpu)
  + break;
  + if (vcpu == me)
  + continue;
  + if (vcpu-spinning)
  + continue;
  + if (!task)
  + continue;
  + if (waitqueue_active(vcpu-wq))
  + continue;
  + if (task-flags  PF_VCPU)
  + continue;

I wonder if you set vcpu-task in sched_out and then NULL it in sched_in
if you'd get what you want you could simplify the checks.  Basically
that would be only the preempted runnable vcpus.


They may be sleeping due to some other reason (HLT, major page fault).

Better check is that the task is runnable but not running.  Can we get 
this information from a task?


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 3/3] kvm: use yield_to instead of sleep in kvm_vcpu_on_spin

2010-12-02 Thread Chris Wright
* Rik van Riel (r...@redhat.com) wrote:
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -1880,18 +1880,53 @@ void kvm_resched(struct kvm_vcpu *vcpu)
  }
  EXPORT_SYMBOL_GPL(kvm_resched);
  
 -void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu)
 +void kvm_vcpu_on_spin(struct kvm_vcpu *me)
  {
 - ktime_t expires;
 - DEFINE_WAIT(wait);
 + struct kvm *kvm = me-kvm;
 + struct kvm_vcpu *vcpu;
 + int last_boosted_vcpu = me-kvm-last_boosted_vcpu;

s/me-//

 + int first_round = 1;
 + int i;
  
 - prepare_to_wait(vcpu-wq, wait, TASK_INTERRUPTIBLE);
 + me-spinning = 1;
 +
 + /*
 +  * We boost the priority of a VCPU that is runnable but not
 +  * currently running, because it got preempted by something
 +  * else and called schedule in __vcpu_run.  Hopefully that
 +  * VCPU is holding the lock that we need and will release it.
 +  * We approximate round-robin by starting at the last boosted VCPU.
 +  */
 + again:
 + kvm_for_each_vcpu(i, vcpu, kvm) {
 + struct task_struct *task = vcpu-task;
 + if (first_round  i  last_boosted_vcpu) {
 + i = last_boosted_vcpu;
 + continue;
 + } else if (!first_round  i  last_boosted_vcpu)
 + break;
 + if (vcpu == me)
 + continue;
 + if (vcpu-spinning)
 + continue;
 + if (!task)
 + continue;
 + if (waitqueue_active(vcpu-wq))
 + continue;
 + if (task-flags  PF_VCPU)
 + continue;

I wonder if you set vcpu-task in sched_out and then NULL it in sched_in
if you'd get what you want you could simplify the checks.  Basically
that would be only the preempted runnable vcpus.

 + kvm-last_boosted_vcpu = i;
 + yield_to(task);

Just trying to think of ways to be sure this doesn't become just
yield() (although I thnk PF_VCPU is enough to catch that).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html