Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-06 Thread Radim Krčmář
2016-04-06 10:40+0700, Suravee Suthikulpanit:
> On 04/05/2016 09:56 PM, Radim Krčmář wrote:
>>I meant to change the place where we remember that is_running must not
>>be true.  Something like
>>
>>   svm_vcpu_blocking(struct kvm_vcpu *vcpu):
>>  vcpu->is_blocking = true;
>>  avic_set_running(vcpu, false);
>>
>>   avic_vcpu_load(struct kvm_vcpu *vcpu, bool is_load):
>>  avic_set_running(vcpu, is_load && !vcpu->is_blocking)
> 
> I assume that you also imply that we would also need:
> 
>   svm_vcpu_unblocking(struct kvm_vcpu *vcpu) {
>   avic_set_running(vcpu, false);
>   vcpu->is_blocking = false;
>   }

Yes, thought the order should be flipped in order to avoid suboptimal
case when preemption hits us after avic_set_running().

   static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu) {
vcpu->is_blocking = false;
avic_set_running(vcpu, true);
   }

avic_set_running has barriers that prevent GCC from harmful reordering.


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-06 Thread Radim Krčmář
2016-04-06 10:40+0700, Suravee Suthikulpanit:
> On 04/05/2016 09:56 PM, Radim Krčmář wrote:
>>I meant to change the place where we remember that is_running must not
>>be true.  Something like
>>
>>   svm_vcpu_blocking(struct kvm_vcpu *vcpu):
>>  vcpu->is_blocking = true;
>>  avic_set_running(vcpu, false);
>>
>>   avic_vcpu_load(struct kvm_vcpu *vcpu, bool is_load):
>>  avic_set_running(vcpu, is_load && !vcpu->is_blocking)
> 
> I assume that you also imply that we would also need:
> 
>   svm_vcpu_unblocking(struct kvm_vcpu *vcpu) {
>   avic_set_running(vcpu, false);
>   vcpu->is_blocking = false;
>   }

Yes, thought the order should be flipped in order to avoid suboptimal
case when preemption hits us after avic_set_running().

   static void svm_vcpu_unblocking(struct kvm_vcpu *vcpu) {
vcpu->is_blocking = false;
avic_set_running(vcpu, true);
   }

avic_set_running has barriers that prevent GCC from harmful reordering.


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-05 Thread Suravee Suthikulpanit

Radim,

On 04/05/2016 09:56 PM, Radim Krčmář wrote:

I meant to change the place where we remember that is_running must not
be true.  Something like

   svm_vcpu_blocking(struct kvm_vcpu *vcpu):
  vcpu->is_blocking = true;
  avic_set_running(vcpu, false);

   avic_vcpu_load(struct kvm_vcpu *vcpu, bool is_load):
  avic_set_running(vcpu, is_load && !vcpu->is_blocking)


I assume that you also imply that we would also need:

svm_vcpu_unblocking(struct kvm_vcpu *vcpu) {
avic_set_running(vcpu, false);
vcpu->is_blocking = false;
}

Thanks,
Suravee


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-05 Thread Suravee Suthikulpanit

Radim,

On 04/05/2016 09:56 PM, Radim Krčmář wrote:

I meant to change the place where we remember that is_running must not
be true.  Something like

   svm_vcpu_blocking(struct kvm_vcpu *vcpu):
  vcpu->is_blocking = true;
  avic_set_running(vcpu, false);

   avic_vcpu_load(struct kvm_vcpu *vcpu, bool is_load):
  avic_set_running(vcpu, is_load && !vcpu->is_blocking)


I assume that you also imply that we would also need:

svm_vcpu_unblocking(struct kvm_vcpu *vcpu) {
avic_set_running(vcpu, false);
vcpu->is_blocking = false;
}

Thanks,
Suravee


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-05 Thread Radim Krčmář
2016-04-05 17:07+0700, Suravee Suthikulpanit:
> On 3/31/16 21:19, Radim Krčmář wrote:
>>2016-03-31 15:52+0700, Suravee Suthikulpanit:
>>>On 03/19/2016 04:44 AM, Radim Krčmář wrote:
2016-03-18 01:09-0500, Suravee Suthikulpanit:
>+  } else {
>+  new_entry = READ_ONCE(*entry);
>+  /**
>+   * This handles the case when vcpu is scheduled out
>+   * and has not yet not called blocking. We save the
>+   * AVIC running flag so that we can restore later.
>+   */

is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)
>>>
>>>Not sure if I understand your concern.  However, the is_running bit
>>>setting/clearing should be handled in the avic_set_running below. This part
>>>only handles othe case when the is_running bit still set when calling
>>>vcpu_put (and later on loading some other vcpus). This way, when we are
>>>re-loading this vcpu, we can restore the is_running bit accordingly.
>>
>>I think that the comment is misleading.  The saved is_running flag only
>>matters after svm_vcpu_blocking, yet the comment says that it handles
>>the irrelevant case before.
> 
> Actually, my understanding is if the svm_vcpu_blocking() is called, the
> is_running bit would have been cleared. At this point, if the vcpu is
> unloaded. We should not need to worry about it. Is that not the case here?

svm_vcpu_blocking() clears is_running so we don't wait infinitely if an
interrupt arrives between kvm_vcpu_check_block() and schedule().
was_running ensures that preempt notifiers don't set is_running between
kvm_vcpu_check_block() and schedule() and it's the only place where we
need to worry about was_running causing a bug.

The comment would be better if it covered the case we actually care
about and I think that we can change was_running to make it clear even
without a comment.

>>Another minor bug is that was_running isn't initialized to 1, so we need
>>to halt before is_running gets set.
> 
> Just to make sure, you are referring to the point where the is_running is
> not set for first time the vcpu is loaded?

Yes.

>>It would be clearer to toggle a flag in svm_vcpu_(un)blocking and set
>>is_running = !is_blocking.
> 
> Not sure what you meant here. We are already setting/unsetting the
> is_running bit when vcpu is blocking/unblocking. Are you suggesting just
> simply move the current avic_set_running() into the svm_vcpu_blocking and
> svm_vcpu_unblocking()?

No, that would be buggy.  (The code needs to force is_running to true on
svm_vcpu_unblocking().)

I meant to change the place where we remember that is_running must not
be true.  Something like

  svm_vcpu_blocking(struct kvm_vcpu *vcpu):
 vcpu->is_blocking = true;
 avic_set_running(vcpu, false);

  avic_vcpu_load(struct kvm_vcpu *vcpu, bool is_load):
 avic_set_running(vcpu, is_load && !vcpu->is_blocking)

>>Doing so will also be immeasurably faster,
>>because avic_vcpu_load is called far more than svm_vcpu_(un)blocking.
> 
> Actually, this is not the same as handling normal vcpu blocking and
> unblocking, which we are already setting/un-setting the is_running bit in
> the avic_set_running().

There is no practical difference after fixing the bug where was_running
starts as 0.

> The was_running should only be set to 1 if the vcpu
> is unloaded but has not yet calling halt.

Yes.  was_running must be 0 inside of svm_vcpu_blocking and
svm_vcpu_unblocking and should be 1 outside.

> Am I missing your points somehow?

I'm not sure ...


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-05 Thread Radim Krčmář
2016-04-05 17:07+0700, Suravee Suthikulpanit:
> On 3/31/16 21:19, Radim Krčmář wrote:
>>2016-03-31 15:52+0700, Suravee Suthikulpanit:
>>>On 03/19/2016 04:44 AM, Radim Krčmář wrote:
2016-03-18 01:09-0500, Suravee Suthikulpanit:
>+  } else {
>+  new_entry = READ_ONCE(*entry);
>+  /**
>+   * This handles the case when vcpu is scheduled out
>+   * and has not yet not called blocking. We save the
>+   * AVIC running flag so that we can restore later.
>+   */

is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)
>>>
>>>Not sure if I understand your concern.  However, the is_running bit
>>>setting/clearing should be handled in the avic_set_running below. This part
>>>only handles othe case when the is_running bit still set when calling
>>>vcpu_put (and later on loading some other vcpus). This way, when we are
>>>re-loading this vcpu, we can restore the is_running bit accordingly.
>>
>>I think that the comment is misleading.  The saved is_running flag only
>>matters after svm_vcpu_blocking, yet the comment says that it handles
>>the irrelevant case before.
> 
> Actually, my understanding is if the svm_vcpu_blocking() is called, the
> is_running bit would have been cleared. At this point, if the vcpu is
> unloaded. We should not need to worry about it. Is that not the case here?

svm_vcpu_blocking() clears is_running so we don't wait infinitely if an
interrupt arrives between kvm_vcpu_check_block() and schedule().
was_running ensures that preempt notifiers don't set is_running between
kvm_vcpu_check_block() and schedule() and it's the only place where we
need to worry about was_running causing a bug.

The comment would be better if it covered the case we actually care
about and I think that we can change was_running to make it clear even
without a comment.

>>Another minor bug is that was_running isn't initialized to 1, so we need
>>to halt before is_running gets set.
> 
> Just to make sure, you are referring to the point where the is_running is
> not set for first time the vcpu is loaded?

Yes.

>>It would be clearer to toggle a flag in svm_vcpu_(un)blocking and set
>>is_running = !is_blocking.
> 
> Not sure what you meant here. We are already setting/unsetting the
> is_running bit when vcpu is blocking/unblocking. Are you suggesting just
> simply move the current avic_set_running() into the svm_vcpu_blocking and
> svm_vcpu_unblocking()?

No, that would be buggy.  (The code needs to force is_running to true on
svm_vcpu_unblocking().)

I meant to change the place where we remember that is_running must not
be true.  Something like

  svm_vcpu_blocking(struct kvm_vcpu *vcpu):
 vcpu->is_blocking = true;
 avic_set_running(vcpu, false);

  avic_vcpu_load(struct kvm_vcpu *vcpu, bool is_load):
 avic_set_running(vcpu, is_load && !vcpu->is_blocking)

>>Doing so will also be immeasurably faster,
>>because avic_vcpu_load is called far more than svm_vcpu_(un)blocking.
> 
> Actually, this is not the same as handling normal vcpu blocking and
> unblocking, which we are already setting/un-setting the is_running bit in
> the avic_set_running().

There is no practical difference after fixing the bug where was_running
starts as 0.

> The was_running should only be set to 1 if the vcpu
> is unloaded but has not yet calling halt.

Yes.  was_running must be 0 inside of svm_vcpu_blocking and
svm_vcpu_unblocking and should be 1 outside.

> Am I missing your points somehow?

I'm not sure ...


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-05 Thread Suravee Suthikulpanit

Hi Radim,

On 3/31/16 21:19, Radim Krčmář wrote:

2016-03-31 15:52+0700, Suravee Suthikulpanit:

On 03/19/2016 04:44 AM, Radim Krčmář wrote:

2016-03-18 01:09-0500, Suravee Suthikulpanit:

+   } else {
+   new_entry = READ_ONCE(*entry);
+   /**
+* This handles the case when vcpu is scheduled out
+* and has not yet not called blocking. We save the
+* AVIC running flag so that we can restore later.
+*/


is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)


Not sure if I understand your concern.  However, the is_running bit
setting/clearing should be handled in the avic_set_running below. This part
only handles othe case when the is_running bit still set when calling
vcpu_put (and later on loading some other vcpus). This way, when we are
re-loading this vcpu, we can restore the is_running bit accordingly.


I think that the comment is misleading.  The saved is_running flag only
matters after svm_vcpu_blocking, yet the comment says that it handles
the irrelevant case before.


Actually, my understanding is if the svm_vcpu_blocking() is called, the 
is_running bit would have been cleared. At this point, if the vcpu is 
unloaded. We should not need to worry about it. Is that not the case here?



Another minor bug is that was_running isn't initialized to 1, so we need
to halt before is_running gets set.


Just to make sure, you are referring to the point where the is_running 
is not set for first time the vcpu is loaded?  If so, I agree. Thanks 
for the good catch.



It would be clearer to toggle a flag in svm_vcpu_(un)blocking and set
is_running = !is_blocking.


Not sure what you meant here. We are already setting/unsetting the 
is_running bit when vcpu is blocking/unblocking. Are you suggesting just 
simply move the current avic_set_running() into the svm_vcpu_blocking 
and svm_vcpu_unblocking()?




Doing so will also be immeasurably faster,
because avic_vcpu_load is called far more than svm_vcpu_(un)blocking.


Actually, this is not the same as handling normal vcpu blocking and 
unblocking, which we are already setting/un-setting the is_running bit 
in the avic_set_running(). The was_running should only be set to 1 if 
the vcpu is unloaded but has not yet calling halt.


Am I missing your points somehow?

Thanks,
Suravee


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-04-05 Thread Suravee Suthikulpanit

Hi Radim,

On 3/31/16 21:19, Radim Krčmář wrote:

2016-03-31 15:52+0700, Suravee Suthikulpanit:

On 03/19/2016 04:44 AM, Radim Krčmář wrote:

2016-03-18 01:09-0500, Suravee Suthikulpanit:

+   } else {
+   new_entry = READ_ONCE(*entry);
+   /**
+* This handles the case when vcpu is scheduled out
+* and has not yet not called blocking. We save the
+* AVIC running flag so that we can restore later.
+*/


is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)


Not sure if I understand your concern.  However, the is_running bit
setting/clearing should be handled in the avic_set_running below. This part
only handles othe case when the is_running bit still set when calling
vcpu_put (and later on loading some other vcpus). This way, when we are
re-loading this vcpu, we can restore the is_running bit accordingly.


I think that the comment is misleading.  The saved is_running flag only
matters after svm_vcpu_blocking, yet the comment says that it handles
the irrelevant case before.


Actually, my understanding is if the svm_vcpu_blocking() is called, the 
is_running bit would have been cleared. At this point, if the vcpu is 
unloaded. We should not need to worry about it. Is that not the case here?



Another minor bug is that was_running isn't initialized to 1, so we need
to halt before is_running gets set.


Just to make sure, you are referring to the point where the is_running 
is not set for first time the vcpu is loaded?  If so, I agree. Thanks 
for the good catch.



It would be clearer to toggle a flag in svm_vcpu_(un)blocking and set
is_running = !is_blocking.


Not sure what you meant here. We are already setting/unsetting the 
is_running bit when vcpu is blocking/unblocking. Are you suggesting just 
simply move the current avic_set_running() into the svm_vcpu_blocking 
and svm_vcpu_unblocking()?




Doing so will also be immeasurably faster,
because avic_vcpu_load is called far more than svm_vcpu_(un)blocking.


Actually, this is not the same as handling normal vcpu blocking and 
unblocking, which we are already setting/un-setting the is_running bit 
in the avic_set_running(). The was_running should only be set to 1 if 
the vcpu is unloaded but has not yet calling halt.


Am I missing your points somehow?

Thanks,
Suravee


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-31 Thread Radim Krčmář
2016-03-31 15:52+0700, Suravee Suthikulpanit:
> On 03/19/2016 04:44 AM, Radim Krčmář wrote:
>>2016-03-18 01:09-0500, Suravee Suthikulpanit:
>>>+} else {
>>>+new_entry = READ_ONCE(*entry);
>>>+/**
>>>+ * This handles the case when vcpu is scheduled out
>>>+ * and has not yet not called blocking. We save the
>>>+ * AVIC running flag so that we can restore later.
>>>+ */
>>
>>is_running must be disabled in between ...blocking and ...unblocking,
>>because we don't want to miss interrupts and block forever.
>>I somehow don't get it from the comment. :)
> 
> Not sure if I understand your concern.  However, the is_running bit
> setting/clearing should be handled in the avic_set_running below. This part
> only handles othe case when the is_running bit still set when calling
> vcpu_put (and later on loading some other vcpus). This way, when we are
> re-loading this vcpu, we can restore the is_running bit accordingly.

I think that the comment is misleading.  The saved is_running flag only
matters after svm_vcpu_blocking, yet the comment says that it handles
the irrelevant case before.

Another minor bug is that was_running isn't initialized to 1, so we need
to halt before is_running gets set.
It would be clearer to toggle a flag in svm_vcpu_(un)blocking and set
is_running = !is_blocking.  Doing so will also be immeasurably faster,
because avic_vcpu_load is called far more than svm_vcpu_(un)blocking.


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-31 Thread Radim Krčmář
2016-03-31 15:52+0700, Suravee Suthikulpanit:
> On 03/19/2016 04:44 AM, Radim Krčmář wrote:
>>2016-03-18 01:09-0500, Suravee Suthikulpanit:
>>>+} else {
>>>+new_entry = READ_ONCE(*entry);
>>>+/**
>>>+ * This handles the case when vcpu is scheduled out
>>>+ * and has not yet not called blocking. We save the
>>>+ * AVIC running flag so that we can restore later.
>>>+ */
>>
>>is_running must be disabled in between ...blocking and ...unblocking,
>>because we don't want to miss interrupts and block forever.
>>I somehow don't get it from the comment. :)
> 
> Not sure if I understand your concern.  However, the is_running bit
> setting/clearing should be handled in the avic_set_running below. This part
> only handles othe case when the is_running bit still set when calling
> vcpu_put (and later on loading some other vcpus). This way, when we are
> re-loading this vcpu, we can restore the is_running bit accordingly.

I think that the comment is misleading.  The saved is_running flag only
matters after svm_vcpu_blocking, yet the comment says that it handles
the irrelevant case before.

Another minor bug is that was_running isn't initialized to 1, so we need
to halt before is_running gets set.
It would be clearer to toggle a flag in svm_vcpu_(un)blocking and set
is_running = !is_blocking.  Doing so will also be immeasurably faster,
because avic_vcpu_load is called far more than svm_vcpu_(un)blocking.


Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-31 Thread Suravee Suthikulpanit

Hi Radim,

On 03/19/2016 04:44 AM, Radim Krčmář wrote:

2016-03-18 01:09-0500, Suravee Suthikulpanit:

From: Suravee Suthikulpanit 
+
+   WRITE_ONCE(*entry, new_entry);


This will translate to two writes in 32 bit mode and we need to write
physical ID first to avoid spurious doorbells ...
is the order guaranteed?


Hm.. not sure on that.


+   } else {
+   new_entry = READ_ONCE(*entry);
+   /**
+* This handles the case when vcpu is scheduled out
+* and has not yet not called blocking. We save the
+* AVIC running flag so that we can restore later.
+*/


is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)


Not sure if I understand your concern.  However, the is_running bit 
setting/clearing should be handled in the avic_set_running below. This 
part only handles othe case when the is_running bit still set when 
calling vcpu_put (and later on loading some other vcpus). This way, when 
we are re-loading this vcpu, we can restore the is_running bit accordingly.


Thanks,
Suravee



Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-31 Thread Suravee Suthikulpanit

Hi Radim,

On 03/19/2016 04:44 AM, Radim Krčmář wrote:

2016-03-18 01:09-0500, Suravee Suthikulpanit:

From: Suravee Suthikulpanit 
+
+   WRITE_ONCE(*entry, new_entry);


This will translate to two writes in 32 bit mode and we need to write
physical ID first to avoid spurious doorbells ...
is the order guaranteed?


Hm.. not sure on that.


+   } else {
+   new_entry = READ_ONCE(*entry);
+   /**
+* This handles the case when vcpu is scheduled out
+* and has not yet not called blocking. We save the
+* AVIC running flag so that we can restore later.
+*/


is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)


Not sure if I understand your concern.  However, the is_running bit 
setting/clearing should be handled in the avic_set_running below. This 
part only handles othe case when the is_running bit still set when 
calling vcpu_put (and later on loading some other vcpus). This way, when 
we are re-loading this vcpu, we can restore the is_running bit accordingly.


Thanks,
Suravee



[PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-19 Thread Suravee Suthikulpanit
From: Suravee Suthikulpanit 

When a vcpu is loaded/unloaded to a physical core, we need to update
host physical APIC ID information in the Physical APIC-ID table
accordingly.

Also, when vCPU is blocking/un-blocking (due to halt instruction),
we need to make sure that the is-running bit in set accordingly in the
physical APIC-ID table.

Signed-off-by: Suravee Suthikulpanit 
---
 arch/x86/kvm/svm.c | 121 +
 1 file changed, 121 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d5418c3..c5e8100 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -1334,6 +1335,110 @@ free_avic:
return err;
 }
 
+static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
+{
+   int h_phy_apic_id;
+   u64 *entry, new_entry;
+   struct vcpu_svm *svm = to_svm(vcpu);
+   int ret = 0;
+
+   if (!svm_vcpu_avic_enabled(svm))
+   return 0;
+
+   if (!svm)
+   return -EINVAL;
+
+   /* Note: APIC ID = 0xff is used for broadcast.
+*   APIC ID > 0xff is reserved.
+*/
+   h_phy_apic_id = __default_cpu_present_to_apicid(cpu);
+
+   if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
+   return -EINVAL;
+
+   entry = svm->avic_phy_apic_id_cache;
+   if (!entry)
+   return -EINVAL;
+
+   if (is_load) {
+   new_entry = READ_ONCE(*entry);
+
+   BUG_ON(new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK);
+
+   new_entry &= ~AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK;
+   new_entry |= (h_phy_apic_id & 
AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK);
+
+   /**
+* Restore AVIC running flag if it was set during
+* vcpu unload.
+*/
+   if (svm->avic_was_running)
+   new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   else
+   new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
+
+   WRITE_ONCE(*entry, new_entry);
+
+   } else {
+   new_entry = READ_ONCE(*entry);
+
+   /**
+* This handles the case when vcpu is scheduled out
+* and has not yet not called blocking. We save the
+* AVIC running flag so that we can restore later.
+*/
+   if (new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK) {
+   svm->avic_was_running = true;
+   new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   WRITE_ONCE(*entry, new_entry);
+   } else {
+   svm->avic_was_running = false;
+   }
+   }
+
+   return ret;
+}
+
+/**
+ * This function is called during VCPU halt/unhalt.
+ */
+static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
+{
+   int ret = 0;
+   int h_phy_apic_id;
+   u64 *entry, new_entry;
+   struct vcpu_svm *svm = to_svm(vcpu);
+
+   if (!svm_vcpu_avic_enabled(svm))
+   return 0;
+
+   /* Note: APIC ID = 0xff is used for broadcast.
+*   APIC ID > 0xff is reserved.
+*/
+   h_phy_apic_id = __default_cpu_present_to_apicid(vcpu->cpu);
+
+   if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
+   return -EINVAL;
+
+   entry = svm->avic_phy_apic_id_cache;
+   if (!entry)
+   return -EINVAL;
+
+   if (is_run) {
+   /* Handle vcpu unblocking after HLT */
+   new_entry = READ_ONCE(*entry);
+   new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   WRITE_ONCE(*entry, new_entry);
+   } else {
+   /* Handle vcpu blocking due to HLT */
+   new_entry = READ_ONCE(*entry);
+   new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   WRITE_ONCE(*entry, new_entry);
+   }
+
+   return ret;
+}
+
 static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -1476,6 +1581,8 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
/* This assumes that the kernel never uses MSR_TSC_AUX */
if (static_cpu_has(X86_FEATURE_RDTSCP))
wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
+
+   avic_vcpu_load(vcpu, cpu, true);
 }
 
 static void svm_vcpu_put(struct kvm_vcpu *vcpu)
@@ -1483,6 +1590,8 @@ static void svm_vcpu_put(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
int i;
 
+   avic_vcpu_load(vcpu, 0, false);
+
++vcpu->stat.host_state_reload;
kvm_load_ldt(svm->host.ldt);
 #ifdef CONFIG_X86_64
@@ -1498,6 +1607,16 @@ static void svm_vcpu_put(struct kvm_vcpu *vcpu)
wrmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
 }
 
+static void svm_vcpu_blocking(struct 

[PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-19 Thread Suravee Suthikulpanit
From: Suravee Suthikulpanit 

When a vcpu is loaded/unloaded to a physical core, we need to update
host physical APIC ID information in the Physical APIC-ID table
accordingly.

Also, when vCPU is blocking/un-blocking (due to halt instruction),
we need to make sure that the is-running bit in set accordingly in the
physical APIC-ID table.

Signed-off-by: Suravee Suthikulpanit 
---
 arch/x86/kvm/svm.c | 121 +
 1 file changed, 121 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d5418c3..c5e8100 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -1334,6 +1335,110 @@ free_avic:
return err;
 }
 
+static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
+{
+   int h_phy_apic_id;
+   u64 *entry, new_entry;
+   struct vcpu_svm *svm = to_svm(vcpu);
+   int ret = 0;
+
+   if (!svm_vcpu_avic_enabled(svm))
+   return 0;
+
+   if (!svm)
+   return -EINVAL;
+
+   /* Note: APIC ID = 0xff is used for broadcast.
+*   APIC ID > 0xff is reserved.
+*/
+   h_phy_apic_id = __default_cpu_present_to_apicid(cpu);
+
+   if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
+   return -EINVAL;
+
+   entry = svm->avic_phy_apic_id_cache;
+   if (!entry)
+   return -EINVAL;
+
+   if (is_load) {
+   new_entry = READ_ONCE(*entry);
+
+   BUG_ON(new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK);
+
+   new_entry &= ~AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK;
+   new_entry |= (h_phy_apic_id & 
AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK);
+
+   /**
+* Restore AVIC running flag if it was set during
+* vcpu unload.
+*/
+   if (svm->avic_was_running)
+   new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   else
+   new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
+
+   WRITE_ONCE(*entry, new_entry);
+
+   } else {
+   new_entry = READ_ONCE(*entry);
+
+   /**
+* This handles the case when vcpu is scheduled out
+* and has not yet not called blocking. We save the
+* AVIC running flag so that we can restore later.
+*/
+   if (new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK) {
+   svm->avic_was_running = true;
+   new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   WRITE_ONCE(*entry, new_entry);
+   } else {
+   svm->avic_was_running = false;
+   }
+   }
+
+   return ret;
+}
+
+/**
+ * This function is called during VCPU halt/unhalt.
+ */
+static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
+{
+   int ret = 0;
+   int h_phy_apic_id;
+   u64 *entry, new_entry;
+   struct vcpu_svm *svm = to_svm(vcpu);
+
+   if (!svm_vcpu_avic_enabled(svm))
+   return 0;
+
+   /* Note: APIC ID = 0xff is used for broadcast.
+*   APIC ID > 0xff is reserved.
+*/
+   h_phy_apic_id = __default_cpu_present_to_apicid(vcpu->cpu);
+
+   if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
+   return -EINVAL;
+
+   entry = svm->avic_phy_apic_id_cache;
+   if (!entry)
+   return -EINVAL;
+
+   if (is_run) {
+   /* Handle vcpu unblocking after HLT */
+   new_entry = READ_ONCE(*entry);
+   new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   WRITE_ONCE(*entry, new_entry);
+   } else {
+   /* Handle vcpu blocking due to HLT */
+   new_entry = READ_ONCE(*entry);
+   new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
+   WRITE_ONCE(*entry, new_entry);
+   }
+
+   return ret;
+}
+
 static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -1476,6 +1581,8 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
/* This assumes that the kernel never uses MSR_TSC_AUX */
if (static_cpu_has(X86_FEATURE_RDTSCP))
wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
+
+   avic_vcpu_load(vcpu, cpu, true);
 }
 
 static void svm_vcpu_put(struct kvm_vcpu *vcpu)
@@ -1483,6 +1590,8 @@ static void svm_vcpu_put(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
int i;
 
+   avic_vcpu_load(vcpu, 0, false);
+
++vcpu->stat.host_state_reload;
kvm_load_ldt(svm->host.ldt);
 #ifdef CONFIG_X86_64
@@ -1498,6 +1607,16 @@ static void svm_vcpu_put(struct kvm_vcpu *vcpu)
wrmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
 }
 
+static void svm_vcpu_blocking(struct kvm_vcpu *vcpu)
+{
+   avic_set_running(vcpu, false);
+}
+

Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-19 Thread Radim Krčmář
2016-03-18 01:09-0500, Suravee Suthikulpanit:
> From: Suravee Suthikulpanit 
> 
> When a vcpu is loaded/unloaded to a physical core, we need to update
> host physical APIC ID information in the Physical APIC-ID table
> accordingly.
> 
> Also, when vCPU is blocking/un-blocking (due to halt instruction),
> we need to make sure that the is-running bit in set accordingly in the
> physical APIC-ID table.
> 
> Signed-off-by: Suravee Suthikulpanit 
> ---
>  arch/x86/kvm/svm.c | 121 
> +
>  1 file changed, 121 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> +static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
> +{
> + int h_phy_apic_id;

(Paolo said a lot about those names.)

> + u64 *entry, new_entry;
> + struct vcpu_svm *svm = to_svm(vcpu);
> + int ret = 0;
> + if (!svm_vcpu_avic_enabled(svm))
> + return 0;

(The check for NULL below feels weird when it has already been used.)

> +
> + if (!svm)
> + return -EINVAL;

!svm means that KVM completely blew up ... don't check for it.
(See implementation of to_svm.)

> +
> + /* Note: APIC ID = 0xff is used for broadcast.
> +  *   APIC ID > 0xff is reserved.
> +  */
> + h_phy_apic_id = __default_cpu_present_to_apicid(cpu);
> +
> + if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
> + return -EINVAL;
> +
> + entry = svm->avic_phy_apic_id_cache;

The naming is confusing ... can avic_phy_apic_id_cache change during
execution of this function?
If yes, then add READ_ONCE and distinguish the pointer name.
If not, then use svm->avic_phy_apic_id_cache directly.

entry would be ok name for current new_entry.

> + if (!entry)
> + return -EINVAL;
> +
> + if (is_load) {
> + new_entry = READ_ONCE(*entry);

Please move this before the if.

> + BUG_ON(new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK);
> +
> + new_entry &= ~AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK;
> + new_entry |= (h_phy_apic_id & 
> AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK);
> +
> + /**
> +  * Restore AVIC running flag if it was set during
> +  * vcpu unload.
> +  */
> + if (svm->avic_was_running)
> + new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + else
> + new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;

You even BUG_ON when AVIC_PHY_APIC_ID__IS_RUN_MSK is set, so there is no
reason to clear it.

(Also, don't BUG.)

> +
> + WRITE_ONCE(*entry, new_entry);

This will translate to two writes in 32 bit mode and we need to write
physical ID first to avoid spurious doorbells ...
is the order guaranteed?

> + } else {
> + new_entry = READ_ONCE(*entry);
> + /**
> +  * This handles the case when vcpu is scheduled out
> +  * and has not yet not called blocking. We save the
> +  * AVIC running flag so that we can restore later.
> +  */

is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)

> + if (new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK) {
> + svm->avic_was_running = true;
> + new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + WRITE_ONCE(*entry, new_entry);
> + } else {
> + svm->avic_was_running = false;
> + }

(This can be shorter by setting avic_was_running first.)

> + }
> +
> + return ret;

(return 0;)

> +}
> +
> +/**
> + * This function is called during VCPU halt/unhalt.
> + */
> +static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
> +{
> + int ret = 0;
> + int h_phy_apic_id;
> + u64 *entry, new_entry;
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (!svm_vcpu_avic_enabled(svm))
> + return 0;
> +
> + /* Note: APIC ID = 0xff is used for broadcast.
> +  *   APIC ID > 0xff is reserved.
> +  */
> + h_phy_apic_id = __default_cpu_present_to_apicid(vcpu->cpu);
> +
> + if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
> + return -EINVAL;

The cache should be valid only if this condition is true.
We can get rid of it in both function.

> +
> + entry = svm->avic_phy_apic_id_cache;
> + if (!entry)
> + return -EINVAL;
> +
> + if (is_run) {

Both READ_ONCE and WRITE_ONCE belong outside of the if.

> + /* Handle vcpu unblocking after HLT */
> + new_entry = READ_ONCE(*entry);
> + new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + WRITE_ONCE(*entry, new_entry);
> + } else {
> + /* Handle vcpu blocking due to HLT */
> + new_entry = 

Re: [PART1 RFC v3 12/12] svm: Manage vcpu load/unload when enable AVIC

2016-03-19 Thread Radim Krčmář
2016-03-18 01:09-0500, Suravee Suthikulpanit:
> From: Suravee Suthikulpanit 
> 
> When a vcpu is loaded/unloaded to a physical core, we need to update
> host physical APIC ID information in the Physical APIC-ID table
> accordingly.
> 
> Also, when vCPU is blocking/un-blocking (due to halt instruction),
> we need to make sure that the is-running bit in set accordingly in the
> physical APIC-ID table.
> 
> Signed-off-by: Suravee Suthikulpanit 
> ---
>  arch/x86/kvm/svm.c | 121 
> +
>  1 file changed, 121 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> +static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
> +{
> + int h_phy_apic_id;

(Paolo said a lot about those names.)

> + u64 *entry, new_entry;
> + struct vcpu_svm *svm = to_svm(vcpu);
> + int ret = 0;
> + if (!svm_vcpu_avic_enabled(svm))
> + return 0;

(The check for NULL below feels weird when it has already been used.)

> +
> + if (!svm)
> + return -EINVAL;

!svm means that KVM completely blew up ... don't check for it.
(See implementation of to_svm.)

> +
> + /* Note: APIC ID = 0xff is used for broadcast.
> +  *   APIC ID > 0xff is reserved.
> +  */
> + h_phy_apic_id = __default_cpu_present_to_apicid(cpu);
> +
> + if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
> + return -EINVAL;
> +
> + entry = svm->avic_phy_apic_id_cache;

The naming is confusing ... can avic_phy_apic_id_cache change during
execution of this function?
If yes, then add READ_ONCE and distinguish the pointer name.
If not, then use svm->avic_phy_apic_id_cache directly.

entry would be ok name for current new_entry.

> + if (!entry)
> + return -EINVAL;
> +
> + if (is_load) {
> + new_entry = READ_ONCE(*entry);

Please move this before the if.

> + BUG_ON(new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK);
> +
> + new_entry &= ~AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK;
> + new_entry |= (h_phy_apic_id & 
> AVIC_PHY_APIC_ID__HOST_PHY_APIC_ID_MSK);
> +
> + /**
> +  * Restore AVIC running flag if it was set during
> +  * vcpu unload.
> +  */
> + if (svm->avic_was_running)
> + new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + else
> + new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;

You even BUG_ON when AVIC_PHY_APIC_ID__IS_RUN_MSK is set, so there is no
reason to clear it.

(Also, don't BUG.)

> +
> + WRITE_ONCE(*entry, new_entry);

This will translate to two writes in 32 bit mode and we need to write
physical ID first to avoid spurious doorbells ...
is the order guaranteed?

> + } else {
> + new_entry = READ_ONCE(*entry);
> + /**
> +  * This handles the case when vcpu is scheduled out
> +  * and has not yet not called blocking. We save the
> +  * AVIC running flag so that we can restore later.
> +  */

is_running must be disabled in between ...blocking and ...unblocking,
because we don't want to miss interrupts and block forever.
I somehow don't get it from the comment. :)

> + if (new_entry & AVIC_PHY_APIC_ID__IS_RUN_MSK) {
> + svm->avic_was_running = true;
> + new_entry &= ~AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + WRITE_ONCE(*entry, new_entry);
> + } else {
> + svm->avic_was_running = false;
> + }

(This can be shorter by setting avic_was_running first.)

> + }
> +
> + return ret;

(return 0;)

> +}
> +
> +/**
> + * This function is called during VCPU halt/unhalt.
> + */
> +static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
> +{
> + int ret = 0;
> + int h_phy_apic_id;
> + u64 *entry, new_entry;
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (!svm_vcpu_avic_enabled(svm))
> + return 0;
> +
> + /* Note: APIC ID = 0xff is used for broadcast.
> +  *   APIC ID > 0xff is reserved.
> +  */
> + h_phy_apic_id = __default_cpu_present_to_apicid(vcpu->cpu);
> +
> + if (h_phy_apic_id >= AVIC_PHY_APIC_ID_MAX)
> + return -EINVAL;

The cache should be valid only if this condition is true.
We can get rid of it in both function.

> +
> + entry = svm->avic_phy_apic_id_cache;
> + if (!entry)
> + return -EINVAL;
> +
> + if (is_run) {

Both READ_ONCE and WRITE_ONCE belong outside of the if.

> + /* Handle vcpu unblocking after HLT */
> + new_entry = READ_ONCE(*entry);
> + new_entry |= AVIC_PHY_APIC_ID__IS_RUN_MSK;
> + WRITE_ONCE(*entry, new_entry);
> + } else {
> + /* Handle vcpu blocking due to HLT */
> + new_entry = READ_ONCE(*entry);
> + new_entry &=