Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-06 Thread Gleb Natapov
On Tue, Dec 06, 2011 at 02:54:06PM +0800, Liu ping fan wrote:
 On Mon, Dec 5, 2011 at 4:41 PM, Gleb Natapov g...@redhat.com wrote:
  On Mon, Dec 05, 2011 at 01:39:37PM +0800, Liu ping fan wrote:
  On Sun, Dec 4, 2011 at 8:10 PM, Gleb Natapov g...@redhat.com wrote:
   On Sun, Dec 04, 2011 at 07:53:37PM +0800, Liu ping fan wrote:
   On Sat, Dec 3, 2011 at 2:26 AM, Jan Kiszka jan.kis...@siemens.com 
   wrote:
On 2011-12-02 07:26, Liu Ping Fan wrote:
From: Liu Ping Fan pingf...@linux.vnet.ibm.com
   
Currently, vcpu can be destructed only when kvm instance destroyed.
Change this to vcpu's destruction taken when its refcnt is zero,
and then vcpu MUST and CAN be destroyed before kvm's destroy.
   
I'm lacking the big picture yet (would be good to have in the change 
log
- at least I'm too lazy to read the code):
   
What increments the refcnt, what decrements it again? IOW, how does 
user
space controls the life-cycle of a vcpu after your changes?
   
   In local APIC mode, delivering IPI to target APIC, target's refcnt is
   incremented, and decremented when finished. At other times, using RCU to
   Why is this needed?
  
  Suppose the following scene:
 
  #define kvm_for_each_vcpu(idx, vcpup, kvm) \
          for (idx = 0; \
               idx  atomic_read(kvm-online_vcpus)  \
               (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
               idx++)
 
  --
  Here kvm_vcpu's destruction is called
                vcpup-vcpu_id ...  //oops!
 
 
  And this is exactly how your code looks. i.e you do not increment
  reference count in most of the loops, you only increment it twice
  (in pic_unlock() and kvm_irq_delivery_to_apic()) because you are using
  vcpu outside of rcu_read_lock() protected section and I do not see why
  not just extend protected section to include kvm_vcpu_kick(). As far as
  I can see this function does not sleep.
 
 :-), I just want to minimize the RCU critical area, and as you say, we
 can  extend protected section to include kvm_vcpu_kick()
 
What's the point of trying to minimize it? vcpu will not be freed quicker.

  What should protect vcpu from disappearing in your example above is RCU
  itself if you are using it right. But since I do not see any calls to
  rcu_assign_pointer()/rcu_dereference() I doubt you are using it right
  actually.
 
 Sorry, but I thought it would not be. Please help me to check my thoughts :
 
 struct kvm_vcpu *kvm_vcpu_get(struct kvm_vcpu *vcpu)
 {
   if (vcpu == NULL)
   return NULL;
   if (atomic_add_unless(vcpu-refcount, 1, 0))
 --increment
   return vcpu;
   return NULL;
 }
 
 void kvm_vcpu_put(struct kvm_vcpu *vcpu)
 {
   struct kvm *kvm;
   if (atomic_dec_and_test(vcpu-refcount)) {
 --decrement
   kvm = vcpu-kvm;
   mutex_lock(kvm-lock);
   kvm-vcpus[vcpu-vcpu_id] = NULL;
   atomic_dec(kvm-online_vcpus);
   mutex_unlock(kvm-lock);
   call_rcu(vcpu-head, kvm_vcpu_zap);
   }
 }
 
 The atomic of decrement and increment are protected by cache coherent 
 protocol.
 So once we hold a valid kvm_vcpu pointer through kvm_vcpu_get(),
 we will always keep it until we release it, then, the destruction may happen.
 
My point is you do not need those atomics at all, not that they are
incorrect. You either protect vcpus with reference counters or RCU, but
not both. The point of RCU is that you do not need any locking
on read access to data structure, so if you add locking (by means of
reference counting) just use rwlock around access to vcpus array and be
done with it.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Gleb Natapov
On Mon, Dec 05, 2011 at 01:39:37PM +0800, Liu ping fan wrote:
 On Sun, Dec 4, 2011 at 8:10 PM, Gleb Natapov g...@redhat.com wrote:
  On Sun, Dec 04, 2011 at 07:53:37PM +0800, Liu ping fan wrote:
  On Sat, Dec 3, 2011 at 2:26 AM, Jan Kiszka jan.kis...@siemens.com wrote:
   On 2011-12-02 07:26, Liu Ping Fan wrote:
   From: Liu Ping Fan pingf...@linux.vnet.ibm.com
  
   Currently, vcpu can be destructed only when kvm instance destroyed.
   Change this to vcpu's destruction taken when its refcnt is zero,
   and then vcpu MUST and CAN be destroyed before kvm's destroy.
  
   I'm lacking the big picture yet (would be good to have in the change log
   - at least I'm too lazy to read the code):
  
   What increments the refcnt, what decrements it again? IOW, how does user
   space controls the life-cycle of a vcpu after your changes?
  
  In local APIC mode, delivering IPI to target APIC, target's refcnt is
  incremented, and decremented when finished. At other times, using RCU to
  Why is this needed?
 
 Suppose the following scene:
 
 #define kvm_for_each_vcpu(idx, vcpup, kvm) \
 for (idx = 0; \
  idx  atomic_read(kvm-online_vcpus)  \
  (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
  idx++)
 
 --
 Here kvm_vcpu's destruction is called
   vcpup-vcpu_id ...  //oops!
 
 
And this is exactly how your code looks. i.e you do not increment
reference count in most of the loops, you only increment it twice
(in pic_unlock() and kvm_irq_delivery_to_apic()) because you are using
vcpu outside of rcu_read_lock() protected section and I do not see why
not just extend protected section to include kvm_vcpu_kick(). As far as
I can see this function does not sleep.

What should protect vcpu from disappearing in your example above is RCU
itself if you are using it right. But since I do not see any calls to
rcu_assign_pointer()/rcu_dereference() I doubt you are using it right
actually.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Avi Kivity
On 12/05/2011 07:29 AM, Liu ping fan wrote:
 like this,
 #define kvm_for_each_vcpu(idx, cnt, vcpup, kvm) \
   for (idx = 0, cnt = 0, vcpup = kvm_get_vcpu(kvm, idx); \
cnt  atomic_read(kvm-online_vcpus)  \
idx  KVM_MAX_VCPUS; \
idx++, (vcpup == NULL)?:cnt++, vcpup = kvm_get_vcpu(kvm, idx)) \
if (vcpup == NULL) \
 continue; \
else


 A little ugly, but have not thought a better way out :-)


#define kvm_for_each_vcpu(vcpu, it) for (vcpu = kvm_fev_init(it); vcpu;
vcpu = kvm_fev_next(it, vcpu))

Though that doesn't give a good place for rcu_read_unlock().



-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Gleb Natapov
On Mon, Dec 05, 2011 at 11:30:51AM +0200, Avi Kivity wrote:
 On 12/05/2011 07:29 AM, Liu ping fan wrote:
  like this,
  #define kvm_for_each_vcpu(idx, cnt, vcpup, kvm) \
  for (idx = 0, cnt = 0, vcpup = kvm_get_vcpu(kvm, idx); \
   cnt  atomic_read(kvm-online_vcpus)  \
   idx  KVM_MAX_VCPUS; \
   idx++, (vcpup == NULL)?:cnt++, vcpup = kvm_get_vcpu(kvm, idx)) \
   if (vcpup == NULL) \
continue; \
   else
 
 
  A little ugly, but have not thought a better way out :-)
 
 
 #define kvm_for_each_vcpu(vcpu, it) for (vcpu = kvm_fev_init(it); vcpu;
 vcpu = kvm_fev_next(it, vcpu))
 
 Though that doesn't give a good place for rcu_read_unlock().
 
 
Why not use rculist to store vcpus and use list_for_each_entry_rcu()?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Avi Kivity
On 12/05/2011 11:42 AM, Gleb Natapov wrote:
 On Mon, Dec 05, 2011 at 11:30:51AM +0200, Avi Kivity wrote:
  On 12/05/2011 07:29 AM, Liu ping fan wrote:
   like this,
   #define kvm_for_each_vcpu(idx, cnt, vcpup, kvm) \
 for (idx = 0, cnt = 0, vcpup = kvm_get_vcpu(kvm, idx); \
  cnt  atomic_read(kvm-online_vcpus)  \
  idx  KVM_MAX_VCPUS; \
  idx++, (vcpup == NULL)?:cnt++, vcpup = kvm_get_vcpu(kvm, idx)) \
  if (vcpup == NULL) \
   continue; \
  else
  
  
   A little ugly, but have not thought a better way out :-)
  
  
  #define kvm_for_each_vcpu(vcpu, it) for (vcpu = kvm_fev_init(it); vcpu;
  vcpu = kvm_fev_next(it, vcpu))
  
  Though that doesn't give a good place for rcu_read_unlock().
  
  
 Why not use rculist to store vcpus and use list_for_each_entry_rcu()?

We can, but that's a bigger change.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Gleb Natapov
On Mon, Dec 05, 2011 at 11:58:56AM +0200, Avi Kivity wrote:
 On 12/05/2011 11:42 AM, Gleb Natapov wrote:
  On Mon, Dec 05, 2011 at 11:30:51AM +0200, Avi Kivity wrote:
   On 12/05/2011 07:29 AM, Liu ping fan wrote:
like this,
#define kvm_for_each_vcpu(idx, cnt, vcpup, kvm) \
for (idx = 0, cnt = 0, vcpup = kvm_get_vcpu(kvm, idx); \
 cnt  atomic_read(kvm-online_vcpus)  \
 idx  KVM_MAX_VCPUS; \
 idx++, (vcpup == NULL)?:cnt++, vcpup = kvm_get_vcpu(kvm, 
idx)) \
 if (vcpup == NULL) \
  continue; \
 else
   
   
A little ugly, but have not thought a better way out :-)
   
   
   #define kvm_for_each_vcpu(vcpu, it) for (vcpu = kvm_fev_init(it); vcpu;
   vcpu = kvm_fev_next(it, vcpu))
   
   Though that doesn't give a good place for rcu_read_unlock().
   
   
  Why not use rculist to store vcpus and use list_for_each_entry_rcu()?
 
 We can, but that's a bigger change.
 
Is it? I do not see a lot of accesses to vcpu array except those loops.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Avi Kivity
On 12/05/2011 12:18 PM, Gleb Natapov wrote:
  
  We can, but that's a bigger change.
  
 Is it? I do not see a lot of accesses to vcpu array except those loops.


Well actually some of those loops have to go away and be replaced by a
hash lookup with apic id as key.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Gleb Natapov
On Mon, Dec 05, 2011 at 12:22:53PM +0200, Avi Kivity wrote:
 On 12/05/2011 12:18 PM, Gleb Natapov wrote:
   
   We can, but that's a bigger change.
   
  Is it? I do not see a lot of accesses to vcpu array except those loops.
 
 
 Well actually some of those loops have to go away and be replaced by a
 hash lookup with apic id as key.
 
Yes, but apic ids are guest controllable, so there should be separate hash that
will hold vcpu to gust configured apic id mapping. Shouldn't prevent us
from moving to rculist now.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-05 Thread Liu ping fan
On Mon, Dec 5, 2011 at 4:41 PM, Gleb Natapov g...@redhat.com wrote:
 On Mon, Dec 05, 2011 at 01:39:37PM +0800, Liu ping fan wrote:
 On Sun, Dec 4, 2011 at 8:10 PM, Gleb Natapov g...@redhat.com wrote:
  On Sun, Dec 04, 2011 at 07:53:37PM +0800, Liu ping fan wrote:
  On Sat, Dec 3, 2011 at 2:26 AM, Jan Kiszka jan.kis...@siemens.com wrote:
   On 2011-12-02 07:26, Liu Ping Fan wrote:
   From: Liu Ping Fan pingf...@linux.vnet.ibm.com
  
   Currently, vcpu can be destructed only when kvm instance destroyed.
   Change this to vcpu's destruction taken when its refcnt is zero,
   and then vcpu MUST and CAN be destroyed before kvm's destroy.
  
   I'm lacking the big picture yet (would be good to have in the change log
   - at least I'm too lazy to read the code):
  
   What increments the refcnt, what decrements it again? IOW, how does user
   space controls the life-cycle of a vcpu after your changes?
  
  In local APIC mode, delivering IPI to target APIC, target's refcnt is
  incremented, and decremented when finished. At other times, using RCU to
  Why is this needed?
 
 Suppose the following scene:

 #define kvm_for_each_vcpu(idx, vcpup, kvm) \
         for (idx = 0; \
              idx  atomic_read(kvm-online_vcpus)  \
              (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
              idx++)

 --
 Here kvm_vcpu's destruction is called
               vcpup-vcpu_id ...  //oops!


 And this is exactly how your code looks. i.e you do not increment
 reference count in most of the loops, you only increment it twice
 (in pic_unlock() and kvm_irq_delivery_to_apic()) because you are using
 vcpu outside of rcu_read_lock() protected section and I do not see why
 not just extend protected section to include kvm_vcpu_kick(). As far as
 I can see this function does not sleep.

:-), I just want to minimize the RCU critical area, and as you say, we
can  extend protected section to include kvm_vcpu_kick()

 What should protect vcpu from disappearing in your example above is RCU
 itself if you are using it right. But since I do not see any calls to
 rcu_assign_pointer()/rcu_dereference() I doubt you are using it right
 actually.

Sorry, but I thought it would not be. Please help me to check my thoughts :

struct kvm_vcpu *kvm_vcpu_get(struct kvm_vcpu *vcpu)
{
if (vcpu == NULL)
return NULL;
if (atomic_add_unless(vcpu-refcount, 1, 0))
--increment
return vcpu;
return NULL;
}

void kvm_vcpu_put(struct kvm_vcpu *vcpu)
{
struct kvm *kvm;
if (atomic_dec_and_test(vcpu-refcount)) {
--decrement
kvm = vcpu-kvm;
mutex_lock(kvm-lock);
kvm-vcpus[vcpu-vcpu_id] = NULL;
atomic_dec(kvm-online_vcpus);
mutex_unlock(kvm-lock);
call_rcu(vcpu-head, kvm_vcpu_zap);
}
}

The atomic of decrement and increment are protected by cache coherent protocol.
So once we hold a valid kvm_vcpu pointer through kvm_vcpu_get(),
we will always keep it until we release it, then, the destruction may happen.

Thanks and regards,
ping fan

 --
                        Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-04 Thread Avi Kivity
On 12/02/2011 08:26 AM, Liu Ping Fan wrote:
 From: Liu Ping Fan pingf...@linux.vnet.ibm.com

 Currently, vcpu can be destructed only when kvm instance destroyed.
 Change this to vcpu's destruction taken when its refcnt is zero,
 and then vcpu MUST and CAN be destroyed before kvm's destroy.

  
 @@ -315,9 +315,17 @@ static void pit_do_work(struct work_struct *work)
* LVT0 to NMI delivery. Other PIC interrupts are just sent to
* VCPU0, and only if its LVT0 is in EXTINT mode.
*/
 - if (kvm-arch.vapics_in_nmi_mode  0)
 - kvm_for_each_vcpu(i, vcpu, kvm)
 + if (kvm-arch.vapics_in_nmi_mode  0) {
 + rcu_read_lock();
 + kvm_for_each_vcpu(i, cnt, vcpu, kvm) {
 + vcpu = kvm_get_vcpu(kvm, i);
 + if (vcpu == NULL)
 + continue;
 + cnt++;
   kvm_apic_nmi_wd_deliver(vcpu);
 + }
 + rcu_read_unlock();
 + }
   }
  }

This pattern keeps repeating, please fold it into kvm_for_each_vcpu().

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-04 Thread Liu ping fan
On Sat, Dec 3, 2011 at 2:26 AM, Jan Kiszka jan.kis...@siemens.com wrote:
 On 2011-12-02 07:26, Liu Ping Fan wrote:
 From: Liu Ping Fan pingf...@linux.vnet.ibm.com

 Currently, vcpu can be destructed only when kvm instance destroyed.
 Change this to vcpu's destruction taken when its refcnt is zero,
 and then vcpu MUST and CAN be destroyed before kvm's destroy.

 I'm lacking the big picture yet (would be good to have in the change log
 - at least I'm too lazy to read the code):

 What increments the refcnt, what decrements it again? IOW, how does user
 space controls the life-cycle of a vcpu after your changes?

In local APIC mode, delivering IPI to target APIC, target's refcnt is
incremented, and decremented when finished. At other times, using RCU to
protect the vcpu's reference from its destruction.

If kvm_vcpu is not needed by guest, user space can close the
kvm_vcpu's file
descriptors, and then,if the kvm_vcpu has crossed the period of local
APCI mode's reference,it will be destroyed.

Regards,
ping fan

 Thanks,
 Jan

 --
 Siemens AG, Corporate Technology, CT T DE IT 1
 Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-04 Thread Gleb Natapov
On Sun, Dec 04, 2011 at 07:53:37PM +0800, Liu ping fan wrote:
 On Sat, Dec 3, 2011 at 2:26 AM, Jan Kiszka jan.kis...@siemens.com wrote:
  On 2011-12-02 07:26, Liu Ping Fan wrote:
  From: Liu Ping Fan pingf...@linux.vnet.ibm.com
 
  Currently, vcpu can be destructed only when kvm instance destroyed.
  Change this to vcpu's destruction taken when its refcnt is zero,
  and then vcpu MUST and CAN be destroyed before kvm's destroy.
 
  I'm lacking the big picture yet (would be good to have in the change log
  - at least I'm too lazy to read the code):
 
  What increments the refcnt, what decrements it again? IOW, how does user
  space controls the life-cycle of a vcpu after your changes?
 
 In local APIC mode, delivering IPI to target APIC, target's refcnt is
 incremented, and decremented when finished. At other times, using RCU to
Why is this needed?

 protect the vcpu's reference from its destruction.
 
 If kvm_vcpu is not needed by guest, user space can close the
 kvm_vcpu's file
 descriptors, and then,if the kvm_vcpu has crossed the period of local
 APCI mode's reference,it will be destroyed.
 
 Regards,
 ping fan
 
  Thanks,
  Jan
 
  --
  Siemens AG, Corporate Technology, CT T DE IT 1
  Corporate Competence Center Embedded Linux

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-04 Thread Liu ping fan
On Sun, Dec 4, 2011 at 6:23 PM, Avi Kivity a...@redhat.com wrote:
 On 12/02/2011 08:26 AM, Liu Ping Fan wrote:
 From: Liu Ping Fan pingf...@linux.vnet.ibm.com

 Currently, vcpu can be destructed only when kvm instance destroyed.
 Change this to vcpu's destruction taken when its refcnt is zero,
 and then vcpu MUST and CAN be destroyed before kvm's destroy.


 @@ -315,9 +315,17 @@ static void pit_do_work(struct work_struct *work)
                * LVT0 to NMI delivery. Other PIC interrupts are just sent to
                * VCPU0, and only if its LVT0 is in EXTINT mode.
                */
 -             if (kvm-arch.vapics_in_nmi_mode  0)
 -                     kvm_for_each_vcpu(i, vcpu, kvm)
 +             if (kvm-arch.vapics_in_nmi_mode  0) {
 +                     rcu_read_lock();
 +                     kvm_for_each_vcpu(i, cnt, vcpu, kvm) {
 +                             vcpu = kvm_get_vcpu(kvm, i);
 +                             if (vcpu == NULL)
 +                                     continue;
 +                             cnt++;
                               kvm_apic_nmi_wd_deliver(vcpu);
 +                     }
 +                     rcu_read_unlock();
 +             }
       }
  }

 This pattern keeps repeating, please fold it into kvm_for_each_vcpu().

What about folding
kvm_for_each_vcpu(i, cnt, vcpu, kvm) {
              vcpu = kvm_get_vcpu(kvm, i);
              if (vcpu == NULL)
                        continue;
              cnt++;

like this,
#define kvm_for_each_vcpu(idx, cnt, vcpup, kvm) \
for (idx = 0, cnt = 0, vcpup = kvm_get_vcpu(kvm, idx); \
 cnt  atomic_read(kvm-online_vcpus)  \
 idx  KVM_MAX_VCPUS; \
 idx++, (vcpup == NULL)?:cnt++, vcpup = kvm_get_vcpu(kvm, idx)) \
 if (vcpup == NULL) \
  continue; \
 else


A little ugly, but have not thought a better way out :-)

Thanks,
ping fan
 --
 error compiling committee.c: too many arguments to function



Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-04 Thread Liu ping fan
On Sun, Dec 4, 2011 at 8:10 PM, Gleb Natapov g...@redhat.com wrote:
 On Sun, Dec 04, 2011 at 07:53:37PM +0800, Liu ping fan wrote:
 On Sat, Dec 3, 2011 at 2:26 AM, Jan Kiszka jan.kis...@siemens.com wrote:
  On 2011-12-02 07:26, Liu Ping Fan wrote:
  From: Liu Ping Fan pingf...@linux.vnet.ibm.com
 
  Currently, vcpu can be destructed only when kvm instance destroyed.
  Change this to vcpu's destruction taken when its refcnt is zero,
  and then vcpu MUST and CAN be destroyed before kvm's destroy.
 
  I'm lacking the big picture yet (would be good to have in the change log
  - at least I'm too lazy to read the code):
 
  What increments the refcnt, what decrements it again? IOW, how does user
  space controls the life-cycle of a vcpu after your changes?
 
 In local APIC mode, delivering IPI to target APIC, target's refcnt is
 incremented, and decremented when finished. At other times, using RCU to
 Why is this needed?

Suppose the following scene:

#define kvm_for_each_vcpu(idx, vcpup, kvm) \
for (idx = 0; \
 idx  atomic_read(kvm-online_vcpus)  \
 (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
 idx++)

--
Here kvm_vcpu's destruction is called
  vcpup-vcpu_id ...  //oops!


Regards,
ping fan



 protect the vcpu's reference from its destruction.

 If kvm_vcpu is not needed by guest, user space can close the
 kvm_vcpu's file
 descriptors, and then,if the kvm_vcpu has crossed the period of local
 APCI mode's reference,it will be destroyed.

 Regards,
 ping fan

  Thanks,
  Jan
 
  --
  Siemens AG, Corporate Technology, CT T DE IT 1
  Corporate Competence Center Embedded Linux

 --
                        Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-02 Thread Jan Kiszka
On 2011-12-02 07:26, Liu Ping Fan wrote:
 From: Liu Ping Fan pingf...@linux.vnet.ibm.com
 
 Currently, vcpu can be destructed only when kvm instance destroyed.
 Change this to vcpu's destruction taken when its refcnt is zero,
 and then vcpu MUST and CAN be destroyed before kvm's destroy.

I'm lacking the big picture yet (would be good to have in the change log
- at least I'm too lazy to read the code):

What increments the refcnt, what decrements it again? IOW, how does user
space controls the life-cycle of a vcpu after your changes?

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: make vcpu life cycle separated from kvm instance

2011-12-01 Thread Liu Ping Fan
From: Liu Ping Fan pingf...@linux.vnet.ibm.com

Currently, vcpu can be destructed only when kvm instance destroyed.
Change this to vcpu's destruction taken when its refcnt is zero,
and then vcpu MUST and CAN be destroyed before kvm's destroy.

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 arch/x86/kvm/i8254.c |   14 +-
 arch/x86/kvm/i8259.c |   14 +-
 arch/x86/kvm/mmu.c   |   12 -
 arch/x86/kvm/x86.c   |   66 +++
 include/linux/kvm_host.h |   24 +--
 virt/kvm/irq_comm.c  |   16 ++--
 virt/kvm/kvm_main.c  |   98 --
 7 files changed, 188 insertions(+), 56 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 76e3f1c..36e9943 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -289,7 +289,7 @@ static void pit_do_work(struct work_struct *work)
struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
struct kvm *kvm = pit-kvm;
struct kvm_vcpu *vcpu;
-   int i;
+   int i, cnt;
struct kvm_kpit_state *ps = pit-pit_state;
int inject = 0;
 
@@ -315,9 +315,17 @@ static void pit_do_work(struct work_struct *work)
 * LVT0 to NMI delivery. Other PIC interrupts are just sent to
 * VCPU0, and only if its LVT0 is in EXTINT mode.
 */
-   if (kvm-arch.vapics_in_nmi_mode  0)
-   kvm_for_each_vcpu(i, vcpu, kvm)
+   if (kvm-arch.vapics_in_nmi_mode  0) {
+   rcu_read_lock();
+   kvm_for_each_vcpu(i, cnt, vcpu, kvm) {
+   vcpu = kvm_get_vcpu(kvm, i);
+   if (vcpu == NULL)
+   continue;
+   cnt++;
kvm_apic_nmi_wd_deliver(vcpu);
+   }
+   rcu_read_unlock();
+   }
}
 }
 
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index cac4746..529057c 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -50,25 +50,33 @@ static void pic_unlock(struct kvm_pic *s)
 {
bool wakeup = s-wakeup_needed;
struct kvm_vcpu *vcpu, *found = NULL;
-   int i;
+   struct kvm *kvm = s-kvm;
+   int i, cnt;
 
s-wakeup_needed = false;
 
spin_unlock(s-lock);
 
if (wakeup) {
-   kvm_for_each_vcpu(i, vcpu, s-kvm) {
+   rcu_read_lock();
+   kvm_for_each_vcpu(i, cnt, vcpu, kvm) {
+   vcpu = kvm_get_vcpu(kvm, i);
+   if (vcpu == NULL)
+   continue;
+   cnt++;
if (kvm_apic_accept_pic_intr(vcpu)) {
found = vcpu;
break;
}
}
-
+   found = kvm_vcpu_get(found);
+   rcu_read_unlock();
if (!found)
return;
 
kvm_make_request(KVM_REQ_EVENT, found);
kvm_vcpu_kick(found);
+   kvm_vcpu_put(found);
}
 }
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f1b36cf..b9c3a01 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1833,11 +1833,17 @@ static void kvm_mmu_put_page(struct kvm_mmu_page *sp, 
u64 *parent_pte)
 
 static void kvm_mmu_reset_last_pte_updated(struct kvm *kvm)
 {
-   int i;
+   int i, cnt;
struct kvm_vcpu *vcpu;
-
-   kvm_for_each_vcpu(i, vcpu, kvm)
+   rcu_read_lock();
+   kvm_for_each_vcpu(i, cnt, vcpu, kvm) {
+   vcpu = kvm_get_vcpu(kvm, i);
+   if (vcpu == NULL)
+   continue;
+   cnt++;
vcpu-arch.last_pte_updated = NULL;
+   }
+   rcu_read_unlock();
 }
 
 static void kvm_mmu_unlink_parents(struct kvm *kvm, struct kvm_mmu_page *sp)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c38efd7..5bd8b95 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1830,11 +1830,19 @@ static int get_msr_hyperv(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata)
 
switch (msr) {
case HV_X64_MSR_VP_INDEX: {
-   int r;
+   int r, cnt;
struct kvm_vcpu *v;
-   kvm_for_each_vcpu(r, v, vcpu-kvm)
+   struct kvm *kvm =  vcpu-kvm;
+   rcu_read_lock();
+   kvm_for_each_vcpu(r, cnt, v, kvm) {
+   v = kvm_get_vcpu(kvm, r);
+   if (v == NULL)
+   continue;
+   cnt++;
if (v == vcpu)
data = r;
+   }
+   rcu_read_unlock();
break;
}
case HV_X64_MSR_EOI:
@@ -4966,7 +4974,7 @@