RE: KVM: VMX: disable SMEP feature when guest is in non-paging mode

2013-02-03 Thread Xu, Dongxiao
 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Sunday, February 03, 2013 9:57 PM
 To: Xu, Dongxiao
 Cc: kvm@vger.kernel.org
 Subject: Re: KVM: VMX: disable SMEP feature when guest is in non-paging
 mode
 
 On Fri, Feb 01, 2013 at 08:30:07AM -, Xu wrote:
  SMEP is disabled if CPU is in non-paging mode in hardware.
  However KVM always uses paging mode to emulate guest non-paging
  mode with HAP.
 Not always, only if unrestricted mode is disabled, since vm86 mode, that
 is used otherwise, requires paging.
 
 To emulate this behavior, SMEP needs to be manually
  disabled when guest switches to non-paging mode.
 
  We met an issue that, SMP Linux guest with recent kernel (enable
  SMEP support, for example, 3.5.3) would crash with triple fault if
  setting unrestricted_guest=0. This is because KVM uses an identity
  mapping page table to emulate the non-paging mode, where the page
  table is set with USER flag. If SMEP is still enabled in this case,
  guest will meet unhandlable page fault and then crash.
 
  Signed-off-by: Dongxiao Xu dongxiao...@intel.com
  Signed-off-by: Xiantao Zhang xiantao.zh...@intel.com
 Reviewed-by: Gleb Natapov g...@redhat.com
 
 But HAP is XEN terminology AFAIK. KVM speak is tdp (two dimensional
 paging). If would be nice to change it in the commit message and the
 comment before committing.

Thanks for reminding this. I've sent out the v2 patch to address this issue.

Thanks,
Dongxiao

 
 
  ---
  arch/x86/kvm/vmx.c |8 
   1 files changed, 8 insertions(+), 0 deletions(-)
 
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
  index 9120ae1..e82f20d 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -3155,6 +3155,14 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu,
 unsigned long cr4)
  if (!is_paging(vcpu)) {
  hw_cr4 = ~X86_CR4_PAE;
  hw_cr4 |= X86_CR4_PSE;
  +   /*
  +* SMEP is disabled if CPU is in non-paging mode in
  +* hardware. However KVM always uses paging mode to
  +* emulate guest non-paging mode with HAP.
  +* To emulate this behavior, SMEP needs to be manually
  +* disabled when guest switches to non-paging mode.
  +*/
  +   hw_cr4 = ~X86_CR4_SMEP;
  } else if (!(cr4  X86_CR4_PAE)) {
  hw_cr4 = ~X86_CR4_PAE;
  }
 
 --
   Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 4/4] nested vmx: use a list to store the launched vmcs12 for L1 VMM

2012-11-28 Thread Xu, Dongxiao
 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Wednesday, November 28, 2012 7:28 PM
 To: Marcelo Tosatti
 Cc: Xu, Dongxiao; kvm@vger.kernel.org
 Subject: Re: [PATCH v2 4/4] nested vmx: use a list to store the launched 
 vmcs12
 for L1 VMM
 
 On Tue, Nov 27, 2012 at 10:29:08PM -0200, Marcelo Tosatti wrote:
  On Thu, Nov 22, 2012 at 12:51:59PM +0800, Dongxiao Xu wrote:
   The launch state is not a member in the VMCS area, use a separate
   variable (list) to store it instead.
  
   Signed-off-by: Dongxiao Xu dongxiao...@intel.com
 
  1. What is the problem with keeping launched state in the VMCS?
  Assuming there is a positive answer to the above:
 
  2. Don't you have to change VMCS ID?
 
  3. Can't it be kept somewhere else other than a list? Current scheme
  allows guest to allocate unlimited amounts of host memory.
 
  4. What is the state of migration / nested vmx again? If vmcs12 is
  migrated, this means launched state is not migrated anymore.
 
  Patches 1-3 seem fine.
 According to Dongxiao they are slowing down nested guest by 4%.

For this version, it will introduce certain performance downgrade. 

Actually in my new patch, I simplified the vmcs12_read() and vmcs12_write() 
functions and there is no obvious performance downgrade.

Thanks,
Dongxiao

 
 --
   Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 4/4] nested vmx: use a list to store the launched vmcs12 for L1 VMM

2012-11-28 Thread Xu, Dongxiao
 -Original Message-
 From: Orit Wasserman [mailto:owass...@redhat.com]
 Sent: Wednesday, November 28, 2012 8:30 PM
 To: Marcelo Tosatti
 Cc: Xu, Dongxiao; kvm@vger.kernel.org; g...@redhat.com
 Subject: Re: [PATCH v2 4/4] nested vmx: use a list to store the launched 
 vmcs12
 for L1 VMM
 
 On 11/28/2012 02:29 AM, Marcelo Tosatti wrote:
  On Thu, Nov 22, 2012 at 12:51:59PM +0800, Dongxiao Xu wrote:
  The launch state is not a member in the VMCS area, use a separate
  variable (list) to store it instead.
 
  Signed-off-by: Dongxiao Xu dongxiao...@intel.com
 
  1. What is the problem with keeping launched state in the VMCS?
  Assuming there is a positive answer to the above:
 
  2. Don't you have to change VMCS ID?
 
  3. Can't it be kept somewhere else other than a list? Current scheme
  allows guest to allocate unlimited amounts of host memory.
 I agree with Marcelo you have to limit the number of VMCS in the list 
 otherwise
 it will be easy to attack a host with nested :)

Yes it is a point. I will add a limitation of the VMCS number for the guest VMM.

Thanks,
Dongxiao

 
  4. What is the state of migration / nested vmx again? If vmcs12 is
  migrated, this means launched state is not migrated anymore.
 
  Patches 1-3 seem fine.
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm in the
  body of a message to majord...@vger.kernel.org More majordomo info at
  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] nested vmx: use a list to store the launched vmcs12 for L1 VMM

2012-11-22 Thread Xu, Dongxiao
 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Wednesday, November 21, 2012 10:15 PM
 To: Xu, Dongxiao
 Cc: kvm@vger.kernel.org; mtosa...@redhat.com
 Subject: Re: [PATCH 4/4] nested vmx: use a list to store the launched vmcs12
 for L1 VMM
 
 On Wed, Nov 21, 2012 at 05:04:37PM +0800, Dongxiao Xu wrote:
  The launch state is not a member in the VMCS area, use a separate
  variable (list) to store it instead.
 
 Why? Guest shouldn't be aware of the format of VMCS area.

Yes, I agree. Guest VMM/L1 VMM shouldn't be aware of the VMCS format.

For Root VMM/L0 VMM, it need to track the launch state of the vmcs12, in order 
to correctly emulate the VMLAUNCH and VMRESUME instructions.
Originally we store the launch state in the VMCS area, however in fact, there 
is no launch state field in VMCS. This patch is to move it out and use a 
separate list to store it.

Thanks,
Dongxiao


 
  Signed-off-by: Dongxiao Xu dongxiao...@intel.com
  ---
   arch/x86/kvm/vmx.c |   86
 +---
   1 files changed, 81 insertions(+), 5 deletions(-)
 
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
  6687fb6..d03ab4e 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -177,8 +177,7 @@ struct __packed vmcs12 {
  u32 revision_id;
  u32 abort;
 
  -   u32 launch_state; /* set to 0 by VMCLEAR, to 1 by VMLAUNCH */
  -   u32 padding[7]; /* room for future expansion */
  +   u32 padding[8]; /* room for future expansion */
 
  u64 io_bitmap_a;
  u64 io_bitmap_b;
  @@ -339,6 +338,11 @@ struct vmcs02_list {
  struct loaded_vmcs vmcs02;
   };
 
  +struct vmcs12_list {
  +   unsigned long vmcs12_pa;
  +   struct list_head node;
  +};
  +
   /*
* The nested_vmx structure is part of vcpu_vmx, and holds information we
 need
* for correct emulation of VMX (i.e., nested VMX) on this vcpu.
  @@ -364,6 +368,8 @@ struct nested_vmx {
   * we must keep them pinned while L2 runs.
   */
  struct page *apic_access_page;
  +   /* vmcs12_pool contains the launched vmcs12. */
  +   struct list_head vmcs12_pool;
   };
 
   struct vcpu_vmx {
  @@ -614,6 +620,58 @@ static void nested_release_page_clean(struct page
 *page)
  kvm_release_page_clean(page);
   }
 
  +static int vmcs12_launched(struct list_head *vmcs12_pool,
  +  unsigned long vmcs12_pa)
  +{
  +   struct vmcs12_list *iter;
  +   struct list_head *pos;
  +   int launched = 0;
  +
  +   list_for_each(pos, vmcs12_pool) {
  +   iter = list_entry(pos, struct vmcs12_list, node);
  +   if (vmcs12_pa == iter-vmcs12_pa) {
  +   launched = 1;
  +   break;
  +   }
  +   }
  +
  +   return launched;
  +}
  +
  +static int set_vmcs12_launched(struct list_head *vmcs12_pool,
  +  unsigned long vmcs12_pa)
  +{
  +   struct vmcs12_list *vmcs12;
  +
  +   if (vmcs12_launched(vmcs12_pool, vmcs12_pa))
  +   return 0;
  +
  +   vmcs12 = kzalloc(sizeof(struct vmcs12_list), GFP_KERNEL);
  +   if (!vmcs12)
  +   return -ENOMEM;
  +
  +   vmcs12-vmcs12_pa = vmcs12_pa;
  +   list_add(vmcs12-node, vmcs12_pool);
  +
  +   return 0;
  +}
  +
  +static void clear_vmcs12_launched(struct list_head *vmcs12_pool,
  +  unsigned long vmcs12_pa)
  +{
  +   struct vmcs12_list *iter;
  +   struct list_head *pos;
  +
  +   list_for_each(pos, vmcs12_pool) {
  +   iter = list_entry(pos, struct vmcs12_list, node);
  +   if (vmcs12_pa == iter-vmcs12_pa) {
  +   list_del(iter-node);
  +   kfree(iter);
  +   break;
  +   }
  +   }
  +}
  +
   static u64 construct_eptp(unsigned long root_hpa);  static void
  kvm_cpu_vmxon(u64 addr);  static void kvm_cpu_vmxoff(void); @@ -5111,6
  +5169,18 @@ static void nested_free_all_saved_vmcss(struct vcpu_vmx
  *vmx)  }
 
   /*
  + * Free the vmcs12 list.
  + */
  +static void nested_free_vmcs12_list(struct vcpu_vmx *vmx) {
  +   struct vmcs12_list *item, *n;
  +   list_for_each_entry_safe(item, n, vmx-nested.vmcs12_pool, node) {
  +   list_del(item-node);
  +   kfree(item);
  +   }
  +}
  +
  +/*
* Emulate the VMXON instruction.
* Currently, we just remember that VMX is active, and do not save or even
* inspect the argument to VMXON (the so-called VMXON pointer)
  because we @@ -5207,6 +5277,7 @@ static void free_nested(struct
 vcpu_vmx *vmx)
  }
 
  nested_free_all_saved_vmcss(vmx);
  +   nested_free_vmcs12_list(vmx);
   }
 
   /* Emulate the VMXOFF instruction */
  @@ -5359,7 +5430,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
  return 1;
  }
  vmcs12 = kmap(page);
  -   vmcs12-launch_state = 0;
  +   clear_vmcs12_launched(vmx-nested.vmcs12_pool, __pa(vmcs12));
  kunmap(page);
  nested_release_page(page);
 
  @@ -6467,6 +6538,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct
  kvm

RE: [PATCH 1/4] nested vmx: clean up for vmcs12 read and write

2012-11-22 Thread Xu, Dongxiao


 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Wednesday, November 21, 2012 9:27 PM
 To: Xu, Dongxiao
 Cc: kvm@vger.kernel.org; mtosa...@redhat.com
 Subject: Re: [PATCH 1/4] nested vmx: clean up for vmcs12 read and write
 
 On Wed, Nov 21, 2012 at 05:04:34PM +0800, Dongxiao Xu wrote:
  abstract vmcs12_read and vmcs12_write functions to do the vmcs12
  read/write operations.
 
  Signed-off-by: Dongxiao Xu dongxiao...@intel.com
  ---
   arch/x86/kvm/vmx.c |   86
 +++-
   1 files changed, 45 insertions(+), 41 deletions(-)
 
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
  f858159..d8670e4 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -5407,32 +5407,67 @@ static inline int vmcs_field_readonly(unsigned
 long field)
* some of the bits we return here (e.g., on 32-bit guests, only 32 bits of
* 64-bit fields are to be returned).
*/
  -static inline bool vmcs12_read_any(struct kvm_vcpu *vcpu,
  -   unsigned long field, u64 *ret)
  +static inline u64 vmcs12_read(struct kvm_vcpu *vcpu, unsigned long
  +field)
   {
  short offset = vmcs_field_to_offset(field);
  char *p;
 
  -   if (offset  0)
  +   if (offset  0) {
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  return 0;
  +   }
 
  p = ((char *)(get_vmcs12(vcpu))) + offset;
 
  switch (vmcs_field_type(field)) {
  case VMCS_FIELD_TYPE_NATURAL_WIDTH:
  -   *ret = *((natural_width *)p);
  +   return *((natural_width *)p);
  +   case VMCS_FIELD_TYPE_U16:
  +   return *((u16 *)p);
  +   case VMCS_FIELD_TYPE_U32:
  +   return *((u32 *)p);
  +   case VMCS_FIELD_TYPE_U64:
  +   return *((u64 *)p);
  +   default:
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  +   return 0; /* can never happen. */
  +   }
  +}
  +
  +static inline int vmcs12_write(struct kvm_vcpu *vcpu,
  +   unsigned long field,
  +   u64 value)
  +{
  +   short offset = vmcs_field_to_offset(field);
  +   char *p;
  +
  +   if (offset  0) {
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  +   return 0;
  +   }
  +
  +   p = ((char *)(get_vmcs12(vcpu))) + offset;
  +
  +   switch (vmcs_field_type(field)) {
  +   case VMCS_FIELD_TYPE_NATURAL_WIDTH:
  +   *(natural_width *)p = value;
  return 1;
  case VMCS_FIELD_TYPE_U16:
  -   *ret = *((u16 *)p);
  +   *(u16 *)p = value;
  return 1;
  case VMCS_FIELD_TYPE_U32:
  -   *ret = *((u32 *)p);
  +   *(u32 *)p = value;
  return 1;
  case VMCS_FIELD_TYPE_U64:
  -   *ret = *((u64 *)p);
  +   *(u64 *)p = value;
  return 1;
  default:
  -   return 0; /* can never happen. */
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  +   return 0;
  }
   }
 
  @@ -5466,11 +5501,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
  /* Decode instruction info and find the field to read */
  field = kvm_register_read(vcpu, (((vmx_instruction_info)  28)  0xf));
  /* Read the field, zero-extended to a u64 field_value */
  -   if (!vmcs12_read_any(vcpu, field, field_value)) {
  -   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  -   skip_emulated_instruction(vcpu);
  -   return 1;
  -   }
  +   field_value = vmcs12_read(vcpu, field);
 You do not handle failure here and always write back field_value even if
 vmcs12_read() failed. Actually now it is impossible to detect a failure. Call 
 to
 nested_vmx_failValid() in vmcs12_read() will be overwritten by call to
 nested_vmx_succeed() at the end of
 handle_vmread() and skip_emulated_instruction() will be called twice.

Thanks Gleb and Orit to raise this issue.

What about moving the offset check outside the vmcs12_read() and vmcs12_write() 
function, and put it directly in handle_vmread() and handle_vmwrite()?
I think we only need to do offset error check in handle_vmread() and 
handle_vmwrite() since they are to emulate correct behavior for guest VMM. For 
example, if guest VMM reads a field that is not valid or writes a field that is 
read only, then in emulation code handle_vmread() and handle_vmwrite, we need 
to raise error to guest VMM.
For other calling of vmcs12_read() and vmcs12_write() functions in KVM 
hypervisor (see PATCH 3/4), actually the caller needs to ensure the field 
should be valid. Otherwise it is a bug in KVM hypervisor itself.
Does it make sense? If so, I will revise this patch according to above.

Thanks,
Dongxiao

RE: [PATCH 1/4] nested vmx: clean up for vmcs12 read and write

2012-11-22 Thread Xu, Dongxiao


 -Original Message-
 From: Gleb Natapov [mailto:g...@redhat.com]
 Sent: Wednesday, November 21, 2012 9:04 PM
 To: Xu, Dongxiao
 Cc: kvm@vger.kernel.org; mtosa...@redhat.com
 Subject: Re: [PATCH 1/4] nested vmx: clean up for vmcs12 read and write
 
 On Wed, Nov 21, 2012 at 05:04:34PM +0800, Dongxiao Xu wrote:
  abstract vmcs12_read and vmcs12_write functions to do the vmcs12
  read/write operations.
 
  Signed-off-by: Dongxiao Xu dongxiao...@intel.com
  ---
   arch/x86/kvm/vmx.c |   86
 +++-
   1 files changed, 45 insertions(+), 41 deletions(-)
 
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
  f858159..d8670e4 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -5407,32 +5407,67 @@ static inline int vmcs_field_readonly(unsigned
 long field)
* some of the bits we return here (e.g., on 32-bit guests, only 32 bits of
* 64-bit fields are to be returned).
*/
  -static inline bool vmcs12_read_any(struct kvm_vcpu *vcpu,
  -   unsigned long field, u64 *ret)
  +static inline u64 vmcs12_read(struct kvm_vcpu *vcpu, unsigned long
  +field)
   {
  short offset = vmcs_field_to_offset(field);
  char *p;
 
  -   if (offset  0)
  +   if (offset  0) {
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  return 0;
  +   }
 
  p = ((char *)(get_vmcs12(vcpu))) + offset;
 
  switch (vmcs_field_type(field)) {
  case VMCS_FIELD_TYPE_NATURAL_WIDTH:
  -   *ret = *((natural_width *)p);
  +   return *((natural_width *)p);
  +   case VMCS_FIELD_TYPE_U16:
  +   return *((u16 *)p);
  +   case VMCS_FIELD_TYPE_U32:
  +   return *((u32 *)p);
  +   case VMCS_FIELD_TYPE_U64:
  +   return *((u64 *)p);
  +   default:
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  +   return 0; /* can never happen. */
  +   }
  +}
  +
  +static inline int vmcs12_write(struct kvm_vcpu *vcpu,
  +   unsigned long field,
  +   u64 value)
  +{
  +   short offset = vmcs_field_to_offset(field);
  +   char *p;
  +
  +   if (offset  0) {
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  +   return 0;
  +   }
  +
 Shouldn't vmcs_field_readonly() check be in vmcs12_write() instead of a 
 caller?

Well, you can see from the later patch that, we construct vmcs12 through the 
vmcs12_write() function. Some fields like the exit reason, exit qualification 
are needed to be written into the vmcs12 area. Therefore we could not put the 
read only check into the vmcs12_write() function.

Thanks,
Dongxiao

 
  +   p = ((char *)(get_vmcs12(vcpu))) + offset;
  +
  +   switch (vmcs_field_type(field)) {
  +   case VMCS_FIELD_TYPE_NATURAL_WIDTH:
  +   *(natural_width *)p = value;
  return 1;
  case VMCS_FIELD_TYPE_U16:
  -   *ret = *((u16 *)p);
  +   *(u16 *)p = value;
  return 1;
  case VMCS_FIELD_TYPE_U32:
  -   *ret = *((u32 *)p);
  +   *(u32 *)p = value;
  return 1;
  case VMCS_FIELD_TYPE_U64:
  -   *ret = *((u64 *)p);
  +   *(u64 *)p = value;
  return 1;
  default:
  -   return 0; /* can never happen. */
  +   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  +   skip_emulated_instruction(vcpu);
  +   return 0;
  }
   }
 
  @@ -5466,11 +5501,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
  /* Decode instruction info and find the field to read */
  field = kvm_register_read(vcpu, (((vmx_instruction_info)  28)  0xf));
  /* Read the field, zero-extended to a u64 field_value */
  -   if (!vmcs12_read_any(vcpu, field, field_value)) {
  -   nested_vmx_failValid(vcpu,
 VMXERR_UNSUPPORTED_VMCS_COMPONENT);
  -   skip_emulated_instruction(vcpu);
  -   return 1;
  -   }
  +   field_value = vmcs12_read(vcpu, field);
  /*
   * Now copy part of this value to register or memory, as requested.
   * Note that the number of bits actually copied is 32 or 64
  depending @@ -5500,8 +5531,6 @@ static int handle_vmwrite(struct
 kvm_vcpu *vcpu)
  gva_t gva;
  unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
  u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
  -   char *p;
  -   short offset;
  /* The value to write might be 32 or 64 bits, depending on L1's long
   * mode, and eventually we need to write that into a field of several
   * possible lengths. The code below first zero-extends the value to
  64 @@ -5537,33 +5566,8 @@ static int handle_vmwrite(struct kvm_vcpu
 *vcpu)
  skip_emulated_instruction(vcpu);
  return 1

RE: [PATCH 4/4 v4] KVM: VMX: VMXON/VMXOFF usage changes.

2010-05-12 Thread Xu, Dongxiao
Marcelo Tosatti wrote:
 On Tue, May 11, 2010 at 06:29:48PM +0800, Xu, Dongxiao wrote:
 From: Dongxiao Xu dongxiao...@intel.com
 
 SDM suggests VMXON should be called before VMPTRLD, and VMXOFF
 should be called after doing VMCLEAR.
 
 Therefore in vmm coexistence case, we should firstly call VMXON
 before any VMCS operation, and then call VMXOFF after the
 operation is done.
 
 Signed-off-by: Dongxiao Xu dongxiao...@intel.com
 ---
  arch/x86/kvm/vmx.c |   38 +++---
  1 files changed, 31 insertions(+), 7 deletions(-)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index c536b9d..dbd47a7 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -168,6 +168,8 @@ static inline struct vcpu_vmx *to_vmx(struct
 kvm_vcpu *vcpu) 
 
  static int init_rmode(struct kvm *kvm);
  static u64 construct_eptp(unsigned long root_hpa);
 +static void kvm_cpu_vmxon(u64 addr);
 +static void kvm_cpu_vmxoff(void);
 
  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
 @@ -786,8 +788,11 @@ static void vmx_vcpu_load(struct kvm_vcpu
  *vcpu, int cpu)  { struct vcpu_vmx *vmx = to_vmx(vcpu);
  u64 tsc_this, delta, new_offset;
 +u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
 
 -if (vmm_exclusive  vcpu-cpu != cpu)
 +if (!vmm_exclusive)
 +kvm_cpu_vmxon(phys_addr);
 +else if (vcpu-cpu != cpu)
  vcpu_clear(vmx);
 
  if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
 @@ -833,8 +838,10 @@ static void vmx_vcpu_load(struct kvm_vcpu
  *vcpu, int cpu) static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
  {
  __vmx_load_host_state(to_vmx(vcpu));
 -if (!vmm_exclusive)
 +if (!vmm_exclusive) {
  __vcpu_clear(to_vmx(vcpu));
 +kvm_cpu_vmxoff();
 +}
  }
 
  static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
 @@ -1257,9 +1264,11 @@ static int hardware_enable(void *garbage)
 FEATURE_CONTROL_LOCKED |
 FEATURE_CONTROL_VMXON_ENABLED);
  write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug
 safe */ -kvm_cpu_vmxon(phys_addr); 
 
 -ept_sync_global();
 +if (vmm_exclusive) {
 +kvm_cpu_vmxon(phys_addr);
 +ept_sync_global();
 +}
 
  return 0;
 
 The documentation recommends usage of INVEPT all-context after
 execution of VMXON and prior to execution of VMXOFF. Is it not
 necessary? 

After adding the patch, when vCPU is scheduled in a CPU, it will call
tlb_flush() to invalidate the EPT and VPID cache/tlb for the vCPU.
Therefore the correctness for KVM is guaranteed. 

Thanks,
Dongxiao--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4 v3] KVM: VMX: VMXON/VMXOFF usage changes.

2010-05-11 Thread Xu, Dongxiao
Avi Kivity wrote:
 On 05/07/2010 05:43 AM, Xu, Dongxiao wrote:
 From: Dongxiao Xudongxiao...@intel.com
 
 SDM suggests VMXON should be called before VMPTRLD, and VMXOFF
 should be called after doing VMCLEAR.
 
 Therefore in vmm coexistence case, we should firstly call VMXON
 before any VMCS operation, and then call VMXOFF after the
 operation is done.
 
 
 @@ -4118,7 +4143,10 @@ static void vmx_cpuid_update(struct kvm_vcpu
  *vcpu) struct kvm_cpuid_entry2 *best;
  struct vcpu_vmx *vmx = to_vmx(vcpu);
  u32 exec_control;
 +int cpu;
 
 +cpu = get_cpu();
 +vmx_vcpu_load(vmx-vcpu, cpu);
  vmx-rdtscp_enabled = false;
  if (vmx_rdtscp_supported()) {
  exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
 @@ -4133,6 +4161,9 @@ static void vmx_cpuid_update(struct kvm_vcpu
  *vcpu)  } }
  }
 +vmx_vcpu_put(vmx-vcpu);
 +put_cpu();
 +
   }
 
   static void vmx_set_supported_cpuid(u32 func, struct
 kvm_cpuid_entry2 *entry) 
 
 
 I'm not sure why this is needed.  vmx_cpuid_update() is called from a
 vcpu ioctl which should have called vcpu_load() before.
 
 Apart from that, everything looks good for merging.

Vcpu_load() and vcpu_put() is not called in that ioctl. I will add that and
send out another version of patchset.

Thanks,
Dongxiao

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4 v3] KVM: VMX: VMXON/VMXOFF usage changes.

2010-05-11 Thread Xu, Dongxiao
Avi Kivity wrote:
 On 05/11/2010 12:38 PM, Xu, Dongxiao wrote:
 
 I'm not sure why this is needed.  vmx_cpuid_update() is called from
 a vcpu ioctl which should have called vcpu_load() before.
 
 Apart from that, everything looks good for merging.
 
 Vcpu_load() and vcpu_put() is not called in that ioctl. I will add
 that and send out another version of patchset.
 
 
 That's a serious bug even before this patchset, since vmx_cpuid_update
 accesses the vmcs.  Please send the fix as a separate patch (which
 will be backported to -stable).

That's OK. 

Thanks
Dongxiao--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86: Call vcpu_load and vcpu_put in cpuid_update.

2010-05-11 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

cpuid_update may operate VMCS, so vcpu_load() and vcpu_put()
should be called to ensure correctness.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/x86.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6b2ce1d..08edfc8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1773,6 +1773,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
if (copy_from_user(cpuid_entries, entries,
   cpuid-nent * sizeof(struct kvm_cpuid_entry)))
goto out_free;
+   vcpu_load(vcpu);
for (i = 0; i  cpuid-nent; i++) {
vcpu-arch.cpuid_entries[i].function = 
cpuid_entries[i].function;
vcpu-arch.cpuid_entries[i].eax = cpuid_entries[i].eax;
@@ -1790,6 +1791,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
r = 0;
kvm_apic_set_version(vcpu);
kvm_x86_ops-cpuid_update(vcpu);
+   vcpu_put(vcpu);
 
 out_free:
vfree(cpuid_entries);
@@ -1810,9 +1812,11 @@ static int kvm_vcpu_ioctl_set_cpuid2(struct kvm_vcpu 
*vcpu,
if (copy_from_user(vcpu-arch.cpuid_entries, entries,
   cpuid-nent * sizeof(struct kvm_cpuid_entry2)))
goto out;
+   vcpu_load(vcpu);
vcpu-arch.cpuid_nent = cpuid-nent;
kvm_apic_set_version(vcpu);
kvm_x86_ops-cpuid_update(vcpu);
+   vcpu_put(vcpu);
return 0;
 
 out:
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4 v4] KVM: VMX: Support hosted VMM coexsitence.

2010-05-11 Thread Xu, Dongxiao
Hi all,

This is hosted VMM coexistence support v4.

Main changes from v3:
Remove the change to vmx_cpuid_update() since vcpu_load() and
vcpu_put() will be added around cpuid_update by another patch.

Main changes from v2:
1) Change vmm_coexistence to vmm_exclusive.
2) Some code structure changes. Split the original 3 patches to 4.
3) Address some comments from Avi.

Main changes from v1:
1) Add an module option vmm_coexistence to decide whether to
enable this feature. Currently it is off defaultly.
2) Each time when a KVM vcpu is scheduled in, it will invalidate EPT
and VPID TLBs to avoid conflict between different VMMs.

VMX: Support for coexistence of KVM and other hosted VMMs. 

The following NOTE is picked up from Intel SDM 3B 27.3 chapter, 
MANAGING VMCS REGIONS AND POINTERS.

--
NOTE
As noted in Section 21.1, the processor may optimize VMX operation
by maintaining the state of an active VMCS (one for which VMPTRLD
has been executed) on the processor. Before relinquishing control to
other system software that may, without informing the VMM, remove
power from the processor (e.g., for transitions to S3 or S4) or leave
VMX operation, a VMM must VMCLEAR all active VMCSs. This ensures
that all VMCS data cached by the processor are flushed to memory
and that no other software can corrupt the current VMM's VMCS data.
It is also recommended that the VMM execute VMXOFF after such
executions of VMCLEAR.
--

Currently, VMCLEAR is called at VCPU migration. To support hosted
VMM coexistence, this patch modifies the VMCLEAR/VMPTRLD and
VMXON/VMXOFF usages. VMCLEAR will be called when VCPU is
scheduled out of a physical CPU, while VMPTRLD is called when VCPU
is scheduled in a physical CPU. Also this approach could eliminates
the IPI mechanism for original VMCLEAR. As suggested by SDM,
VMXOFF will be called after VMCLEAR, and VMXON will be called
before VMPTRLD.

With this patchset, KVM and VMware Workstation 7 could launch
serapate guests and they can work well with each other. --
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4 v4] KVM: VMX: Define new functions to wrapper direct call of asm code.

2010-05-11 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Define vmcs_load() and kvm_cpu_vmxon() to avoid direct call of asm
code. Also move VMXE bit operation out of kvm_cpu_vmxoff().

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   36 +++-
 1 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 875b785..e77da89 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -445,6 +445,19 @@ static void vmcs_clear(struct vmcs *vmcs)
   vmcs, phys_addr);
 }
 
+static void vmcs_load(struct vmcs *vmcs)
+{
+   u64 phys_addr = __pa(vmcs);
+   u8 error;
+
+   asm volatile (__ex(ASM_VMX_VMPTRLD_RAX) ; setna %0
+   : =g(error) : a(phys_addr), m(phys_addr)
+   : cc, memory);
+   if (error)
+   printk(KERN_ERR kvm: vmptrld %p/%llx fail\n,
+  vmcs, phys_addr);
+}
+
 static void __vcpu_clear(void *arg)
 {
struct vcpu_vmx *vmx = arg;
@@ -769,7 +782,6 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx)
 static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
-   u64 phys_addr = __pa(vmx-vmcs);
u64 tsc_this, delta, new_offset;
 
if (vcpu-cpu != cpu) {
@@ -783,15 +795,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
}
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
-   u8 error;
-
per_cpu(current_vmcs, cpu) = vmx-vmcs;
-   asm volatile (__ex(ASM_VMX_VMPTRLD_RAX) ; setna %0
- : =g(error) : a(phys_addr), m(phys_addr)
- : cc);
-   if (error)
-   printk(KERN_ERR kvm: vmptrld %p/%llx fail\n,
-  vmx-vmcs, phys_addr);
+   vmcs_load(vmx-vmcs);
}
 
if (vcpu-cpu != cpu) {
@@ -1220,6 +1225,13 @@ static __init int vmx_disabled_by_bios(void)
/* locked but not enabled */
 }
 
+static void kvm_cpu_vmxon(u64 addr)
+{
+   asm volatile (ASM_VMX_VMXON_RAX
+   : : a(addr), m(addr)
+   : memory, cc);
+}
+
 static int hardware_enable(void *garbage)
 {
int cpu = raw_smp_processor_id();
@@ -1240,9 +1252,7 @@ static int hardware_enable(void *garbage)
   FEATURE_CONTROL_LOCKED |
   FEATURE_CONTROL_VMXON_ENABLED);
write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
-   asm volatile (ASM_VMX_VMXON_RAX
- : : a(phys_addr), m(phys_addr)
- : memory, cc);
+   kvm_cpu_vmxon(phys_addr);
 
ept_sync_global();
 
@@ -1266,13 +1276,13 @@ static void vmclear_local_vcpus(void)
 static void kvm_cpu_vmxoff(void)
 {
asm volatile (__ex(ASM_VMX_VMXOFF) : : : cc);
-   write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
 static void hardware_disable(void *garbage)
 {
vmclear_local_vcpus();
kvm_cpu_vmxoff();
+   write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
 static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4 v4] KVM: VMX: VMCLEAR/VMPTRLD usage changes.

2010-05-11 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Originally VMCLEAR/VMPTRLD is called on vcpu migration. To
support hosted VMM coexistance, VMCLEAR is executed on vcpu
schedule out, and VMPTRLD is executed on vcpu schedule in.
This could also eliminate the IPI when doing VMCLEAR.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 49b0850..c536b9d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -62,6 +62,9 @@ module_param_named(unrestricted_guest,
 static int __read_mostly emulate_invalid_guest_state = 0;
 module_param(emulate_invalid_guest_state, bool, S_IRUGO);
 
+static int __read_mostly vmm_exclusive = 1;
+module_param(vmm_exclusive, bool, S_IRUGO);
+
 #define KVM_GUEST_CR0_MASK_UNRESTRICTED_GUEST  \
(X86_CR0_WP | X86_CR0_NE | X86_CR0_NW | X86_CR0_CD)
 #define KVM_GUEST_CR0_MASK \
@@ -784,7 +787,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 tsc_this, delta, new_offset;
 
-   if (vcpu-cpu != cpu)
+   if (vmm_exclusive  vcpu-cpu != cpu)
vcpu_clear(vmx);
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
@@ -830,6 +833,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 {
__vmx_load_host_state(to_vmx(vcpu));
+   if (!vmm_exclusive)
+   __vcpu_clear(to_vmx(vcpu));
 }
 
 static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4 v4] KVM: VMX: VMXON/VMXOFF usage changes.

2010-05-11 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

SDM suggests VMXON should be called before VMPTRLD, and VMXOFF
should be called after doing VMCLEAR.

Therefore in vmm coexistence case, we should firstly call VMXON
before any VMCS operation, and then call VMXOFF after the
operation is done.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   38 +++---
 1 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c536b9d..dbd47a7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -168,6 +168,8 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu)
 
 static int init_rmode(struct kvm *kvm);
 static u64 construct_eptp(unsigned long root_hpa);
+static void kvm_cpu_vmxon(u64 addr);
+static void kvm_cpu_vmxoff(void);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -786,8 +788,11 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 tsc_this, delta, new_offset;
+   u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
 
-   if (vmm_exclusive  vcpu-cpu != cpu)
+   if (!vmm_exclusive)
+   kvm_cpu_vmxon(phys_addr);
+   else if (vcpu-cpu != cpu)
vcpu_clear(vmx);
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
@@ -833,8 +838,10 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 {
__vmx_load_host_state(to_vmx(vcpu));
-   if (!vmm_exclusive)
+   if (!vmm_exclusive) {
__vcpu_clear(to_vmx(vcpu));
+   kvm_cpu_vmxoff();
+   }
 }
 
 static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
@@ -1257,9 +1264,11 @@ static int hardware_enable(void *garbage)
   FEATURE_CONTROL_LOCKED |
   FEATURE_CONTROL_VMXON_ENABLED);
write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
-   kvm_cpu_vmxon(phys_addr);
 
-   ept_sync_global();
+   if (vmm_exclusive) {
+   kvm_cpu_vmxon(phys_addr);
+   ept_sync_global();
+   }
 
return 0;
 }
@@ -1285,8 +1294,10 @@ static void kvm_cpu_vmxoff(void)
 
 static void hardware_disable(void *garbage)
 {
-   vmclear_local_vcpus();
-   kvm_cpu_vmxoff();
+   if (vmm_exclusive) {
+   vmclear_local_vcpus();
+   kvm_cpu_vmxoff();
+   }
write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
@@ -3949,6 +3960,19 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vmx);
 }
 
+static inline void vmcs_init(struct vmcs *vmcs)
+{
+   u64 phys_addr = __pa(per_cpu(vmxarea, raw_smp_processor_id()));
+
+   if (!vmm_exclusive)
+   kvm_cpu_vmxon(phys_addr);
+
+   vmcs_clear(vmcs);
+
+   if (!vmm_exclusive)
+   kvm_cpu_vmxoff();
+}
+
 static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 {
int err;
@@ -3974,7 +3998,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, 
unsigned int id)
if (!vmx-vmcs)
goto free_msrs;
 
-   vmcs_clear(vmx-vmcs);
+   vmcs_init(vmx-vmcs);
 
cpu = get_cpu();
vmx_vcpu_load(vmx-vcpu, cpu);
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4 v4] KVM: VMX: Some minor changes to code structure.

2010-05-11 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Do some preparations for vmm coexistence support.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e77da89..49b0850 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -784,15 +784,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 tsc_this, delta, new_offset;
 
-   if (vcpu-cpu != cpu) {
+   if (vcpu-cpu != cpu)
vcpu_clear(vmx);
-   kvm_migrate_timers(vcpu);
-   set_bit(KVM_REQ_TLB_FLUSH, vcpu-requests);
-   local_irq_disable();
-   list_add(vmx-local_vcpus_link,
-per_cpu(vcpus_on_cpu, cpu));
-   local_irq_enable();
-   }
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
per_cpu(current_vmcs, cpu) = vmx-vmcs;
@@ -803,6 +796,13 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
struct desc_ptr dt;
unsigned long sysenter_esp;
 
+   kvm_migrate_timers(vcpu);
+   set_bit(KVM_REQ_TLB_FLUSH, vcpu-requests);
+   local_irq_disable();
+   list_add(vmx-local_vcpus_link,
+per_cpu(vcpus_on_cpu, cpu));
+   local_irq_enable();
+
vcpu-cpu = cpu;
/*
 * Linux uses per-cpu TSS and GDT, so set these when switching
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/4 v3] KVM: VMX: Support hosted VMM coexsitence.

2010-05-10 Thread Xu, Dongxiao
Hi Avi,

Do you have other comments on this version of patch? 

Thanks,
Dongxiao

Xu, Dongxiao wrote:
 Hi all,
 
 This is hosted VMM coexistence support v3.
 
 Main changes from v2:
 1) Change vmm_coexistence to vmm_exclusive.
 2) Some code structure changes. Split the original 3 patches to 4.
 3) Address some comments from Avi.
 
 Main changes from v1:
 1) Add an module option vmm_coexistence to decide whether to
 enable this feature. Currently it is off defaultly.
 2) Each time when a KVM vcpu is scheduled in, it will invalidate EPT
 and VPID TLBs to avoid conflict between different VMMs.
 
 VMX: Support for coexistence of KVM and other hosted VMMs.
 
 The following NOTE is picked up from Intel SDM 3B 27.3 chapter,
 MANAGING VMCS REGIONS AND POINTERS.
 
 --
 NOTE
 As noted in Section 21.1, the processor may optimize VMX operation
 by maintaining the state of an active VMCS (one for which VMPTRLD
 has been executed) on the processor. Before relinquishing control to
 other system software that may, without informing the VMM, remove
 power from the processor (e.g., for transitions to S3 or S4) or leave
 VMX operation, a VMM must VMCLEAR all active VMCSs. This ensures
 that all VMCS data cached by the processor are flushed to memory
 and that no other software can corrupt the current VMM's VMCS data.
 It is also recommended that the VMM execute VMXOFF after such
 executions of VMCLEAR.
 --
 
 Currently, VMCLEAR is called at VCPU migration. To support hosted
 VMM coexistence, this patch modifies the VMCLEAR/VMPTRLD and
 VMXON/VMXOFF usages. VMCLEAR will be called when VCPU is
 scheduled out of a physical CPU, while VMPTRLD is called when VCPU
 is scheduled in a physical CPU. Also this approach could eliminates
 the IPI mechanism for original VMCLEAR. As suggested by SDM,
 VMXOFF will be called after VMCLEAR, and VMXON will be called
 before VMPTRLD.
 
 With this patchset, KVM and VMware Workstation 7 could launch
 serapate guests and they can work well with each other.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3 v2] KVM: VMX: Support hosted VMM coexsitence.

2010-05-06 Thread Xu, Dongxiao
Main changes from v1:
1) Add an module option vmm_coexistence to decide whether to
enable this feature. Currently it is off defaultly.
2) Each time when a KVM vcpu is scheduled in, it will invalidate EPT
and VPID TLBs to avoid conflict between different VMMs.

VMX: Support for coexistence of KVM and other hosted VMMs. 

The following NOTE is picked up from Intel SDM 3B 27.3 chapter, 
MANAGING VMCS REGIONS AND POINTERS.

--
NOTE
As noted in Section 21.1, the processor may optimize VMX operation
by maintaining the state of an active VMCS (one for which VMPTRLD
has been executed) on the processor. Before relinquishing control to
other system software that may, without informing the VMM, remove
power from the processor (e.g., for transitions to S3 or S4) or leave
VMX operation, a VMM must VMCLEAR all active VMCSs. This ensures
that all VMCS data cached by the processor are flushed to memory
and that no other software can corrupt the current VMM's VMCS data.
It is also recommended that the VMM execute VMXOFF after such
executions of VMCLEAR.
--

Currently, VMCLEAR is called at VCPU migration. To support hosted
VMM coexistence, this patch modifies the VMCLEAR/VMPTRLD and
VMXON/VMXOFF usages. VMCLEAR will be called when VCPU is
scheduled out of a physical CPU, while VMPTRLD is called when VCPU
is scheduled in a physical CPU. Also this approach could eliminates
the IPI mechanism for original VMCLEAR. As suggested by SDM,
VMXOFF will be called after VMCLEAR, and VMXON will be called
before VMPTRLD.

With this patchset, KVM and VMware Workstation 7 could launch
serapate guests and they can work well with each other. --
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/3 v2] KVM: VMX: VMCLEAR/VMPTRLD usage changes.

2010-05-06 Thread Xu, Dongxiao
Avi Kivity wrote:
 On 05/06/2010 11:45 AM, Xu, Dongxiao wrote:
 From: Dongxiao Xudongxiao...@intel.com
 
 Originally VMCLEAR/VMPTRLD is called on vcpu migration. To
 support hosted VMM coexistance, VMCLEAR is executed on vcpu
 schedule out, and VMPTRLD is executed on vcpu schedule in.
 This could also eliminate the IPI when doing VMCLEAR.
 
 
 
 +static int __read_mostly vmm_coexistence;
 +module_param(vmm_coexistence, bool, S_IRUGO);
 
 
 I think we can call it 'exclusive' defaulting to true.

I will change it in next version.

 
 +
   #define KVM_GUEST_CR0_MASK_UNRESTRICTED_GUEST  
 \
  (X86_CR0_WP | X86_CR0_NE | X86_CR0_NW | X86_CR0_CD)
   #define KVM_GUEST_CR0_MASK \
 @@ -222,6 +225,7 @@ static u64 host_efer;
 
   static void ept_save_pdptrs(struct kvm_vcpu *vcpu);
 
 +static void vmx_flush_tlb(struct kvm_vcpu *vcpu);
   /*
* Keep MSR_K6_STAR at the end, as setup_msrs() will try to
 optimize it 
* away by decrementing the array size.
 @@ -784,25 +788,31 @@ static void vmx_vcpu_load(struct kvm_vcpu
  *vcpu, int cpu) struct vcpu_vmx *vmx = to_vmx(vcpu);
  u64 tsc_this, delta, new_offset;
 
 -if (vcpu-cpu != cpu) {
 -vcpu_clear(vmx);
 -kvm_migrate_timers(vcpu);
 +if (vmm_coexistence) {
 +vmcs_load(vmx-vmcs);
  set_bit(KVM_REQ_TLB_FLUSH,vcpu-requests);
 -local_irq_disable();
 -list_add(vmx-local_vcpus_link,
 -per_cpu(vcpus_on_cpu, cpu));
 -local_irq_enable();
 -}
 +} else {
 +if (vcpu-cpu != cpu) {
 +vcpu_clear(vmx);
 +set_bit(KVM_REQ_TLB_FLUSH,vcpu-requests);
 +local_irq_disable();
 +list_add(vmx-local_vcpus_link,
 +per_cpu(vcpus_on_cpu, cpu));
 +local_irq_enable();
 +}
 
 -if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
 -per_cpu(current_vmcs, cpu) = vmx-vmcs;
 -vmcs_load(vmx-vmcs);
 +if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
 +per_cpu(current_vmcs, cpu) = vmx-vmcs;
 +vmcs_load(vmx-vmcs);
 +}
  }
 
 
 Please keep the exclusive use code as it is, and just add !exclusive
 cases.  For example. the current_vmcs test will always fail since
 current_vmcs will be set to NULL on vmx_vcpu_put, so we can have just
 one vmcs_load().

Thanks for the suggestion.

 
 
 @@ -829,7 +839,14 @@ static void vmx_vcpu_load(struct kvm_vcpu
 *vcpu, int cpu) 
 
   static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
   {
 +struct vcpu_vmx *vmx = to_vmx(vcpu);
 +
  __vmx_load_host_state(to_vmx(vcpu));
 +if (vmm_coexistence) {
 +vmcs_clear(vmx-vmcs);
 +rdtscll(vmx-vcpu.arch.host_tsc);
 +vmx-launched = 0;
 
 
 This code is also duplicated.  Please refactor to avoid duplication.

Thanks.

 
 +}
   }
 
   static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
 @@ -1280,7 +1297,8 @@ static void kvm_cpu_vmxoff(void)
 
   static void hardware_disable(void *garbage)
   {
 -vmclear_local_vcpus();
 +if (!vmm_coexistence)
 +vmclear_local_vcpus();
  kvm_cpu_vmxoff();
  write_cr4(read_cr4()  ~X86_CR4_VMXE);
   }
 @@ -3927,7 +3945,8 @@ static void vmx_free_vmcs(struct kvm_vcpu
  *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu);
 
  if (vmx-vmcs) {
 -vcpu_clear(vmx);
 +if (!vmm_coexistence)
 +vcpu_clear(vmx);
 
 
 What's wrong with doing this unconditionally?

The change should have beed put in PATCH 3/3.
This judgement is to avoid calling vcpu_clear() after kvm_cpu_vmxoff();
However it will not appear in next version since I will set vcpu-cpu=-1
in vmx_vcpu_put() !vmm_exclusive case, so that vcpu_clear() will not
executed actually.

 
  free_vmcs(vmx-vmcs);
  vmx-vmcs = NULL;
  }
 @@ -3969,13 +3988,20 @@ static struct kvm_vcpu
  *vmx_create_vcpu(struct kvm *kvm, unsigned int id)  if
 (!vmx-vmcs) goto free_msrs; 
 
 -vmcs_clear(vmx-vmcs);
 -
 -cpu = get_cpu();
 -vmx_vcpu_load(vmx-vcpu, cpu);
 -err = vmx_vcpu_setup(vmx);
 -vmx_vcpu_put(vmx-vcpu);
 -put_cpu();
 +if (vmm_coexistence) {
 +preempt_disable();
 +vmcs_load(vmx-vmcs);
 +err = vmx_vcpu_setup(vmx);
 +vmcs_clear(vmx-vmcs);
 +preempt_enable();
 
 
 Why won't the normal sequence work?

Yes you are right.

 
 +} else {
 +vmcs_clear(vmx-vmcs);
 +cpu = get_cpu();
 +vmx_vcpu_load(vmx-vcpu, cpu);
 +err = vmx_vcpu_setup(vmx);
 +vmx_vcpu_put(vmx-vcpu);
 +put_cpu();
 +}
  if (err)
  goto free_vmcs;
  if (vm_need_virtualize_apic_accesses(kvm))
 @@ -4116,6 +4142,8 @@ static void vmx_cpuid_update

[PATCH 1/4 v3] KVM: VMX: Define new functions to wrapper direct call of asm code.

2010-05-06 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Define vmcs_load() and kvm_cpu_vmxon() to avoid direct call of asm
code. Also move VMXE bit operation out of kvm_cpu_vmxoff().

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   36 +++-
 1 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 875b785..e77da89 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -445,6 +445,19 @@ static void vmcs_clear(struct vmcs *vmcs)
   vmcs, phys_addr);
 }
 
+static void vmcs_load(struct vmcs *vmcs)
+{
+   u64 phys_addr = __pa(vmcs);
+   u8 error;
+
+   asm volatile (__ex(ASM_VMX_VMPTRLD_RAX) ; setna %0
+   : =g(error) : a(phys_addr), m(phys_addr)
+   : cc, memory);
+   if (error)
+   printk(KERN_ERR kvm: vmptrld %p/%llx fail\n,
+  vmcs, phys_addr);
+}
+
 static void __vcpu_clear(void *arg)
 {
struct vcpu_vmx *vmx = arg;
@@ -769,7 +782,6 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx)
 static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
-   u64 phys_addr = __pa(vmx-vmcs);
u64 tsc_this, delta, new_offset;
 
if (vcpu-cpu != cpu) {
@@ -783,15 +795,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
}
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
-   u8 error;
-
per_cpu(current_vmcs, cpu) = vmx-vmcs;
-   asm volatile (__ex(ASM_VMX_VMPTRLD_RAX) ; setna %0
- : =g(error) : a(phys_addr), m(phys_addr)
- : cc);
-   if (error)
-   printk(KERN_ERR kvm: vmptrld %p/%llx fail\n,
-  vmx-vmcs, phys_addr);
+   vmcs_load(vmx-vmcs);
}
 
if (vcpu-cpu != cpu) {
@@ -1220,6 +1225,13 @@ static __init int vmx_disabled_by_bios(void)
/* locked but not enabled */
 }
 
+static void kvm_cpu_vmxon(u64 addr)
+{
+   asm volatile (ASM_VMX_VMXON_RAX
+   : : a(addr), m(addr)
+   : memory, cc);
+}
+
 static int hardware_enable(void *garbage)
 {
int cpu = raw_smp_processor_id();
@@ -1240,9 +1252,7 @@ static int hardware_enable(void *garbage)
   FEATURE_CONTROL_LOCKED |
   FEATURE_CONTROL_VMXON_ENABLED);
write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
-   asm volatile (ASM_VMX_VMXON_RAX
- : : a(phys_addr), m(phys_addr)
- : memory, cc);
+   kvm_cpu_vmxon(phys_addr);
 
ept_sync_global();
 
@@ -1266,13 +1276,13 @@ static void vmclear_local_vcpus(void)
 static void kvm_cpu_vmxoff(void)
 {
asm volatile (__ex(ASM_VMX_VMXOFF) : : : cc);
-   write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
 static void hardware_disable(void *garbage)
 {
vmclear_local_vcpus();
kvm_cpu_vmxoff();
+   write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
 static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4 v3] KVM: VMX: Some minor changes to code structure.

2010-05-06 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Do some preparations for vmm coexistence support.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e77da89..49b0850 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -784,15 +784,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 tsc_this, delta, new_offset;
 
-   if (vcpu-cpu != cpu) {
+   if (vcpu-cpu != cpu)
vcpu_clear(vmx);
-   kvm_migrate_timers(vcpu);
-   set_bit(KVM_REQ_TLB_FLUSH, vcpu-requests);
-   local_irq_disable();
-   list_add(vmx-local_vcpus_link,
-per_cpu(vcpus_on_cpu, cpu));
-   local_irq_enable();
-   }
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
per_cpu(current_vmcs, cpu) = vmx-vmcs;
@@ -803,6 +796,13 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
struct desc_ptr dt;
unsigned long sysenter_esp;
 
+   kvm_migrate_timers(vcpu);
+   set_bit(KVM_REQ_TLB_FLUSH, vcpu-requests);
+   local_irq_disable();
+   list_add(vmx-local_vcpus_link,
+per_cpu(vcpus_on_cpu, cpu));
+   local_irq_enable();
+
vcpu-cpu = cpu;
/*
 * Linux uses per-cpu TSS and GDT, so set these when switching
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4 v3] KVM: VMX: Support hosted VMM coexsitence.

2010-05-06 Thread Xu, Dongxiao
Hi all,

This is hosted VMM coexistence support v3.

Main changes from v2:
1) Change vmm_coexistence to vmm_exclusive.
2) Some code structure changes. Split the original 3 patches to 4.
3) Address some comments from Avi.

Main changes from v1:
1) Add an module option vmm_coexistence to decide whether to
enable this feature. Currently it is off defaultly.
2) Each time when a KVM vcpu is scheduled in, it will invalidate EPT
and VPID TLBs to avoid conflict between different VMMs.

VMX: Support for coexistence of KVM and other hosted VMMs. 

The following NOTE is picked up from Intel SDM 3B 27.3 chapter, 
MANAGING VMCS REGIONS AND POINTERS.

--
NOTE
As noted in Section 21.1, the processor may optimize VMX operation
by maintaining the state of an active VMCS (one for which VMPTRLD
has been executed) on the processor. Before relinquishing control to
other system software that may, without informing the VMM, remove
power from the processor (e.g., for transitions to S3 or S4) or leave
VMX operation, a VMM must VMCLEAR all active VMCSs. This ensures
that all VMCS data cached by the processor are flushed to memory
and that no other software can corrupt the current VMM's VMCS data.
It is also recommended that the VMM execute VMXOFF after such
executions of VMCLEAR.
--

Currently, VMCLEAR is called at VCPU migration. To support hosted
VMM coexistence, this patch modifies the VMCLEAR/VMPTRLD and
VMXON/VMXOFF usages. VMCLEAR will be called when VCPU is
scheduled out of a physical CPU, while VMPTRLD is called when VCPU
is scheduled in a physical CPU. Also this approach could eliminates
the IPI mechanism for original VMCLEAR. As suggested by SDM,
VMXOFF will be called after VMCLEAR, and VMXON will be called
before VMPTRLD.

With this patchset, KVM and VMware Workstation 7 could launch
serapate guests and they can work well with each other. --
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4 v3] KVM: VMX: VMXON/VMXOFF usage changes.

2010-05-06 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

SDM suggests VMXON should be called before VMPTRLD, and VMXOFF
should be called after doing VMCLEAR.

Therefore in vmm coexistence case, we should firstly call VMXON
before any VMCS operation, and then call VMXOFF after the
operation is done.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   45 ++---
 1 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c536b9d..59d7443 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -168,6 +168,8 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu)
 
 static int init_rmode(struct kvm *kvm);
 static u64 construct_eptp(unsigned long root_hpa);
+static void kvm_cpu_vmxon(u64 addr);
+static void kvm_cpu_vmxoff(void);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -786,8 +788,11 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 tsc_this, delta, new_offset;
+   u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
 
-   if (vmm_exclusive  vcpu-cpu != cpu)
+   if (!vmm_exclusive)
+   kvm_cpu_vmxon(phys_addr);
+   else if (vcpu-cpu != cpu)
vcpu_clear(vmx);
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
@@ -833,8 +838,10 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 {
__vmx_load_host_state(to_vmx(vcpu));
-   if (!vmm_exclusive)
+   if (!vmm_exclusive) {
__vcpu_clear(to_vmx(vcpu));
+   kvm_cpu_vmxoff();
+   }
 }
 
 static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
@@ -1257,9 +1264,11 @@ static int hardware_enable(void *garbage)
   FEATURE_CONTROL_LOCKED |
   FEATURE_CONTROL_VMXON_ENABLED);
write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
-   kvm_cpu_vmxon(phys_addr);
 
-   ept_sync_global();
+   if (vmm_exclusive) {
+   kvm_cpu_vmxon(phys_addr);
+   ept_sync_global();
+   }
 
return 0;
 }
@@ -1285,8 +1294,10 @@ static void kvm_cpu_vmxoff(void)
 
 static void hardware_disable(void *garbage)
 {
-   vmclear_local_vcpus();
-   kvm_cpu_vmxoff();
+   if (vmm_exclusive) {
+   vmclear_local_vcpus();
+   kvm_cpu_vmxoff();
+   }
write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
@@ -3949,6 +3960,19 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
kmem_cache_free(kvm_vcpu_cache, vmx);
 }
 
+static inline void vmcs_init(struct vmcs *vmcs)
+{
+   u64 phys_addr = __pa(per_cpu(vmxarea, raw_smp_processor_id()));
+
+   if (!vmm_exclusive)
+   kvm_cpu_vmxon(phys_addr);
+
+   vmcs_clear(vmcs);
+
+   if (!vmm_exclusive)
+   kvm_cpu_vmxoff();
+}
+
 static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 {
int err;
@@ -3974,13 +3998,14 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm 
*kvm, unsigned int id)
if (!vmx-vmcs)
goto free_msrs;
 
-   vmcs_clear(vmx-vmcs);
+   vmcs_init(vmx-vmcs);
 
cpu = get_cpu();
vmx_vcpu_load(vmx-vcpu, cpu);
err = vmx_vcpu_setup(vmx);
vmx_vcpu_put(vmx-vcpu);
put_cpu();
+
if (err)
goto free_vmcs;
if (vm_need_virtualize_apic_accesses(kvm))
@@ -4118,7 +4143,10 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
struct kvm_cpuid_entry2 *best;
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 exec_control;
+   int cpu;
 
+   cpu = get_cpu();
+   vmx_vcpu_load(vmx-vcpu, cpu);
vmx-rdtscp_enabled = false;
if (vmx_rdtscp_supported()) {
exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
@@ -4133,6 +4161,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
}
}
}
+   vmx_vcpu_put(vmx-vcpu);
+   put_cpu();
+
 }
 
 static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4 v3] KVM: VMX: VMCLEAR/VMPTRLD usage changes.

2010-05-06 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Originally VMCLEAR/VMPTRLD is called on vcpu migration. To
support hosted VMM coexistance, VMCLEAR is executed on vcpu
schedule out, and VMPTRLD is executed on vcpu schedule in.
This could also eliminate the IPI when doing VMCLEAR.

vmm_exclusive is introduced as a module parameter to indicate
whether the vmm coexistence feature is enabled or not.
Currently the feature is disabled in default.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 49b0850..c536b9d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -62,6 +62,9 @@ module_param_named(unrestricted_guest,
 static int __read_mostly emulate_invalid_guest_state = 0;
 module_param(emulate_invalid_guest_state, bool, S_IRUGO);
 
+static int __read_mostly vmm_exclusive = 1;
+module_param(vmm_exclusive, bool, S_IRUGO);
+
 #define KVM_GUEST_CR0_MASK_UNRESTRICTED_GUEST  \
(X86_CR0_WP | X86_CR0_NE | X86_CR0_NW | X86_CR0_CD)
 #define KVM_GUEST_CR0_MASK \
@@ -784,7 +787,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 tsc_this, delta, new_offset;
 
-   if (vcpu-cpu != cpu)
+   if (vmm_exclusive  vcpu-cpu != cpu)
vcpu_clear(vmx);
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
@@ -830,6 +833,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 {
__vmx_load_host_state(to_vmx(vcpu));
+   if (!vmm_exclusive)
+   __vcpu_clear(to_vmx(vcpu));
 }
 
 static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/3] KVM: VMX: Support hosted VMM coexsitence.

2010-03-23 Thread Xu, Dongxiao
Avi Kivity wrote:
 On 03/23/2010 06:01 AM, Xu, Dongxiao wrote:
 Avi Kivity wrote:
 
 On 03/18/2010 11:49 AM, Xu, Dongxiao wrote:
 
 VMX: Support for coexistence of KVM and other hosted VMMs.
 
 The following NOTE is picked up from Intel SDM 3B 27.3 chapter,
 MANAGING VMCS REGIONS AND POINTERS.
 
 --
 NOTE
 As noted in Section 21.1, the processor may optimize VMX operation
 by maintaining the state of an active VMCS (one for which VMPTRLD
 has been executed) on the processor. Before relinquishing control
 to other system software that may, without informing the VMM,
 remove power from the processor (e.g., for transitions to S3 or
 S4) or leave VMX operation, a VMM must VMCLEAR all active VMCSs.
 This ensures that all VMCS data cached by the processor are
 flushed to memory 
 and that no other software can corrupt the current VMM's VMCS data.
 It is also recommended that the VMM execute VMXOFF after such
 executions of VMCLEAR. --
 
 Currently, VMCLEAR is called at VCPU migration. To support hosted
 VMM coexistence, this patch modifies the VMCLEAR/VMPTRLD and
 VMXON/VMXOFF usages. VMCLEAR will be called when VCPU is
 scheduled out of a physical CPU, while VMPTRLD is called when VCPU
 is scheduled in a physical CPU. Also this approach could eliminates
 the IPI mechanism for original VMCLEAR. As suggested by SDM,
 VMXOFF will be called after VMCLEAR, and VMXON will be called
 before VMPTRLD.
 
 
 My worry is that newer processors will cache more and more VMCS
 contents on-chip, so the VMCLEAR/VMXOFF will cause a greater loss
 with newer processors. 
 
 Based on our intenal testing, we saw less than 1% of performance
 differences even on such processors.
 
 
 Did you measure workloads that exit to userspace very often?
 
 Also, what about future processors?  My understanding is that the
 manual recommends keeping things cached, the above description is for
 sleep states. 

I measured the performance by using kernel build in guest. I launched 6
guests, 5 of them and the host are doing while(1) loop, and the left guest
is doing kernel build. The CPU overcommitment is 7:1, and vcpu schedule
frequency is about 15k/sec. I tested this with Intel new processors on
my hand, and the performance difference is little.

 
 With this patchset, KVM and VMware Workstation 7 could launch
 serapate guests and they can work well with each other. Besides, I
 measured the performance for this patch, there is no visable
 performance loss according to the test results.
 
 
 Is that the only motivation?  It seems like an odd use-case.  If
 there was no performance impact (current or future), I wouldn't
 mind, but the design of VMPTRLD/VMCLEAR/VMXON/VMXOFF seems to
 indicate that we want to keep a VMCS loaded as much as possible on
 the processor. 
 
 I just used KVM and VMware Workstation 7 for testing this patchset.
 
 Through this new usage of VMPTRLD/VMCLEAR/VMXON/VMXOFF,
 we could make hosted VMMs work separately and do not impact each
 other. 
 
 
 What I am questioning is whether a significant number of users want to
 run kvm in parallel with another hypervisor.

At least this approach gives users an option to run VMMs in parallel without
significant performance loss. Think of this senario, if a server has already
Deployed VMware software, but some new customers want to use KVM,
this patch could help them to meet their requirements.

I ever tested this case, if a KVM guest is already run, and the user launches 
VMware guest, the KVM guest will die and VMware guest still runs well.
This patchset can solve the problem.

Thanks!
Dongxiao
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/3] KVM: VMX: Support hosted VMM coexsitence.

2010-03-22 Thread Xu, Dongxiao
Avi Kivity wrote:
 On 03/18/2010 11:49 AM, Xu, Dongxiao wrote:
 VMX: Support for coexistence of KVM and other hosted VMMs.
 
 The following NOTE is picked up from Intel SDM 3B 27.3 chapter,
 MANAGING VMCS REGIONS AND POINTERS.
 
 --
 NOTE
 As noted in Section 21.1, the processor may optimize VMX operation
 by maintaining the state of an active VMCS (one for which VMPTRLD
 has been executed) on the processor. Before relinquishing control to
 other system software that may, without informing the VMM, remove
 power from the processor (e.g., for transitions to S3 or S4) or leave
 VMX operation, a VMM must VMCLEAR all active VMCSs. This ensures
 that all VMCS data cached by the processor are flushed to memory
 and that no other software can corrupt the current VMM's VMCS data.
 It is also recommended that the VMM execute VMXOFF after such
 executions of VMCLEAR. --
 
 Currently, VMCLEAR is called at VCPU migration. To support hosted
 VMM coexistence, this patch modifies the VMCLEAR/VMPTRLD and
 VMXON/VMXOFF usages. VMCLEAR will be called when VCPU is
 scheduled out of a physical CPU, while VMPTRLD is called when VCPU
 is scheduled in a physical CPU. Also this approach could eliminates
 the IPI mechanism for original VMCLEAR. As suggested by SDM,
 VMXOFF will be called after VMCLEAR, and VMXON will be called
 before VMPTRLD.
 
 
 My worry is that newer processors will cache more and more VMCS
 contents on-chip, so the VMCLEAR/VMXOFF will cause a greater loss
 with newer processors.

Based on our intenal testing, we saw less than 1% of performance
differences even on such processors.

 
 With this patchset, KVM and VMware Workstation 7 could launch
 serapate guests and they can work well with each other. Besides, I
 measured the performance for this patch, there is no visable
 performance loss according to the test results.
 
 
 Is that the only motivation?  It seems like an odd use-case.  If there
 was no performance impact (current or future), I wouldn't mind, but
 the design of VMPTRLD/VMCLEAR/VMXON/VMXOFF seems to indicate that we
 want to keep a VMCS loaded as much as possible on the processor.

I just used KVM and VMware Workstation 7 for testing this patchset. 

Through this new usage of VMPTRLD/VMCLEAR/VMXON/VMXOFF,
we could make hosted VMMs work separately and do not impact each
other.

Thanks!
Dongxiao
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] KVM: VMX code cleanup and preparation.

2010-03-18 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   36 +++-
 1 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ae3217d..10e2c3c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -442,6 +442,19 @@ static void vmcs_clear(struct vmcs *vmcs)
   vmcs, phys_addr);
 }
 
+static void vmcs_load(struct vmcs *vmcs)
+{
+   u64 phys_addr = __pa(vmcs);
+   u8 error;
+
+   asm volatile (__ex(ASM_VMX_VMPTRLD_RAX) ; setna %0
+   : =g(error) : a(phys_addr), m(phys_addr)
+   : cc, memory);
+   if (error)
+   printk(KERN_ERR kvm: vmptrld %p/%llx fail\n,
+  vmcs, phys_addr);
+}
+
 static void __vcpu_clear(void *arg)
 {
struct vcpu_vmx *vmx = arg;
@@ -766,7 +779,6 @@ static void vmx_load_host_state(struct vcpu_vmx *vmx)
 static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
-   u64 phys_addr = __pa(vmx-vmcs);
u64 tsc_this, delta, new_offset;
 
if (vcpu-cpu != cpu) {
@@ -780,15 +792,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
}
 
if (per_cpu(current_vmcs, cpu) != vmx-vmcs) {
-   u8 error;
-
per_cpu(current_vmcs, cpu) = vmx-vmcs;
-   asm volatile (__ex(ASM_VMX_VMPTRLD_RAX) ; setna %0
- : =g(error) : a(phys_addr), m(phys_addr)
- : cc);
-   if (error)
-   printk(KERN_ERR kvm: vmptrld %p/%llx fail\n,
-  vmx-vmcs, phys_addr);
+   vmcs_load(vmx-vmcs);
}
 
if (vcpu-cpu != cpu) {
@@ -1211,6 +1216,13 @@ static __init int vmx_disabled_by_bios(void)
/* locked but not enabled */
 }
 
+static void kvm_cpu_vmxon(u64 addr)
+{
+   asm volatile (ASM_VMX_VMXON_RAX
+   : : a(addr), m(addr)
+   : memory, cc);
+}
+
 static int hardware_enable(void *garbage)
 {
int cpu = raw_smp_processor_id();
@@ -1231,9 +1243,7 @@ static int hardware_enable(void *garbage)
   FEATURE_CONTROL_LOCKED |
   FEATURE_CONTROL_VMXON_ENABLED);
write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
-   asm volatile (ASM_VMX_VMXON_RAX
- : : a(phys_addr), m(phys_addr)
- : memory, cc);
+   kvm_cpu_vmxon(phys_addr);
 
ept_sync_global();
 
@@ -1257,13 +1267,13 @@ static void vmclear_local_vcpus(void)
 static void kvm_cpu_vmxoff(void)
 {
asm volatile (__ex(ASM_VMX_VMXOFF) : : : cc);
-   write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
 static void hardware_disable(void *garbage)
 {
vmclear_local_vcpus();
kvm_cpu_vmxoff();
+   write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
 static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] KVM: VMX: VMXON/VMXOFF usage changes.

2010-03-18 Thread Xu, Dongxiao
From: Dongxiao Xu dongxiao...@intel.com

Intel SDM also suggests that VMXOFF should be called after doing
VMCLEAR. Therefore VMXON should be called before VMPTRLD.

Signed-off-by: Dongxiao Xu dongxiao...@intel.com
---
 arch/x86/kvm/vmx.c |   21 -
 1 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index aaf5b52..dd3a7fa 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -161,8 +161,11 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu 
*vcpu)
 
 static int init_rmode(struct kvm *kvm);
 static u64 construct_eptp(unsigned long root_hpa);
+static void kvm_cpu_vmxon(u64 addr);
+static void kvm_cpu_vmxoff(void);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
+static DEFINE_PER_CPU(bool, vmx_mode);
 
 static unsigned long *vmx_io_bitmap_a;
 static unsigned long *vmx_io_bitmap_b;
@@ -755,7 +758,9 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 tsc_this, delta, new_offset;
+   u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
 
+   kvm_cpu_vmxon(phys_addr);
vmcs_load(vmx-vmcs);
 
if (vcpu-cpu != cpu) {
@@ -796,6 +801,7 @@ static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
vmcs_clear(vmx-vmcs);
rdtscll(vmx-vcpu.arch.host_tsc);
vmx-launched = 0;
+   kvm_cpu_vmxoff();
 }
 
 static void vmx_fpu_activate(struct kvm_vcpu *vcpu)
@@ -1187,9 +1193,11 @@ static __init int vmx_disabled_by_bios(void)
 
 static void kvm_cpu_vmxon(u64 addr)
 {
+   int cpu = raw_smp_processor_id();
asm volatile (ASM_VMX_VMXON_RAX
: : a(addr), m(addr)
: memory, cc);
+   per_cpu(vmx_mode, cpu) = 1;
 }
 
 static int hardware_enable(void *garbage)
@@ -1223,12 +1231,16 @@ static int hardware_enable(void *garbage)
  */
 static void kvm_cpu_vmxoff(void)
 {
+   int cpu = raw_smp_processor_id();
asm volatile (__ex(ASM_VMX_VMXOFF) : : : cc);
+   per_cpu(vmx_mode, cpu) = 0;
 }
 
 static void hardware_disable(void *garbage)
 {
-   kvm_cpu_vmxoff();
+   int cpu = raw_smp_processor_id();
+   if (per_cpu(vmx_mode, cpu))
+   kvm_cpu_vmxoff();
write_cr4(read_cr4()  ~X86_CR4_VMXE);
 }
 
@@ -3930,6 +3942,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, 
unsigned int id)
 {
int err;
struct vcpu_vmx *vmx = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL);
+   u64 phys_addr = __pa(per_cpu(vmxarea, smp_processor_id()));
 
if (!vmx)
return ERR_PTR(-ENOMEM);
@@ -3951,9 +3964,11 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, 
unsigned int id)
goto free_msrs;
 
preempt_disable();
+   kvm_cpu_vmxon(phys_addr);
vmcs_load(vmx-vmcs);
err = vmx_vcpu_setup(vmx);
vmcs_clear(vmx-vmcs);
+   kvm_cpu_vmxoff();
preempt_enable();
 
if (err)
@@ -4091,10 +4106,13 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best;
struct vcpu_vmx *vmx = to_vmx(vcpu);
+   int cpu = raw_smp_processor_id();
+   u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
u32 exec_control;
 
vmx-rdtscp_enabled = false;
if (vmx_rdtscp_supported()) {
+   kvm_cpu_vmxon(phys_addr);
vmcs_load(vmx-vmcs);
exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
if (exec_control  SECONDARY_EXEC_RDTSCP) {
@@ -4108,6 +4126,7 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
}
}
vmcs_clear(vmx-vmcs);
+   kvm_cpu_vmxoff();
}
 }
 
-- 
1.6.3
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] KVM: VMX: Support hosted VMM coexsitence.

2010-03-18 Thread Xu, Dongxiao
VMX: Support for coexistence of KVM and other hosted VMMs. 

The following NOTE is picked up from Intel SDM 3B 27.3 chapter, 
MANAGING VMCS REGIONS AND POINTERS.

--
NOTE
As noted in Section 21.1, the processor may optimize VMX operation
by maintaining the state of an active VMCS (one for which VMPTRLD
has been executed) on the processor. Before relinquishing control to
other system software that may, without informing the VMM, remove
power from the processor (e.g., for transitions to S3 or S4) or leave
VMX operation, a VMM must VMCLEAR all active VMCSs. This ensures
that all VMCS data cached by the processor are flushed to memory
and that no other software can corrupt the current VMM's VMCS data.
It is also recommended that the VMM execute VMXOFF after such
executions of VMCLEAR.
--

Currently, VMCLEAR is called at VCPU migration. To support hosted
VMM coexistence, this patch modifies the VMCLEAR/VMPTRLD and
VMXON/VMXOFF usages. VMCLEAR will be called when VCPU is
scheduled out of a physical CPU, while VMPTRLD is called when VCPU
is scheduled in a physical CPU. Also this approach could eliminates
the IPI mechanism for original VMCLEAR. As suggested by SDM,
VMXOFF will be called after VMCLEAR, and VMXON will be called
before VMPTRLD.

With this patchset, KVM and VMware Workstation 7 could launch
serapate guests and they can work well with each other. Besides, I
measured the performance for this patch, there is no visable
performance loss according to the test results.

The following performance results are got from a host with 8 cores.
 
1. vConsolidate benchmarks on KVM
  
Test Round  WebBenchSPECjbb SysBenchLoadSim GEOMEAN 
1 W/O patch 2,614.7228,053.09   1,108.4116.30   
1,072.95 
   W/ patch 2,691.5528,145.71   1,128.4116.47   
1,089.28 
2 W/O patch 2,642.3928,104.79   1,096.9917.79   
1,097.19 
   W/ patch 2,699.2528,092.62   1,116.1015.54   
1,070.98 
3 W/O patch 2,571.5828,131.17   1,108.4316.39   
1,070.70 
   W/ patch 2,627.8928,090.19   1,110.9417.00   
1,086.57 

Average
W/O patch   2,609.5628,096.35   1,104.6116.83   
1,080.28 
W/ patch2,672.9028,109.51   1,118.4816.34   
1,082.28 

2. CPU overcommitment tests for KVM

A) Run 8 while(1) in host which pin with 8 cores.
B) Launch 6 guests, each has 8 VCPUs, pin each VCPU with one core.
C) Among the 6 guests, 5 of them are running 8*while(1).
D) The left guest is doing kernel build make -j9 under ramdisk.

In this case, the overcommitment ratio for each core is 7:1.
The VCPU schedule frequency on all cores is totally ~15k/sec.
l record the kernel build time.
 
While doing the average, the first round data is treated as invalid,
which isn't counted into the final average result.
 
Kernel Build Time (second) 
Round   w/o patch   w/ patch 
1   541 501 
2   488 490 
3   488 492 
4   492 493 
5   489 491 
6   494 487 
7   497 494 
8   492 492 
9   493 496 
10  492 495 
11  490 496 
12  489 494 
13  489 490 
14  490 491 
15  494 497 
16  495 496 
17  496 496 
18  493 492 
19  493 500 
20  490 499 

Average 491.79  493.74
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Handle INIT before SIPI.

2008-10-20 Thread Xu, Dongxiao
Hi, Gleb,
It seems that this patch is x86 specific, and it breaks the IA-64 
compilation. Do you have any good idea to solve this? Thanks!

Best Regards,
-- Dongxiao

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gleb Natapov
Sent: 2008年10月16日 23:21
To: [EMAIL PROTECTED]
Cc: kvm@vger.kernel.org
Subject: Handle INIT before SIPI.


Handle INIT before SIPI.

If SIPI was issued before CPU handled INIT signal the CPU is not reseted.

Signed-off-by: Gleb Natapov [EMAIL PROTECTED]

diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 577f2b2..dae05b0 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -349,13 +349,20 @@ static void update_regs_for_sipi(CPUState *env)
 {
 kvm_arch_update_regs_for_sipi(env);
 vcpu_info[env-cpu_index].sipi_needed = 0;
-vcpu_info[env-cpu_index].init = 0;
 }

 static void update_regs_for_init(CPUState *env)
 {
+SegmentCache cs = env-segs[R_CS];
+
 cpu_reset(env);
+
+/* restore SIPI vector */
+if(vcpu_info[env-cpu_index].sipi_needed)
+env-segs[R_CS] = cs;
+
 kvm_arch_load_regs(env);
+vcpu_info[env-cpu_index].init = 0;
 }

 static void setup_kernel_sigmask(CPUState *env)
@@ -411,10 +418,12 @@ static int kvm_main_loop_cpu(CPUState *env)
kvm_main_loop_wait(env, 1000);
if (env-interrupt_request  (CPU_INTERRUPT_HARD | CPU_INTERRUPT_NMI))
env-halted = 0;
-   if (!kvm_irqchip_in_kernel(kvm_context)  info-sipi_needed)
-   update_regs_for_sipi(env);
-   if (!kvm_irqchip_in_kernel(kvm_context)  info-init)
-   update_regs_for_init(env);
+if (!kvm_irqchip_in_kernel(kvm_context)) {
+   if (info-init)
+   update_regs_for_init(env);
+   if (info-sipi_needed)
+   update_regs_for_sipi(env);
+}
if (!env-halted  !info-init)
kvm_cpu_exec(env);
env-interrupt_request = ~CPU_INTERRUPT_EXIT;
--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html