Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Gleb Natapov
On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
 Zhang, Yang Z wrote on 2012-12-24:
  Gleb Natapov wrote on 2012-12-20:
  On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
  basically to benefit from apicv, we need clear MSR bitmap for
  corresponding x2apic MSRs:
  0x800 - 0x8ff: no read intercept for apicv register virtualization
  TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
  We do not set Virtualize x2APIC mode bit in secondary execution
  control. If I read the spec correctly without that those MSR read/writes
  will go straight to physical local APIC.
  Right. Now it cannot get benefit, but we may enable it in future and then 
  we can
  benefit from it.
Without enabling it you cannot disable MSR intercept for x2apic MSRs.

 how about to add the following check:
 if (apicv_enabled  virtual_x2apic_enabled)
   clear_msr();
 
I do not understand what do you mean here.

 
  Signed-off-by: Yang Zhang yang.z.zh...@intel.com
  Signed-off-by: Kevin Tian kevin.t...@intel.com
  ---
   arch/x86/kvm/vmx.c |   62
   ++-- 1 files changed,
   55 insertions(+), 7 deletions(-)
  diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
  index be66c3e..9b5e7a2 100644
  --- a/arch/x86/kvm/vmx.c
  +++ b/arch/x86/kvm/vmx.c
  @@ -3773,7 +3773,10 @@ static void free_vpid(struct vcpu_vmx *vmx)
spin_unlock(vmx_vpid_lock);
   }
  -static void __vmx_disable_intercept_for_msr(unsigned long
  *msr_bitmap, u32 msr) +#define MSR_TYPE_R 1 +#define MSR_TYPE_W   2
  +static void __vmx_disable_intercept_for_msr(unsigned long
  *msr_bitmap, +u32 msr, int type)
   {
int f = sizeof(unsigned long);
  @@ -3786,20 +3789,52 @@ static void
  __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr)
 * We can control MSRs 0x-0x1fff and
   0xc000-0xc0001fff.*/ if (msr = 0x1fff) {
  - __clear_bit(msr, msr_bitmap + 0x000 / f); /* read-low */
  - __clear_bit(msr, msr_bitmap + 0x800 / f); /* write-low */
  + if (type  MSR_TYPE_R)
  + /* read-low */
  + __clear_bit(msr, msr_bitmap + 0x000 / f);
  +
  + if (type  MSR_TYPE_W)
  + /* write-low */
  + __clear_bit(msr, msr_bitmap + 0x800 / f);
  +
} else if ((msr = 0xc000)  (msr = 0xc0001fff)) {
msr = 0x1fff;
  - __clear_bit(msr, msr_bitmap + 0x400 / f); /* read-high */
  - __clear_bit(msr, msr_bitmap + 0xc00 / f); /* write-high */
  + if (type  MSR_TYPE_R)
  + /* read-high */
  + __clear_bit(msr, msr_bitmap + 0x400 / f);
  +
  + if (type  MSR_TYPE_W)
  + /* write-high */
  + __clear_bit(msr, msr_bitmap + 0xc00 / f);
  +
}
   }
   
   static void vmx_disable_intercept_for_msr(u32 msr, bool longmode_only)
   {
if (!longmode_only)
  - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, msr);
  - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, msr);
  + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
  msr,
  MSR_TYPE_R | MSR_TYPE_W);
  + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +  
  msr,
  MSR_TYPE_R | MSR_TYPE_W); +} + +static void
  vmx_disable_intercept_for_msr_read(u32 msr, bool longmode_only) +{ +
if (!longmode_only)
  + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
  msr,
  MSR_TYPE_R); +
__vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +  
  msr,
  MSR_TYPE_R); +} + +static void vmx_disable_intercept_for_msr_write(u32
  msr, bool longmode_only) +{ + if (!longmode_only)
  + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
  msr,
  MSR_TYPE_W); +
__vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +  
  msr,
  MSR_TYPE_W);
   }
   
   /* @@ -7633,6 +7668,19 @@ static int __init vmx_init(void)
vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
  + if (enable_apicv_reg_vid) {
  + int msr;
  + for (msr = 0x800; msr = 0x8ff; msr++)
  + vmx_disable_intercept_for_msr_read(msr, false);
  +
  + /* TPR */
  + vmx_disable_intercept_for_msr_write(0x808, false);
  + /* EOI */
  + vmx_disable_intercept_for_msr_write(0x80b, false);
  + /* SELF-IPI */
  + vmx_disable_intercept_for_msr_write(0x83f, false);
  + }
  +
if (enable_ept) {
kvm_mmu_set_mask_ptes(0ull,
(enable_ept_ad_bits) ? VMX_EPT_ACCESS_BIT : 0ull,
  --
  1.7.1
  
  --
 Gleb.
  
  
  Best regards,
  Yang
 
 
 
 

Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging

2012-12-24 Thread Gleb Natapov
On Tue, Dec 18, 2012 at 04:26:47PM +0900, Takuya Yoshikawa wrote:
 This is needed to make kvm_mmu_slot_remove_write_access() rmap based:
 otherwise we may end up using invalid rmap's.
 
 Signed-off-by: Takuya Yoshikawa yoshikawa_takuya...@lab.ntt.co.jp
 ---
  arch/x86/kvm/x86.c  |9 -
  virt/kvm/kvm_main.c |1 -
  2 files changed, 8 insertions(+), 2 deletions(-)
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 1c9c834..9451efa 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -6897,7 +6897,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
   spin_lock(kvm-mmu_lock);
   if (nr_mmu_pages)
   kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
 - kvm_mmu_slot_remove_write_access(kvm, mem-slot);
 + /*
 +  * Write protect all pages for dirty logging.
 +  * Existing largepage mappings are destroyed here and new ones will
 +  * not be created until the end of the logging.
 +  */
 + if ((mem-flags  KVM_MEM_LOG_DIRTY_PAGES) 
 + !(old.flags  KVM_MEM_LOG_DIRTY_PAGES))
 + kvm_mmu_slot_remove_write_access(kvm, mem-slot);
We should not check old slot flags here or at least check that
old.npages is not zero. Userspace may delete a slot using old flags,
then, if new memslot is created with dirty log enabled, it will not be
protected.

   spin_unlock(kvm-mmu_lock);
   /*
* If memory slot is created, or moved, we need to clear all
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index bd31096..0ef5daa 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -805,7 +805,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
   if ((new.flags  KVM_MEM_LOG_DIRTY_PAGES)  !new.dirty_bitmap) {
   if (kvm_create_dirty_bitmap(new)  0)
   goto out_free;
 - /* destroy any largepage mappings for dirty tracking */
   }
  
   if (!npages || base_gfn != old.base_gfn) {
 -- 
 1.7.5.4

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] qemu-kvm/pci-assign: 64 bits bar emulation

2012-12-24 Thread Alex Williamson
On Sun, 2012-12-23 at 10:06 +0200, Gleb Natapov wrote:
 On Thu, Dec 20, 2012 at 11:07:23AM +0800, Xudong Hao wrote:
  Enable 64 bits bar emulation.
  
  v3 changes from v2:
  - Leave original error string and drop the leading 016.
  
  v2 changes from v1:
  - Change 0lx% to 0x%016 when print a 64 bit variable.
  
  Test pass with the current seabios which already support 64bit pci bars.
  
  Signed-off-by: Xudong Hao xudong@intel.com
 Alex, is this OK with you now?

Yep

Reviewed-by: Alex Williamson alex.william...@redhat.com

  ---
   hw/kvm/pci-assign.c |   14 ++
   1 files changed, 10 insertions(+), 4 deletions(-)
  
  diff --git a/hw/kvm/pci-assign.c b/hw/kvm/pci-assign.c
  index 7a0998c..2271a2e 100644
  --- a/hw/kvm/pci-assign.c
  +++ b/hw/kvm/pci-assign.c
  @@ -46,6 +46,7 @@
   #define IORESOURCE_IRQ  0x0400
   #define IORESOURCE_DMA  0x0800
   #define IORESOURCE_PREFETCH 0x2000  /* No side effects */
  +#define IORESOURCE_MEM_64   0x0010
   
   //#define DEVICE_ASSIGNMENT_DEBUG
   
  @@ -442,9 +443,13 @@ static int assigned_dev_register_regions(PCIRegion 
  *io_regions,
   
   /* handle memory io regions */
   if (cur_region-type  IORESOURCE_MEM) {
  -int t = cur_region-type  IORESOURCE_PREFETCH
  -? PCI_BASE_ADDRESS_MEM_PREFETCH
  -: PCI_BASE_ADDRESS_SPACE_MEMORY;
  +int t = PCI_BASE_ADDRESS_SPACE_MEMORY;
  +if (cur_region-type  IORESOURCE_PREFETCH) {
  +t |= PCI_BASE_ADDRESS_MEM_PREFETCH;
  +}
  +if (cur_region-type  IORESOURCE_MEM_64) {
  +t |= PCI_BASE_ADDRESS_MEM_TYPE_64;
  +}
   
   /* map physical memory */
   pci_dev-v_addrs[i].u.r_virtbase = mmap(NULL, cur_region-size,
  @@ -632,7 +637,8 @@ again:
   rp-valid = 0;
   rp-resource_fd = -1;
   size = end - start + 1;
  -flags = IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH;
  +flags = IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH
  + | IORESOURCE_MEM_64;
   if (size == 0 || (flags  ~IORESOURCE_PREFETCH) == 0) {
   continue;
   }
  -- 
  1.5.5
 
 --
   Gleb.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: Fix memory slot generation updates

2012-12-24 Thread Alex Williamson
On Sun, 2012-12-23 at 10:55 +0200, Gleb Natapov wrote:
 On Fri, Dec 21, 2012 at 08:20:16AM -0700, Alex Williamson wrote:
  Previous patch kvm: Minor memory slot optimization overlooked the
  generation field of the memory slots.  Re-using the original memory
  slots left us with with two slightly different memory slots with the
  same generation.  To fix this, make update_memslots() take a new
  parameter to specify the last generation.  This also makes generation
  management more explicit to avoid such problems in the future.
  
  Reported-by: Takuya Yoshikawa yoshikawa_takuya...@lab.ntt.co.jp
  Signed-off-by: Alex Williamson alex.william...@redhat.com
 Applied, thanks. What about this small cleanup on to of the patch:

Nice cleanup

Reviewed-by: Alex Williamson alex.william...@redhat.com

 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index 14cbae8..e45c20c 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -702,6 +702,17 @@ static int check_memory_region_flags(struct 
 kvm_userspace_memory_region *mem)
   return 0;
  }
  
 +static struct kvm_memslots *install_new_memslots(struct kvm *kvm,
 + struct kvm_memslots *slots, struct kvm_memory_slot *new)
 +{
 + struct kvm_memslots *old_memslots = kvm-memslots;
 +
 + update_memslots(slots, new, kvm-memslots-generation);
 + rcu_assign_pointer(kvm-memslots, slots);
 + synchronize_srcu_expedited(kvm-srcu);
 + return old_memslots; 
 +}
 +
  /*
   * Allocate some memory and give it an address in the guest physical address
   * space.
 @@ -820,11 +831,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
   slot = id_to_memslot(slots, mem-slot);
   slot-flags |= KVM_MEMSLOT_INVALID;
  
 - update_memslots(slots, NULL, kvm-memslots-generation);
 + old_memslots = install_new_memslots(kvm, slots, NULL);
  
 - old_memslots = kvm-memslots;
 - rcu_assign_pointer(kvm-memslots, slots);
 - synchronize_srcu_expedited(kvm-srcu);
   /* slot was deleted or moved, clear iommu mapping */
   kvm_iommu_unmap_pages(kvm, old);
   /* From this point no new shadow pages pointing to a deleted,
 @@ -868,10 +876,7 @@ int __kvm_set_memory_region(struct kvm *kvm,
   memset(new.arch, 0, sizeof(new.arch));
   }
  
 - update_memslots(slots, new, kvm-memslots-generation);
 - old_memslots = kvm-memslots;
 - rcu_assign_pointer(kvm-memslots, slots);
 - synchronize_srcu_expedited(kvm-srcu);
 + old_memslots = install_new_memslots(kvm, slots, new);
  
   kvm_arch_commit_memory_region(kvm, mem, old, user_alloc);
  
 --
   Gleb.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-24:
 On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
 Zhang, Yang Z wrote on 2012-12-24:
 Gleb Natapov wrote on 2012-12-20:
 On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
 basically to benefit from apicv, we need clear MSR bitmap for
 corresponding x2apic MSRs:
 0x800 - 0x8ff: no read intercept for apicv register virtualization
 TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
 We do not set Virtualize x2APIC mode bit in secondary execution
 control. If I read the spec correctly without that those MSR read/writes
 will go straight to physical local APIC.
 Right. Now it cannot get benefit, but we may enable it in future and
 then we can benefit from it.
 Without enabling it you cannot disable MSR intercept for x2apic MSRs.
 
 how about to add the following check:
 if (apicv_enabled  virtual_x2apic_enabled)
  clear_msr();
 
 I do not understand what do you mean here.
In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As 
you said, since kvm doesn't set virtualize x2apic mode, APIC register 
virtualization never take effect. So we need to clear MSR bitmap only when 
apicv enabled and virtualize x2apic mode set.

 
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 ---
  arch/x86/kvm/vmx.c |   62
  ++-- 1 files
  changed, 55 insertions(+), 7 deletions(-)
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index be66c3e..9b5e7a2 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -3773,7 +3773,10 @@ static void free_vpid(struct vcpu_vmx *vmx)
   spin_unlock(vmx_vpid_lock);
  }
 -static void __vmx_disable_intercept_for_msr(unsigned long
 *msr_bitmap, u32 msr) +#define MSR_TYPE_R 1 +#define MSR_TYPE_W   2
 +static void __vmx_disable_intercept_for_msr(unsigned long
 *msr_bitmap, +u32 msr, int type)
  {
   int f = sizeof(unsigned long);
 @@ -3786,20 +3789,52 @@ static void
 __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr)
* We can control MSRs 0x-0x1fff and
  0xc000-0xc0001fff.*/ if (msr = 0x1fff) {
 - __clear_bit(msr, msr_bitmap + 0x000 / f); /* read-low */
 - __clear_bit(msr, msr_bitmap + 0x800 / f); /* write-low */
 + if (type  MSR_TYPE_R)
 + /* read-low */
 + __clear_bit(msr, msr_bitmap + 0x000 / f);
 +
 + if (type  MSR_TYPE_W)
 + /* write-low */
 + __clear_bit(msr, msr_bitmap + 0x800 / f);
 +
   } else if ((msr = 0xc000)  (msr = 0xc0001fff)) {
   msr = 0x1fff;
 - __clear_bit(msr, msr_bitmap + 0x400 / f); /* read-high */
 - __clear_bit(msr, msr_bitmap + 0xc00 / f); /* write-high */
 + if (type  MSR_TYPE_R)
 + /* read-high */
 + __clear_bit(msr, msr_bitmap + 0x400 / f);
 +
 + if (type  MSR_TYPE_W)
 + /* write-high */
 + __clear_bit(msr, msr_bitmap + 0xc00 / f);
 +
   }
  }
  
  static void vmx_disable_intercept_for_msr(u32 msr, bool
  longmode_only) { if (!longmode_only)
 - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, msr);
 - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, msr);
 + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
   msr, MSR_TYPE_R | MSR_TYPE_W);
 + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +
   msr, MSR_TYPE_R | MSR_TYPE_W); +} + 
 +static void
 vmx_disable_intercept_for_msr_read(u32 msr, bool longmode_only) +{ +
   if (!longmode_only)
 + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
   msr, MSR_TYPE_R); +
   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +
   msr, MSR_TYPE_R); +} + +static void
 vmx_disable_intercept_for_msr_write(u32 msr, bool longmode_only) +{
 + if (!longmode_only)
 + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
   msr, MSR_TYPE_W); +
   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +
   msr, MSR_TYPE_W);
  }
  
  /* @@ -7633,6 +7668,19 @@ static int __init vmx_init(void)
   vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
   vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
 + if (enable_apicv_reg_vid) {
 + int msr;
 + for (msr = 0x800; msr = 0x8ff; msr++)
 + vmx_disable_intercept_for_msr_read(msr, false);
 +
 + /* TPR */
 + vmx_disable_intercept_for_msr_write(0x808, false);
 + /* EOI */
 + vmx_disable_intercept_for_msr_write(0x80b, false);
 + /* SELF-IPI */
 + vmx_disable_intercept_for_msr_write(0x83f, false);
 + }
 +
   if (enable_ept) {
   

Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging

2012-12-24 Thread Takuya Yoshikawa
On Mon, 24 Dec 2012 15:27:17 +0200
Gleb Natapov g...@redhat.com wrote:

  @@ -6897,7 +6897,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
  spin_lock(kvm-mmu_lock);
  if (nr_mmu_pages)
  kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
  -   kvm_mmu_slot_remove_write_access(kvm, mem-slot);
  +   /*
  +* Write protect all pages for dirty logging.
  +* Existing largepage mappings are destroyed here and new ones will
  +* not be created until the end of the logging.
  +*/
  +   if ((mem-flags  KVM_MEM_LOG_DIRTY_PAGES) 
  +   !(old.flags  KVM_MEM_LOG_DIRTY_PAGES))
  +   kvm_mmu_slot_remove_write_access(kvm, mem-slot);
 We should not check old slot flags here or at least check that
 old.npages is not zero. Userspace may delete a slot using old flags,
 then, if new memslot is created with dirty log enabled, it will not be
 protected.

The flag, KVM_MEM_LOG_DIRTY_PAGES, is explicitely cleared when we
delete a slot to free dirty_bitmap:

if (!npages)
mem-flags = ~KVM_MEM_LOG_DIRTY_PAGES;

So when old.npages is not zero, the second condition must be true.

Other parts are doing if (!slot-dirty_bitmap) to see if the slot
is in dirty logging mode.  If you prefer, we can do the same here.

Thanks,
Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging

2012-12-24 Thread Gleb Natapov
On Tue, Dec 25, 2012 at 01:08:40PM +0900, Takuya Yoshikawa wrote:
 On Mon, 24 Dec 2012 15:27:17 +0200
 Gleb Natapov g...@redhat.com wrote:
 
   @@ -6897,7 +6897,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 spin_lock(kvm-mmu_lock);
 if (nr_mmu_pages)
 kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
   - kvm_mmu_slot_remove_write_access(kvm, mem-slot);
   + /*
   +  * Write protect all pages for dirty logging.
   +  * Existing largepage mappings are destroyed here and new ones will
   +  * not be created until the end of the logging.
   +  */
   + if ((mem-flags  KVM_MEM_LOG_DIRTY_PAGES) 
   + !(old.flags  KVM_MEM_LOG_DIRTY_PAGES))
   + kvm_mmu_slot_remove_write_access(kvm, mem-slot);
  We should not check old slot flags here or at least check that
  old.npages is not zero. Userspace may delete a slot using old flags,
  then, if new memslot is created with dirty log enabled, it will not be
  protected.
 
 The flag, KVM_MEM_LOG_DIRTY_PAGES, is explicitely cleared when we
 delete a slot to free dirty_bitmap:
 
   if (!npages)
   mem-flags = ~KVM_MEM_LOG_DIRTY_PAGES;
 
 So when old.npages is not zero, the second condition must be true.
 
Right. I was looking for where it is cleared and missed that.

 Other parts are doing if (!slot-dirty_bitmap) to see if the slot
 is in dirty logging mode.  If you prefer, we can do the same here.
 
Does not matter to me. If you think it will be more consistent do it.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging

2012-12-24 Thread Takuya Yoshikawa
On Tue, 25 Dec 2012 07:05:55 +0200
Gleb Natapov g...@redhat.com wrote:

  Other parts are doing if (!slot-dirty_bitmap) to see if the slot
  is in dirty logging mode.  If you prefer, we can do the same here.
  
 Does not matter to me. If you think it will be more consistent do it.

No preference at the moment.  But since .dirty_bitmap and .rmap may be
changed to control free_physmem(new, old) behaviour, I'll keep the code
as is.

I'll send some patches to clean up the current mem, new, old, slot
games in __kvm_set_memory_region() later.  I was confused many times
by these.

In addition, passing old slot by value to kvm_arch_commit_memory_region()
should be fixed as well.  The structure is too big to copy.

Thanks,
Takuya
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Gleb Natapov
On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-24:
  On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
  Zhang, Yang Z wrote on 2012-12-24:
  Gleb Natapov wrote on 2012-12-20:
  On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
  basically to benefit from apicv, we need clear MSR bitmap for
  corresponding x2apic MSRs:
  0x800 - 0x8ff: no read intercept for apicv register virtualization
  TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
  We do not set Virtualize x2APIC mode bit in secondary execution
  control. If I read the spec correctly without that those MSR read/writes
  will go straight to physical local APIC.
  Right. Now it cannot get benefit, but we may enable it in future and
  then we can benefit from it.
  Without enabling it you cannot disable MSR intercept for x2apic MSRs.
  
  how about to add the following check:
  if (apicv_enabled  virtual_x2apic_enabled)
 clear_msr();
  
  I do not understand what do you mean here.
 In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As 
 you said, since kvm doesn't set virtualize x2apic mode, APIC register 
 virtualization never take effect. So we need to clear MSR bitmap only when 
 apicv enabled and virtualize x2apic mode set.
 
But currently it is never set.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-25:
 On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-24:
 On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
 Zhang, Yang Z wrote on 2012-12-24:
 Gleb Natapov wrote on 2012-12-20:
 On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
 basically to benefit from apicv, we need clear MSR bitmap for
 corresponding x2apic MSRs:
 0x800 - 0x8ff: no read intercept for apicv register virtualization
 TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
 We do not set Virtualize x2APIC mode bit in secondary execution
 control. If I read the spec correctly without that those MSR read/writes
 will go straight to physical local APIC.
 Right. Now it cannot get benefit, but we may enable it in future and
 then we can benefit from it.
 Without enabling it you cannot disable MSR intercept for x2apic MSRs.
 
 how about to add the following check:
 if (apicv_enabled  virtual_x2apic_enabled)
clear_msr();
 
 I do not understand what do you mean here.
 In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As 
 you
 said, since kvm doesn't set virtualize x2apic mode, APIC register 
 virtualization
 never take effect. So we need to clear MSR bitmap only when apicv enabled and
 virtualize x2apic mode set.
 
 But currently it is never set.
So you think the third patch is not necessary currently? Unless we enabled 
virtualize x2apic mode.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Gleb Natapov
On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-25:
  On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
  Gleb Natapov wrote on 2012-12-24:
  On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
  Zhang, Yang Z wrote on 2012-12-24:
  Gleb Natapov wrote on 2012-12-20:
  On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
  basically to benefit from apicv, we need clear MSR bitmap for
  corresponding x2apic MSRs:
  0x800 - 0x8ff: no read intercept for apicv register virtualization
  TPR,EOI,SELF-IPI: no write intercept for virtual interrupt 
  delivery
  We do not set Virtualize x2APIC mode bit in secondary execution
  control. If I read the spec correctly without that those MSR 
  read/writes
  will go straight to physical local APIC.
  Right. Now it cannot get benefit, but we may enable it in future and
  then we can benefit from it.
  Without enabling it you cannot disable MSR intercept for x2apic MSRs.
  
  how about to add the following check:
  if (apicv_enabled  virtual_x2apic_enabled)
   clear_msr();
  
  I do not understand what do you mean here.
  In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. 
  As you
  said, since kvm doesn't set virtualize x2apic mode, APIC register 
  virtualization
  never take effect. So we need to clear MSR bitmap only when apicv enabled 
  and
  virtualize x2apic mode set.
  
  But currently it is never set.
 So you think the third patch is not necessary currently? Unless we enabled 
 virtualize x2apic mode.
 
Without third patch vid will not work properly if a guest is in x2apic
mode. Actually second and third patches need to be reordered to not have
a windows where x2apic is broken. The problem is that this patch itself
is buggy since it does not set virtualize x2apic mode flag. It should
set the flag if vid is enabled and if the flag cannot be set vid should
be forced off.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-25:
 On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-25:
 On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-24:
 On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
 Zhang, Yang Z wrote on 2012-12-24:
 Gleb Natapov wrote on 2012-12-20:
 On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
 basically to benefit from apicv, we need clear MSR bitmap for
 corresponding x2apic MSRs:
 0x800 - 0x8ff: no read intercept for apicv register virtualization
 TPR,EOI,SELF-IPI: no write intercept for virtual interrupt 
 delivery
 We do not set Virtualize x2APIC mode bit in secondary execution
 control. If I read the spec correctly without that those MSR 
 read/writes
 will go straight to physical local APIC.
 Right. Now it cannot get benefit, but we may enable it in future and
 then we can benefit from it.
 Without enabling it you cannot disable MSR intercept for x2apic MSRs.
 
 how about to add the following check:
 if (apicv_enabled  virtual_x2apic_enabled)
  clear_msr();
 
 I do not understand what do you mean here.
 In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. 
 As
 you
 said, since kvm doesn't set virtualize x2apic mode, APIC register
 virtualization never take effect. So we need to clear MSR bitmap only
 when apicv enabled and virtualize x2apic mode set.
 
 But currently it is never set.
 So you think the third patch is not necessary currently? Unless we
 enabled virtualize x2apic mode.
 
 Without third patch vid will not work properly if a guest is in x2apic
 mode. Actually second and third patches need to be reordered to not have
 a windows where x2apic is broken. The problem is that this patch itself
 is buggy since it does not set virtualize x2apic mode flag. It should
 set the flag if vid is enabled and if the flag cannot be set vid should
 be forced off.
In what conditions this flag cannot be set? I think the only case is that KVM 
doesn't expose the x2apic capability to guest, if this is true, the guest will 
never use x2apic and we still can use vid.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Gleb Natapov
On Tue, Dec 25, 2012 at 07:25:15AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-25:
  On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote:
  Gleb Natapov wrote on 2012-12-25:
  On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
  Gleb Natapov wrote on 2012-12-24:
  On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
  Zhang, Yang Z wrote on 2012-12-24:
  Gleb Natapov wrote on 2012-12-20:
  On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
  basically to benefit from apicv, we need clear MSR bitmap for
  corresponding x2apic MSRs:
  0x800 - 0x8ff: no read intercept for apicv register 
  virtualization
  TPR,EOI,SELF-IPI: no write intercept for virtual interrupt 
  delivery
  We do not set Virtualize x2APIC mode bit in secondary execution
  control. If I read the spec correctly without that those MSR 
  read/writes
  will go straight to physical local APIC.
  Right. Now it cannot get benefit, but we may enable it in future and
  then we can benefit from it.
  Without enabling it you cannot disable MSR intercept for x2apic MSRs.
  
  how about to add the following check:
  if (apicv_enabled  virtual_x2apic_enabled)
 clear_msr();
  
  I do not understand what do you mean here.
  In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv 
  enabled. As
  you
  said, since kvm doesn't set virtualize x2apic mode, APIC register
  virtualization never take effect. So we need to clear MSR bitmap only
  when apicv enabled and virtualize x2apic mode set.
  
  But currently it is never set.
  So you think the third patch is not necessary currently? Unless we
  enabled virtualize x2apic mode.
  
  Without third patch vid will not work properly if a guest is in x2apic
  mode. Actually second and third patches need to be reordered to not have
  a windows where x2apic is broken. The problem is that this patch itself
  is buggy since it does not set virtualize x2apic mode flag. It should
  set the flag if vid is enabled and if the flag cannot be set vid should
  be forced off.
 In what conditions this flag cannot be set? I think the only case is that KVM 
 doesn't expose the x2apic capability to guest, if this is true, the guest 
 will never use x2apic and we still can use vid.
 
We can indeed set virtualize x2apic mode unconditionally since it does
not take any effect if x2apic MSRs are intercepted.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-25:
 On Tue, Dec 25, 2012 at 07:25:15AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-25:
 On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-25:
 On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-24:
 On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
 Zhang, Yang Z wrote on 2012-12-24:
 Gleb Natapov wrote on 2012-12-20:
 On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
 basically to benefit from apicv, we need clear MSR bitmap for
 corresponding x2apic MSRs:
 0x800 - 0x8ff: no read intercept for apicv register 
 virtualization
 TPR,EOI,SELF-IPI: no write intercept for virtual interrupt
 delivery
 We do not set Virtualize x2APIC mode bit in secondary
 execution control. If I read the spec correctly without that
 those MSR read/writes will go straight to physical local APIC.
 Right. Now it cannot get benefit, but we may enable it in future and
 then we can benefit from it.
 Without enabling it you cannot disable MSR intercept for x2apic MSRs.
 
 how about to add the following check:
 if (apicv_enabled  virtual_x2apic_enabled)
clear_msr();
 
 I do not understand what do you mean here.
 In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled.
 As
 you
 said, since kvm doesn't set virtualize x2apic mode, APIC register
 virtualization never take effect. So we need to clear MSR bitmap only
 when apicv enabled and virtualize x2apic mode set.
 
 But currently it is never set.
 So you think the third patch is not necessary currently? Unless we
 enabled virtualize x2apic mode.
 
 Without third patch vid will not work properly if a guest is in x2apic
 mode. Actually second and third patches need to be reordered to not have
 a windows where x2apic is broken. The problem is that this patch itself
 is buggy since it does not set virtualize x2apic mode flag. It should
 set the flag if vid is enabled and if the flag cannot be set vid should
 be forced off.
 In what conditions this flag cannot be set? I think the only case is that KVM
 doesn't expose the x2apic capability to guest, if this is true, the guest 
 will never
 use x2apic and we still can use vid.
 
 We can indeed set virtualize x2apic mode unconditionally since it does
 not take any effect if x2apic MSRs are intercepted.
No. Since Virtual APIC access must be cleared if virtualize x2apic mode is 
set, and if guest still use xAPIC, then there should be lots of ept violations 
for apic access emulation. This will hurt performance.
We should only set virtualize x2apic mode when guest really uses x2apic(guest 
set bit 11 of APIC_BASE_MSR).

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-24 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2012-12-21:
 On Thu, Dec 20, 2012 at 03:12:32PM +0200, Gleb Natapov wrote:
 On Thu, Dec 20, 2012 at 10:53:16AM -0200, Marcelo Tosatti wrote:
 On Thu, Dec 20, 2012 at 08:42:06AM +0200, Gleb Natapov wrote:
 On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 
 
 Resuming previous discussion:
 
 How about to recaculate irr_pending according the VIRR on each
 vmexit?
 
 No need really. Since HW can only clear VIRR the only situation that
 may
 happen is that irr_pending will be true but VIRR is empty and
 apic_find_highest_irr() will return correct result in this case.
 
 Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI
 Virtualization.
 
 True. But as I said later in that discussion once irr_pending is set
 to true it never becomes false, so the optimization is effectively
 disable. We can set it to true doing apic initialization to make it
 explicit.
 
 Its just confusing, to have a variable which has different meanings
 in different configurations. I would rather have it explicit that
 its not used rather than check every time the i read the code.
 
 if (apic_vid() == 0  !apic-irr_pending)
 return -1;
 
 I'd prefer to avoid this additional if() especially as its sole purpose
 is documentation.  We can add comment instead. Note that irr_pending
 is just a hint anyway.  It can be true when no interrupt is pending in
 irr. We can even rename it to irr_pending_hint or something.
 
 Works for me (documentation).
 
 Not sure if you can skip it, its probably necessary to calculate it
 before HW does so (say migration etc).
 kvm_apic_has_interrupt() is not called during migration and
 kvm_apic_post_state_restore() calls apic_update_ppr() explicitly.
 I am not sure it is needed though since migrated value should be already
 correct anyway.
 
 Ok, best force isr_count to 1 if apic vintr enabled (and add a comment,
 please).

Sorry for the later reply. I will address those problems according your 
comments.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Gleb Natapov
On Tue, Dec 25, 2012 at 07:46:53AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-25:
  On Tue, Dec 25, 2012 at 07:25:15AM +, Zhang, Yang Z wrote:
  Gleb Natapov wrote on 2012-12-25:
  On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote:
  Gleb Natapov wrote on 2012-12-25:
  On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
  Gleb Natapov wrote on 2012-12-24:
  On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
  Zhang, Yang Z wrote on 2012-12-24:
  Gleb Natapov wrote on 2012-12-20:
  On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
  basically to benefit from apicv, we need clear MSR bitmap for
  corresponding x2apic MSRs:
  0x800 - 0x8ff: no read intercept for apicv register 
  virtualization
  TPR,EOI,SELF-IPI: no write intercept for virtual interrupt
  delivery
  We do not set Virtualize x2APIC mode bit in secondary
  execution control. If I read the spec correctly without that
  those MSR read/writes will go straight to physical local APIC.
  Right. Now it cannot get benefit, but we may enable it in future and
  then we can benefit from it.
  Without enabling it you cannot disable MSR intercept for x2apic MSRs.
  
  how about to add the following check:
  if (apicv_enabled  virtual_x2apic_enabled)
   clear_msr();
  
  I do not understand what do you mean here.
  In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv 
  enabled.
  As
  you
  said, since kvm doesn't set virtualize x2apic mode, APIC register
  virtualization never take effect. So we need to clear MSR bitmap only
  when apicv enabled and virtualize x2apic mode set.
  
  But currently it is never set.
  So you think the third patch is not necessary currently? Unless we
  enabled virtualize x2apic mode.
  
  Without third patch vid will not work properly if a guest is in x2apic
  mode. Actually second and third patches need to be reordered to not have
  a windows where x2apic is broken. The problem is that this patch itself
  is buggy since it does not set virtualize x2apic mode flag. It should
  set the flag if vid is enabled and if the flag cannot be set vid should
  be forced off.
  In what conditions this flag cannot be set? I think the only case is that 
  KVM
  doesn't expose the x2apic capability to guest, if this is true, the guest 
  will never
  use x2apic and we still can use vid.
  
  We can indeed set virtualize x2apic mode unconditionally since it does
  not take any effect if x2apic MSRs are intercepted.
 No. Since Virtual APIC access must be cleared if virtualize x2apic mode 
 is set, and if guest still use xAPIC, then there should be lots of ept 
 violations for apic access emulation. This will hurt performance.
Stupid HW, why this pointless limitation? Can you point me where SDM says that?

 We should only set virtualize x2apic mode when guest really uses 
 x2apic(guest set bit 11 of APIC_BASE_MSR).
 
Looks like SDM force us to.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html