Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2013-01-01 Thread Marcelo Tosatti
On Thu, Dec 27, 2012 at 06:25:25AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-27:
  On Thu, Dec 27, 2012 at 02:24:04AM +, Zhang, Yang Z wrote:
  Gleb Natapov wrote on 2012-12-21:
  On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
  On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
  On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
  On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
  manually, which is fully taken care of by the hardware. This needs
  some special awareness into existing interrupr injection path:
  
  - for pending interrupt, instead of direct injection, we may need
update architecture specific indicators before resuming to
guest. - A pending interrupt, which is masked by ISR, should be
also considered in above update action, since hardware will
decide when to inject it at right time. Current has_interrupt
and get_interrupt only returns a valid vector from injection
p.o.v.
  Signed-off-by: Kevin Tian kevin.t...@intel.com
  Signed-off-by: Yang Zhang yang.z.zh...@intel.com
  ---
   arch/ia64/kvm/lapic.h   |6 ++
   arch/x86/include/asm/kvm_host.h |6 ++
   arch/x86/include/asm/vmx.h  |   11 +++ arch/x86/kvm/irq.c
  |   56 +- arch/x86/kvm/lapic.c|
  65
   ++--- arch/x86/kvm/lapic.h|   28 ++-
   arch/x86/kvm/svm.c  |   24 ++
  arch/x86/kvm/vmx.c
 |  154 ++-
   arch/x86/kvm/x86.c  |   11 ++-
  include/linux/kvm_host.h
  |2 + virt/kvm/ioapic.c   |   36 +
   virt/kvm/ioapic.h   |1 + virt/kvm/irq_comm.c
|   20 + 13 files changed, 379 insertions(+), 41
   deletions(-)
  diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
  index c5f92a9..cb59eb4 100644
  --- a/arch/ia64/kvm/lapic.h
  +++ b/arch/ia64/kvm/lapic.h
  @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu,
  struct
  kvm_lapic_irq *irq);
   #define kvm_apic_present(x) (true)
   #define kvm_lapic_enabled(x) (true)
  +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
  + struct kvm_lapic_irq *irq)
  +{
  + /* IA64 has no apicv supporting, do nothing here */
  +}
  +
   #endif
  diff --git a/arch/x86/include/asm/kvm_host.h
  b/arch/x86/include/asm/kvm_host.h index c431b33..b63a144 100644 ---
  a/arch/x86/include/asm/kvm_host.h +++
  b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,11 @@ struct
  kvm_x86_ops {
void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
void (*enable_irq_window)(struct kvm_vcpu *vcpu);
void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int
  irr);
  + int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
  + void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
  + void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq
  *irq); +  void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu); + 
  void
  (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
int (*get_tdp_level)(void);
u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
  is_mmio);
  
  EOI exit bitmap is problematic (its racy). Please do this:
  
  1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC
  register modifications which require EOI exit bitmap updates. 2. On
  VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC
  map and adjusts its own EOI exit bitmap VMCS registers.
  
  1) that waits until remote executing VCPUs have acknowledge the
  request, using make_all_cpus_request (see virt/kvm/kvm_main.c),
  similarly to remote TLB flushes.
  
  What is the problem now: there is no control over _when_ a VCPU
  updates its EOI exit bitmap VMCS register from the (remotely updated)
  master EOI exit bitmap. The VCPU can be processing a
  KVM_REQ_EOIBITMAP relative to a precedence IOAPIC register write
  while the current IOAPIC register write is updating the EOI exit
  bitmap. There is no way to fix that without locking (which can be
  avoided if the IOAPIC-EOI exit bitmap synchronization is vcpu local).
  
  The race is benign. We have similar one for interrupt injection and
  the same race exists on a real HW. The two cases that can happen due
  to the race are:
  
  1. exitbitmap bit X is changed from 1 to 0
No problem. It is harmless to do an exit, on the next entry
exitbitmap will be fixed. 2. exitbitmap bit X is changed from 0 to 1
If vcpu serves X at the time this happens it was delivered as edge,
so no need to exit. The exitbitmap will be updated after the next
vmexit which will happen due to KVM_REQ_EOIBITMAP processing.
  
  1. Missed the case where bitmap is being reset (where 

RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-21:
 On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
 On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
 On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 ---
  arch/ia64/kvm/lapic.h   |6 ++
  arch/x86/include/asm/kvm_host.h |6 ++
  arch/x86/include/asm/vmx.h  |   11 +++ arch/x86/kvm/irq.c  
 |   56 +- arch/x86/kvm/lapic.c|   65
  ++--- arch/x86/kvm/lapic.h|   28 ++-
  arch/x86/kvm/svm.c  |   24 ++ arch/x86/kvm/vmx.c   
|  154 ++-
  arch/x86/kvm/x86.c  |   11 ++- include/linux/kvm_host.h
 |2 + virt/kvm/ioapic.c   |   36 +
  virt/kvm/ioapic.h   |1 + virt/kvm/irq_comm.c   
   |   20 + 13 files changed, 379 insertions(+), 41
  deletions(-)
 diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
 index c5f92a9..cb59eb4 100644
 --- a/arch/ia64/kvm/lapic.h
 +++ b/arch/ia64/kvm/lapic.h
 @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct
 kvm_lapic_irq *irq);
  #define kvm_apic_present(x) (true)
  #define kvm_lapic_enabled(x) (true)
 +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
 + struct kvm_lapic_irq *irq)
 +{
 + /* IA64 has no apicv supporting, do nothing here */
 +}
 +
  #endif
 diff --git a/arch/x86/include/asm/kvm_host.h
 b/arch/x86/include/asm/kvm_host.h index c431b33..b63a144 100644 ---
 a/arch/x86/include/asm/kvm_host.h +++
 b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,11 @@ struct
 kvm_x86_ops {
   void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
   void (*enable_irq_window)(struct kvm_vcpu *vcpu);
   void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
 + int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
 + void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
 + void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq
 *irq); +  void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu); + void
 (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
   int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
   int (*get_tdp_level)(void);
   u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
 is_mmio);
 
 EOI exit bitmap is problematic (its racy). Please do this:
 
 1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC
 register modifications which require EOI exit bitmap updates. 2. On
 VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC map
 and adjusts its own EOI exit bitmap VMCS registers.
 
 1) that waits until remote executing VCPUs have acknowledge the request,
 using make_all_cpus_request (see virt/kvm/kvm_main.c), similarly to
 remote TLB flushes.
 
 What is the problem now: there is no control over _when_ a VCPU
 updates its EOI exit bitmap VMCS register from the (remotely updated)
 master EOI exit bitmap. The VCPU can be processing a
 KVM_REQ_EOIBITMAP relative to a precedence IOAPIC register write
 while the current IOAPIC register write is updating the EOI exit
 bitmap. There is no way to fix that without locking (which can be
 avoided if the IOAPIC-EOI exit bitmap synchronization is vcpu local).
 
 The race is benign. We have similar one for interrupt injection and
 the same race exists on a real HW. The two cases that can happen due
 to the race are:
 
 1. exitbitmap bit X is changed from 1 to 0
   No problem. It is harmless to do an exit, on the next entry
   exitbitmap will be fixed. 2. exitbitmap bit X is changed from 0 to 1
   If vcpu serves X at the time this happens it was delivered as edge,
   so no need to exit. The exitbitmap will be updated after the next
   vmexit which will happen due to KVM_REQ_EOIBITMAP processing.
 
 1. Missed the case where bitmap is being reset (where EOI exit bitmaps
 are zeroed). In this case vcpu enters guest with incorrect EOI exit
 bitmap.
 
 Right, the bitmap reset us problematic indeed since it does not
 represent real vector configuration change.
 
 2. Missed the case where current code allows vcpu to enter guest
 with EOI exit bitmap 

RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2012-12-21:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 @@ -3925,6 +3942,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
  vmx_secondary_exec_control(vmx));
  }
 +if (enable_apicv_reg_vid) {
 +vmcs_write64(EOI_EXIT_BITMAP0, 0);
 +vmcs_write64(EOI_EXIT_BITMAP1, 0);
 +vmcs_write64(EOI_EXIT_BITMAP2, 0);
 +vmcs_write64(EOI_EXIT_BITMAP3, 0);
 +
 +vmcs_write16(GUEST_INTR_STATUS, 0);
 +}
 
 AFAICS SVI should be regenerated on migration. Consider:
 
 1. vintr delivery, sets SVI = vector = RVI.
 2. clears RVI.
 3. migration.
 4. RVI properly set from VIRR on entry.
 5. SVI = 0.
 6. EOI - EOI virtualization with SVI = 0.
 
 Could hook into kvm_apic_post_state_restore() to do that (set highest
 index of bit set in VISR).
Ok. How about to make a request(KVM_REQ_UPDATE_SVI) and handle it in vmentry to 
set highest index of bit in VISR to RVI.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Gleb Natapov
On Thu, Dec 27, 2012 at 03:32:47AM +, Zhang, Yang Z wrote:
 Marcelo Tosatti wrote on 2012-12-21:
  On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  @@ -3925,6 +3942,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 vmx_secondary_exec_control(vmx));
 }
  +  if (enable_apicv_reg_vid) {
  +  vmcs_write64(EOI_EXIT_BITMAP0, 0);
  +  vmcs_write64(EOI_EXIT_BITMAP1, 0);
  +  vmcs_write64(EOI_EXIT_BITMAP2, 0);
  +  vmcs_write64(EOI_EXIT_BITMAP3, 0);
  +
  +  vmcs_write16(GUEST_INTR_STATUS, 0);
  +  }
  
  AFAICS SVI should be regenerated on migration. Consider:
  
  1. vintr delivery, sets SVI = vector = RVI.
  2. clears RVI.
  3. migration.
  4. RVI properly set from VIRR on entry.
  5. SVI = 0.
  6. EOI - EOI virtualization with SVI = 0.
  
  Could hook into kvm_apic_post_state_restore() to do that (set highest
  index of bit set in VISR).
 Ok. How about to make a request(KVM_REQ_UPDATE_SVI) and handle it in vmentry 
 to set highest index of bit in VISR to RVI.
 
Just do it in kvm_apic_post_state_restore() directly, no need to do
a request.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Gleb Natapov
On Thu, Dec 27, 2012 at 02:24:04AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-21:
  On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
  On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
  On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
  On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
  manually, which is fully taken care of by the hardware. This needs
  some special awareness into existing interrupr injection path:
  
  - for pending interrupt, instead of direct injection, we may need
update architecture specific indicators before resuming to guest.
  - A pending interrupt, which is masked by ISR, should be also
considered in above update action, since hardware will decide
when to inject it at right time. Current has_interrupt and
get_interrupt only returns a valid vector from injection p.o.v.
  Signed-off-by: Kevin Tian kevin.t...@intel.com
  Signed-off-by: Yang Zhang yang.z.zh...@intel.com
  ---
   arch/ia64/kvm/lapic.h   |6 ++
   arch/x86/include/asm/kvm_host.h |6 ++
   arch/x86/include/asm/vmx.h  |   11 +++ arch/x86/kvm/irq.c  
  |   56 +- arch/x86/kvm/lapic.c|   65
   ++--- arch/x86/kvm/lapic.h|   28 ++-
   arch/x86/kvm/svm.c  |   24 ++ arch/x86/kvm/vmx.c   
 |  154 ++-
   arch/x86/kvm/x86.c  |   11 ++- include/linux/kvm_host.h
  |2 + virt/kvm/ioapic.c   |   36 +
   virt/kvm/ioapic.h   |1 + virt/kvm/irq_comm.c   
|   20 + 13 files changed, 379 insertions(+), 41
   deletions(-)
  diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
  index c5f92a9..cb59eb4 100644
  --- a/arch/ia64/kvm/lapic.h
  +++ b/arch/ia64/kvm/lapic.h
  @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct
  kvm_lapic_irq *irq);
   #define kvm_apic_present(x) (true)
   #define kvm_lapic_enabled(x) (true)
  +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
  +   struct kvm_lapic_irq *irq)
  +{
  +   /* IA64 has no apicv supporting, do nothing here */
  +}
  +
   #endif
  diff --git a/arch/x86/include/asm/kvm_host.h
  b/arch/x86/include/asm/kvm_host.h index c431b33..b63a144 100644 ---
  a/arch/x86/include/asm/kvm_host.h +++
  b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,11 @@ struct
  kvm_x86_ops {
  void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
  void (*enable_irq_window)(struct kvm_vcpu *vcpu);
  void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, 
  int irr);
  +   int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
  +   void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
  +   void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq
  *irq); +void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu); + 
  void
  (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
  int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
  int (*get_tdp_level)(void);
  u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
  is_mmio);
  
  EOI exit bitmap is problematic (its racy). Please do this:
  
  1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC
  register modifications which require EOI exit bitmap updates. 2. On
  VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC map
  and adjusts its own EOI exit bitmap VMCS registers.
  
  1) that waits until remote executing VCPUs have acknowledge the request,
  using make_all_cpus_request (see virt/kvm/kvm_main.c), similarly to
  remote TLB flushes.
  
  What is the problem now: there is no control over _when_ a VCPU
  updates its EOI exit bitmap VMCS register from the (remotely updated)
  master EOI exit bitmap. The VCPU can be processing a
  KVM_REQ_EOIBITMAP relative to a precedence IOAPIC register write
  while the current IOAPIC register write is updating the EOI exit
  bitmap. There is no way to fix that without locking (which can be
  avoided if the IOAPIC-EOI exit bitmap synchronization is vcpu local).
  
  The race is benign. We have similar one for interrupt injection and
  the same race exists on a real HW. The two cases that can happen due
  to the race are:
  
  1. exitbitmap bit X is changed from 1 to 0
No problem. It is harmless to do an exit, on the next entry
exitbitmap will be fixed. 2. exitbitmap bit X is changed from 0 to 1
If vcpu serves X at the time this happens it was delivered as edge,
so no need to exit. The exitbitmap will be updated after the next
vmexit which will happen due to KVM_REQ_EOIBITMAP processing.
  
  1. Missed the case where bitmap is being reset (where EOI exit bitmaps
  are zeroed). In this case vcpu enters guest with 

RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-27:
 On Thu, Dec 27, 2012 at 02:24:04AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-21:
 On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
 On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
 On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to
   guest. - A pending interrupt, which is masked by ISR, should be
   also considered in above update action, since hardware will
   decide when to inject it at right time. Current has_interrupt
   and get_interrupt only returns a valid vector from injection
   p.o.v.
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 ---
  arch/ia64/kvm/lapic.h   |6 ++
  arch/x86/include/asm/kvm_host.h |6 ++
  arch/x86/include/asm/vmx.h  |   11 +++ arch/x86/kvm/irq.c
 |   56 +- arch/x86/kvm/lapic.c|
 65
  ++--- arch/x86/kvm/lapic.h|   28 ++-
  arch/x86/kvm/svm.c  |   24 ++
 arch/x86/kvm/vmx.c
|  154 ++-
  arch/x86/kvm/x86.c  |   11 ++-
 include/linux/kvm_host.h
 |2 + virt/kvm/ioapic.c   |   36 +
  virt/kvm/ioapic.h   |1 + virt/kvm/irq_comm.c
   |   20 + 13 files changed, 379 insertions(+), 41
  deletions(-)
 diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
 index c5f92a9..cb59eb4 100644
 --- a/arch/ia64/kvm/lapic.h
 +++ b/arch/ia64/kvm/lapic.h
 @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu,
 struct
 kvm_lapic_irq *irq);
  #define kvm_apic_present(x) (true)
  #define kvm_lapic_enabled(x) (true)
 +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
 +   struct kvm_lapic_irq *irq)
 +{
 +   /* IA64 has no apicv supporting, do nothing here */
 +}
 +
  #endif
 diff --git a/arch/x86/include/asm/kvm_host.h
 b/arch/x86/include/asm/kvm_host.h index c431b33..b63a144 100644 ---
 a/arch/x86/include/asm/kvm_host.h +++
 b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,11 @@ struct
 kvm_x86_ops {
 void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
 void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int
 irr);
 +   int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
 +   void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
 +   void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq
 *irq); +void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu); + 
 void
 (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
 int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
 int (*get_tdp_level)(void);
 u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
 is_mmio);
 
 EOI exit bitmap is problematic (its racy). Please do this:
 
 1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC
 register modifications which require EOI exit bitmap updates. 2. On
 VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC
 map and adjusts its own EOI exit bitmap VMCS registers.
 
 1) that waits until remote executing VCPUs have acknowledge the
 request, using make_all_cpus_request (see virt/kvm/kvm_main.c),
 similarly to remote TLB flushes.
 
 What is the problem now: there is no control over _when_ a VCPU
 updates its EOI exit bitmap VMCS register from the (remotely updated)
 master EOI exit bitmap. The VCPU can be processing a
 KVM_REQ_EOIBITMAP relative to a precedence IOAPIC register write
 while the current IOAPIC register write is updating the EOI exit
 bitmap. There is no way to fix that without locking (which can be
 avoided if the IOAPIC-EOI exit bitmap synchronization is vcpu local).
 
 The race is benign. We have similar one for interrupt injection and
 the same race exists on a real HW. The two cases that can happen due
 to the race are:
 
 1. exitbitmap bit X is changed from 1 to 0
   No problem. It is harmless to do an exit, on the next entry
   exitbitmap will be fixed. 2. exitbitmap bit X is changed from 0 to 1
   If vcpu serves X at the time this happens it was delivered as edge,
   so no need to exit. The exitbitmap will be updated after the next
   vmexit which will happen due to KVM_REQ_EOIBITMAP processing.
 
 1. Missed the case where bitmap is being reset (where EOI exit bitmaps
 are zeroed). In this case vcpu enters guest with incorrect EOI exit
 bitmap.
 
 Right, the bitmap reset us problematic indeed 

RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-27:
 On Thu, Dec 27, 2012 at 03:32:47AM +, Zhang, Yang Z wrote:
 Marcelo Tosatti wrote on 2012-12-21:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 @@ -3925,6 +3942,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx
 *vmx)
vmx_secondary_exec_control(vmx));
}
 +  if (enable_apicv_reg_vid) {
 +  vmcs_write64(EOI_EXIT_BITMAP0, 0);
 +  vmcs_write64(EOI_EXIT_BITMAP1, 0);
 +  vmcs_write64(EOI_EXIT_BITMAP2, 0);
 +  vmcs_write64(EOI_EXIT_BITMAP3, 0);
 +
 +  vmcs_write16(GUEST_INTR_STATUS, 0);
 +  }
 
 AFAICS SVI should be regenerated on migration. Consider:
 
 1. vintr delivery, sets SVI = vector = RVI.
 2. clears RVI.
 3. migration.
 4. RVI properly set from VIRR on entry.
 5. SVI = 0.
 6. EOI - EOI virtualization with SVI = 0.
 
 Could hook into kvm_apic_post_state_restore() to do that (set highest
 index of bit set in VISR).
 Ok. How about to make a request(KVM_REQ_UPDATE_SVI) and handle it in
 vmentry to set highest index of bit in VISR to RVI.
 
 Just do it in kvm_apic_post_state_restore() directly, no need to do
 a request.
What you mean do it directly. Since we are not in target vcpu's context, we 
cannot access vmcs at this point. We still need a request or variable to 
indicate the migration happened.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Gleb Natapov
On Thu, Dec 27, 2012 at 06:34:37AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-27:
  On Thu, Dec 27, 2012 at 03:32:47AM +, Zhang, Yang Z wrote:
  Marcelo Tosatti wrote on 2012-12-21:
  On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  @@ -3925,6 +3942,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx
  *vmx)
   vmx_secondary_exec_control(vmx));
   }
  +if (enable_apicv_reg_vid) {
  +vmcs_write64(EOI_EXIT_BITMAP0, 0);
  +vmcs_write64(EOI_EXIT_BITMAP1, 0);
  +vmcs_write64(EOI_EXIT_BITMAP2, 0);
  +vmcs_write64(EOI_EXIT_BITMAP3, 0);
  +
  +vmcs_write16(GUEST_INTR_STATUS, 0);
  +}
  
  AFAICS SVI should be regenerated on migration. Consider:
  
  1. vintr delivery, sets SVI = vector = RVI.
  2. clears RVI.
  3. migration.
  4. RVI properly set from VIRR on entry.
  5. SVI = 0.
  6. EOI - EOI virtualization with SVI = 0.
  
  Could hook into kvm_apic_post_state_restore() to do that (set highest
  index of bit set in VISR).
  Ok. How about to make a request(KVM_REQ_UPDATE_SVI) and handle it in
  vmentry to set highest index of bit in VISR to RVI.
  
  Just do it in kvm_apic_post_state_restore() directly, no need to do
  a request.
 What you mean do it directly. Since we are not in target vcpu's context, we 
 cannot access vmcs at this point. We still need a request or variable to 
 indicate the migration happened.
 
We are in a target vcpu context.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-27:
 On Thu, Dec 27, 2012 at 06:34:37AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2012-12-27:
 On Thu, Dec 27, 2012 at 03:32:47AM +, Zhang, Yang Z wrote:
 Marcelo Tosatti wrote on 2012-12-21:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 @@ -3925,6 +3942,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx
 *vmx)
  vmx_secondary_exec_control(vmx));
  }
 +if (enable_apicv_reg_vid) {
 +vmcs_write64(EOI_EXIT_BITMAP0, 0);
 +vmcs_write64(EOI_EXIT_BITMAP1, 0);
 +vmcs_write64(EOI_EXIT_BITMAP2, 0);
 +vmcs_write64(EOI_EXIT_BITMAP3, 0);
 +
 +vmcs_write16(GUEST_INTR_STATUS, 0);
 +}
 
 AFAICS SVI should be regenerated on migration. Consider:
 
 1. vintr delivery, sets SVI = vector = RVI.
 2. clears RVI.
 3. migration.
 4. RVI properly set from VIRR on entry.
 5. SVI = 0.
 6. EOI - EOI virtualization with SVI = 0.
 
 Could hook into kvm_apic_post_state_restore() to do that (set highest
 index of bit set in VISR).
 Ok. How about to make a request(KVM_REQ_UPDATE_SVI) and handle it in
 vmentry to set highest index of bit in VISR to RVI.
 
 Just do it in kvm_apic_post_state_restore() directly, no need to do
 a request.
 What you mean do it directly. Since we are not in target vcpu's context, we
 cannot access vmcs at this point. We still need a request or variable to 
 indicate
 the migration happened.
 
 We are in a target vcpu context.
Ok, I misread the code.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-24 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2012-12-21:
 On Thu, Dec 20, 2012 at 03:12:32PM +0200, Gleb Natapov wrote:
 On Thu, Dec 20, 2012 at 10:53:16AM -0200, Marcelo Tosatti wrote:
 On Thu, Dec 20, 2012 at 08:42:06AM +0200, Gleb Natapov wrote:
 On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 
 
 Resuming previous discussion:
 
 How about to recaculate irr_pending according the VIRR on each
 vmexit?
 
 No need really. Since HW can only clear VIRR the only situation that
 may
 happen is that irr_pending will be true but VIRR is empty and
 apic_find_highest_irr() will return correct result in this case.
 
 Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI
 Virtualization.
 
 True. But as I said later in that discussion once irr_pending is set
 to true it never becomes false, so the optimization is effectively
 disable. We can set it to true doing apic initialization to make it
 explicit.
 
 Its just confusing, to have a variable which has different meanings
 in different configurations. I would rather have it explicit that
 its not used rather than check every time the i read the code.
 
 if (apic_vid() == 0  !apic-irr_pending)
 return -1;
 
 I'd prefer to avoid this additional if() especially as its sole purpose
 is documentation.  We can add comment instead. Note that irr_pending
 is just a hint anyway.  It can be true when no interrupt is pending in
 irr. We can even rename it to irr_pending_hint or something.
 
 Works for me (documentation).
 
 Not sure if you can skip it, its probably necessary to calculate it
 before HW does so (say migration etc).
 kvm_apic_has_interrupt() is not called during migration and
 kvm_apic_post_state_restore() calls apic_update_ppr() explicitly.
 I am not sure it is needed though since migrated value should be already
 correct anyway.
 
 Ok, best force isr_count to 1 if apic vintr enabled (and add a comment,
 please).

Sorry for the later reply. I will address those problems according your 
comments.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-21 Thread Marcelo Tosatti
On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
 On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
  On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
   From: Yang Zhang yang.z.zh...@intel.com
   
   Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
   manually, which is fully taken care of by the hardware. This needs
   some special awareness into existing interrupr injection path:
   
   - for pending interrupt, instead of direct injection, we may need
 update architecture specific indicators before resuming to guest.
   
   - A pending interrupt, which is masked by ISR, should be also
 considered in above update action, since hardware will decide
 when to inject it at right time. Current has_interrupt and
 get_interrupt only returns a valid vector from injection p.o.v.
   
   Signed-off-by: Kevin Tian kevin.t...@intel.com
   Signed-off-by: Yang Zhang yang.z.zh...@intel.com
   ---
arch/ia64/kvm/lapic.h   |6 ++
arch/x86/include/asm/kvm_host.h |6 ++
arch/x86/include/asm/vmx.h  |   11 +++
arch/x86/kvm/irq.c  |   56 +-
arch/x86/kvm/lapic.c|   65 ++---
arch/x86/kvm/lapic.h|   28 ++-
arch/x86/kvm/svm.c  |   24 ++
arch/x86/kvm/vmx.c  |  154 
   ++-
arch/x86/kvm/x86.c  |   11 ++-
include/linux/kvm_host.h|2 +
virt/kvm/ioapic.c   |   36 +
virt/kvm/ioapic.h   |1 +
virt/kvm/irq_comm.c |   20 +
13 files changed, 379 insertions(+), 41 deletions(-)
   
   diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
   index c5f92a9..cb59eb4 100644
   --- a/arch/ia64/kvm/lapic.h
   +++ b/arch/ia64/kvm/lapic.h
   @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
   kvm_lapic_irq *irq);
#define kvm_apic_present(x) (true)
#define kvm_lapic_enabled(x) (true)

   +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
   + struct kvm_lapic_irq *irq)
   +{
   + /* IA64 has no apicv supporting, do nothing here */
   +}
   +
#endif
   diff --git a/arch/x86/include/asm/kvm_host.h 
   b/arch/x86/include/asm/kvm_host.h
   index c431b33..b63a144 100644
   --- a/arch/x86/include/asm/kvm_host.h
   +++ b/arch/x86/include/asm/kvm_host.h
   @@ -697,6 +697,11 @@ struct kvm_x86_ops {
 void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
 void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
   + int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
   + void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
   + void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq *irq);
   + void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu);
   + void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
 int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
 int (*get_tdp_level)(void);
 u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
  
  EOI exit bitmap is problematic (its racy). Please do this:
  
  1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC register
  modifications which require EOI exit bitmap updates.
  2. On VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC map
  and adjusts its own EOI exit bitmap VMCS registers.
  
  1) that waits until remote executing VCPUs have acknowledge the request,
  using make_all_cpus_request (see virt/kvm/kvm_main.c), similarly to
  remote TLB flushes.
  
  What is the problem now: there is no control over _when_ a VCPU updates
  its EOI exit bitmap VMCS register from the (remotely updated) master EOI
  exit bitmap. The VCPU can be processing a KVM_REQ_EOIBITMAP relative to
  a precedence IOAPIC register write while the current IOAPIC register
  write is updating the EOI exit bitmap. There is no way to fix that
  without locking (which can be avoided if the IOAPIC-EOI exit bitmap
  synchronization is vcpu local).
  
 The race is benign. We have similar one for interrupt injection and the same 
 race
 exists on a real HW. The two cases that can happen due to the race are:
 
 1. exitbitmap bit X is changed from 1 to 0
   No problem. It is harmless to do an exit, on the next entry exitbitmap
   will be fixed.
 2. exitbitmap bit X is changed from 0 to 1
   If vcpu serves X at the time this happens it was delivered as edge, so
   no need to exit. The exitbitmap will be updated after the next vmexit
   which will happen due to KVM_REQ_EOIBITMAP processing.

1. Missed the case where bitmap is being reset (where EOI exit bitmaps
are zeroed). In this case vcpu enters guest with incorrect EOI exit
bitmap.

2. Missed the case where current code allows vcpu to enter guest 
with EOI exit bitmap unsynchronized relative to IOAPIC registers
(see one KVM_REQ 

Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-21 Thread Gleb Natapov
On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
 On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
  On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
   On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
From: Yang Zhang yang.z.zh...@intel.com

Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:

- for pending interrupt, instead of direct injection, we may need
  update architecture specific indicators before resuming to guest.

- A pending interrupt, which is masked by ISR, should be also
  considered in above update action, since hardware will decide
  when to inject it at right time. Current has_interrupt and
  get_interrupt only returns a valid vector from injection p.o.v.

Signed-off-by: Kevin Tian kevin.t...@intel.com
Signed-off-by: Yang Zhang yang.z.zh...@intel.com
---
 arch/ia64/kvm/lapic.h   |6 ++
 arch/x86/include/asm/kvm_host.h |6 ++
 arch/x86/include/asm/vmx.h  |   11 +++
 arch/x86/kvm/irq.c  |   56 +-
 arch/x86/kvm/lapic.c|   65 ++---
 arch/x86/kvm/lapic.h|   28 ++-
 arch/x86/kvm/svm.c  |   24 ++
 arch/x86/kvm/vmx.c  |  154 
++-
 arch/x86/kvm/x86.c  |   11 ++-
 include/linux/kvm_host.h|2 +
 virt/kvm/ioapic.c   |   36 +
 virt/kvm/ioapic.h   |1 +
 virt/kvm/irq_comm.c |   20 +
 13 files changed, 379 insertions(+), 41 deletions(-)

diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
index c5f92a9..cb59eb4 100644
--- a/arch/ia64/kvm/lapic.h
+++ b/arch/ia64/kvm/lapic.h
@@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
kvm_lapic_irq *irq);
 #define kvm_apic_present(x) (true)
 #define kvm_lapic_enabled(x) (true)
 
+static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
+   struct kvm_lapic_irq *irq)
+{
+   /* IA64 has no apicv supporting, do nothing here */
+}
+
 #endif
diff --git a/arch/x86/include/asm/kvm_host.h 
b/arch/x86/include/asm/kvm_host.h
index c431b33..b63a144 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -697,6 +697,11 @@ struct kvm_x86_ops {
void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
void (*enable_irq_window)(struct kvm_vcpu *vcpu);
void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, 
int irr);
+   int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
+   void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
+   void (*update_eoi_exitmap)(struct kvm *kvm, struct 
kvm_lapic_irq *irq);
+   void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu);
+   void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
int (*get_tdp_level)(void);
u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool 
is_mmio);
   
   EOI exit bitmap is problematic (its racy). Please do this:
   
   1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC register
   modifications which require EOI exit bitmap updates.
   2. On VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC 
   map
   and adjusts its own EOI exit bitmap VMCS registers.
   
   1) that waits until remote executing VCPUs have acknowledge the request,
   using make_all_cpus_request (see virt/kvm/kvm_main.c), similarly to
   remote TLB flushes.
   
   What is the problem now: there is no control over _when_ a VCPU updates
   its EOI exit bitmap VMCS register from the (remotely updated) master EOI
   exit bitmap. The VCPU can be processing a KVM_REQ_EOIBITMAP relative to
   a precedence IOAPIC register write while the current IOAPIC register
   write is updating the EOI exit bitmap. There is no way to fix that
   without locking (which can be avoided if the IOAPIC-EOI exit bitmap
   synchronization is vcpu local).
   
  The race is benign. We have similar one for interrupt injection and the 
  same race
  exists on a real HW. The two cases that can happen due to the race are:
  
  1. exitbitmap bit X is changed from 1 to 0
No problem. It is harmless to do an exit, on the next entry exitbitmap
will be fixed.
  2. exitbitmap bit X is changed from 0 to 1
If vcpu serves X at the time this happens it was delivered as edge, so
no need to exit. The exitbitmap will be updated after the next vmexit
which will happen due to KVM_REQ_EOIBITMAP processing.
 
 1. Missed the case where bitmap is 

Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Marcelo Tosatti
On Thu, Dec 20, 2012 at 08:42:06AM +0200, Gleb Natapov wrote:
 On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
  On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
   From: Yang Zhang yang.z.zh...@intel.com
   
   Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
   manually, which is fully taken care of by the hardware. This needs
   some special awareness into existing interrupr injection path:
   
   - for pending interrupt, instead of direct injection, we may need
 update architecture specific indicators before resuming to guest.
   
   - A pending interrupt, which is masked by ISR, should be also
 considered in above update action, since hardware will decide
 when to inject it at right time. Current has_interrupt and
 get_interrupt only returns a valid vector from injection p.o.v.
   
   Signed-off-by: Kevin Tian kevin.t...@intel.com
   Signed-off-by: Yang Zhang yang.z.zh...@intel.com
  
  
  Resuming previous discussion:
  
How about to recaculate irr_pending according the VIRR on each
vmexit?

   No need really. Since HW can only clear VIRR the only situation that
   may
   happen is that irr_pending will be true but VIRR is empty and
   apic_find_highest_irr() will return correct result in this case.
  
  Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI 
  Virtualization.
 
 True. But as I said later in that discussion once irr_pending is set
 to true it never becomes false, so the optimization is effectively
 disable. We can set it to true doing apic initialization to make it
 explicit.

Its just confusing, to have a variable which has different meanings
in different configurations. I would rather have it explicit that 
its not used rather than check every time the i read the code.

if (apic_vid() == 0  !apic-irr_pending)
return -1;

  Also, an example of problem with ISR caching optimization (which was written
  with ISR/IRR management entirely in software):
  
  isr_count variable is never incremented (because HW sets ISR):
  kvm_cpu_get_interrupt does not call into kvm_get_apic_interrupt 
  if virtual interrupt delivery enabled.
  
  Therefore apic_find_highest_isr can return -1 even though VISR
  is not zero.
  
  In that case (VISR non-zero but apic_find_highest_isr() == -1), 
  apic_update_ppr does not correctly set
  
  PPR = max(ISRV, TPR)
  
  Which can result in kvm_arch_vcpu_runnable returning 1 when it should not.
  
  Please disable usage of isr_count if virtual interrupt delivery enabled.
  Given that self-IPI writes to VIRR, also please disable usage of
  highest_isr_cache and irr_pending.
  
 Good catch about isr_count. We can set it to 1 during apic
 initialization too, or skip call to apic_update_ppr() in
 kvm_apic_has_interrupt() if vid is enabled.

Not sure if you can skip it, its probably necessary to calculate it
before HW does so (say migration etc).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Marcelo Tosatti
On Thu, Dec 20, 2012 at 08:51:32AM +0200, Gleb Natapov wrote:
 On Wed, Dec 19, 2012 at 11:26:30PM -0200, Marcelo Tosatti wrote:
  On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
   From: Yang Zhang yang.z.zh...@intel.com
   
   Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
   manually, which is fully taken care of by the hardware. This needs
   some special awareness into existing interrupr injection path:
   
   - for pending interrupt, instead of direct injection, we may need
 update architecture specific indicators before resuming to guest.
   
   - A pending interrupt, which is masked by ISR, should be also
 considered in above update action, since hardware will decide
 when to inject it at right time. Current has_interrupt and
 get_interrupt only returns a valid vector from injection p.o.v.
   
   Signed-off-by: Kevin Tian kevin.t...@intel.com
   Signed-off-by: Yang Zhang yang.z.zh...@intel.com
  
   + if (enable_apicv_reg_vid)
   + kvm_x86_ops-update_cr8_intercept = NULL;
   + else
   + kvm_x86_ops-update_apic_irq = NULL;
  
  Loading the module with enable_apicv=0, then enable_apicv=1, 
  fails?
 It is not changeable after modules is loaded. This is true for all
 kvm-intel.ko parameters.

Are you sure? Works just fine here.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Marcelo Tosatti
On Thu, Dec 20, 2012 at 11:01:42AM -0200, Marcelo Tosatti wrote:
 On Thu, Dec 20, 2012 at 08:51:32AM +0200, Gleb Natapov wrote:
  On Wed, Dec 19, 2012 at 11:26:30PM -0200, Marcelo Tosatti wrote:
   On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
From: Yang Zhang yang.z.zh...@intel.com

Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:

- for pending interrupt, instead of direct injection, we may need
  update architecture specific indicators before resuming to guest.

- A pending interrupt, which is masked by ISR, should be also
  considered in above update action, since hardware will decide
  when to inject it at right time. Current has_interrupt and
  get_interrupt only returns a valid vector from injection p.o.v.

Signed-off-by: Kevin Tian kevin.t...@intel.com
Signed-off-by: Yang Zhang yang.z.zh...@intel.com
   
+   if (enable_apicv_reg_vid)
+   kvm_x86_ops-update_cr8_intercept = NULL;
+   else
+   kvm_x86_ops-update_apic_irq = NULL;
   
   Loading the module with enable_apicv=0, then enable_apicv=1, 
   fails?
  It is not changeable after modules is loaded. This is true for all
  kvm-intel.ko parameters.
 
 Are you sure? Works just fine here.

Its OK, nevermind.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Gleb Natapov
On Thu, Dec 20, 2012 at 10:53:16AM -0200, Marcelo Tosatti wrote:
 On Thu, Dec 20, 2012 at 08:42:06AM +0200, Gleb Natapov wrote:
  On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
   On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
From: Yang Zhang yang.z.zh...@intel.com

Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:

- for pending interrupt, instead of direct injection, we may need
  update architecture specific indicators before resuming to guest.

- A pending interrupt, which is masked by ISR, should be also
  considered in above update action, since hardware will decide
  when to inject it at right time. Current has_interrupt and
  get_interrupt only returns a valid vector from injection p.o.v.

Signed-off-by: Kevin Tian kevin.t...@intel.com
Signed-off-by: Yang Zhang yang.z.zh...@intel.com
   
   
   Resuming previous discussion:
   
 How about to recaculate irr_pending according the VIRR on each
 vmexit?
 
No need really. Since HW can only clear VIRR the only situation that
may
happen is that irr_pending will be true but VIRR is empty and
apic_find_highest_irr() will return correct result in this case.
   
   Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI 
   Virtualization.
  
  True. But as I said later in that discussion once irr_pending is set
  to true it never becomes false, so the optimization is effectively
  disable. We can set it to true doing apic initialization to make it
  explicit.
 
 Its just confusing, to have a variable which has different meanings
 in different configurations. I would rather have it explicit that 
 its not used rather than check every time the i read the code.
 
 if (apic_vid() == 0  !apic-irr_pending)
   return -1;
 
I'd prefer to avoid this additional if() especially as its sole purpose
is documentation.  We can add comment instead. Note that irr_pending
is just a hint anyway.  It can be true when no interrupt is pending in
irr. We can even rename it to irr_pending_hint or something.

   Also, an example of problem with ISR caching optimization (which was 
   written
   with ISR/IRR management entirely in software):
   
   isr_count variable is never incremented (because HW sets ISR):
   kvm_cpu_get_interrupt does not call into kvm_get_apic_interrupt 
   if virtual interrupt delivery enabled.
   
   Therefore apic_find_highest_isr can return -1 even though VISR
   is not zero.
   
   In that case (VISR non-zero but apic_find_highest_isr() == -1), 
   apic_update_ppr does not correctly set
   
   PPR = max(ISRV, TPR)
   
   Which can result in kvm_arch_vcpu_runnable returning 1 when it should not.
   
   Please disable usage of isr_count if virtual interrupt delivery enabled.
   Given that self-IPI writes to VIRR, also please disable usage of
   highest_isr_cache and irr_pending.
   
  Good catch about isr_count. We can set it to 1 during apic
  initialization too, or skip call to apic_update_ppr() in
  kvm_apic_has_interrupt() if vid is enabled.
 
 Not sure if you can skip it, its probably necessary to calculate it
 before HW does so (say migration etc).
kvm_apic_has_interrupt() is not called during migration and
kvm_apic_post_state_restore() calls apic_update_ppr() explicitly. I am
not sure it is needed though since migrated value should be already
correct anyway.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Marcelo Tosatti
On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 ---
  arch/ia64/kvm/lapic.h   |6 ++
  arch/x86/include/asm/kvm_host.h |6 ++
  arch/x86/include/asm/vmx.h  |   11 +++
  arch/x86/kvm/irq.c  |   56 +-
  arch/x86/kvm/lapic.c|   65 ++---
  arch/x86/kvm/lapic.h|   28 ++-
  arch/x86/kvm/svm.c  |   24 ++
  arch/x86/kvm/vmx.c  |  154 
 ++-
  arch/x86/kvm/x86.c  |   11 ++-
  include/linux/kvm_host.h|2 +
  virt/kvm/ioapic.c   |   36 +
  virt/kvm/ioapic.h   |1 +
  virt/kvm/irq_comm.c |   20 +
  13 files changed, 379 insertions(+), 41 deletions(-)
 
 diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
 index c5f92a9..cb59eb4 100644
 --- a/arch/ia64/kvm/lapic.h
 +++ b/arch/ia64/kvm/lapic.h
 @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
 kvm_lapic_irq *irq);
  #define kvm_apic_present(x) (true)
  #define kvm_lapic_enabled(x) (true)
  
 +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
 + struct kvm_lapic_irq *irq)
 +{
 + /* IA64 has no apicv supporting, do nothing here */
 +}
 +
  #endif
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index c431b33..b63a144 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -697,6 +697,11 @@ struct kvm_x86_ops {
   void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
   void (*enable_irq_window)(struct kvm_vcpu *vcpu);
   void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
 + int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
 + void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
 + void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq *irq);
 + void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu);
 + void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
   int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
   int (*get_tdp_level)(void);
   u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
 @@ -991,6 +996,7 @@ int kvm_age_hva(struct kvm *kvm, unsigned long hva);
  int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
  void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
  int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
 +int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v);
  int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
  int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
  int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
 diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
 index 44c3f7e..d1ab331 100644
 --- a/arch/x86/include/asm/vmx.h
 +++ b/arch/x86/include/asm/vmx.h
 @@ -62,6 +62,7 @@
  #define EXIT_REASON_MCE_DURING_VMENTRY  41
  #define EXIT_REASON_TPR_BELOW_THRESHOLD 43
  #define EXIT_REASON_APIC_ACCESS 44
 +#define EXIT_REASON_EOI_INDUCED 45
  #define EXIT_REASON_EPT_VIOLATION   48
  #define EXIT_REASON_EPT_MISCONFIG   49
  #define EXIT_REASON_WBINVD  54
 @@ -143,6 +144,7 @@
  #define SECONDARY_EXEC_WBINVD_EXITING0x0040
  #define SECONDARY_EXEC_UNRESTRICTED_GUEST0x0080
  #define SECONDARY_EXEC_APIC_REGISTER_VIRT   0x0100
 +#define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY0x0200
  #define SECONDARY_EXEC_PAUSE_LOOP_EXITING0x0400
  #define SECONDARY_EXEC_ENABLE_INVPCID0x1000
  
 @@ -180,6 +182,7 @@ enum vmcs_field {
   GUEST_GS_SELECTOR   = 0x080a,
   GUEST_LDTR_SELECTOR = 0x080c,
   GUEST_TR_SELECTOR   = 0x080e,
 + GUEST_INTR_STATUS   = 0x0810,
   HOST_ES_SELECTOR= 0x0c00,
   HOST_CS_SELECTOR= 0x0c02,
   HOST_SS_SELECTOR= 0x0c04,
 @@ -207,6 +210,14 @@ enum vmcs_field {
   APIC_ACCESS_ADDR_HIGH   = 0x2015,
   EPT_POINTER = 0x201a,
   EPT_POINTER_HIGH= 0x201b,
 + EOI_EXIT_BITMAP0= 0x201c,
 + 

Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Marcelo Tosatti
On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 ---
  arch/ia64/kvm/lapic.h   |6 ++
  arch/x86/include/asm/kvm_host.h |6 ++
  arch/x86/include/asm/vmx.h  |   11 +++
  arch/x86/kvm/irq.c  |   56 +-
  arch/x86/kvm/lapic.c|   65 ++---
  arch/x86/kvm/lapic.h|   28 ++-
  arch/x86/kvm/svm.c  |   24 ++
  arch/x86/kvm/vmx.c  |  154 
 ++-
  arch/x86/kvm/x86.c  |   11 ++-
  include/linux/kvm_host.h|2 +
  virt/kvm/ioapic.c   |   36 +
  virt/kvm/ioapic.h   |1 +
  virt/kvm/irq_comm.c |   20 +
  13 files changed, 379 insertions(+), 41 deletions(-)
 
 diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
 index c5f92a9..cb59eb4 100644
 --- a/arch/ia64/kvm/lapic.h
 +++ b/arch/ia64/kvm/lapic.h
 @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
 kvm_lapic_irq *irq);
  #define kvm_apic_present(x) (true)
  #define kvm_lapic_enabled(x) (true)
  
 +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
 + struct kvm_lapic_irq *irq)
 +{
 + /* IA64 has no apicv supporting, do nothing here */
 +}
 +
  #endif
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index c431b33..b63a144 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -697,6 +697,11 @@ struct kvm_x86_ops {
   void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
   void (*enable_irq_window)(struct kvm_vcpu *vcpu);
   void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
 + int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
 + void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
 + void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq *irq);
 + void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu);
 + void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
   int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
   int (*get_tdp_level)(void);
   u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);

EOI exit bitmap is problematic (its racy). Please do this:

1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC register
modifications which require EOI exit bitmap updates.
2. On VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC map
and adjusts its own EOI exit bitmap VMCS registers.

1) that waits until remote executing VCPUs have acknowledge the request,
using make_all_cpus_request (see virt/kvm/kvm_main.c), similarly to
remote TLB flushes.

What is the problem now: there is no control over _when_ a VCPU updates
its EOI exit bitmap VMCS register from the (remotely updated) master EOI
exit bitmap. The VCPU can be processing a KVM_REQ_EOIBITMAP relative to
a precedence IOAPIC register write while the current IOAPIC register
write is updating the EOI exit bitmap. There is no way to fix that
without locking (which can be avoided if the IOAPIC-EOI exit bitmap
synchronization is vcpu local).

Questions below.

 @@ -236,12 +225,14 @@ static inline void kvm_apic_set_id(struct kvm_lapic 
 *apic, u8 id)
  {
   apic_set_reg(apic, APIC_ID, id  24);
   recalculate_apic_map(apic-vcpu-kvm);
 + ioapic_update_eoi_exitmap(apic-vcpu-kvm);
  }

Why is it necessary to update EOI exit bitmap here?

  static inline void kvm_apic_set_ldr(struct kvm_lapic *apic, u32 id)
  {
   apic_set_reg(apic, APIC_LDR, id);
   recalculate_apic_map(apic-vcpu-kvm);
 + ioapic_update_eoi_exitmap(apic-vcpu-kvm);
  }

And here?

 +/*
 + * this interface assumes a trap-like exit, which has already finished
 + * desired side effect including vISR and vPPR update.
 + */
 +void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector)
 +{
 + struct kvm_lapic *apic = vcpu-arch.apic;
 +
 + trace_kvm_eoi(apic, vector);
 +
 + kvm_ioapic_send_eoi(apic, vector);
 + kvm_make_request(KVM_REQ_EVENT, apic-vcpu);
 +}

Why is it necessary to set KVM_REQ_EVENT here?

   apic_set_reg(apic, APIC_DFR, val | 0x0FFF);
   

Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Marcelo Tosatti
On Thu, Dec 20, 2012 at 03:12:32PM +0200, Gleb Natapov wrote:
 On Thu, Dec 20, 2012 at 10:53:16AM -0200, Marcelo Tosatti wrote:
  On Thu, Dec 20, 2012 at 08:42:06AM +0200, Gleb Natapov wrote:
   On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com


Resuming previous discussion:

  How about to recaculate irr_pending according the VIRR on each
  vmexit?
  
 No need really. Since HW can only clear VIRR the only situation that
 may
 happen is that irr_pending will be true but VIRR is empty and
 apic_find_highest_irr() will return correct result in this case.

Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI 
Virtualization.
   
   True. But as I said later in that discussion once irr_pending is set
   to true it never becomes false, so the optimization is effectively
   disable. We can set it to true doing apic initialization to make it
   explicit.
  
  Its just confusing, to have a variable which has different meanings
  in different configurations. I would rather have it explicit that 
  its not used rather than check every time the i read the code.
  
  if (apic_vid() == 0  !apic-irr_pending)
  return -1;
  
 I'd prefer to avoid this additional if() especially as its sole purpose
 is documentation.  We can add comment instead. Note that irr_pending
 is just a hint anyway.  It can be true when no interrupt is pending in
 irr. We can even rename it to irr_pending_hint or something.

Works for me (documentation).

  Not sure if you can skip it, its probably necessary to calculate it
  before HW does so (say migration etc).
 kvm_apic_has_interrupt() is not called during migration and
 kvm_apic_post_state_restore() calls apic_update_ppr() explicitly.
 I am not sure it is needed though since migrated value should be already
 correct anyway.

Ok, best force isr_count to 1 if apic vintr enabled (and add a comment,
please).

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-20 Thread Gleb Natapov
On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
  manually, which is fully taken care of by the hardware. This needs
  some special awareness into existing interrupr injection path:
  
  - for pending interrupt, instead of direct injection, we may need
update architecture specific indicators before resuming to guest.
  
  - A pending interrupt, which is masked by ISR, should be also
considered in above update action, since hardware will decide
when to inject it at right time. Current has_interrupt and
get_interrupt only returns a valid vector from injection p.o.v.
  
  Signed-off-by: Kevin Tian kevin.t...@intel.com
  Signed-off-by: Yang Zhang yang.z.zh...@intel.com
  ---
   arch/ia64/kvm/lapic.h   |6 ++
   arch/x86/include/asm/kvm_host.h |6 ++
   arch/x86/include/asm/vmx.h  |   11 +++
   arch/x86/kvm/irq.c  |   56 +-
   arch/x86/kvm/lapic.c|   65 ++---
   arch/x86/kvm/lapic.h|   28 ++-
   arch/x86/kvm/svm.c  |   24 ++
   arch/x86/kvm/vmx.c  |  154 
  ++-
   arch/x86/kvm/x86.c  |   11 ++-
   include/linux/kvm_host.h|2 +
   virt/kvm/ioapic.c   |   36 +
   virt/kvm/ioapic.h   |1 +
   virt/kvm/irq_comm.c |   20 +
   13 files changed, 379 insertions(+), 41 deletions(-)
  
  diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
  index c5f92a9..cb59eb4 100644
  --- a/arch/ia64/kvm/lapic.h
  +++ b/arch/ia64/kvm/lapic.h
  @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
  kvm_lapic_irq *irq);
   #define kvm_apic_present(x) (true)
   #define kvm_lapic_enabled(x) (true)
   
  +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
  +   struct kvm_lapic_irq *irq)
  +{
  +   /* IA64 has no apicv supporting, do nothing here */
  +}
  +
   #endif
  diff --git a/arch/x86/include/asm/kvm_host.h 
  b/arch/x86/include/asm/kvm_host.h
  index c431b33..b63a144 100644
  --- a/arch/x86/include/asm/kvm_host.h
  +++ b/arch/x86/include/asm/kvm_host.h
  @@ -697,6 +697,11 @@ struct kvm_x86_ops {
  void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
  void (*enable_irq_window)(struct kvm_vcpu *vcpu);
  void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
  +   int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
  +   void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
  +   void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq *irq);
  +   void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu);
  +   void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
  int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
  int (*get_tdp_level)(void);
  u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
 
 EOI exit bitmap is problematic (its racy). Please do this:
 
 1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC register
 modifications which require EOI exit bitmap updates.
 2. On VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC map
 and adjusts its own EOI exit bitmap VMCS registers.
 
 1) that waits until remote executing VCPUs have acknowledge the request,
 using make_all_cpus_request (see virt/kvm/kvm_main.c), similarly to
 remote TLB flushes.
 
 What is the problem now: there is no control over _when_ a VCPU updates
 its EOI exit bitmap VMCS register from the (remotely updated) master EOI
 exit bitmap. The VCPU can be processing a KVM_REQ_EOIBITMAP relative to
 a precedence IOAPIC register write while the current IOAPIC register
 write is updating the EOI exit bitmap. There is no way to fix that
 without locking (which can be avoided if the IOAPIC-EOI exit bitmap
 synchronization is vcpu local).
 
The race is benign. We have similar one for interrupt injection and the same 
race
exists on a real HW. The two cases that can happen due to the race are:

1. exitbitmap bit X is changed from 1 to 0
  No problem. It is harmless to do an exit, on the next entry exitbitmap
  will be fixed.
2. exitbitmap bit X is changed from 0 to 1
  If vcpu serves X at the time this happens it was delivered as edge, so
  no need to exit. The exitbitmap will be updated after the next vmexit
  which will happen due to KVM_REQ_EOIBITMAP processing.

But software really should take care of not changing interrupt vector
configuration while there is an interrupt in flight with the same vector.

 Questions below.
 
  @@ -236,12 +225,14 @@ static inline void kvm_apic_set_id(struct kvm_lapic 
  *apic, u8 id)
   {
  apic_set_reg(apic, APIC_ID, id  24);
  recalculate_apic_map(apic-vcpu-kvm);
  +   ioapic_update_eoi_exitmap(apic-vcpu-kvm);
   }
 
 Why is it necessary to update 

Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-19 Thread Marcelo Tosatti
On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com

 + if (enable_apicv_reg_vid)
 + kvm_x86_ops-update_cr8_intercept = NULL;
 + else
 + kvm_x86_ops-update_apic_irq = NULL;

Loading the module with enable_apicv=0, then enable_apicv=1, 
fails?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-19 Thread Marcelo Tosatti
On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
 manually, which is fully taken care of by the hardware. This needs
 some special awareness into existing interrupr injection path:
 
 - for pending interrupt, instead of direct injection, we may need
   update architecture specific indicators before resuming to guest.
 
 - A pending interrupt, which is masked by ISR, should be also
   considered in above update action, since hardware will decide
   when to inject it at right time. Current has_interrupt and
   get_interrupt only returns a valid vector from injection p.o.v.
 
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com


Resuming previous discussion:

  How about to recaculate irr_pending according the VIRR on each
  vmexit?
  
 No need really. Since HW can only clear VIRR the only situation that
 may
 happen is that irr_pending will be true but VIRR is empty and
 apic_find_highest_irr() will return correct result in this case.

Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI 
Virtualization.

Also, an example of problem with ISR caching optimization (which was written
with ISR/IRR management entirely in software):

isr_count variable is never incremented (because HW sets ISR):
kvm_cpu_get_interrupt does not call into kvm_get_apic_interrupt 
if virtual interrupt delivery enabled.

Therefore apic_find_highest_isr can return -1 even though VISR
is not zero.

In that case (VISR non-zero but apic_find_highest_isr() == -1), 
apic_update_ppr does not correctly set

PPR = max(ISRV, TPR)

Which can result in kvm_arch_vcpu_runnable returning 1 when it should not.

Please disable usage of isr_count if virtual interrupt delivery enabled.
Given that self-IPI writes to VIRR, also please disable usage of
highest_isr_cache and irr_pending.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-19 Thread Marcelo Tosatti
On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
  manually, which is fully taken care of by the hardware. This needs
  some special awareness into existing interrupr injection path:
  
  - for pending interrupt, instead of direct injection, we may need
update architecture specific indicators before resuming to guest.
  
  - A pending interrupt, which is masked by ISR, should be also
considered in above update action, since hardware will decide
when to inject it at right time. Current has_interrupt and
get_interrupt only returns a valid vector from injection p.o.v.
  
  Signed-off-by: Kevin Tian kevin.t...@intel.com
  Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 
 
 Resuming previous discussion:
 
   How about to recaculate irr_pending according the VIRR on each
   vmexit?
   
  No need really. Since HW can only clear VIRR the only situation that
  may
  happen is that irr_pending will be true but VIRR is empty and
  apic_find_highest_irr() will return correct result in this case.
 
 Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI 
 Virtualization.
 
 Also, an example of problem with ISR caching optimization (which was written
 with ISR/IRR management entirely in software):
 
 isr_count variable is never incremented (because HW sets ISR):
 kvm_cpu_get_interrupt does not call into kvm_get_apic_interrupt 
 if virtual interrupt delivery enabled.
 
 Therefore apic_find_highest_isr can return -1 even though VISR
 is not zero.
 
 In that case (VISR non-zero but apic_find_highest_isr() == -1), 
 apic_update_ppr does not correctly set
 
 PPR = max(ISRV, TPR)
 
 Which can result in kvm_arch_vcpu_runnable returning 1 when it should not.
 
 Please disable usage of isr_count if virtual interrupt delivery enabled.
 Given that self-IPI writes to VIRR, also please disable usage of
 highest_isr_cache and irr_pending.

BTW, no need to update irr_pending or the other caching variables on
VM-exit. Just skip them if virtual interrupt delivery is enabled.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-19 Thread Gleb Natapov
On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
  manually, which is fully taken care of by the hardware. This needs
  some special awareness into existing interrupr injection path:
  
  - for pending interrupt, instead of direct injection, we may need
update architecture specific indicators before resuming to guest.
  
  - A pending interrupt, which is masked by ISR, should be also
considered in above update action, since hardware will decide
when to inject it at right time. Current has_interrupt and
get_interrupt only returns a valid vector from injection p.o.v.
  
  Signed-off-by: Kevin Tian kevin.t...@intel.com
  Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 
 
 Resuming previous discussion:
 
   How about to recaculate irr_pending according the VIRR on each
   vmexit?
   
  No need really. Since HW can only clear VIRR the only situation that
  may
  happen is that irr_pending will be true but VIRR is empty and
  apic_find_highest_irr() will return correct result in this case.
 
 Self-IPI does cause VIRR to be set, see 29.1.5 Self-IPI 
 Virtualization.

True. But as I said later in that discussion once irr_pending is set
to true it never becomes false, so the optimization is effectively
disable. We can set it to true doing apic initialization to make it
explicit.

 Also, an example of problem with ISR caching optimization (which was written
 with ISR/IRR management entirely in software):
 
 isr_count variable is never incremented (because HW sets ISR):
 kvm_cpu_get_interrupt does not call into kvm_get_apic_interrupt 
 if virtual interrupt delivery enabled.
 
 Therefore apic_find_highest_isr can return -1 even though VISR
 is not zero.
 
 In that case (VISR non-zero but apic_find_highest_isr() == -1), 
 apic_update_ppr does not correctly set
 
 PPR = max(ISRV, TPR)
 
 Which can result in kvm_arch_vcpu_runnable returning 1 when it should not.
 
 Please disable usage of isr_count if virtual interrupt delivery enabled.
 Given that self-IPI writes to VIRR, also please disable usage of
 highest_isr_cache and irr_pending.
 
Good catch about isr_count. We can set it to 1 during apic
initialization too, or skip call to apic_update_ppr() in
kvm_apic_has_interrupt() if vid is enabled.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-19 Thread Gleb Natapov
On Wed, Dec 19, 2012 at 11:26:30PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
  From: Yang Zhang yang.z.zh...@intel.com
  
  Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
  manually, which is fully taken care of by the hardware. This needs
  some special awareness into existing interrupr injection path:
  
  - for pending interrupt, instead of direct injection, we may need
update architecture specific indicators before resuming to guest.
  
  - A pending interrupt, which is masked by ISR, should be also
considered in above update action, since hardware will decide
when to inject it at right time. Current has_interrupt and
get_interrupt only returns a valid vector from injection p.o.v.
  
  Signed-off-by: Kevin Tian kevin.t...@intel.com
  Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 
  +   if (enable_apicv_reg_vid)
  +   kvm_x86_ops-update_cr8_intercept = NULL;
  +   else
  +   kvm_x86_ops-update_apic_irq = NULL;
 
 Loading the module with enable_apicv=0, then enable_apicv=1, 
 fails?
It is not changeable after modules is loaded. This is true for all
kvm-intel.ko parameters.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html