RE: [PATCH v5 2/5] KVM: VMX: Register a new IPI for posted interrupt

2013-03-08 Thread Zhang, Yang Z
Ingo Molnar wrote on 2013-03-08:
> 
> * Gleb Natapov  wrote:
> 
>> On Fri, Mar 08, 2013 at 02:26:25PM +0100, Ingo Molnar wrote:
>>> 
>>> * Yang Zhang  wrote:
>>> 
 diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
 index 6e03b0d..2329a54 100644
 --- a/arch/x86/kernel/irqinit.c
 +++ b/arch/x86/kernel/irqinit.c
 @@ -205,6 +205,10 @@ static void __init apic_intr_init(void)
 
/* IPI for X86 platform specific use */
alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
 +#ifdef CONFIG_HAVE_KVM
 +  /* IPI for KVM to deliver posted interrupt */
 +  alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
 +#endif
>>> 
>>> Please avoid wasting an IDT entry by reusing x86_platform_ipi.
>>> 
>>> A KVM guest is in essence one type of 'x86 platform', and this callback is 
>>> used
>>> by hardware platforms, so collision is not an issue AFAICS.
>> 
>> This is IPI send by a host though.
> 
> But received on the guest side, right?

Yes, both guest and host will receive it.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 4/5] KVM: VMX: Add the algorithm of deliver posted interrupt

2013-03-08 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-03-09:
> On Fri, Mar 08, 2013 at 09:23:20AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Only deliver the posted interrupt when target vcpu is running
>> and there is no previous interrupt pending in pir.
>> 
>> Signed-off-by: Yang Zhang 
>> 
>> +static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct vcpu_vmx *vmx = to_vmx(vcpu);
>> +
>> +if (!vmx_vm_has_apicv(vcpu->kvm))
>> +return false;
>> +
>> +if (pi_test_and_set_pir(vector, &vmx->pi_desc))
>> +return true;
>> +
>> +kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +if ((vcpu->mode == IN_GUEST_MODE)) {
>> +if (!pi_test_and_set_on(&vmx->pi_desc))
>> +apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
>> +POSTED_INTR_VECTOR);
>> +} else
>> +kvm_vcpu_kick(vcpu);
>> +
>> +return true;
>> +}
> 
> Meaning of return value is unclear.
Yes, maybe the function name is confused. how about call it 
hwapic_deliver_interrupt()?

Or, rollback to the old way:
if(vm_has_apicv)
deliver_posted_interrupt()
else
test_and_set_virr(). 

>> +
>> +static bool vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu, bool sync)
>> +{
>> +struct vcpu_vmx *vmx = to_vmx(vcpu);
>> +
>> +if (!vmx_vm_has_apicv(vcpu->kvm))
>> +return false;
>> +
>> +if (bitmap_empty((unsigned long *)vmx->pi_desc.pir, 256))
>> +return false;
>> +
>> +if (sync)
>> +kvm_apic_update_irr(vcpu, vmx->pi_desc.pir);
>> +return true;
>> +}
> 
> Please split in two kvm_x86_ops functions: one to query whether PIR is empty
> the other
> to perform the sync.
> 
> Perhaps "->hwapic_has_interrupt" is a good name.
Sure.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt

2013-03-08 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-03-09:
> On Fri, Mar 08, 2013 at 09:23:21AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> If posted interrupt is avaliable, then uses it to inject virtual
>> interrupt to guest.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/irq.c   |2 +-
>>  arch/x86/kvm/lapic.c |   16 +---
>>  arch/x86/kvm/lapic.h |1 +
>>  arch/x86/kvm/x86.c   |4 
>>  4 files changed, 19 insertions(+), 4 deletions(-)
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>> index 484bc87..93b1fd0 100644
>> --- a/arch/x86/kvm/irq.c
>> +++ b/arch/x86/kvm/irq.c
>> @@ -81,7 +81,7 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>>  if (kvm_cpu_has_extint(v))
>>  return 1;
>> -return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
>> +return (kvm_apic_has_interrupt(v) != -1) || kvm_hwapic_has_interrupt(v);
>>  }
> 
> Please move kvm_apic_has_interrupt() check to a new separate if line.
Sure.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt

2013-03-11 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-11:
> On Fri, Mar 08, 2013 at 09:23:21AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> If posted interrupt is avaliable, then uses it to inject virtual
>> interrupt to guest.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/irq.c   |2 +-
>>  arch/x86/kvm/lapic.c |   16 +---
>>  arch/x86/kvm/lapic.h |1 +
>>  arch/x86/kvm/x86.c   |4 
>>  4 files changed, 19 insertions(+), 4 deletions(-)
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>> index 484bc87..93b1fd0 100644
>> --- a/arch/x86/kvm/irq.c
>> +++ b/arch/x86/kvm/irq.c
>> @@ -81,7 +81,7 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>>  if (kvm_cpu_has_extint(v))
>>  return 1;
>> -return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
>> +return (kvm_apic_has_interrupt(v) != -1) || kvm_hwapic_has_interrupt(v);
>>  }
>>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index b3ea50e..76c8df4 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  } else
>>  apic_clear_vector(vector, apic->regs + APIC_TMR);
> So we still touch apic page to update APIC_TMR register while vcpu is in
> non-root mode. SDM seams to prohibit it. Can we get clarification about
> how it suppose to work? May it cause any problems in practice?
Currently, no real hardware will touch APIC_TMR register. So it should be ok 
now. But I am not sure whether hardware will touch it in future. I will consult 
more people to see whether this is a real problem in future.

>> -result = !apic_test_and_set_irr(vector, apic);
>> +result = 1;
>> +if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
>> +result = !apic_test_and_set_irr(vector, apic);
>> +
>>  trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>>trig_mode, vector, !result);
>>  if (!result) {
>> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  break;
>>  }
>> -kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -kvm_vcpu_kick(vcpu);
>> +if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
>> +kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +kvm_vcpu_kick(vcpu);
>> +}
>>  break;
>>  
>>  case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
>>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)   return highest_irr; }
>> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
>> +{
>> +return kvm_x86_ops->sync_pir_to_irr(vcpu, false);
>> +}
>> +
>>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
>>  {
>>  u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index e5327be..c6abc63 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
>>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
>>  
>>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); +bool
>>  kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu); int
>>  kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int
>>  kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); void
>>  kvm_lapic_reset(struct kvm_vcpu *vcpu);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 0baa90d..57f8570 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,  
>>
>>  struct kvm_lapic_state *s) { +  kvm_x86_ops->sync_pir_to_irr(vcpu,
>>  true);  memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
>>  
>>  return 0; @@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct
>>  kvm_vcpu *vcpu) }
>>  
>>  if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>>  +   kvm_x86_ops->sync_pir_to_irr(vcpu, true);
>>  inject_pending_event(vcpu);
>>  
>>  /* enable NMI/IRQ window open exits if needed */
>> @@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>> 
>>  local_irq_disable();
>> +kvm_x86_ops->posted_intr_clear_on(vcpu);
>> +
>>  if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
>>  || need_resched() || signal_pending(current)) {
>>  vcpu->mode = OUTSIDE_GUEST_MODE;
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 2/5] KVM: VMX: Register a new IPI for posted interrupt

2013-03-14 Thread Zhang, Yang Z
Ingo Molnar wrote on 2013-03-08:
> 
> * Gleb Natapov  wrote:
> 
>> On Fri, Mar 08, 2013 at 03:05:45PM +0100, Ingo Molnar wrote:
>>> 
>>> * Gleb Natapov  wrote:
>>> 
 On Fri, Mar 08, 2013 at 02:26:25PM +0100, Ingo Molnar wrote:
> 
> * Yang Zhang  wrote:
> 
>> diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
>> index 6e03b0d..2329a54 100644
>> --- a/arch/x86/kernel/irqinit.c
>> +++ b/arch/x86/kernel/irqinit.c
>> @@ -205,6 +205,10 @@ static void __init apic_intr_init(void)
>> 
>>  /* IPI for X86 platform specific use */
>>  alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
>> +#ifdef CONFIG_HAVE_KVM
>> +/* IPI for KVM to deliver posted interrupt */
>> +alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
>> +#endif
> 
> Please avoid wasting an IDT entry by reusing x86_platform_ipi.
> 
> A KVM guest is in essence one type of 'x86 platform', and this
> callback is used by hardware platforms, so collision is not an issue
> AFAICS.
 
 This is IPI send by a host though.
>>> 
>>> But received on the guest side, right?
>> 
>> Not directly. If CPU that receives it happens to run in a guest mode it
>> makes VMX to re-evaluate pending interrupt and inject one if possible
>> without vmexit.  If CPU is not in a guest mode the handler for the IPI
>> is called in a host mode and does nothing. Guest code is unaware of the
>> existence of that IPI.
> 
> Ok, I guess a separate IPI is fine (and better) in this case then.
Is it ok to add your name in 'ack by' list?

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 5/5] KVM: use eoi to track RTC interrupt delivery status

2013-03-17 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-17:
> On Fri, Mar 15, 2013 at 04:05:00PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Current interrupt coalescing logci which only used by RTC has conflict
>> with Posted Interrupt.
>> This patch introduces a new mechinism to use eoi to track interrupt:
>> When delivering an interrupt to vcpu, the need_eoi set to number of
>> vcpu that received the interrupt. And decrease it when each vcpu writing
>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>> write eoi.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |2 +- arch/x86/kvm/lapic.h |1 +
>>  virt/kvm/ioapic.c|  105
>>  ++ 3 files changed,
>>  107 insertions(+), 1 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index ad97f1f..bf2d208 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -89,7 +89,7 @@ static inline int apic_test_and_clear_vector(int vec, void
> *bitmap)
>>  return test_and_clear_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
>> -static inline int apic_test_vector(int vec, void *bitmap)
>> +int apic_test_vector(int vec, void *bitmap)
>>  {
>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
> That's too low level to call from IOAPIC. Put kvm_apic_pending_eoi()
> here instead.
> 
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index
>> 3a0f9d8..02da8b8 100644 --- a/arch/x86/kvm/lapic.h +++
>> b/arch/x86/kvm/lapic.h @@ -84,6 +84,7 @@ static inline bool
>> kvm_hv_vapic_assist_page_enabled(struct kvm_vcpu *vcpu)
>> 
>>  int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 data);
>>  void kvm_lapic_init(void);
>> +int apic_test_vector(int vec, void *bitmap);
>> 
>>  static inline u32 kvm_apic_get_reg(struct kvm_lapic *apic, int reg_off)
>>  {
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 2c6235c..46cb8ed 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -103,12 +103,110 @@ static void rtc_register_notifier(struct kvm_ioapic
> *ioapic)
>>  kvm_register_irq_ack_notifier(ioapic->kvm,
>>  &ioapic->rtc_status.irq_ack_notifier);
>>  }
>> + +static void rtc_irq_reset(struct kvm_ioapic *ioapic) +{
>> +ioapic->rtc_status.need_eoi = 0;
>> +bitmap_zero(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS); +} + +static
>> void rtc_irq_restore(struct kvm_ioapic *ioapic) +{ + struct kvm_vcpu
>> *vcpu; + struct kvm_lapic *apic; +   int vector, i, need_eoi = 0, 
>> rtc_pin
>> = 8; + + vector = ioapic->redirtbl[rtc_pin].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) { + apic = 
>> vcpu->arch.apic;
>> +if (apic_test_vector(vector, apic->regs + APIC_ISR) || +
>>
>> apic_test_vector(vector, apic->regs + APIC_IRR)) { + 
>> need_eoi++;
>> +set_bit(vcpu->vcpu_id, ioapic->rtc_status.vcpu_map); +  
>> } + }
>> +ioapic->rtc_status.need_eoi = need_eoi; +} + +static void
>> rtc_irq_update(struct kvm_ioapic *ioapic, +  struct kvm_lapic_irq
>> *irqe, int irq) +{ + int weight; + + if (irq != 8) + return; +
>> +rtc_irq_reset(ioapic); + +  kvm_get_dest_vcpu(ioapic->kvm, irqe,
>> ioapic->rtc_status.vcpu_map); +  if
>> (likely(!bitmap_empty(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS))) {
>> +if (irqe->delivery_mode == APIC_DM_LOWEST)
>> +ioapic->rtc_status.need_eoi = 1; +  else { 
>> +weight =
>> bitmap_weight(ioapic->rtc_status.vcpu_map,
>> +sizeof(ioapic->rtc_status.vcpu_map));
>> +ioapic->rtc_status.need_eoi = weight; + } + 
>> } +} + +static void
>> rtc_irq_ack_eoi(struct kvm_vcpu *vcpu, + struct 
>> rtc_status
>> *rtc_status, int irq) +{ +   if (irq != 8) + return; + + if
>> (test_and_clear_bit(vcpu->vcpu_id, rtc_status->vcpu_map)) { +
>> if
>> (!(--rtc_status->need_eoi))
> WARN_ON(need_eoi < 0)?
> 
>> +/* Clear irr to accept subsequent RTC interrupt */
>> +vcpu->kvm->arch.vioapic->irr &= ~(1 << 8);
> This is not needed if you do not set irr if irq is coalesced.
> 
>> +}
>> +}
>> +
>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +if (irq != 8)
>> +return false;
>> +
>> +if (ioapic->rtc_status.need_eoi > 0)
>> +return true; /* coalesced */
>> +
>> +return false;
>> +}
>> +
>>  #else
>>  
>>  static void rtc_register_notifier(struct kvm_ioapic *ioapic)
>>  {
>>  return;
>>  }
>> +
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> +{
>> +return;
>> +}
>> +
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +return;
>> +}
>> +
>> +static void rtc_irq_update(struct kvm_ioapic *ioapic,
>> +struct kvm_lap

RE: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting

2013-03-17 Thread Zhang, Yang Z
Zhang, Yang Z wrote on 2013-03-15:
> From: Yang Zhang 
> 
> The follwoing patches are adding the Posted Interrupt supporting to KVM:
> The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
> it is required by Posted interrupt, we need to enable it firstly.
> 
> And the subsequent patches are adding the posted interrupt supporting:
> Posted Interrupt allows APIC interrupts to inject into guest directly
> without any vmexit.
> 
> - When delivering a interrupt to guest, if target vcpu is running,
>   update Posted-interrupt requests bitmap and send a notification event
>   to the vcpu. Then the vcpu will handle this interrupt automatically,
>   without any software involvemnt.
> - If target vcpu is not running or there already a notification event
>   pending in the vcpu, do nothing. The interrupt will be handled by
>   next vm entry
> NOTE: We don't turn on the Posted Interrupt until the coalesced issue is
> solved.
> 
> Changes from v5 to v6:
> * Split sync_pir_to_irr into two functions one to query whether PIR is empty
>   and the other to perform the sync.
> * Add comments to explain how vmx_sync_pir_to_irr() work.
> * Rebase on top of KVM upstream.
> 
> Changes from v4 to v5:
> * Add per cpu count for posted IPI handler.
> * Dont' check irr when delivering interrupt. Since we can not get interrupt
>   coalesced info with Posted Interrupt. So there is no need to check the
>   irr. There is another patch will changed current interrupt coalesced
>   logic. Before it, we will not turn on Posted Interrupt. * Clear
>   outstanding notification bit after call local_irq_disable, but before
>   check request. As Marcelo suggested, we can ensure the IPI not lost in
>   this way * Remove the spinlock. Same as item 2, if not need to get
>   coalesced info, then no need the lock.
> * Rebase on top of KVM upstream.
> 
> Yang Zhang (5):
>   KVM: VMX: Enable acknowledge interupt on vmexit
>   KVM: VMX: Register a new IPI for posted interrupt
>   KVM: VMX: Check the posted interrupt capability
>   KVM: VMX: Add the algorithm of deliver posted interrupt
>   KVM : VMX: Use posted interrupt to deliver virtual interrupt
>  arch/x86/include/asm/entry_arch.h  |4 +
>  arch/x86/include/asm/hardirq.h |3 +
>  arch/x86/include/asm/hw_irq.h  |1 +
>  arch/x86/include/asm/irq_vectors.h |5 +
>  arch/x86/include/asm/kvm_host.h|5 + arch/x86/include/asm/vmx.h 
> |4 + arch/x86/kernel/entry_64.S |5 +
>  arch/x86/kernel/irq.c  |   22 
>  arch/x86/kernel/irqinit.c  |4 + arch/x86/kvm/irq.c 
> |3 +- arch/x86/kvm/lapic.c   |   29 -
>  arch/x86/kvm/lapic.h   |2 + arch/x86/kvm/svm.c 
> |   30 + arch/x86/kvm/vmx.c |  223
>   arch/x86/kvm/x86.c
>  |8 +- virt/kvm/kvm_main.c|1 + 16 files changed,
>  320 insertions(+), 29 deletions(-)
Are there any other comments with this patch?

For TMR issue, since it has nothing to do with APICv, if we really need to 
handle it later, then we may need a separate patch to fix it. But currently, we 
may focused on APICv only.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting

2013-03-18 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 02:49:25AM +0000, Zhang, Yang Z wrote:
>> Zhang, Yang Z wrote on 2013-03-15:
>>> From: Yang Zhang 
>>> 
>>> The follwoing patches are adding the Posted Interrupt supporting to KVM:
>>> The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
>>> it is required by Posted interrupt, we need to enable it firstly.
>>> 
>>> And the subsequent patches are adding the posted interrupt supporting:
>>> Posted Interrupt allows APIC interrupts to inject into guest directly
>>> without any vmexit.
>>> 
>>> - When delivering a interrupt to guest, if target vcpu is running,
>>>   update Posted-interrupt requests bitmap and send a notification
>>>   event to the vcpu. Then the vcpu will handle this interrupt
>>>   automatically, without any software involvemnt. - If target vcpu is
>>>   not running or there already a notification event pending in the
>>>   vcpu, do nothing. The interrupt will be handled by next vm entry
>>> NOTE: We don't turn on the Posted Interrupt until the coalesced issue is
>>> solved.
>>> 
>>> Changes from v5 to v6:
>>> * Split sync_pir_to_irr into two functions one to query whether PIR is empty
>>>   and the other to perform the sync.
>>> * Add comments to explain how vmx_sync_pir_to_irr() work.
>>> * Rebase on top of KVM upstream.
>>> 
>>> Changes from v4 to v5:
>>> * Add per cpu count for posted IPI handler.
>>> * Dont' check irr when delivering interrupt. Since we can not get interrupt
>>>   coalesced info with Posted Interrupt. So there is no need to check the
>>>   irr. There is another patch will changed current interrupt coalesced
>>>   logic. Before it, we will not turn on Posted Interrupt. * Clear
>>>   outstanding notification bit after call local_irq_disable, but before
>>>   check request. As Marcelo suggested, we can ensure the IPI not lost in
>>>   this way * Remove the spinlock. Same as item 2, if not need to get
>>>   coalesced info, then no need the lock.
>>> * Rebase on top of KVM upstream.
>>> 
>>> Yang Zhang (5):
>>>   KVM: VMX: Enable acknowledge interupt on vmexit
>>>   KVM: VMX: Register a new IPI for posted interrupt
>>>   KVM: VMX: Check the posted interrupt capability
>>>   KVM: VMX: Add the algorithm of deliver posted interrupt
>>>   KVM : VMX: Use posted interrupt to deliver virtual interrupt
>>>  arch/x86/include/asm/entry_arch.h  |4 +
>>>  arch/x86/include/asm/hardirq.h |3 +
>>>  arch/x86/include/asm/hw_irq.h  |1 +
>>>  arch/x86/include/asm/irq_vectors.h |5 +
>>>  arch/x86/include/asm/kvm_host.h|5 +
> arch/x86/include/asm/vmx.h
>>> |4 + arch/x86/kernel/entry_64.S |5 +
>>>  arch/x86/kernel/irq.c  |   22 
>>>  arch/x86/kernel/irqinit.c  |4 + arch/x86/kvm/irq.c
>>> |3 +- arch/x86/kvm/lapic.c   |   29 -
>>> arch/x86/kvm/lapic.h   |2 + arch/x86/kvm/svm.c
>>> |   30 + arch/x86/kvm/vmx.c |  223
>>>   arch/x86/kvm/x86.c
>>>  |8 +- virt/kvm/kvm_main.c|1 + 16 files changed,
>>>  320 insertions(+), 29 deletions(-)
>> Are there any other comments with this patch?
>> 
> Haven't reviewed it yet, will do ASAP. "Use eoi to track RTC interrupt
> delivery status" series is pre-request for this one.
> 
>> For TMR issue, since it has nothing to do with APICv, if we really need to 
>> handle
> it later, then we may need a separate patch to fix it. But currently, we may
> focused on APICv only.
>> 
> What do you mean by "TMR has nothing to do with APICv"?
Just ignore it. I will send out the fixing with APICv patch.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 4/8] KVM: Introduce struct rtc_status

2013-03-18 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 03:24:35PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.h |9 +
>>  1 files changed, 9 insertions(+), 0 deletions(-)
>> diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
>> index 2001b61..4904ca3 100644
>> --- a/virt/kvm/ioapic.h
>> +++ b/virt/kvm/ioapic.h
>> @@ -34,6 +34,12 @@ struct kvm_vcpu;
>>  #define IOAPIC_INIT 0x5
>>  #define IOAPIC_EXTINT   0x7
>> +struct rtc_status {
>> +int need_eoi;
>> +DECLARE_BITMAP(vcpu_map, KVM_MAX_VCPUS);
>> +struct kvm_irq_ack_notifier irq_ack_notifier;
> If we do not register ack notifier any more why do you need this here?
> Also give the structure more kvmish name.
You are right. It's a mistake to leave it here.

>> +};
>> +
>>  struct kvm_ioapic { u64 base_address;   u32 ioregsel; @@ -47,6 
>> +53,9
>>  @@ struct kvm_ioapic {  void (*ack_notifier)(void *opaque, int irq);
>>  spinlock_t lock;DECLARE_BITMAP(handled_vectors, 256);
>> +#ifdef CONFIG_X86
>> +struct rtc_status rtc_status;
>> +#endif
>>  };
>>  
>>  #ifdef DEBUG
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 5/8] KVM: Recalculate destination vcpu map

2013-03-18 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 03:24:36PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Update destination vcpu map when ioapic entry or apic(id, ldr, dfr) is 
>> changed
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |   38 --
>>  1 files changed, 36 insertions(+), 2 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 4296116..659511d 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -87,6 +87,36 @@ static unsigned long ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  return result;
>>  }
>> +#ifdef CONFIG_X86
>> +static void rtc_irq_get_dest_vcpu(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>> +struct kvm_lapic_irq irqe;
>> +
> You cannot access ioapic without taking ioapic lock.
Right.
 
>> +if (irq != 8 || entry->fields.mask)
>> +return;
>> +
>> +irqe.dest_id = entry->fields.dest_id;
>> +irqe.vector = entry->fields.vector;
>> +irqe.dest_mode = entry->fields.dest_mode;
>> +irqe.trig_mode = entry->fields.trig_mode;
>> +irqe.delivery_mode = entry->fields.delivery_mode << 8;
>> +irqe.level = 1;
>> +irqe.shorthand = 0;
>> +
>> +bitmap_zero(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS);
>> +
>> +kvm_get_dest_vcpu(ioapic->kvm, &irqe, ioapic->rtc_status.vcpu_map);
>> +}
>> +
>> +#else
>> +
>> +static void rtc_irq_get_dest_vcpu(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +return;
>> +}
>> +#endif
>> +
>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>>  {   union kvm_ioapic_redirect_entry *pent; @@ -147,9 +177,13 @@ void
>>  kvm_scan_ioapic_entry(struct kvm *kvm) {struct kvm_ioapic *ioapic =
>>  kvm->arch.vioapic;
>> -if (!kvm_apic_vid_enabled(kvm) || !ioapic)
>> +if (!ioapic)
>>  return;
>> -kvm_make_update_eoibitmap_request(kvm);
>> +
>> +rtc_irq_get_dest_vcpu(ioapic, 8);
>> +
>> +if (kvm_apic_vid_enabled(kvm))
>> +kvm_make_update_eoibitmap_request(kvm);
>>  }
>>  
>>  static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 8/8] KVM: Use eoi to track RTC interrupt delivery status

2013-03-18 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 03:24:39PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Current interrupt coalescing logci which only used by RTC has conflict
>> with Posted Interrupt.
>> This patch introduces a new mechinism to use eoi to track interrupt:
>> When delivering an interrupt to vcpu, the need_eoi set to number of
>> vcpu that received the interrupt. And decrease it when each vcpu writing
>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>> write eoi.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |   67
>>  + 1 files changed,
>>  67 insertions(+), 0 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 7e47da8..8d498e5 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -130,6 +130,48 @@ static void rtc_irq_get_dest_vcpu(struct kvm_ioapic
> *ioapic, int irq)
>>  kvm_get_dest_vcpu(ioapic->kvm, &irqe, ioapic->rtc_status.vcpu_map);
>>  }
>> +static void rtc_irq_set_eoi(struct kvm_ioapic *ioapic, int irq) +{
>> +union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq]; +
>> +if (irq != 8) + return; + + if
>> (likely(!bitmap_empty(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS))) {
>> +if (entry->fields.delivery_mode == APIC_DM_LOWEST)
>> +ioapic->rtc_status.need_eoi = 1; +  else { 
>> +int weight;
>> +weight = bitmap_weight(ioapic->rtc_status.vcpu_map,
>> +sizeof(ioapic->rtc_status.vcpu_map));
>> +ioapic->rtc_status.need_eoi = weight; + } + 
>> } +} + +static void
>> rtc_irq_ack_eoi(struct kvm_vcpu *vcpu, + struct 
>> rtc_status
>> *rtc_status, int irq) +{ +   if (irq != 8) + return; + + if
>> (test_bit(vcpu->vcpu_id, rtc_status->vcpu_map))
> If you do not use test_and_clear_bit() here the WARN_ON() bellow can
> be triggered by a malicious guest. Lets define rtc_status->expected_eoi
> bitmap and copy vcpu_map into expected_eoi on each RTC irq.
Sure.
 
>> +--rtc_status->need_eoi;
>> +
>> +WARN_ON(rtc_status->need_eoi < 0);
>> +}
>> +
>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +if (irq != 8)
>> +return false;
>> +
>> +if (ioapic->rtc_status.need_eoi > 0)
>> +return true; /* coalesced */
>> +
>> +return false;
>> +}
>> +
>>  #else
>>  
>>  static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> @@ -146,6 +188,22 @@ static void rtc_irq_get_dest_vcpu(struct kvm_ioapic
> *ioapic, int irq)
>>  {
>>  return;
>>  }
>> +
>> +static void rtc_irq_set_eoi(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +return;
>> +}
>> +
>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>> +struct rtc_status *rtc_status, int irq)
>> +{
>> +return;
>> +}
>> +
>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +return false;
>> +}
>>  #endif
>>  
>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>> @@ -282,6 +340,8 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int
> irq)
>>  irqe.level = 1;
>>  irqe.shorthand = 0;
>> +rtc_irq_set_eoi(ioapic, irq);
>> +
>>  return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe);
>>  }
>> @@ -306,6 +366,11 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int
> irq, int irq_source_id,
>>  ret = 1;
>>  } else {
>>  int edge = (entry.fields.trig_mode == IOAPIC_EDGE_TRIG);
>> +
>> +if (rtc_irq_check(ioapic, irq)) {
>> +ret = 0; /* coalesced */
>> +goto out;
>> +}
>>  ioapic->irr |= mask;
>>  if ((edge && old_irr != ioapic->irr) ||
>>  (!edge && !entry.fields.remote_irr))
>> @@ -313,6 +378,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int 
>> irq,
> int irq_source_id,
>>  elseret = 0; /* report coalesced interrupt 
>> */   } +out:
>>  trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0);
>>  spin_unlock(&ioapic->lock);
>> @@ -340,6 +406,7 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu
> *vcpu,
>>  if (ent->fields.vector != vector)
>>  continue;
>> +rtc_irq_ack_eoi(vcpu, &ioapic->rtc_status, i);
>>  /*
>>   * We are dropping lock while calling ack notifiers because ack
>>   * notifier callbacks for assigned devices call into IOAPIC
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting

2013-03-18 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 10:43:16AM +0000, Zhang, Yang Z wrote:
>>>> For TMR issue, since it has nothing to do with APICv, if we really need to
> handle
>>> it later, then we may need a separate patch to fix it. But currently, we may
>>> focused on APICv only.
>>>> 
>>> What do you mean by "TMR has nothing to do with APICv"?
>> Just ignore it. I will send out the fixing with APICv patch.
>> 
> Does this mean we cannot get away with updating TMR like we do now?
> Because if we can I prefer to not complicate the code further.
No, we still need to update TMR but only in vcpu context.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: Set TMR when programming ioapic entry

2013-03-18 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 07:42:22PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> We already know the trigger mode of a given interrupt when programming
>> the ioapice entry. So it's not necessary to set it in each interrupt
>> delivery.
>> 
> What this patch suppose to go on top of?
Sorry, forget to mention it.
This is based on RTC patch(Use eoi to track RTC interrupt delivery status) and 
Posted interrupt patch(KVM: VMX: Add Posted Interrupt supporting).

>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |   26 ++
>>  arch/x86/kvm/lapic.h |5 ++---
>>  arch/x86/kvm/vmx.c   |2 ++
>>  arch/x86/kvm/x86.c   |   12 
>>  include/linux/kvm_host.h |4 ++--
>>  virt/kvm/ioapic.c|   17 +
>>  virt/kvm/ioapic.h|5 ++---
>>  virt/kvm/kvm_main.c  |4 ++--
>>  8 files changed, 41 insertions(+), 34 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index d0b553b..0d2bcde 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -194,18 +194,17 @@ out:
>>  rcu_read_unlock();
>>  }
>> -void kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
>> -struct kvm_lapic_irq *irq,
>> -u64 *eoi_exit_bitmap)
>> +bool kvm_vcpu_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic_irq
> *irq)
>>  {
>>  DECLARE_BITMAP(vcpu_map, KVM_MAX_VCPUS);
>>  
>>  memset(vcpu_map, 0, sizeof(vcpu_map));
>>  
>>  kvm_get_dest_vcpu(vcpu->kvm, irq, vcpu_map);
>> -if (test_bit(vcpu->vcpu_id, vcpu_map) ||
>> -bitmap_empty(vcpu_map, sizeof(vcpu_map)))
>> -__set_bit(irq->vector, (unsigned long *)eoi_exit_bitmap);
>> +if (test_bit(vcpu->vcpu_id, vcpu_map))
>> +return true;
>> +
>> +return false;
>>  }
>>  
>>  static void recalculate_apic_map(struct kvm *kvm)
>> @@ -534,6 +533,15 @@ static inline int apic_find_highest_isr(struct kvm_lapic
> *apic)
>>  return result;
>>  }
>> +void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr)
>> +{
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +int i;
>> +
>> +for (i = 0; i < 8; i++)
>> +apic_set_reg(apic, APIC_TMR + 0x10 * i, tmr[i]);
>> +}
>> +
>>  static void apic_update_ppr(struct kvm_lapic *apic)
>>  {
>>  u32 tpr, isrv, ppr, old_ppr;
>> @@ -723,12 +731,6 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  if (unlikely(!apic_enabled(apic)))
>>  break;
>> -if (trig_mode) {
>> -apic_debug("level trig mode for vector %d", vector);
>> -apic_set_vector(vector, apic->regs + APIC_TMR);
>> -} else
>> -apic_clear_vector(vector, apic->regs + APIC_TMR);
>> -
>>  result = 1;
>>  if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
>>  result = !apic_test_and_set_irr(vector, apic);
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 4d38836..7e8d35f 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -48,6 +48,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64
> value);
>>  u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
>>  void kvm_apic_set_version(struct kvm_vcpu *vcpu);
>> +void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr);
>>  int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
>>  int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
>>  int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
>> @@ -155,9 +156,7 @@ static inline u16 apic_logical_id(struct kvm_apic_map
> *map, u32 ldr)
>>  return ldr & map->lid_mask;
>>  }
>> -void kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
>> -struct kvm_lapic_irq *irq,
>> -u64 *eoi_bitmap);
>> +bool kvm_vcpu_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic_irq
> *irq);
>>  void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir);
>>  
>>  void kvm_get_dest_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
>> 1985849..1571df2 100644 --- a/arch/x86/kvm/vmx.c +++
>> b/arch/x86/kvm/vmx.c @@ -6486,6 +6486,8 @@ static void
>> vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
>> 
>>  static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64
>>  *eoi_exit_bitmap) {
>> +if (!vmx_vm_has_apicv(vcpu->kvm))
>> +return;
>>  vmcs_write64(EOI_EXIT_BITMAP0, eoi_exit_bitmap[0]);
>>  vmcs_write64(EOI_EXIT_BITMAP1, eoi_exit_bitmap[1]);
>>  vmcs_write64(EOI_EXIT_BITMAP2, eoi_exit_bitmap[2]);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 1f54987..5b88c2c 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5632,13 +5632,17 @@ static void kvm_gen_update_masterclock(struct

RE: [PATCH] KVM: Set TMR when programming ioapic entry

2013-03-18 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 12:32:51PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-18:
>>> On Mon, Mar 18, 2013 at 07:42:22PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> We already know the trigger mode of a given interrupt when programming
>>>> the ioapice entry. So it's not necessary to set it in each interrupt
>>>> delivery.
>>>> 
>>> What this patch suppose to go on top of?
>> Sorry, forget to mention it. This is based on RTC patch(Use eoi to
>> track RTC interrupt delivery status) and Posted interrupt patch(KVM:
>> VMX: Add Posted Interrupt supporting).
>> 
> Since it touches the code added by RTC patch series that will be changed
> after my latest comments it is hard to review that.
I have send out the modified RTC patch. Please review it.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt

2013-03-19 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-19:
> On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> If posted interrupt is avaliable, then uses it to inject virtual
>> interrupt to guest.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/irq.c   |3 ++-
>>  arch/x86/kvm/lapic.c |   16 +---
>>  arch/x86/kvm/lapic.h |1 +
>>  arch/x86/kvm/vmx.c   |   11 +++
>>  arch/x86/kvm/x86.c   |4 
>>  5 files changed, 31 insertions(+), 4 deletions(-)
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>> index 484bc87..5179988 100644
>> --- a/arch/x86/kvm/irq.c
>> +++ b/arch/x86/kvm/irq.c
>> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>>  if (kvm_cpu_has_extint(v))
>>  return 1;
>> -return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
>> +return (kvm_apic_has_interrupt(v) != -1) ||
>> +kvm_hwapic_has_interrupt(v);
> That's incorrect. kvm_cpu_has_interrupt() should return true only it
> there is IRR suitable to be injected, not just any IRR.
> kvm_apic_has_interrupt() should call kvm_apic_update_irr().
You are right.

>>  }
>>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index b3ea50e..46c7310 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  } else
>>  apic_clear_vector(vector, apic->regs + APIC_TMR);
>> -result = !apic_test_and_set_irr(vector, apic);
>> +result = 1;
>> +if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
>> +result = !apic_test_and_set_irr(vector, apic);
>> +
>>  trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>>trig_mode, vector, !result);
>>  if (!result) {
>> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  break;
>>  }
>> -kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -kvm_vcpu_kick(vcpu);
>> +if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
>> +kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +kvm_vcpu_kick(vcpu);
>> +}
>>  break;
> apicv code and non apicv code are completely different. What's the point
> checking for apicv twice here?
> Just do:
> 
> if (kvm_x86_ops->deliver_posted_interrupt)
>   kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
> else {
>   result = !apic_test_and_set_irr(vector, apic);
>   kvm_make_request(KVM_REQ_EVENT, vcpu);
>   kvm_vcpu_kick(vcpu);
> }
> 
> And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.
> 
> Also rearrange patches so that APIC_TMR handling goes before posted
> interrupt series.
Sure. 

>> 
>>  case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
>>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)   return highest_irr; }
>> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
>> +{
>> +return kvm_x86_ops->hwapic_has_interrupt(vcpu);
>> +}
>> +
>>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
>>  {
>>  u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index e5327be..c6abc63 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
>>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
>>  
>>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); +bool
>>  kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu); int
>>  kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int
>>  kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); void
>>  kvm_lapic_reset(struct kvm_vcpu *vcpu);
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 0b5a8ae..48a2239 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct
> kvm_vcpu *vcpu)
>>  clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
>>  }
>> +/*
>> + * Send interrupt to vcpu via posted interrupt way.
>> + * Return false if posted interrupt is not supported and the caller will
>> + * roll back to old way(via set vIRR).
>> + * Return true if posted interrupt is avalialbe, the interrupt is set
>> + * in pir(posted interrupt requests):
>> + * 1. If target vcpu is running(non-root mode), send posted interrupt
>> + * notification to vcpu and hardware will sync pir to vIRR atomically.
>> + * 2. If target vcpu isn't running(root mode), kick it to pick up the
>> + * interrupt from pir in next vmentry.
>> + */
> The comment should go into previous patch. Also I prefer to not check
> for posted interrupt inside the callback, but set it to NULL instead.
> This way we avoid calling a callback on a hot path needlessly.
It'

RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt

2013-03-19 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-19:
> On Tue, Mar 19, 2013 at 12:11:47PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-19:
>>> On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> If posted interrupt is avaliable, then uses it to inject virtual
>>>> interrupt to guest.
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  arch/x86/kvm/irq.c   |3 ++-
>>>>  arch/x86/kvm/lapic.c |   16 +---
>>>>  arch/x86/kvm/lapic.h |1 +
>>>>  arch/x86/kvm/vmx.c   |   11 +++
>>>>  arch/x86/kvm/x86.c   |4 
>>>>  5 files changed, 31 insertions(+), 4 deletions(-)
>>>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>>>> index 484bc87..5179988 100644
>>>> --- a/arch/x86/kvm/irq.c
>>>> +++ b/arch/x86/kvm/irq.c
>>>> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>>>>if (kvm_cpu_has_extint(v))
>>>>return 1;
>>>> -  return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
>>>> +  return (kvm_apic_has_interrupt(v) != -1) ||
>>>> +  kvm_hwapic_has_interrupt(v);
>>> That's incorrect. kvm_cpu_has_interrupt() should return true only it
>>> there is IRR suitable to be injected, not just any IRR.
>>> kvm_apic_has_interrupt() should call kvm_apic_update_irr().
>> You are right.
>> 
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>> index b3ea50e..46c7310 100644
>>>> --- a/arch/x86/kvm/lapic.c
>>>> +++ b/arch/x86/kvm/lapic.c
>>>> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic,
> int
>>> delivery_mode,
>>>>} else
>>>>apic_clear_vector(vector, apic->regs + APIC_TMR);
>>>> -  result = !apic_test_and_set_irr(vector, apic);
>>>> +  result = 1;
>>>> +  if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
>>>> +  result = !apic_test_and_set_irr(vector, apic);
>>>> +
>>>>trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>>>>  trig_mode, vector, !result);
>>>>if (!result) {
>>>> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic,
> int
>>> delivery_mode,
>>>>break;
>>>>}
>>>> -  kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>> -  kvm_vcpu_kick(vcpu);
>>>> +  if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
>>>> +  kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>> +  kvm_vcpu_kick(vcpu);
>>>> +  }
>>>>break;
>>> apicv code and non apicv code are completely different. What's the point
>>> checking for apicv twice here?
>>> Just do:
>>> 
>>> if (kvm_x86_ops->deliver_posted_interrupt)
>>> kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
>>> else {
>>> result = !apic_test_and_set_irr(vector, apic);
>>> kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> kvm_vcpu_kick(vcpu);
>>> }
>>> 
>>> And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.
>>> 
>>> Also rearrange patches so that APIC_TMR handling goes before posted
>>> interrupt series.
>> Sure.
>> 
>>>> 
>>>>case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
>>>>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) return highest_irr; }
>>>> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +  return kvm_x86_ops->hwapic_has_interrupt(vcpu);
>>>> +}
>>>> +
>>>>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
>>>>  {
>>>>u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>> index e5327be..c6abc63 100644
>>>> --- a/arch/x86/kvm/lapic.h
>>>> +++ b/arch/x86/kvm/lapic.h
>>>> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
>>>>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
>>>>  
>>>>  int kvm_apic_has_interrupt(s

RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt

2013-03-19 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-19:
> On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
>>>>>>  local_irq_disable();
>>>>>> +kvm_x86_ops->posted_intr_clear_on(vcpu);
>>>>>> +
>>>>> Why is this separate from pir_to_irr syncing?
>>>> This is the result of discussion with Marcelo. It is more reasonable to
>>>> put it here to avoid unnecessary posted interrupt between:
>>>> 
>>>> vcpu->mode = IN_GUEST_MODE;
>>>> 
>>>> <--interrupt may arrived here and this is unnecessary.
>>>> 
>>>> local_irq_disable();
>>>> 
>>> 
>>> But this still can happen as far as I see:
>>> 
>>> vcpu0 vcpu1:
>>> pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
>>> if (KVM_REQ_EVENT)
>>>sync_pir_to_irr()
>>> vcpu->mode =
>>> IN_GUEST_MODE;
>>> if (vcpu->mode == IN_GUEST_MODE)
>>>   if (!pi_test_and_set_on())
>>> apic->send_IPI_mask()
>>> --> IPI arrives here
>>> local_irq_disable();
>>> posted_intr_clear_on()
>> Current solution is trying to block other Posted Interrupt from other VCPUs 
>> at
> same time. It only mitigates it but cannot solve it. The case you mentioned 
> still
> exists but it should be rare.
>> 
> I am not sure I follow. What scenario exactly are you talking about. I
> looked over past discussion about it and saw that Marcelo gives an
> example how IPI can be lost, but I think that's because we set "on" bit
> after KVM_REQ_EVENT:
The IPI will not lost in his example(he misread the patch). 

> cpu0cpu1vcpu0
> test_and_set_bit(PIR-A) set KVM_REQ_EVENT
> process
> REQ_EVENT
> PIR-A->IRR
> 
> vcpu->mode=IN_GUEST
> 
> if (vcpu0->guest_mode)
> if (!t_a_s_bit(PIR notif))
> send IPI
> linux_pir_handler
> 
> t_a_s_b(PIR-B)=1
> no PIR IPI sent
> 
> But what if on delivery we do:
> pi_test_and_set_pir()
> r = pi_test_and_set_on()
> kvm_make_request(KVM_REQ_EVENT)
> if (!r)
>send_IPI_mask() else kvm_vcpu_kick()
> And on vcpu entry we do:
> if (kvm_check_request(KVM_REQ_EVENT)
>  if (test_and_clear_bit(on))
>kvm_apic_update_irr()
> What are the downsides? Can we lost interrupts this way?
Need to check guest mode before sending IPI. Otherwise hypervisor may receive 
IPI.
I think current logic is ok. Only problem is that when to clear Outstanding 
Notification bit. Actually I prefer your suggestion to clear it before 
sync_pir_irr. But Marcelo prefer to clear ON bit after disabling irq.
 
>>> May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?
>> Yes, this will solve it. But I am not sure whether it will introduce
>> any regressions. Is there any check relies on this sequence?
>> 
> Do not think so.
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 6/8] KVM: Add reset/restore rtc_status support

2013-03-19 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-03-20:
> On Mon, Mar 18, 2013 at 03:24:37PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> reset/restore rtc_status when ioapic reset/restore.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |8 
>>  arch/x86/kvm/lapic.h |1 +
>>  virt/kvm/ioapic.c|   33 +
>>  3 files changed, 42 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 6fb22e3..a223170 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void *bitmap)
>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>> +}
>> +
> 
> Should hook into kvm_lapic_reset and kvm_vcpu_ioctl_set_lapic to
> generate updates.
rtc_irq_restore will be called after lapic is restored. What's the problem if 
we no hook into the two function?

>>  static inline void apic_set_vector(int vec, void *bitmap)
>>  {
>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index
>> 3a0f9d8..e2a03d1 100644 --- a/arch/x86/kvm/lapic.h +++
>> b/arch/x86/kvm/lapic.h @@ -160,5 +160,6 @@ void
>> kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
>> 
>>  void kvm_get_dest_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
>>  unsigned long *vcpu_map);
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>> 
>>  #endif
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 659511d..6266d1f 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -88,6 +88,27 @@ static unsigned long ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  }
>>  
>>  #ifdef CONFIG_X86
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> +{
>> +ioapic->rtc_status.need_eoi = 0;
>> +bitmap_zero(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS);
>> +}
>> +
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +struct kvm_vcpu *vcpu;
>> +int vector, i, need_eoi = 0, rtc_pin = 8;
>> +
>> +vector = ioapic->redirtbl[rtc_pin].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>> +need_eoi++;
>> +set_bit(vcpu->vcpu_id, ioapic->rtc_status.vcpu_map);
> 
> Why set bit on vcpu_map here?
We will set need_eoi here. And if target vcpu is not in vcpu_map, then it will 
not update need_eoi on EOI.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 6/8] KVM: Add reset/restore rtc_status support

2013-03-19 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-03-20:
> On Mon, Mar 18, 2013 at 03:24:37PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> reset/restore rtc_status when ioapic reset/restore.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |8 
>>  arch/x86/kvm/lapic.h |1 +
>>  virt/kvm/ioapic.c|   33 +
>>  3 files changed, 42 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 6fb22e3..a223170 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void *bitmap)
>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>> +}
>> +
>>  static inline void apic_set_vector(int vec, void *bitmap)
>>  {
>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index
>> 3a0f9d8..e2a03d1 100644 --- a/arch/x86/kvm/lapic.h +++
>> b/arch/x86/kvm/lapic.h @@ -160,5 +160,6 @@ void
>> kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
>> 
>>  void kvm_get_dest_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
>>  unsigned long *vcpu_map);
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>> 
>>  #endif
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 659511d..6266d1f 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -88,6 +88,27 @@ static unsigned long ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  }
>>  
>>  #ifdef CONFIG_X86
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> +{
>> +ioapic->rtc_status.need_eoi = 0;
>> +bitmap_zero(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS);
>> +}
>> +
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +struct kvm_vcpu *vcpu;
>> +int vector, i, need_eoi = 0, rtc_pin = 8;
>> +
>> +vector = ioapic->redirtbl[rtc_pin].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>> +need_eoi++;
>> +set_bit(vcpu->vcpu_id, ioapic->rtc_status.vcpu_map);
>> +}
>> +}
>> +ioapic->rtc_status.need_eoi = need_eoi;
>> +}
>> +
>>  static void rtc_irq_get_dest_vcpu(struct kvm_ioapic *ioapic, int irq)
>>  {
>>  union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>> @@ -111,6 +132,16 @@ static void rtc_irq_get_dest_vcpu(struct
>> kvm_ioapic *ioapic, int irq)
>> 
>>  #else
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> +{
>> +return;
>> +}
>> +
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +return;
>> +}
>> +
>>  static void rtc_irq_get_dest_vcpu(struct kvm_ioapic *ioapic, int irq)
>>  {   return; @@ -462,6 +493,7 @@ void kvm_ioapic_reset(struct kvm_ioapic
>>  *ioapic)ioapic->ioregsel = 0;   ioapic->irr = 0;ioapic->id = 0;
>>  +   rtc_irq_reset(ioapic);  update_handled_vectors(ioapic); }
> 
> Should also zero the counter if the OS resets the IOAPIC (think reboot
> via triple-fault with unacked RTC interrupt in ISR/IRR).
rtc_irq_reset() already did it.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 8/8] KVM: Use eoi to track RTC interrupt delivery status

2013-03-19 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-03-20:
> On Mon, Mar 18, 2013 at 03:24:39PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Current interrupt coalescing logci which only used by RTC has conflict
>> with Posted Interrupt.
>> This patch introduces a new mechinism to use eoi to track interrupt:
>> When delivering an interrupt to vcpu, the need_eoi set to number of
>> vcpu that received the interrupt. And decrease it when each vcpu writing
>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>> write eoi.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |   67
>>  + 1 files changed,
>>  67 insertions(+), 0 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 7e47da8..8d498e5 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -130,6 +130,48 @@ static void rtc_irq_get_dest_vcpu(struct kvm_ioapic
> *ioapic, int irq)
>>  kvm_get_dest_vcpu(ioapic->kvm, &irqe, ioapic->rtc_status.vcpu_map);
>>  }
>> +static void rtc_irq_set_eoi(struct kvm_ioapic *ioapic, int irq) +{
>> +union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq]; +
>> +if (irq != 8) + return; + + if
>> (likely(!bitmap_empty(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS))) {
>> +if (entry->fields.delivery_mode == APIC_DM_LOWEST)
>> +ioapic->rtc_status.need_eoi = 1; +  else { 
>> +int weight;
>> +weight = bitmap_weight(ioapic->rtc_status.vcpu_map,
>> +sizeof(ioapic->rtc_status.vcpu_map));
>> +ioapic->rtc_status.need_eoi = weight; + } + 
>> } +}
> 
> Why two bitmaps are necessary? One should be enough.
On eoi, it will clear the bitmap. So we need two bitmap, one only updated when 
rtc destination vcpu changed and one is copy of it for EOI check.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 5/8] KVM: Recalculate destination vcpu map

2013-03-19 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-19:
> On Mon, Mar 18, 2013 at 08:47:19PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Update destination vcpu map when ioapic entry or apic(id, ldr, dfr) is 
>> changed
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |   40 ++--
>>  1 files changed, 38 insertions(+), 2 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 4296116..329efe1 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -87,6 +87,38 @@ static unsigned long ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  return result;
>>  }
>> +#ifdef CONFIG_X86
>> +static void rtc_irq_get_dest_vcpu(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>> +struct kvm_lapic_irq irqe;
>> +
>> +if (irq != 8 || entry->fields.mask)
>> +return;
>> +
>> +spin_lock(&ioapic->lock);
> How does this not deadlock? The is called from kvm_scan_ioapic_entry()
> and kvm_scan_ioapic_entry() is called from ioapic_write_indirect() with
I removed the lock before call kvm_scan_ioapic_entry() in 
ioapic_write_indirect().

> the lock already taken. You should handle that the same way we handle
> eoibitmap recalculation: signal vcpu and calculate there.
Sure.

>> +irqe.dest_id = entry->fields.dest_id;
>> +irqe.vector = entry->fields.vector;
>> +irqe.dest_mode = entry->fields.dest_mode;
>> +irqe.trig_mode = entry->fields.trig_mode;
>> +irqe.delivery_mode = entry->fields.delivery_mode << 8;
>> +irqe.level = 1;
>> +irqe.shorthand = 0;
>> +
>> +bitmap_zero(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS);
>> +
>> +kvm_get_dest_vcpu(ioapic->kvm, &irqe, ioapic->rtc_status.vcpu_map);
>> +spin_unlock(&ioapic->lock);
>> +}
>> +
>> +#else
>> +
>> +static void rtc_irq_get_dest_vcpu(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +return;
>> +}
>> +#endif
>> +
>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>>  {   union kvm_ioapic_redirect_entry *pent; @@ -147,9 +179,13 @@ void
>>  kvm_scan_ioapic_entry(struct kvm *kvm) {struct kvm_ioapic *ioapic =
>>  kvm->arch.vioapic;
>> -if (!kvm_apic_vid_enabled(kvm) || !ioapic)
>> +if (!ioapic)
>>  return;
>> -kvm_make_update_eoibitmap_request(kvm);
>> +
>> +rtc_irq_get_dest_vcpu(ioapic, 8);
>> +
>> +if (kvm_apic_vid_enabled(kvm))
>> +kvm_make_update_eoibitmap_request(kvm);
>>  }
>>  
>>  static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
>> --
>> 1.7.1
> 
> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 6/8] KVM: Add reset/restore rtc_status support

2013-03-20 Thread Zhang, Yang Z
I have send out the latest patch(v4). Please give comments for the latest one, 
because some issues you point out may not exist on latest patch. If it still 
exits. please point out it again.

Zhang, Yang Z wrote on 2013-03-20:
> Marcelo Tosatti wrote on 2013-03-20:
>> On Mon, Mar 18, 2013 at 03:24:37PM +0800, Yang Zhang wrote:
>>> From: Yang Zhang 
>>> 
>>> reset/restore rtc_status when ioapic reset/restore.
>>> 
>>> Signed-off-by: Yang Zhang 
>>> ---
>>>  arch/x86/kvm/lapic.c |8 
>>>  arch/x86/kvm/lapic.h |1 +
>>>  virt/kvm/ioapic.c|   33 +
>>>  3 files changed, 42 insertions(+), 0 deletions(-)
>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index 6fb22e3..a223170 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void *bitmap)
>>> return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>  }
>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>>> +{
>>> +   struct kvm_lapic *apic = vcpu->arch.apic;
>>> +
>>> +   return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>>> +   apic_test_vector(vector, apic->regs + APIC_IRR);
>>> +}
>>> +
>> 
>> Should hook into kvm_lapic_reset and kvm_vcpu_ioctl_set_lapic to
>> generate updates.
> rtc_irq_restore will be called after lapic is restored. What's the problem if 
> we no
> hook into the two function?
> 
>>>  static inline void apic_set_vector(int vec, void *bitmap)
>>>  {
>>> set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index
>>> 3a0f9d8..e2a03d1 100644 --- a/arch/x86/kvm/lapic.h +++
>>> b/arch/x86/kvm/lapic.h @@ -160,5 +160,6 @@ void
>>> kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
>>> 
>>>  void kvm_get_dest_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
>>> unsigned long *vcpu_map);
>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>>> 
>>>  #endif
>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>> index 659511d..6266d1f 100644
>>> --- a/virt/kvm/ioapic.c
>>> +++ b/virt/kvm/ioapic.c
>>> @@ -88,6 +88,27 @@ static unsigned long ioapic_read_indirect(struct
>> kvm_ioapic *ioapic,
>>>  }
>>>  
>>>  #ifdef CONFIG_X86
>>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>>> +{
>>> +   ioapic->rtc_status.need_eoi = 0;
>>> +   bitmap_zero(ioapic->rtc_status.vcpu_map, KVM_MAX_VCPUS);
>>> +}
>>> +
>>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>>> +{
>>> +   struct kvm_vcpu *vcpu;
>>> +   int vector, i, need_eoi = 0, rtc_pin = 8;
>>> +
>>> +   vector = ioapic->redirtbl[rtc_pin].fields.vector;
>>> +   kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>>> +   if (kvm_apic_pending_eoi(vcpu, vector)) {
>>> +   need_eoi++;
>>> +   set_bit(vcpu->vcpu_id, ioapic->rtc_status.vcpu_map);
>> 
>> Why set bit on vcpu_map here?
> We will set need_eoi here. And if target vcpu is not in vcpu_map, then it 
> will not
> update need_eoi on EOI.
> 
> Best regards,
> Yang


Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt

2013-03-20 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-20:
> On Tue, Mar 19, 2013 at 12:27:38PM -0300, Marcelo Tosatti wrote:
>> On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
>>> See the previous argument: should never enter guest mode with PIR ON bit
>>> set. With logic above:
>>> 
>>> context1context2  context3
>>> set_bit(PIR-1)
>>> r = pi_test_and_set_on()
>>> set_bit(PIR-40)
>>> set_bit(KVM_REQ_EVENT)
>>> if (kvm_check_request(KVM_REQ_EVENT)
>>>  if (test_and_clear_bit(on))
>>>kvm_apic_update_irr()r =
> pi_test_and_set_on()
>>> 
>>> guest entry with PIR ON=1
>>> 
>>> 
>>> Thats the reason for unconditional clearing on guest entry: it is easy
>>> to verify its correct. I understand and agree the callback (and VMWRITE)
>>> is not nice.
>> 
>> Re: KVM_REQ_EVENT setting after set_bit(KVM_REQ_EVENT) assures no guest
>> entry with PIR ON=1.
>> 
>> Might be, would have to verify. Its trickier though. Maybe add a FIXME:
>> to the callback and remove it later.
> We have time still. RTC series is not ready yet. I'll think hard and try
> to poke holes in the logic in this patch and you do the same for what I
> propose.
Any thought? As far as I see, the two solutions are ok. It's hard to say which 
is better. But clear ON bit when sync_pir_irr should be more clear and close to 
hardware's behavior.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt

2013-03-20 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-20:
> On Wed, Mar 20, 2013 at 11:47:49AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-20:
>>> On Tue, Mar 19, 2013 at 12:27:38PM -0300, Marcelo Tosatti wrote:
>>>> On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
>>>>> See the previous argument: should never enter guest mode with PIR ON
>>>>> bit set. With logic above:
>>>>> 
>>>>> context1  context2  context3
>>>>>   set_bit(PIR-1)
>>>>>   r = pi_test_and_set_on()
>>>>> set_bit(PIR-40)
>>>>>   set_bit(KVM_REQ_EVENT)
>>>>> if (kvm_check_request(KVM_REQ_EVENT)
>>>>>  if (test_and_clear_bit(on))
>>>>>kvm_apic_update_irr()  r =
>>> pi_test_and_set_on()
>>>>> 
>>>>> guest entry with PIR ON=1
>>>>> 
>>>>> 
>>>>> Thats the reason for unconditional clearing on guest entry: it is easy
>>>>> to verify its correct. I understand and agree the callback (and VMWRITE)
>>>>> is not nice.
>>>> 
>>>> Re: KVM_REQ_EVENT setting after set_bit(KVM_REQ_EVENT) assures no
>>>> guest entry with PIR ON=1.
>>>> 
>>>> Might be, would have to verify. Its trickier though. Maybe add a FIXME:
>>>> to the callback and remove it later.
>>> We have time still. RTC series is not ready yet. I'll think hard and try
>>> to poke holes in the logic in this patch and you do the same for what I
>>> propose.
>> Any thought? As far as I see, the two solutions are ok. It's hard to say 
>> which is
> better. But clear ON bit when sync_pir_irr should be more clear and close to
> hardware's behavior.
>> 
> Lets go with it unless we see why it will not work.
Sure.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/7] KVM: Recalculate destination vcpu map

2013-03-20 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-20:
> On Wed, Mar 20, 2013 at 07:36:17PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Update RTC interrrupt's destination vcpu map when ioapic entry of RTC
>> or apic register (id, ldr, dfr) is changed.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |9 +++--
>>  1 files changed, 7 insertions(+), 2 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index ddf9414..91b4c08 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -121,6 +121,7 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
>>  {   struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;union
>>  kvm_ioapic_redirect_entry *e; + unsigned long *rtc_map =
>>  ioapic->rtc_status.vcpu_map;struct kvm_lapic_irq irqe;  int 
>> index;
>> @@ -130,15 +131,19 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
>>  if (!e->fields.mask &&
>>  (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
>>   kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
>> - index))) {
>> + index) || index == 8)) {
>>  irqe.dest_id = e->fields.dest_id;
>>  irqe.vector = e->fields.vector;
>>  irqe.dest_mode = e->fields.dest_mode;
>>  irqe.shorthand = 0;
>>  
>>  if (kvm_apic_match_dest(vcpu, NULL, irqe.shorthand,
>> -irqe.dest_id, irqe.dest_mode))
>> +irqe.dest_id, irqe.dest_mode)) {
>>  __set_bit(irqe.vector, eoi_exit_bitmap);
>> +if (index == 8)
>> +__set_bit(vcpu->vcpu_id, rtc_map);
>> +} else if (index == 8)
>> +__clear_bit(vcpu->vcpu_id, rtc_map);
> rtc_map bitmap is accessed from different vcpus simultaneously so access
> has to be atomic. We also have a race:
> 
> vcpu0   iothread
> ioapic config changes
> request scan ioapic
>  inject rtc interrupt
>  use old vcpu mask
> scan_ioapic()
> recalculate vcpu mask
> 
> So this approach (suggested by me :() will not work.
> 
> Need to think about it some more. May be your idea of building a bitmap
> while injecting the interrupt is the way to go indeed: pass a pointer to
> a bitmap to kvm_irq_delivery_to_apic() and build it there. Pass NULL
> pointer if caller does not need to track vcpus.
Or, we can block inject rtc interrupt during recalculate vcpu map.

if(need_eoi > 0 && in_recalculating)
return coalesced

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/7] KVM: Recalculate destination vcpu map

2013-03-20 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 03:42:46AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-20:
>>> On Wed, Mar 20, 2013 at 07:36:17PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> Update RTC interrrupt's destination vcpu map when ioapic entry of RTC
>>>> or apic register (id, ldr, dfr) is changed.
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  virt/kvm/ioapic.c |9 +++--
>>>>  1 files changed, 7 insertions(+), 2 deletions(-)
>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>> index ddf9414..91b4c08 100644
>>>> --- a/virt/kvm/ioapic.c
>>>> +++ b/virt/kvm/ioapic.c
>>>> @@ -121,6 +121,7 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
>>>>  { struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;union
>>>>  kvm_ioapic_redirect_entry *e; +   unsigned long *rtc_map =
>>>>  ioapic->rtc_status.vcpu_map;  struct kvm_lapic_irq irqe;  int 
>>>> index;
>>>> @@ -130,15 +131,19 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu
> *vcpu,
>>>>if (!e->fields.mask &&
>>>>(e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
>>>> kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
>>>> -   index))) {
>>>> +   index) || index == 8)) {
>>>>irqe.dest_id = e->fields.dest_id;
>>>>irqe.vector = e->fields.vector;
>>>>irqe.dest_mode = e->fields.dest_mode;
>>>>irqe.shorthand = 0;
>>>>  
>>>>if (kvm_apic_match_dest(vcpu, NULL, irqe.shorthand,
>>>> -  irqe.dest_id, irqe.dest_mode))
>>>> +  irqe.dest_id, irqe.dest_mode)) {
>>>>__set_bit(irqe.vector, eoi_exit_bitmap);
>>>> +  if (index == 8)
>>>> +  __set_bit(vcpu->vcpu_id, rtc_map);
>>>> +  } else if (index == 8)
>>>> +  __clear_bit(vcpu->vcpu_id, rtc_map);
>>> rtc_map bitmap is accessed from different vcpus simultaneously so access
>>> has to be atomic. We also have a race:
>>> 
>>> vcpu0   iothread
>>> ioapic config changes
>>> request scan ioapic
>>>  inject rtc interrupt
>>>  use old vcpu mask
>>> scan_ioapic()
>>> recalculate vcpu mask
>>> 
>>> So this approach (suggested by me :() will not work.
>>> 
>>> Need to think about it some more. May be your idea of building a bitmap
>>> while injecting the interrupt is the way to go indeed: pass a pointer to
>>> a bitmap to kvm_irq_delivery_to_apic() and build it there. Pass NULL
>>> pointer if caller does not need to track vcpus.
>> Or, we can block inject rtc interrupt during recalculate vcpu map.
>> 
>> if(need_eoi > 0 && in_recalculating)
>> return coalesced
>> 
> This should be ||. Then you need to maintain in_recalculating and
> recalculations requests may overlap. Too complex and fragile.
It should not be too complex. How about the following logic?

when make scan ioapic request:
kvm_vcpu_scan_ioapic()
{
kvm_for_each_vcpu()
in_recalculating++;
}

Then on each vcpu's request handler:
vcpu_scan_ioapic()
{
in_recalculating--;
}

And when delivering RTC interrupt:
if(need_eoi > 0 || in_recalculating)
return coalesced


> 
> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/7] KVM: Recalculate destination vcpu map

2013-03-20 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 05:30:32AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-21:
>>> On Thu, Mar 21, 2013 at 03:42:46AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-03-20:
>>>>> On Wed, Mar 20, 2013 at 07:36:17PM +0800, Yang Zhang wrote:
>>>>>> From: Yang Zhang 
>>>>>> 
>>>>>> Update RTC interrrupt's destination vcpu map when ioapic entry of RTC
>>>>>> or apic register (id, ldr, dfr) is changed.
>>>>>> 
>>>>>> Signed-off-by: Yang Zhang 
>>>>>> ---
>>>>>>  virt/kvm/ioapic.c |9 +++--
>>>>>>  1 files changed, 7 insertions(+), 2 deletions(-)
>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>> index ddf9414..91b4c08 100644
>>>>>> --- a/virt/kvm/ioapic.c
>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>> @@ -121,6 +121,7 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu
> *vcpu,
>>>>>>  {   struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;union
>>>>>>  kvm_ioapic_redirect_entry *e; + unsigned long *rtc_map =
>>>>>>  ioapic->rtc_status.vcpu_map;struct kvm_lapic_irq irqe;  int 
>>>>>> index;
>>>>>> @@ -130,15 +131,19 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu
>>> *vcpu,
>>>>>>  if (!e->fields.mask &&
>>>>>>  (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
>>>>>>   kvm_irq_has_notifier(ioapic->kvm, 
>>>>>> KVM_IRQCHIP_IOAPIC,
>>>>>> - index))) {
>>>>>> + index) || index == 8)) {
>>>>>>  irqe.dest_id = e->fields.dest_id;
>>>>>>  irqe.vector = e->fields.vector;
>>>>>>  irqe.dest_mode = e->fields.dest_mode;
>>>>>>  irqe.shorthand = 0;
>>>>>>  
>>>>>>  if (kvm_apic_match_dest(vcpu, NULL, 
>>>>>> irqe.shorthand,
>>>>>> -irqe.dest_id, 
>>>>>> irqe.dest_mode))
>>>>>> +irqe.dest_id, 
>>>>>> irqe.dest_mode)) {
>>>>>>  __set_bit(irqe.vector, eoi_exit_bitmap);
>>>>>> +if (index == 8)
>>>>>> +__set_bit(vcpu->vcpu_id, 
>>>>>> rtc_map);
>>>>>> +} else if (index == 8)
>>>>>> +__clear_bit(vcpu->vcpu_id, rtc_map);
>>>>> rtc_map bitmap is accessed from different vcpus simultaneously so access
>>>>> has to be atomic. We also have a race:
>>>>> 
>>>>> vcpu0   iothread
>>>>> ioapic config changes
>>>>> request scan ioapic
>>>>>  inject rtc interrupt
>>>>>  use old vcpu mask
>>>>> scan_ioapic()
>>>>> recalculate vcpu mask
>>>>> 
>>>>> So this approach (suggested by me :() will not work.
>>>>> 
>>>>> Need to think about it some more. May be your idea of building a bitmap
>>>>> while injecting the interrupt is the way to go indeed: pass a pointer to
>>>>> a bitmap to kvm_irq_delivery_to_apic() and build it there. Pass NULL
>>>>> pointer if caller does not need to track vcpus.
>>>> Or, we can block inject rtc interrupt during recalculate vcpu map.
>>>> 
>>>> if(need_eoi > 0 && in_recalculating)
>>>> return coalesced
>>>> 
>>> This should be ||. Then you need to maintain in_recalculating and
>>> recalculations requests may overlap. Too complex and fragile.
>> It should not be too complex. How about the following logic?
>> 
>> when make scan ioapic request:
>> kvm_vcpu_scan_ioapic()
>> {
>> kvm_for_each_vcpu()
>>  in_recalculating++;
>> }
>> 
>> Then on each vcpu's request handler:
>> vcpu_scan_ioapic()
>> {
>> in_recalculating--;
>> }
>> 
> kvm_vcpu_scan_ioapic() can be called more often then vcpu_scan_ioapic()
Ok. I see your point. Maybe we need to rollback to old idea.

Can you pick the first two patches? If rollback to old way, it will not touch 
those code.

> 
>> And when delivering RTC interrupt:
>> if(need_eoi > 0 || in_recalculating)
>>  return coalesced
>> 
>> 
>>> 
>>> --
>>> Gleb.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
>> Best regards,
>> Yang
>> 
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/7] KVM: Recalculate destination vcpu map

2013-03-21 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 05:39:46AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-21:
>>> On Thu, Mar 21, 2013 at 05:30:32AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-03-21:
>>>>> On Thu, Mar 21, 2013 at 03:42:46AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-03-20:
>>>>>>> On Wed, Mar 20, 2013 at 07:36:17PM +0800, Yang Zhang wrote:
>>>>>>>> From: Yang Zhang 
>>>>>>>> 
>>>>>>>> Update RTC interrrupt's destination vcpu map when ioapic entry of RTC
>>>>>>>> or apic register (id, ldr, dfr) is changed.
>>>>>>>> 
>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>> ---
>>>>>>>>  virt/kvm/ioapic.c |9 +++--
>>>>>>>>  1 files changed, 7 insertions(+), 2 deletions(-)
>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>>>> index ddf9414..91b4c08 100644
>>>>>>>> --- a/virt/kvm/ioapic.c
>>>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>>>> @@ -121,6 +121,7 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu
>>> *vcpu,
>>>>>>>>  { struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;union
>>>>>>>>  kvm_ioapic_redirect_entry *e; +   unsigned long *rtc_map =
>>>>>>>>  ioapic->rtc_status.vcpu_map;  struct kvm_lapic_irq irqe;  int
> index;
>>>>>>>> @@ -130,15 +131,19 @@ void kvm_ioapic_scan_entry(struct
> kvm_vcpu
>>>>> *vcpu,
>>>>>>>>if (!e->fields.mask &&
>>>>>>>>(e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
>>>>>>>> kvm_irq_has_notifier(ioapic->kvm,
> KVM_IRQCHIP_IOAPIC,
>>>>>>>> -   index))) {
>>>>>>>> +   index) || index == 8)) {
>>>>>>>>irqe.dest_id = e->fields.dest_id;
>>>>>>>>irqe.vector = e->fields.vector;
>>>>>>>>irqe.dest_mode = e->fields.dest_mode;
>>>>>>>>irqe.shorthand = 0;
>>>>>>>>  
>>>>>>>>if (kvm_apic_match_dest(vcpu, NULL, 
>>>>>>>> irqe.shorthand,
>>>>>>>> -  irqe.dest_id, 
>>>>>>>> irqe.dest_mode))
>>>>>>>> +  irqe.dest_id, 
>>>>>>>> irqe.dest_mode)) {
>>>>>>>>__set_bit(irqe.vector, eoi_exit_bitmap);
>>>>>>>> +  if (index == 8)
>>>>>>>> +  __set_bit(vcpu->vcpu_id, 
>>>>>>>> rtc_map);
>>>>>>>> +  } else if (index == 8)
>>>>>>>> +  __clear_bit(vcpu->vcpu_id, rtc_map);
>>>>>>> rtc_map bitmap is accessed from different vcpus simultaneously so
>>>>>>> access has to be atomic. We also have a race:
>>>>>>> 
>>>>>>> vcpu0   iothread
>>>>>>> ioapic config changes
>>>>>>> request scan ioapic
>>>>>>>  inject rtc interrupt
>>>>>>>  use old vcpu mask
>>>>>>> scan_ioapic()
>>>>>>> recalculate vcpu mask
>>>>>>> 
>>>>>>> So this approach (suggested by me :() will not work.
>>>>>>> 
>>>>>>> Need to think about it some more. May be your idea of building a
>>>>>>> bitmap while injecting the interrupt is the way to go indeed: pass
>>>>>>> a pointer to a bitmap to kvm_irq_delivery_to_apic() and build it
>>>>>>> there. Pass NULL pointer if caller does not need to track vcpus.
>>>>>> Or, we can block inject rtc interrupt during recalculate vcpu map.
>>>>>> 
>>>>>> if(need_eoi > 0 && in_recalculating)
>>>>>> return coalesced
>>>>>> 
>>>>> This should be ||. Then you need to maintain in_recalculating and
>>>>> recalculations requests may overlap. Too complex and fragile.
>>>> It should not be too complex. How about the following logic?
>>>> 
>>>> when make scan ioapic request:
>>>> kvm_vcpu_scan_ioapic()
>>>> {
>>>> kvm_for_each_vcpu()
>>>>in_recalculating++;
>>>> }
>>>> 
>>>> Then on each vcpu's request handler:
>>>> vcpu_scan_ioapic()
>>>> {
>>>> in_recalculating--;
>>>> }
>>>> 
>>> kvm_vcpu_scan_ioapic() can be called more often then vcpu_scan_ioapic()
>> Ok. I see your point. Maybe we need to rollback to old idea.
>> 
>> Can you pick the first two patches? If rollback to old way, it will not
>> touch those code.
>> 
> First patch is great, but drop no longer needed irqe there. I do not see
> the point of the second patch if the map will be built during injection.
Sure. I will resend the first patch.
And we need to rebuild TMR when ioapic entry changed. So the second patch will 
be used at that time. But it's ok to send it with APICv patch.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/7] KVM: Recalculate destination vcpu map

2013-03-21 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-20:
> On Wed, Mar 20, 2013 at 07:36:17PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Update RTC interrrupt's destination vcpu map when ioapic entry of RTC
>> or apic register (id, ldr, dfr) is changed.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |9 +++--
>>  1 files changed, 7 insertions(+), 2 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index ddf9414..91b4c08 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -121,6 +121,7 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
>>  {   struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;union
>>  kvm_ioapic_redirect_entry *e; + unsigned long *rtc_map =
>>  ioapic->rtc_status.vcpu_map;struct kvm_lapic_irq irqe;  int 
>> index;
>> @@ -130,15 +131,19 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
>>  if (!e->fields.mask &&
>>  (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
>>   kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
>> - index))) {
>> + index) || index == 8)) {
>>  irqe.dest_id = e->fields.dest_id;
>>  irqe.vector = e->fields.vector;
>>  irqe.dest_mode = e->fields.dest_mode;
>>  irqe.shorthand = 0;
>>  
>>  if (kvm_apic_match_dest(vcpu, NULL, irqe.shorthand,
>> -irqe.dest_id, irqe.dest_mode))
>> +irqe.dest_id, irqe.dest_mode)) {
>>  __set_bit(irqe.vector, eoi_exit_bitmap);
>> +if (index == 8)
>> +__set_bit(vcpu->vcpu_id, rtc_map);
>> +} else if (index == 8)
>> +__clear_bit(vcpu->vcpu_id, rtc_map);
> rtc_map bitmap is accessed from different vcpus simultaneously so access
> has to be atomic. We also have a race:
> 
> vcpu0   iothread
> ioapic config changes
> request scan ioapic
>  inject rtc interrupt
>  use old vcpu mask
> scan_ioapic()
> recalculate vcpu mask
> 
> So this approach (suggested by me :() will not work.
> 
> Need to think about it some more. May be your idea of building a bitmap
> while injecting the interrupt is the way to go indeed: pass a pointer to
> a bitmap to kvm_irq_delivery_to_apic() and build it there. Pass NULL
> pointer if caller does not need to track vcpus.
How about build it in kvm_apic_set_irq()? It should be more straightforward.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 3/6] KVM : Calculate destination vcpu on interrupt injection

2013-03-21 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 06:49:21PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Add a new parameter to know vcpus who received the interrupt.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |   21 -
>>  arch/x86/kvm/lapic.h |5 +++--
>>  virt/kvm/ioapic.c|2 +-
>>  virt/kvm/ioapic.h|2 +-
>>  virt/kvm/irq_comm.c  |   12 ++--
>>  5 files changed, 27 insertions(+), 15 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index d3e322a..5f6b1d0 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -433,10 +433,21 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu
> *vcpu)
>>  static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>>   int vector, int level, int trig_mode);
>> -int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq)
>> +static void kvm_set_irq_dest_map(struct kvm_vcpu *vcpu, unsigned long
>> *dest_map) +{ +  if (!kvm_lapic_enabled(vcpu)) + return;
> Why this check here?
The vcpu who didn't enable apic should not account as destination vcpu. 
Without this check, if broadcast interrupt, all cpus will treat as destination 
vcpu, but only those who enabled apic will receive the interrupt.
There are same check in __apic_accept_irq():
if (unlikely(!apic_enabled(apic)))
 break;

>> +__set_bit(vcpu->vcpu_id, dest_map);
>> +}
>> +
>> +int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
>> +unsigned long *dest_map)
>>  {
>>  struct kvm_lapic *apic = vcpu->arch.apic;
>> +if (dest_map)
>> +kvm_set_irq_dest_map(vcpu, dest_map);
>> +
>>  return __apic_accept_irq(apic, irq->delivery_mode, irq->vector,
>>  irq->level, irq->trig_mode);
>>  }
>> @@ -611,7 +622,7 @@ int kvm_apic_match_dest(struct kvm_vcpu *vcpu,
> struct kvm_lapic *source,
>>  }
>>  
>>  bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
>> -struct kvm_lapic_irq *irq, int *r)
>> +struct kvm_lapic_irq *irq, int *r, unsigned long *dest_map)
>>  {
>>  struct kvm_apic_map *map;
>>  unsigned long bitmap = 1;
>> @@ -622,7 +633,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm,
> struct kvm_lapic *src,
>>  *r = -1;
>>  
>>  if (irq->shorthand == APIC_DEST_SELF) {
>> -*r = kvm_apic_set_irq(src->vcpu, irq);
>> +*r = kvm_apic_set_irq(src->vcpu, irq, dest_map);
>>  return true;
>>  }
>> @@ -667,7 +678,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm,
> struct kvm_lapic *src,
>>  continue;
>>  if (*r < 0)
>>  *r = 0;
>> -*r += kvm_apic_set_irq(dst[i]->vcpu, irq);
>> +*r += kvm_apic_set_irq(dst[i]->vcpu, irq, dest_map);
>>  }
>>  
>>  ret = true; @@ -852,7 +863,7 @@ static void apic_send_ipi(struct
>>  kvm_lapic *apic)   irq.trig_mode, irq.level, irq.dest_mode,
>>  irq.delivery_mode, irq.vector);
>> -kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq);
>> +kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>>  }
>>  
>>  static u32 apic_get_tmcct(struct kvm_lapic *apic)
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 2c721b9..967519c 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -55,11 +55,12 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu);
>> 
>>  int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
>>  int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
>> -int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
>> +int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
>> +unsigned long *dest_map);
>>  int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
>>  
>>  bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
>> -struct kvm_lapic_irq *irq, int *r);
>> +struct kvm_lapic_irq *irq, int *r, unsigned long *dest_map);
>> 
>>  u64 kvm_get_apic_base(struct kvm_vcpu *vcpu);
>>  void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data);
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index ed6f111..4767fa6 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -217,7 +217,7 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int
> irq)
>>  irqe.level = 1;
>>  irqe.shorthand = 0;
>> -return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe);
>> +return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>>  }
>>  
>>  int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int 
>> irq_source_id,
>> diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
>> index 6e5c88f..14e5289 100644
>> --- a/virt/kvm/ioapic.h
>> +++ b/virt/kvm/ioapic.h
>> @@ -88,7 +88,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapi

RE: [PATCH v5 3/6] KVM : Calculate destination vcpu on interrupt injection

2013-03-21 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 11:56:05AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-21:
>>> On Thu, Mar 21, 2013 at 06:49:21PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> Add a new parameter to know vcpus who received the interrupt.
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  arch/x86/kvm/lapic.c |   21 -
>>>>  arch/x86/kvm/lapic.h |5 +++--
>>>>  virt/kvm/ioapic.c|2 +-
>>>>  virt/kvm/ioapic.h|2 +-
>>>>  virt/kvm/irq_comm.c  |   12 ++--
>>>>  5 files changed, 27 insertions(+), 15 deletions(-)
>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>> index d3e322a..5f6b1d0 100644
>>>> --- a/arch/x86/kvm/lapic.c
>>>> +++ b/arch/x86/kvm/lapic.c
>>>> @@ -433,10 +433,21 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu
>>> *vcpu)
>>>>  static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>>>> int vector, int level, int trig_mode);
>>>> -int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq)
>>>> +static void kvm_set_irq_dest_map(struct kvm_vcpu *vcpu, unsigned long
>>>> *dest_map) +{ +if (!kvm_lapic_enabled(vcpu)) + return;
>>> Why this check here?
>> The vcpu who didn't enable apic should not account as destination vcpu.
>> Without this check, if broadcast interrupt, all cpus will treat as
>> destination vcpu, but only those who enabled apic will receive the
>> interrupt. There are same check in __apic_accept_irq(): if
>> (unlikely(!apic_enabled(apic)))
>>  break;
> I see, but you use more strict check that also checks that apic is
> emulated by the kernel and we wouldn't be here if it wasn't. Anyway lets
Do you mean the check add in here will block "userspace apic"? Shouldn't only 
in-kernel apic will get here?

> move bitmap update into __apic_accept_irq().
Sure.

> 
>> 
>>>> +  __set_bit(vcpu->vcpu_id, dest_map);
>>>> +}
>>>> +
>>>> +int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
>>>> +  unsigned long *dest_map)
>>>>  {
>>>>struct kvm_lapic *apic = vcpu->arch.apic;
>>>> +  if (dest_map)
>>>> +  kvm_set_irq_dest_map(vcpu, dest_map);
>>>> +
>>>>return __apic_accept_irq(apic, irq->delivery_mode, irq->vector,
>>>>irq->level, irq->trig_mode);
>>>>  }
>>>> @@ -611,7 +622,7 @@ int kvm_apic_match_dest(struct kvm_vcpu *vcpu,
>>> struct kvm_lapic *source,
>>>>  }
>>>>  
>>>>  bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic
> *src,
>>>> -  struct kvm_lapic_irq *irq, int *r)
>>>> +  struct kvm_lapic_irq *irq, int *r, unsigned long *dest_map)
>>>>  {
>>>>struct kvm_apic_map *map;
>>>>unsigned long bitmap = 1;
>>>> @@ -622,7 +633,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm
> *kvm,
>>> struct kvm_lapic *src,
>>>>*r = -1;
>>>>  
>>>>if (irq->shorthand == APIC_DEST_SELF) {
>>>> -  *r = kvm_apic_set_irq(src->vcpu, irq);
>>>> +  *r = kvm_apic_set_irq(src->vcpu, irq, dest_map);
>>>>return true;
>>>>}
>>>> @@ -667,7 +678,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm
> *kvm,
>>> struct kvm_lapic *src,
>>>>continue;
>>>>if (*r < 0)
>>>>*r = 0;
>>>> -  *r += kvm_apic_set_irq(dst[i]->vcpu, irq);
>>>> +  *r += kvm_apic_set_irq(dst[i]->vcpu, irq, dest_map);
>>>>}
>>>>  
>>>>ret = true; @@ -852,7 +863,7 @@ static void apic_send_ipi(struct
>>>>  kvm_lapic *apic) irq.trig_mode, irq.level, irq.dest_mode,
>>>>  irq.delivery_mode,   irq.vector);
>>>> -  kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq);
>>>> +  kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>>>>  }
>>>>  
>>>>  static u32 apic_get_tmcct(struct kvm_lapic *apic)
>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>> index 2c721b9..967519c 10

RE: [PATCH v5 2/6] KVM: Introduce struct rtc_status

2013-03-21 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 06:49:20PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.h |8 
>>  1 files changed, 8 insertions(+), 0 deletions(-)
>> diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
>> index 2fc61a5..6e5c88f 100644
>> --- a/virt/kvm/ioapic.h
>> +++ b/virt/kvm/ioapic.h
>> @@ -34,6 +34,11 @@ struct kvm_vcpu;
>>  #define IOAPIC_INIT 0x5
>>  #define IOAPIC_EXTINT   0x7
>> +struct rtc_status {
>> +int pending_eoi;
>> +DECLARE_BITMAP(dest_map, KVM_MAX_VCPUS);
>> +};
>> +
>>  struct kvm_ioapic { u64 base_address;   u32 ioregsel; @@ -47,6 
>> +52,9
>>  @@ struct kvm_ioapic {  void (*ack_notifier)(void *opaque, int irq);
>>  spinlock_t lock;DECLARE_BITMAP(handled_vectors, 256);
>> +#ifdef CONFIG_X86
>> +struct rtc_status rtc_status;
>> +#endif
> IA64 KVM is almost dead, but we still add CONFIG_X86 everywhere in these
> patches. Lets drop all CONFIG_X86 throughout the patches and instead leave
> only one:
> #ifdef CONFIG_X86
> #define RTC_GSI 8
> else
> #define RTC_GSI 255
> #endif
> 
> Then use RTC_GSI instead of 8 everywhere and the code will be effectively
> disabled on IA64.
Nice idea!

>>  };
>>  
>>  #ifdef DEBUG
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 3/6] KVM : Calculate destination vcpu on interrupt injection

2013-03-21 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 12:12:06PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-21:
>>> On Thu, Mar 21, 2013 at 11:56:05AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-03-21:
>>>>> On Thu, Mar 21, 2013 at 06:49:21PM +0800, Yang Zhang wrote:
>>>>>> From: Yang Zhang 
>>>>>> 
>>>>>> Add a new parameter to know vcpus who received the interrupt.
>>>>>> 
>>>>>> Signed-off-by: Yang Zhang 
>>>>>> ---
>>>>>>  arch/x86/kvm/lapic.c |   21 -
>>>>>>  arch/x86/kvm/lapic.h |5 +++--
>>>>>>  virt/kvm/ioapic.c|2 +-
>>>>>>  virt/kvm/ioapic.h|2 +-
>>>>>>  virt/kvm/irq_comm.c  |   12 ++--
>>>>>>  5 files changed, 27 insertions(+), 15 deletions(-)
>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>> index d3e322a..5f6b1d0 100644
>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>> @@ -433,10 +433,21 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu
>>>>> *vcpu)
>>>>>>  static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>>>>>>   int vector, int level, int trig_mode);
>>>>>> -int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq
>>>>>> *irq) +static void kvm_set_irq_dest_map(struct kvm_vcpu *vcpu,
>>>>>> unsigned long *dest_map) +{ +if (!kvm_lapic_enabled(vcpu))
>>>>>> +return;
>>>>> Why this check here?
>>>> The vcpu who didn't enable apic should not account as destination vcpu.
>>>> Without this check, if broadcast interrupt, all cpus will treat as
>>>> destination vcpu, but only those who enabled apic will receive the
>>>> interrupt. There are same check in __apic_accept_irq(): if
>>>> (unlikely(!apic_enabled(apic)))
>>>>  break;
>>> I see, but you use more strict check that also checks that apic is
>>> emulated by the kernel and we wouldn't be here if it wasn't. Anyway lets
>> Do you mean the check add in here will block "userspace apic"?
>> Shouldn't only in-kernel apic will get here?
>> 
> No, it will not block. It checks for in kernel apic needlessly. Since we
> patch all those checks out anyway using jump labels it is not really
> affects performance, but I prefer to make only necessary checks for
> consistency.
Make sense.
 
>>> move bitmap update into __apic_accept_irq().
>> Sure.
>> 
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers

2013-03-21 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 11:02:24AM -0300, Marcelo Tosatti wrote:
>> On Thu, Mar 21, 2013 at 06:54:46AM +0200, Gleb Natapov wrote:
>>> On Wed, Mar 20, 2013 at 08:19:13PM -0300, Marcelo Tosatti wrote:
 On Wed, Mar 20, 2013 at 11:32:38PM +0200, Gleb Natapov wrote:
> On Wed, Mar 20, 2013 at 05:03:19PM -0300, Marcelo Tosatti wrote:
>> On Wed, Mar 20, 2013 at 04:30:33PM -0300, Marcelo Tosatti wrote:
>>> On Sun, Mar 17, 2013 at 12:47:17PM +0200, Gleb Natapov wrote:
 On Sun, Mar 17, 2013 at 11:45:34AM +0100, Jan Kiszka wrote:
> On 2013-03-17 09:47, Gleb Natapov wrote:
>> On Sat, Mar 16, 2013 at 09:49:07PM +0100, Jan Kiszka wrote:
>>> From: Jan Kiszka 
>>> 
>>> If the guest didn't take the last APIC timer interrupt yet and
>>> generates another one on top, e.g. via periodic mode, we do
>>> not block the VCPU even if the guest state is halted. The
>>> reason is that apic_has_pending_timer continues to return a
>>> non-zero value.
>>> 
>>> Fix this busy loop by taking the IRR content for the LVT vector in
>>> apic_has_pending_timer into account.
>>> 
>> Just drop coalescing tacking for lapic interrupt. After posted
>> interrupt will be merged __apic_accept_irq() will not longer
>> return coalescing information, so the code will be dead anyway.
> 
> That requires the RTC decoalescing series to go first to avoid a
> regression, no? Then let's postpone this topic for now.
> 
 Yes, but decoalescing will work only for RTC :(
>>> 
>>> Are you proposing to drop LAPIC interrupt reinjection?
>> 
>> Since timer handling and injection is VCPU-local for LAPIC,
>> __apic_accept_irq can (and must) return coalesced information (cannot
>> drop LAPIC interrupt reinjection).
>> 
> Why can't we drop LAPIC interrupt reinjection? Proposed posted
> interrupt patches do not properly check for interrupt coalescing
> even for VCPU-local injection.
> 
> --
>   Gleb.
 
 Because older Linux guests depend on reinjection for proper timekeeping.
>>> Which versions? Those without kvmclock? Can we make them use PIT
>>> instead? Posted interrupts going to break them.
>> 
>> There is no reason to break them if its OK to receive reinjection info
>> from LAPIC... its a matter of returning the information from
>> apic_accept_irq, no big deal.
>> 
> But current PI patches do break them, thats my point. So we either
> need to revise them again, or drop LAPIC timer reinjection. Making
> apic_accept_irq semantics "it returns coalescing info, but only sometimes"
> is dubious though.
We may rollback to the initial idea: test both irr and pir to get coalescing 
info. In this case, inject LAPIC timer always in vcpu context. So 
apic_accept_irq() will return right coalescing info.
Also, we need to add comments to tell caller, apic_accept_irq() can ensure the 
return value is correct only when caller is in target vcpu context.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers

2013-03-21 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-03-22:
> On Thu, Mar 21, 2013 at 11:13:39PM +0200, Gleb Natapov wrote:
>> On Thu, Mar 21, 2013 at 05:51:50PM -0300, Marcelo Tosatti wrote:
>> But current PI patches do break them, thats my point. So we either
>> need to revise them again, or drop LAPIC timer reinjection. Making
>> apic_accept_irq semantics "it returns coalescing info, but only
>> sometimes" is dubious though.
> We may rollback to the initial idea: test both irr and pir to get 
> coalescing
> info. In this case, inject LAPIC timer always in vcpu context. So 
> apic_accept_irq()
> will return right coalescing info.
> Also, we need to add comments to tell caller, apic_accept_irq() can
> ensure the return value is correct only when caller is in target
> vcpu context.
> 
 We cannot touch irr while vcpu is in non-root operation, so we will have
 to pass flag to apic_accept_irq() to let it know that it is called
 synchronously. While all this is possible I want to know which guests
 exactly will we break if we will not track interrupt coalescing for
 lapic timer. If only 2.0 smp kernels will break we can probably drop it.
>>> 
>>> RHEL4 / RHEL5 guests.
>> RHEL5 has kvmclock no? We should not break RHEL4 though.
> 
> kvmclock provides no timer interrupt... either LAPIC or PIT must be used
> with kvmclock.
Ok, Here is the conclusion: 
-- According Marcelo's comments, RHEL4/RHEL5 rely on precise LAPIC timer 
injection. So LAPIC timer injection logic is necessary.
--LAPIC timer injection always occurred in vcpu context, so it's safe to touch 
irr and pir for LAPIC timer injection.
--We cannot touch virtual apic page while vcpu is in non-root operation, so the 
best solution is pass a flag to apic_accept_irq and check whether it's safe to 
touch vIRR according this flag.

Right?

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 3/6] KVM : Return destination vcpu on interrupt injection

2013-03-22 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 01:24:02PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Add a new parameter to know vcpus who received the interrupt.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |   25 -
>>  arch/x86/kvm/lapic.h |5 +++--
>>  virt/kvm/ioapic.c|2 +-
>>  virt/kvm/ioapic.h|2 +-
>>  virt/kvm/irq_comm.c  |   12 ++--
>>  5 files changed, 27 insertions(+), 19 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index d3e322a..d7915a1 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -431,14 +431,16 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu
> *vcpu)
>>  }
>>  
>>  static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>> - int vector, int level, int trig_mode);
>> + int vector, int level, int trig_mode,
>> + unsigned long *dest_map);
>> 
>> -int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq)
>> +int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
>> +unsigned long *dest_map)
>>  {
>>  struct kvm_lapic *apic = vcpu->arch.apic;
>>  
>>  return __apic_accept_irq(apic, irq->delivery_mode, irq->vector,
>> -irq->level, irq->trig_mode);
>> +irq->level, irq->trig_mode, dest_map);
>>  }
>>  
>>  static int pv_eoi_put_user(struct kvm_vcpu *vcpu, u8 val)
>> @@ -611,7 +613,7 @@ int kvm_apic_match_dest(struct kvm_vcpu *vcpu,
> struct kvm_lapic *source,
>>  }
>>  
>>  bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
>> -struct kvm_lapic_irq *irq, int *r)
>> +struct kvm_lapic_irq *irq, int *r, unsigned long *dest_map)
>>  {
>>  struct kvm_apic_map *map;
>>  unsigned long bitmap = 1;
>> @@ -622,7 +624,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm,
> struct kvm_lapic *src,
>>  *r = -1;
>>  
>>  if (irq->shorthand == APIC_DEST_SELF) {
>> -*r = kvm_apic_set_irq(src->vcpu, irq);
>> +*r = kvm_apic_set_irq(src->vcpu, irq, dest_map);
>>  return true;
>>  }
>> @@ -667,7 +669,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm,
> struct kvm_lapic *src,
>>  continue;
>>  if (*r < 0)
>>  *r = 0;
>> -*r += kvm_apic_set_irq(dst[i]->vcpu, irq);
>> +*r += kvm_apic_set_irq(dst[i]->vcpu, irq, dest_map);
>>  }
>>  
>>  ret = true;
>> @@ -681,7 +683,8 @@ out:
>>   * Return 1 if successfully added and 0 if discarded.
>>   */
>>  static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>> - int vector, int level, int trig_mode)
>> + int vector, int level, int trig_mode,
>> + unsigned long *dest_map)
>>  {
>>  int result = 0;
>>  struct kvm_vcpu *vcpu = apic->vcpu;
>> @@ -694,6 +697,9 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  if (unlikely(!apic_enabled(apic)))
>>  break;
>> +if (dest_map)
>> +set_bit(vcpu->vcpu_id, dest_map);
>> +
> __set_bit()
no, __apic_accept_irq() may be called to deliver interrupt from IOAPIC and 
LAPIC interrupt.
Though the dest_map is only used by RTC interrupt now, it may be use by LAPIC 
interrupt in future. So it's better to use set_bit not __set_bit.

>>  if (trig_mode) {apic_debug("level trig 
>> mode for vector %d",
>>  vector);apic_set_vector(vector, apic->regs + APIC_TMR); 
>> @@ -852,7
>>  +858,7 @@ static void apic_send_ipi(struct kvm_lapic *apic) 
>>   
>>  irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode, 
>>   
>>  irq.vector);
>> -kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq);
>> +kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>>  }
>>  
>>  static u32 apic_get_tmcct(struct kvm_lapic *apic)
>> @@ -1488,7 +1494,8 @@ int kvm_apic_local_deliver(struct kvm_lapic *apic, int
> lvt_type)
>>  vector = reg & APIC_VECTOR_MASK;
>>  mode = reg & APIC_MODE_MASK;
>>  trig_mode = reg & APIC_LVT_LEVEL_TRIGGER;
>> -return __apic_accept_irq(apic, mode, vector, 1, trig_mode);
>> +return __apic_accept_irq(apic, mode, vector, 1, trig_mode,
>> +NULL);
>>  }
>>  return 0;
>>  }
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 2c721b9..967519c 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -55,11 +55,12 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu);
>> 
>>  int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest);
>>  int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
>> -int kvm_a

RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-03-22 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 01:24:05PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Current interrupt coalescing logci which only used by RTC has conflict
>> with Posted Interrupt.
>> This patch introduces a new mechinism to use eoi to track interrupt:
>> When delivering an interrupt to vcpu, the pending_eoi set to number of
>> vcpu that received the interrupt. And decrease it when each vcpu writing
>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>> write eoi.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |   40 +++-
>>  1 files changed, 39 insertions(+), 1 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index c991e58..df16daf 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -114,6 +114,29 @@ static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>>  ioapic->rtc_status.pending_eoi = pending_eoi;
>>  }
>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>> +struct rtc_status *rtc_status, int irq)
>> +{
>> +if (irq != RTC_GSI)
>> +return;
>> +
>> +if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>> +--rtc_status->pending_eoi;
>> +
>> +WARN_ON(rtc_status->pending_eoi < 0);
>> +}
>> +
>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>> +{
>> +if (irq != RTC_GSI)
>> +return false;
>> +
>> +if (ioapic->rtc_status.pending_eoi > 0)
>> +return true; /* coalesced */
>> +
>> +return false;
>> +}
>> +
>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>>  {
>>  union kvm_ioapic_redirect_entry *pent;
>> @@ -229,6 +252,7 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int
> irq)
>>  {
>>  union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>>  struct kvm_lapic_irq irqe;
>> +int ret;
>> 
>>  ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
>>   "vector=%x trig_mode=%x\n",
>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int
> irq)
>>  irqe.level = 1;
>>  irqe.shorthand = 0;
>> -return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>> +if (irq == RTC_GSI) {
>> +ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>> +ioapic->rtc_status.dest_map);
>> +ioapic->rtc_status.pending_eoi = ret;
> We should track status only if IRQ_STATUS ioctl was used to inject an
> interrupt.
We already know RTC will use IRQ_STATUS ioctl. Why check it again?

>> +} else
>> +ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>> +
>> +return ret;
>>  }
>>  
>>  int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int 
>> irq_source_id,
>> @@ -268,6 +299,11 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int
> irq, int irq_source_id,
>>  ret = 1;
>>  } else {
>>  int edge = (entry.fields.trig_mode == IOAPIC_EDGE_TRIG);
>> +
>> +if (rtc_irq_check(ioapic, irq)) {
>> +ret = 0; /* coalesced */
>> +goto out;
>> +}
>>  ioapic->irr |= mask;
>>  if ((edge && old_irr != ioapic->irr) ||
>>  (!edge && !entry.fields.remote_irr))
>> @@ -275,6 +311,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int 
>> irq,
> int irq_source_id,
>>  elseret = 0; /* report coalesced interrupt 
>> */   } +out:
>>  trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0);
>>  spin_unlock(&ioapic->lock);
>> @@ -302,6 +339,7 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu
> *vcpu,
>>  if (ent->fields.vector != vector)
>>  continue;
>> +rtc_irq_ack_eoi(vcpu, &ioapic->rtc_status, i);
>>  /*
>>   * We are dropping lock while calling ack notifiers because ack
>>   * notifier callbacks for assigned devices call into IOAPIC
>> --
>> 1.7.1
> 
> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-03-22 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 08:05:27AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-22:
>>> On Fri, Mar 22, 2013 at 01:24:05PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> Current interrupt coalescing logci which only used by RTC has conflict
>>>> with Posted Interrupt.
>>>> This patch introduces a new mechinism to use eoi to track interrupt:
>>>> When delivering an interrupt to vcpu, the pending_eoi set to number of
>>>> vcpu that received the interrupt. And decrease it when each vcpu writing
>>>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>>>> write eoi.
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  virt/kvm/ioapic.c |   40 +++- 1
>>>>  files changed, 39 insertions(+), 1 deletions(-)
>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>> index c991e58..df16daf 100644
>>>> --- a/virt/kvm/ioapic.c
>>>> +++ b/virt/kvm/ioapic.c
>>>> @@ -114,6 +114,29 @@ static void rtc_irq_restore(struct kvm_ioapic
> *ioapic)
>>>>ioapic->rtc_status.pending_eoi = pending_eoi;
>>>>  }
>>>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>>>> +  struct rtc_status *rtc_status, int irq)
>>>> +{
>>>> +  if (irq != RTC_GSI)
>>>> +  return;
>>>> +
>>>> +  if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>>>> +  --rtc_status->pending_eoi;
>>>> +
>>>> +  WARN_ON(rtc_status->pending_eoi < 0);
>>>> +}
>>>> +
>>>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>>>> +{
>>>> +  if (irq != RTC_GSI)
>>>> +  return false;
>>>> +
>>>> +  if (ioapic->rtc_status.pending_eoi > 0)
>>>> +  return true; /* coalesced */
>>>> +
>>>> +  return false;
>>>> +}
>>>> +
>>>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>>>>  {
>>>>union kvm_ioapic_redirect_entry *pent;
>>>> @@ -229,6 +252,7 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, 
>>>> int
>>> irq)
>>>>  {
>>>>union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>>>>struct kvm_lapic_irq irqe;
>>>> +  int ret;
>>>> 
>>>>ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
>>>> "vector=%x trig_mode=%x\n",
>>>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic,
> int
>>> irq)
>>>>irqe.level = 1;
>>>>irqe.shorthand = 0;
>>>> -  return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>>>> +  if (irq == RTC_GSI) {
>>>> +  ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>> +  ioapic->rtc_status.dest_map);
>>>> +  ioapic->rtc_status.pending_eoi = ret;
>>> We should track status only if IRQ_STATUS ioctl was used to inject an
>>> interrupt.
>> We already know RTC will use IRQ_STATUS ioctl. Why check it again?
>> 
> QEMU does. QEMU is not the only userspace.
And this will break other userspace.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-03-22 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 08:25:21AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-22:
>>> On Fri, Mar 22, 2013 at 08:05:27AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-03-22:
>>>>> On Fri, Mar 22, 2013 at 01:24:05PM +0800, Yang Zhang wrote:
>>>>>> From: Yang Zhang 
>>>>>> 
>>>>>> Current interrupt coalescing logci which only used by RTC has conflict
>>>>>> with Posted Interrupt.
>>>>>> This patch introduces a new mechinism to use eoi to track interrupt:
>>>>>> When delivering an interrupt to vcpu, the pending_eoi set to number of
>>>>>> vcpu that received the interrupt. And decrease it when each vcpu writing
>>>>>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>>>>>> write eoi.
>>>>>> 
>>>>>> Signed-off-by: Yang Zhang 
>>>>>> ---
>>>>>>  virt/kvm/ioapic.c |   40 +++-
>>>>>>  1 files changed, 39 insertions(+), 1 deletions(-)
>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>> index c991e58..df16daf 100644
>>>>>> --- a/virt/kvm/ioapic.c
>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>> @@ -114,6 +114,29 @@ static void rtc_irq_restore(struct kvm_ioapic
>>> *ioapic)
>>>>>>  ioapic->rtc_status.pending_eoi = pending_eoi;
>>>>>>  }
>>>>>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>>>>>> +struct rtc_status *rtc_status, int irq)
>>>>>> +{
>>>>>> +if (irq != RTC_GSI)
>>>>>> +return;
>>>>>> +
>>>>>> +if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>>>>>> +--rtc_status->pending_eoi;
>>>>>> +
>>>>>> +WARN_ON(rtc_status->pending_eoi < 0);
>>>>>> +}
>>>>>> +
>>>>>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>>>>>> +{
>>>>>> +if (irq != RTC_GSI)
>>>>>> +return false;
>>>>>> +
>>>>>> +if (ioapic->rtc_status.pending_eoi > 0)
>>>>>> +return true; /* coalesced */
>>>>>> +
>>>>>> +return false;
>>>>>> +}
>>>>>> +
>>>>>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>>>>>>  {
>>>>>>  union kvm_ioapic_redirect_entry *pent;
>>>>>> @@ -229,6 +252,7 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic,
> int
>>>>> irq)
>>>>>>  {
>>>>>>  union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>>>>>>  struct kvm_lapic_irq irqe;
>>>>>> +int ret;
>>>>>> 
>>>>>>  ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
>>>>>>   "vector=%x trig_mode=%x\n",
>>>>>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic,
>>> int
>>>>> irq)
>>>>>>  irqe.level = 1;
>>>>>>  irqe.shorthand = 0;
>>>>>> -return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>>>>>> +if (irq == RTC_GSI) {
>>>>>> +ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>>>> +ioapic->rtc_status.dest_map);
>>>>>> +ioapic->rtc_status.pending_eoi = ret;
>>>>> We should track status only if IRQ_STATUS ioctl was used to inject an
>>>>> interrupt.
>>>> We already know RTC will use IRQ_STATUS ioctl. Why check it again?
>>>> 
>>> QEMU does. QEMU is not the only userspace.
>> And this will break other userspace.
>> 
> How?
If other userspace has the reinjection logic for RTC, but it not uses 
IRQ_STATUS, then it cannot get the right coalescing info. If it also use 
IRQ_STATUS to get coalescing info, then we don't need the IRQ_STATUS check.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-03-22 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 08:37:22AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-22:
>>> On Fri, Mar 22, 2013 at 08:25:21AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-03-22:
>>>>> On Fri, Mar 22, 2013 at 08:05:27AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-03-22:
>>>>>>> On Fri, Mar 22, 2013 at 01:24:05PM +0800, Yang Zhang wrote:
>>>>>>>> From: Yang Zhang 
>>>>>>>> 
>>>>>>>> Current interrupt coalescing logci which only used by RTC has
>>>>>>>> conflict with Posted Interrupt. This patch introduces a new
>>>>>>>> mechinism to use eoi to track interrupt: When delivering an
>>>>>>>> interrupt to vcpu, the pending_eoi set to number of vcpu that
>>>>>>>> received the interrupt. And decrease it when each vcpu writing
>>>>>>>> eoi. No subsequent RTC interrupt can deliver to vcpu until all
>>>>>>>> vcpus write eoi.
>>>>>>>> 
>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>> ---
>>>>>>>>  virt/kvm/ioapic.c |   40
>>>>>>>>  +++- 1 files changed, 39
>>>>>>>>  insertions(+), 1 deletions(-)
>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>>>> index c991e58..df16daf 100644
>>>>>>>> --- a/virt/kvm/ioapic.c
>>>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>>>> @@ -114,6 +114,29 @@ static void rtc_irq_restore(struct kvm_ioapic
>>>>> *ioapic)
>>>>>>>>ioapic->rtc_status.pending_eoi = pending_eoi;
>>>>>>>>  }
>>>>>>>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>>>>>>>> +  struct rtc_status *rtc_status, int irq)
>>>>>>>> +{
>>>>>>>> +  if (irq != RTC_GSI)
>>>>>>>> +  return;
>>>>>>>> +
>>>>>>>> +  if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>>>>>>>> +  --rtc_status->pending_eoi;
>>>>>>>> +
>>>>>>>> +  WARN_ON(rtc_status->pending_eoi < 0);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>>>>>>>> +{
>>>>>>>> +  if (irq != RTC_GSI)
>>>>>>>> +  return false;
>>>>>>>> +
>>>>>>>> +  if (ioapic->rtc_status.pending_eoi > 0)
>>>>>>>> +  return true; /* coalesced */
>>>>>>>> +
>>>>>>>> +  return false;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>>>>>>>>  {
>>>>>>>>union kvm_ioapic_redirect_entry *pent;
>>>>>>>> @@ -229,6 +252,7 @@ static int ioapic_deliver(struct kvm_ioapic
> *ioapic,
>>> int
>>>>>>> irq)
>>>>>>>>  {
>>>>>>>>union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>>>>>>>>struct kvm_lapic_irq irqe;
>>>>>>>> +  int ret;
>>>>>>>> 
>>>>>>>>ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
>>>>>>>> "vector=%x trig_mode=%x\n",
>>>>>>>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic
> *ioapic,
>>>>> int
>>>>>>> irq)
>>>>>>>>irqe.level = 1;
>>>>>>>>irqe.shorthand = 0;
>>>>>>>> -  return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>>>>>> NULL); +   if (irq == RTC_GSI) { + ret =
>>>>>>>> kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>>>>>> +  ioapic->rtc_status.dest_map);
>>>>>>>> +  ioapic->rtc_status.pending_eoi = ret;
>>>>>>> We should track status only if IRQ_STATUS ioctl was used to inject an
>>>>>>> interrupt.
>>>>>> We already know RTC will use IRQ_STATUS ioctl. Why check it again?
>>>>>> 
>>>>> QEMU does. QEMU is not the only userspace.
>>>> And this will break other userspace.
>>>> 
>>> How?
>> If other userspace has the reinjection logic for RTC, but it not uses 
>> IRQ_STATUS,
> then it cannot get the right coalescing info. If it also use IRQ_STATUS to get
> coalescing info, then we don't need the IRQ_STATUS check.
>> 
> If userspace does not care about irq status it does not use IRQ_STATUS
> ioctl and we should not go extra mile to provide one. Not everyone cares
> about running Windows as a guest.
I see your point. But if no windows guest running, RTC is hardly used by other 
guests and the overheard can be ignore.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 4/6] KVM: Add reset/restore rtc_status support

2013-03-22 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-03-22:
> On Thu, Mar 21, 2013 at 06:49:22PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |8 
>>  arch/x86/kvm/lapic.h |2 ++
>>  virt/kvm/ioapic.c|   35 +++
>>  3 files changed, 45 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 5f6b1d0..158e0a3 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void *bitmap)
>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>> +}
>> +
>>  static inline void apic_set_vector(int vec, void *bitmap)
>>  {
>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 967519c..004d2ad 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -170,4 +170,6 @@ static inline bool kvm_apic_has_events(struct
> kvm_vcpu *vcpu)
>>  return vcpu->arch.apic->pending_events;
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>> +
>>  #endif
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 4767fa6..8f9c62b 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -87,6 +87,39 @@ static unsigned long ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  return result;
>>  }
>> +#ifdef CONFIG_X86
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> +{
>> +ioapic->rtc_status.pending_eoi = 0;
>> +bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
>> +}
>> +
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +struct kvm_vcpu *vcpu;
>> +int vector, i, pending_eoi = 0, rtc_pin = 8;
>> +
>> +vector = ioapic->redirtbl[rtc_pin].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>> +pending_eoi++;
>> +set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
>> +}
>> +}
>> +ioapic->rtc_status.pending_eoi = pending_eoi;
>> +}
> 
> Userspace can load the IOAPIC before loading VCPUS LAPIC via
> KVM_SET_LAPIC. So kvm_lapic_reset / kvm_apic_post_state_restore should
> also update ioapic->rtc_status.dest_map (checking whether ioapic is
> initialized, of course).
Yes, this does happen.

>
> Please add a comment explaining why dest_map is necessary (or explain it
> in the changelog).
Sure.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-03-22 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 08:51:47AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-22:
>>> On Fri, Mar 22, 2013 at 08:37:22AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-03-22:
>>>>> On Fri, Mar 22, 2013 at 08:25:21AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-03-22:
>>>>>>> On Fri, Mar 22, 2013 at 08:05:27AM +, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2013-03-22:
>>>>>>>>> On Fri, Mar 22, 2013 at 01:24:05PM +0800, Yang Zhang wrote:
>>>>>>>>>> From: Yang Zhang 
>>>>>>>>>> 
>>>>>>>>>> Current interrupt coalescing logci which only used by RTC has
>>>>>>>>>> conflict with Posted Interrupt. This patch introduces a new
>>>>>>>>>> mechinism to use eoi to track interrupt: When delivering an
>>>>>>>>>> interrupt to vcpu, the pending_eoi set to number of vcpu that
>>>>>>>>>> received the interrupt. And decrease it when each vcpu writing
>>>>>>>>>> eoi. No subsequent RTC interrupt can deliver to vcpu until all
>>>>>>>>>> vcpus write eoi.
>>>>>>>>>> 
>>>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>>>> ---
>>>>>>>>>>  virt/kvm/ioapic.c |   40
>>>>>>>>>>  +++- 1 files changed, 39
>>>>>>>>>>  insertions(+), 1 deletions(-)
>>>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>>>>>> index c991e58..df16daf 100644
>>>>>>>>>> --- a/virt/kvm/ioapic.c
>>>>>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>>>>>> @@ -114,6 +114,29 @@ static void rtc_irq_restore(struct kvm_ioapic
>>>>>>> *ioapic)
>>>>>>>>>>  ioapic->rtc_status.pending_eoi = pending_eoi;
>>>>>>>>>>  }
>>>>>>>>>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>>>>>>>>>> +struct rtc_status *rtc_status, int irq)
>>>>>>>>>> +{
>>>>>>>>>> +if (irq != RTC_GSI)
>>>>>>>>>> +return;
>>>>>>>>>> +
>>>>>>>>>> +if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>>>>>>>>>> +--rtc_status->pending_eoi;
>>>>>>>>>> +
>>>>>>>>>> +WARN_ON(rtc_status->pending_eoi < 0);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static bool rtc_irq_check(struct kvm_ioapic *ioapic, int irq)
>>>>>>>>>> +{
>>>>>>>>>> +if (irq != RTC_GSI)
>>>>>>>>>> +return false;
>>>>>>>>>> +
>>>>>>>>>> +if (ioapic->rtc_status.pending_eoi > 0)
>>>>>>>>>> +return true; /* coalesced */
>>>>>>>>>> +
>>>>>>>>>> +return false;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int 
>>>>>>>>>> idx)
>>>>>>>>>>  {
>>>>>>>>>>  union kvm_ioapic_redirect_entry *pent;
>>>>>>>>>> @@ -229,6 +252,7 @@ static int ioapic_deliver(struct kvm_ioapic
>>> *ioapic,
>>>>> int
>>>>>>>>> irq)
>>>>>>>>>>  {
>>>>>>>>>>  union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>>>>>>>>>>  struct kvm_lapic_irq irqe;
>>>>>>>>>> +int ret;
>>>>>>>>>> 
>>>>>>>>>>  ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
>>>>>>>>>>   "vector=%x trig_mode=%x\n",
>>>>>>>>>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic
>>> *ioapic,
>>>>>>> int
>>>>>>>>> irq)
>>>>>>>>>>  irqe.level = 1;
>>>>>>>>>>  irqe.shorthand = 0;
>>>>>>>>>> -return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>>>>>>>> NULL); + if (irq == RTC_GSI) { + ret =
>>>>>>>>>> kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>>>>>>>> +ioapic->rtc_status.dest_map);
>>>>>>>>>> +ioapic->rtc_status.pending_eoi = ret;
>>>>>>>>> We should track status only if IRQ_STATUS ioctl was used to inject an
>>>>>>>>> interrupt.
>>>>>>>> We already know RTC will use IRQ_STATUS ioctl. Why check it again?
>>>>>>>> 
>>>>>>> QEMU does. QEMU is not the only userspace.
>>>>>> And this will break other userspace.
>>>>>> 
>>>>> How?
>>>> If other userspace has the reinjection logic for RTC, but it not uses
> IRQ_STATUS,
>>> then it cannot get the right coalescing info. If it also use IRQ_STATUS to 
>>> get
>>> coalescing info, then we don't need the IRQ_STATUS check.
>>>> 
>>> If userspace does not care about irq status it does not use IRQ_STATUS
>>> ioctl and we should not go extra mile to provide one. Not everyone cares
>>> about running Windows as a guest.
>> I see your point. But if no windows guest running, RTC is hardly used
>> by other guests and the overheard can be ignore.
>> 
> Anyone can use RTC is Linux guest. Don't know about others.
I see.
Since pass IRQ_STATUS to ioapic need to change many functions, how about add a 
variable in rtc_status:
struct rtc_status {
  bool IRQ_STATUS
};

And set it in kvm_vm_ioctl():
case KVM_IRQ_LINE_STATUS:
if(irq == RTC_GSI && ioapic)
ioapic->rtc_status.IRQ_STATUS = true;

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers

2013-03-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 07:43:03AM -0300, Marcelo Tosatti wrote:
>> On Fri, Mar 22, 2013 at 08:53:15AM +0200, Gleb Natapov wrote:
>>> On Thu, Mar 21, 2013 at 08:06:41PM -0300, Marcelo Tosatti wrote:
 On Thu, Mar 21, 2013 at 11:13:39PM +0200, Gleb Natapov wrote:
> On Thu, Mar 21, 2013 at 05:51:50PM -0300, Marcelo Tosatti wrote:
> But current PI patches do break them, thats my point. So we
> either need to revise them again, or drop LAPIC timer
> reinjection. Making apic_accept_irq semantics "it returns
> coalescing info, but only sometimes" is dubious though.
 We may rollback to the initial idea: test both irr and pir to get
> coalescing info. In this case, inject LAPIC timer always in vcpu context. So
> apic_accept_irq() will return right coalescing info.
 Also, we need to add comments to tell caller, apic_accept_irq()
 can ensure the return value is correct only when caller is in
 target vcpu context.
 
>>> We cannot touch irr while vcpu is in non-root operation, so we
>>> will have to pass flag to apic_accept_irq() to let it know that it
>>> is called synchronously. While all this is possible I want to know
>>> which guests exactly will we break if we will not track interrupt
>>> coalescing for lapic timer. If only 2.0 smp kernels will break we
>>> can probably drop it.
>> 
>> RHEL4 / RHEL5 guests.
> RHEL5 has kvmclock no? We should not break RHEL4 though.
 
 kvmclock provides no timer interrupt... either LAPIC or PIT must be used
 with kvmclock.
>>> I am confused now. If LAPIC is not used for wallclock time keeping, but
>>> only for scheduling the reinjection is actually harmful. Reinjecting the
>>> interrupt will cause needles task rescheduling. So the question is if
>>> there is a Linux kernel that uses LAPIC for wallclock time keeping and
>>> relies on accurate number of injected interrupts to not time drift.
>> 
>> See 4acd47cfea9c18134e0cbf915780892ef0ff433a on RHEL5, RHEL5 kernels
>> before that commit did not reinject.  Which means that all non-RHEL
>> Linux guests based on that upstream code also suffer from the same
>> problem.
>> 
> The commit actually fixes guest, not host. The existence of the commit
> also means that LAPIC timer reinjection does not solve the problem and
> all guests without this commit will suffer from the bug regardless of
> what we will decide to do here. Without LAPIC timer reinfection the
> effect of the bug will be much more visible and long lasting though.
> 
>> Also any other algorithm which uses LAPIC timers and compare that with
>> other clocks (such as NMI watchdog) are potentially vulnerable.
> They are with or without timer reinjection as commit you pointed to
> shows.
> 
>> 
>> Can drop it, and then wait until someone complains (if so).
>> 
> Yes, tough decision to make. All the complains will be guest bugs which
> can be hit without reinjection too, but with less probability. Why we so
> keen on keeping RTC reinject is that the guests that depends on it
> cannot be fixed.
> 
>>> Knowing that Linux tend to disable interrupt it is likely that it tries
>>> to detect and compensate for missing interrupt.
>> 
>> As said above, any algorithm which compares LAPIC timer interrupt with
>> another clock is vulnerable.
Any conclusion? 

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: KVM EPT implementation

2013-03-28 Thread Zhang, Yang Z
Tony Roberts wrote on 2013-03-29:
> Hello list,
> 
> (Apologies if this appears twice!)
> 
> I'm currently doing some research into guest memory allocation,
> specifically trying to determine when guests write data into certain
> memory locations, and I'm trying to get my head around how KVM updates
> the extended page tables, and where within the KVM code the actual
> updates occur.  I'm working on an Intel box with VT extensions, and
> Debian 3.6.6 kernel.
> 
> After going through the code, I can see that a lot of the existing
> shadow page table code is resued, however I'm a little confused over
> how exactly that is.
> 
> As an example, I can see the function vmx_set_cr3 (vmx.c) being
> called, which is setting the host CR3 to the base of the PML4 table.
> 
> Then from that address, the EPTP is created, essentially setting the
> bottom 12 bits to various flags.
> 
> Then, handle_ept_violation is called which contains the GPA that
> generated the page fault.  I've looked into the function
> kvm_mmu_page_fault which contains the value in the CR2, I'm assuming
> this to be the guest's CR2 value, which I think is the guest physical
> address that caused the page fault.
> 
> However this is where I lose the chase slightly.  I know from studying
> the Intel developers manuals that the top level of the 4 level
> hierarchy for the EPTs is the PML4 table, which can contain a maximum
> of 512 64-bit entries, with each entry in turn pointing to the base
> address of a PDPT.
> 
> The first address that the function pte_list_add sees is the base
> address of the PML4 table, so I was expecting to be able to read 512
> 64-bit entries from that base address and see at least one 64-bit
> entry written into that page.  However, after a number of different
> attempts, I'm unable to determine the function that is actually
> responsible for updating the EPTs.
Are you trying to dump guest PML4 table or EPT PML4? If for EPT, just look up 
EPTP(root_hpa in vcpu->arch.mmu.root_hpa). If for guest, you need to translate 
the gpa to hpa firstly.

> 
> I was hoping somebody might be able to point me to the correct location
> within the KVM source code to track when EPT entries are actually
> written to the various tables in the 4 level hierarchy.  The function
> pte_list_add seems to do nothing more than change the value of a
> pointer, but only the first address passed to it is page aligned (the
> PML4 base) and the rest of the addresses appear to be pointers into
> existing pages, often seeming to be outside of the PML4 page range.
> 
> I might be completely misunderstanding something, but any advice on how
> to effectively monitor EPT entries within KVM would be greatly
> appreciated.
You may start with mmu_alloc_direct_roots(). EPTP is assigned value in this 
function.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 4/6] KVM: Add reset/restore rtc_status support

2013-03-28 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2013-03-26:
> Il 22/03/2013 06:24, Yang Zhang ha scritto:
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +struct kvm_vcpu *vcpu;
>> +int vector, i, pending_eoi = 0;
>> +
>> +if (RTC_GSI != 8)
> 
> Please set it to -1U if not x86, and do
> 
>if (RTC_GSI >= IOAPIC_NUM_PINS)
>return;
> here.
Sure. It is more reasonable.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 4/6] KVM: Add reset/restore rtc_status support

2013-03-28 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2013-03-26:
> Il 22/03/2013 06:24, Yang Zhang ha scritto:
>> +vector = ioapic->redirtbl[RTC_GSI].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>> +pending_eoi++;
>> +set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
> 
> Also, __set_bit.  If I understand correctly, dest_map is protected by
> the ioapic spinlock.
Yes, I already see it and changed it in version 7.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-03-28 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2013-03-26:
> Il 22/03/2013 06:24, Yang Zhang ha scritto:
>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>> +struct rtc_status *rtc_status, int irq)
>> +{
>> +if (irq != RTC_GSI)
>> +return;
>> +
>> +if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>> +--rtc_status->pending_eoi;
>> +
>> +WARN_ON(rtc_status->pending_eoi < 0);
>> +}
> 
> This is the only case where you're passing the struct rtc_status instead
> of the struct kvm_ioapic.  Please use the latter, and make it the first
> argument.
>
>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int
> irq)
>>  irqe.level = 1;
>>  irqe.shorthand = 0;
>> -return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>> +if (irq == RTC_GSI) {
>> +ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>> +ioapic->rtc_status.dest_map);
>> +ioapic->rtc_status.pending_eoi = ret;
> 
> I think you should either add a
> 
> BUG_ON(ioapic->rtc_status.pending_eoi != 0);
> or use "ioapic->rtc_status.pending_eoi += ret" (or both).
> 
There may malicious guest to write EOI more than once. And the pending_eoi will 
be negative. But it should not be a bug. Just WARN_ON is enough. And we already 
do it in ack_eoi. So don't need to do duplicated thing here.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-03-29 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2013-03-29:
> Il 29/03/2013 04:25, Zhang, Yang Z ha scritto:
>> Paolo Bonzini wrote on 2013-03-26:
>>> Il 22/03/2013 06:24, Yang Zhang ha scritto:
>>>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>>>> +  struct rtc_status *rtc_status, int irq)
>>>> +{
>>>> +  if (irq != RTC_GSI)
>>>> +  return;
>>>> +
>>>> +  if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>>>> +  --rtc_status->pending_eoi;
>>>> +
>>>> +  WARN_ON(rtc_status->pending_eoi < 0);
>>>> +}
>>> 
>>> This is the only case where you're passing the struct rtc_status instead
>>> of the struct kvm_ioapic.  Please use the latter, and make it the first
>>> argument.
>>> 
>>>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, 
>>>> int
>>> irq)
>>>>irqe.level = 1;
>>>>irqe.shorthand = 0;
>>>> -  return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>>>> +  if (irq == RTC_GSI) {
>>>> +  ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>> +  ioapic->rtc_status.dest_map);
>>>> +  ioapic->rtc_status.pending_eoi = ret;
>>> 
>>> I think you should either add a
>>> 
>>> BUG_ON(ioapic->rtc_status.pending_eoi != 0);
>>> or use "ioapic->rtc_status.pending_eoi += ret" (or both).
>>> 
>> There may malicious guest to write EOI more than once. And the
>> pending_eoi will be negative. But it should not be a bug. Just WARN_ON
>> is enough. And we already do it in ack_eoi. So don't need to do
>> duplicated thing here.
> 
> Even WARN_ON is too much if it is guest-triggerable.  But then it is
> better to make it "+=", I think.
No. If the above case happened, you will always hit the WARN_ON with "+=". 

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: Call kvm_apic_match_dest() to check destination vcpu

2013-03-31 Thread Zhang, Yang Z
Zhang, Yang Z wrote on 2013-03-21:
> From: Yang Zhang 
> 
> For a given vcpu, kvm_apic_match_dest() will tell you whether
> the vcpu in the destination list quickly. Drop kvm_calculate_eoi_exitmap()
> and use kvm_apic_match_dest() instead.
> 
> Signed-off-by: Yang Zhang 
> ---
>  arch/x86/kvm/lapic.c |   47 ---
>  arch/x86/kvm/lapic.h |4 
>  virt/kvm/ioapic.c|9 -
>  3 files changed, 4 insertions(+), 56 deletions(-)
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index a8e9369..e227474 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -145,53 +145,6 @@ static inline int kvm_apic_id(struct kvm_lapic *apic)
>   return (kvm_apic_get_reg(apic, APIC_ID) >> 24) & 0xff;
>  }
> -void kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
> - struct kvm_lapic_irq *irq,
> - u64 *eoi_exit_bitmap)
> -{
> - struct kvm_lapic **dst;
> - struct kvm_apic_map *map;
> - unsigned long bitmap = 1;
> - int i;
> -
> - rcu_read_lock();
> - map = rcu_dereference(vcpu->kvm->arch.apic_map);
> -
> - if (unlikely(!map)) {
> - __set_bit(irq->vector, (unsigned long *)eoi_exit_bitmap);
> - goto out;
> - }
> -
> - if (irq->dest_mode == 0) { /* physical mode */
> - if (irq->delivery_mode == APIC_DM_LOWEST ||
> - irq->dest_id == 0xff) {
> - __set_bit(irq->vector,
> -   (unsigned long *)eoi_exit_bitmap);
> - goto out;
> - }
> - dst = &map->phys_map[irq->dest_id & 0xff];
> - } else {
> - u32 mda = irq->dest_id << (32 - map->ldr_bits);
> -
> - dst = map->logical_map[apic_cluster_id(map, mda)];
> -
> - bitmap = apic_logical_id(map, mda);
> - }
> -
> - for_each_set_bit(i, &bitmap, 16) {
> - if (!dst[i])
> - continue;
> - if (dst[i]->vcpu == vcpu) {
> - __set_bit(irq->vector,
> -   (unsigned long *)eoi_exit_bitmap);
> - break;
> - }
> - }
> -
> -out:
> - rcu_read_unlock();
> -}
> -
>  static void recalculate_apic_map(struct kvm *kvm)
>  {
>   struct kvm_apic_map *new, *old = NULL;
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index 2c721b9..baa20cf 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -160,10 +160,6 @@ static inline u16 apic_logical_id(struct kvm_apic_map
> *map, u32 ldr)
>   return ldr & map->lid_mask;
>  }
> -void kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
> - struct kvm_lapic_irq *irq,
> - u64 *eoi_bitmap);
> -
>  static inline bool kvm_apic_has_events(struct kvm_vcpu *vcpu)
>  {
>   return vcpu->arch.apic->pending_events;
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index ce82b94..b54ddfa 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -132,11 +132,10 @@ void kvm_ioapic_calculate_eoi_exitmap(struct
> kvm_vcpu *vcpu,
>   (e->fields.trig_mode == IOAPIC_LEVEL_TRIG ||
>kvm_irq_has_notifier(ioapic->kvm, KVM_IRQCHIP_IOAPIC,
>index))) {
> - irqe.dest_id = e->fields.dest_id; - 
> irqe.vector =
> e->fields.vector; -   irqe.dest_mode = e->fields.dest_mode;
> - irqe.delivery_mode = e->fields.delivery_mode << 8;
> - kvm_calculate_eoi_exitmap(vcpu, &irqe, 
> eoi_exit_bitmap); +  if
> (kvm_apic_match_dest(vcpu, NULL, 0, + 
> e->fields.dest_id,
> e->fields.dest_mode)) +   __set_bit(irqe.vector, 
> +(unsigned long
> *)eoi_exit_bitmap);
>   }
>   }
>   spin_unlock(&ioapic->lock);
> --
> 1.7.1

Any comments?

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 6/6] KVM: Use eoi to track RTC interrupt delivery status

2013-04-02 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-02:
> On Fri, Mar 29, 2013 at 03:25:16AM +0000, Zhang, Yang Z wrote:
>> Paolo Bonzini wrote on 2013-03-26:
>>> Il 22/03/2013 06:24, Yang Zhang ha scritto:
>>>> +static void rtc_irq_ack_eoi(struct kvm_vcpu *vcpu,
>>>> +  struct rtc_status *rtc_status, int irq)
>>>> +{
>>>> +  if (irq != RTC_GSI)
>>>> +  return;
>>>> +
>>>> +  if (test_and_clear_bit(vcpu->vcpu_id, rtc_status->dest_map))
>>>> +  --rtc_status->pending_eoi;
>>>> +
>>>> +  WARN_ON(rtc_status->pending_eoi < 0);
>>>> +}
>>> 
>>> This is the only case where you're passing the struct rtc_status instead
>>> of the struct kvm_ioapic.  Please use the latter, and make it the first
>>> argument.
>>> 
>>>> @@ -244,7 +268,14 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic,
> int
>>> irq)
>>>>irqe.level = 1;
>>>>irqe.shorthand = 0;
>>>> -  return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>>>> +  if (irq == RTC_GSI) {
>>>> +  ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>>>> +  ioapic->rtc_status.dest_map);
>>>> +  ioapic->rtc_status.pending_eoi = ret;
>>> 
>>> I think you should either add a
>>> 
>>> BUG_ON(ioapic->rtc_status.pending_eoi != 0);
>>> or use "ioapic->rtc_status.pending_eoi += ret" (or both).
>>> 
>> There may malicious guest to write EOI more than once. And the pending_eoi
> will be negative. But it should not be a bug. Just WARN_ON is enough. And we
> already do it in ack_eoi. So don't need to do duplicated thing here.
>> 
> Since we track vcpus that already called EOI and decrement pending_eoi
> only once for each vcpu malicious guest cannot trigger it, but we
> already do WARN_ON() in rtc_irq_ack_eoi(), so I am not sure we need
> another one here. += will be correct (since pending_eoi == 0 here), but
> confusing since it makes an impression that pending_eoi may not be zero.
Yes, I also make the wrong impression.
With previous implementation, the pening_eoi may not be zero: Calculate the 
destination vcpu via parse IOAPIC entry, and if using lowest priority deliver 
mode, set all possible vcpus in dest_map even it doesn't receive it finally. At 
same time, a malicious guest can send IPI with same vector of RTC to those 
vcpus who is in dest_map but not have RTC interrupt. Then the pending_eoi will 
be negative.
Now, we set the dest_map with the vcpus who really received the interrupt. The 
above case cannot happen. So as you and Paolo suggested, it is better to use +=.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-06 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-04:
> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h |2 ++
>>  virt/kvm/ioapic.c|   43
>>  +++ virt/kvm/ioapic.h|   
>>  1 + 4 files changed, 55 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 96ab160..9c041fa 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void *bitmap)
>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>> +}
>> +
>>  static inline void apic_set_vector(int vec, void *bitmap)
>>  {
>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu
> *vcpu,
>>  apic->highest_isr_cache = -1;
>>  kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>  apic_find_highest_isr(apic));   kvm_make_request(KVM_REQ_EVENT, vcpu);
>>  +   kvm_rtc_irq_restore(vcpu); }
>>  
>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 967519c..004d2ad 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -170,4 +170,6 @@ static inline bool kvm_apic_has_events(struct
> kvm_vcpu *vcpu)
>>  return vcpu->arch.apic->pending_events;
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>> +
>>  #endif
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 8664812..0b12b17 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -90,6 +90,47 @@ static unsigned long ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  return result;
>>  }
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>> +{
>> +ioapic->rtc_status.pending_eoi = 0;
>> +bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
>> +}
>> +
>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>> +{
>> +struct kvm_vcpu *vcpu;
>> +int vector, i, pending_eoi = 0;
>> +
>> +if (RTC_GSI >= IOAPIC_NUM_PINS)
>> +return;
>> +
>> +vector = ioapic->redirtbl[RTC_GSI].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>> +pending_eoi++;
>> +__set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
> You should cleat dest_map at the beginning to get rid of stale bits.
I thought kvm_set_ioapic is called only after save/restore or migration. And 
the ioapic should be reset successfully before call it. So the dest_map is 
empty before call rtc_irq_restore().
But it is possible kvm_set_ioapic is called beside save/restore or migration. 
Right?

> 
>> +}
>> +}
>> +ioapic->rtc_status.pending_eoi = pending_eoi;
>> +}
>> +
>> +void kvm_rtc_irq_restore(struct kvm_vcpu *vcpu)
>> +{
>> +struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
>> +int vector;
>> +
>> +if (!ioapic)
>> +return;
>> +
> Can this be called if ioapic == NULL?
Yes. IIRC, unit test will test lapic function without ioapic.

> Should check for if (RTC_GSI >= IOAPIC_NUM_PINS) here too.
Not necessary. kvm_rtc_irq_restore is called from "arch/x86/" and we have the 
defination:
#ifdef CONFIG_X86
#define RTC_GSI 8

The check will be false always. As the logic you suggested below, this check is 
necessary for _all() not _one();

> 
>> +spin_lock(&ioapic->lock);
>> +vector = ioapic->redirtbl[RTC_GSI].fields.vector;
>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>> +__set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
>> +ioapic->rtc_status.pending_eoi++;
> The bit may have been set already. The logic should be:
Right.

>
> 
> new_val = kvm_apic_pending_eoi(vcpu, vector)
> old_val = set_bit(vcpu_id, dest_map)
> 
> if (new_val == old_val)
>   return;
> 
> if (new_val) {
>   __set_bit(vcpu_id, dest_map);
>   pending_eoi++;
> } else {
>   __clear_bit(vcpu_id, dest_map);
>   pending_eoi--;
> }
> 
> The naming of above two functions are not good either. Call
> them something like kvm_rtc_eoi_tracking_restore_all() and
> kvm_rtc_eoi_tracking_restore_one().  And _all should call _one() for
> each vcpu. Make __rtc_irq_eoi_tracking_restore_one() that does not
> take ioapic lock and call it from kvm_rtc_eoi_tracking_restore_one()
> surrounded by locks.
Ok. Just confirm whether I am understanding correct:

kvm_rtc_eoi_tracking_restore_all():
{
for_each_vcpu:
kvm_rtc_eoi_tracking_restore_one():
}

kvm_rtc_eoi_tracking_restore_one():
{
lock();
__rtc_irq_e

RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 02:30:15AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-04:
>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h |2 ++
>>>>  virt/kvm/ioapic.c|   43
>>>>  +++ virt/kvm/ioapic.h | 1 +
>>>>  4 files changed, 55 insertions(+), 0 deletions(-)
>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>> index 96ab160..9c041fa 100644
>>>> --- a/arch/x86/kvm/lapic.c
>>>> +++ b/arch/x86/kvm/lapic.c
>>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void 
>>>> *bitmap)
>>>>return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>  }
>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>>>> +{
>>>> +  struct kvm_lapic *apic = vcpu->arch.apic;
>>>> +
>>>> +  return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>>>> +  apic_test_vector(vector, apic->regs + APIC_IRR);
>>>> +}
>>>> +
>>>>  static inline void apic_set_vector(int vec, void *bitmap)
>>>>  {
>>>>set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct
> kvm_vcpu
>>> *vcpu,
>>>>apic->highest_isr_cache = -1;
>>>>kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>>>  apic_find_highest_isr(apic)); kvm_make_request(KVM_REQ_EVENT,
>>>>  vcpu); +  kvm_rtc_irq_restore(vcpu); }
>>>>  
>>>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>> index 967519c..004d2ad 100644
>>>> --- a/arch/x86/kvm/lapic.h
>>>> +++ b/arch/x86/kvm/lapic.h
>>>> @@ -170,4 +170,6 @@ static inline bool kvm_apic_has_events(struct
>>> kvm_vcpu *vcpu)
>>>>return vcpu->arch.apic->pending_events;
>>>>  }
>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>>>> +
>>>>  #endif
>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>> index 8664812..0b12b17 100644
>>>> --- a/virt/kvm/ioapic.c
>>>> +++ b/virt/kvm/ioapic.c
>>>> @@ -90,6 +90,47 @@ static unsigned long ioapic_read_indirect(struct
>>> kvm_ioapic *ioapic,
>>>>return result;
>>>>  }
>>>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>>>> +{
>>>> +  ioapic->rtc_status.pending_eoi = 0;
>>>> +  bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
>>>> +}
>>>> +
>>>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>>>> +{
>>>> +  struct kvm_vcpu *vcpu;
>>>> +  int vector, i, pending_eoi = 0;
>>>> +
>>>> +  if (RTC_GSI >= IOAPIC_NUM_PINS)
>>>> +  return;
>>>> +
>>>> +  vector = ioapic->redirtbl[RTC_GSI].fields.vector;
>>>> +  kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>>>> +  if (kvm_apic_pending_eoi(vcpu, vector)) {
>>>> +  pending_eoi++;
>>>> +  __set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
>>> You should cleat dest_map at the beginning to get rid of stale bits.
>> I thought kvm_set_ioapic is called only after save/restore or migration. And 
>> the
> ioapic should be reset successfully before call it. So the dest_map is empty
> before call rtc_irq_restore().
>> But it is possible kvm_set_ioapic is called beside save/restore or
>> migration. Right?
>> 
> First of all userspace should not care when it calls kvm_set_ioapic()
> the kernel need to do the right thing. Second, believe it or not,
> kvm_ioapic_reset() is not called during system reset. Instead userspace
> reset it by calling kvm_set_ioapic() with ioapic state after reset.
Ok. I see. As the logic you suggested, it will clear dest_map if no pending eoi 
in vcpu, so we don't need to do it again.

> 
>>> 
>>>> +  }
>>>> +  }
>>>> +  ioapic->rtc_status.pending_eoi = pending_eoi;
>>>> +}
>>>> +
>>>> +void kvm_rtc_irq_restore(struct kvm_vcpu *vcpu

RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 12:39:32PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 02:30:15AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-04-04:
>>>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>>>>>> From: Yang Zhang 
>>>>>> 
>>>>>> Signed-off-by: Yang Zhang 
>>>>>> ---
>>>>>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h |2
>>>>>>  ++ virt/kvm/ioapic.c|   43
>>>>>>  +++ virt/kvm/ioapic.h | 1
>>>>>>  + 4 files changed, 55 insertions(+), 0 deletions(-)
>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>> index 96ab160..9c041fa 100644
>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void
> *bitmap)
>>>>>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>  }
>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>>>>>> +{
>>>>>> +struct kvm_lapic *apic = vcpu->arch.apic;
>>>>>> +
>>>>>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>>>>>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>>>>>> +}
>>>>>> +
>>>>>>  static inline void apic_set_vector(int vec, void *bitmap)
>>>>>>  {
>>>>>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct
>>> kvm_vcpu
>>>>> *vcpu,
>>>>>>  apic->highest_isr_cache = -1;
>>>>>>  kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>>>>>  apic_find_highest_isr(apic));   kvm_make_request(KVM_REQ_EVENT,
>>>>>>  vcpu); +kvm_rtc_irq_restore(vcpu); }
>>>>>>  
>>>>>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>>>> index 967519c..004d2ad 100644
>>>>>> --- a/arch/x86/kvm/lapic.h
>>>>>> +++ b/arch/x86/kvm/lapic.h
>>>>>> @@ -170,4 +170,6 @@ static inline bool kvm_apic_has_events(struct
>>>>> kvm_vcpu *vcpu)
>>>>>>  return vcpu->arch.apic->pending_events;
>>>>>>  }
>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>>>>>> +
>>>>>>  #endif
>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>> index 8664812..0b12b17 100644
>>>>>> --- a/virt/kvm/ioapic.c
>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>> @@ -90,6 +90,47 @@ static unsigned long ioapic_read_indirect(struct
>>>>> kvm_ioapic *ioapic,
>>>>>>  return result;
>>>>>>  }
>>>>>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>>>>>> +{
>>>>>> +ioapic->rtc_status.pending_eoi = 0;
>>>>>> +bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
>>>>>> +}
>>>>>> +
>>>>>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>>>>>> +{
>>>>>> +struct kvm_vcpu *vcpu;
>>>>>> +int vector, i, pending_eoi = 0;
>>>>>> +
>>>>>> +if (RTC_GSI >= IOAPIC_NUM_PINS)
>>>>>> +return;
>>>>>> +
>>>>>> +vector = ioapic->redirtbl[RTC_GSI].fields.vector;
>>>>>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm) {
>>>>>> +if (kvm_apic_pending_eoi(vcpu, vector)) {
>>>>>> +pending_eoi++;
>>>>>> +__set_bit(vcpu->vcpu_id, 
>>>>>> ioapic->rtc_status.dest_map);
>>>>> You should cleat dest_map at the beginning to get rid of stale bits.
>>>> I thought kvm_set_ioapic is called only after save/restore or migration. 
>>>> And
> the
>>> ioapic should be reset successfully before call it. So the dest_map is empty
>>> before call rtc_irq_restore().
>>>> But it is possible kvm_set_ioapic is called beside save/restore or
>>>> migration. Right?
>>>> 
>>> First of all userspace should not care when it calls kvm_set_ioapic()
>>> the kernel need to do the right thing. Second, believe it or not,
>>> kvm_ioapic_reset() is not called during system reset. Instead userspace
>>> reset it by calling kvm_set_ioapic() with ioapic state after reset.
>> Ok. I see. As the logic you suggested, it will clear dest_map if no
>> pending eoi in vcpu, so we don't need to do it again.
>> 
> You again rely on userspace doing thing in certain manner. What is
> set_lapic() is never called? Kernel internal state have to be correct
> after each ioctl call.
Sorry. I cannot figure out what's the problem if don't clear dest_map? Can you 
elaborate it?

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 01:05:02PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 12:39:32PM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-04-07:
>>>>> On Sun, Apr 07, 2013 at 02:30:15AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-04-04:
>>>>>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>>>>>>>> From: Yang Zhang 
>>>>>>>> 
>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>> ---
>>>>>>>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h | 2
>>>>>>>>  ++ virt/kvm/ioapic.c|   43
>>>>>>>>  +++ virt/kvm/ioapic.h |
>>>>>>>>  1 + 4 files changed, 55 insertions(+), 0 deletions(-)
>>>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>>>> index 96ab160..9c041fa 100644
>>>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void
>>> *bitmap)
>>>>>>>>return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>>  }
>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>>>>>>>> +{
>>>>>>>> +  struct kvm_lapic *apic = vcpu->arch.apic;
>>>>>>>> +
>>>>>>>> +  return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>>>>>>>> +  apic_test_vector(vector, apic->regs + APIC_IRR);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  static inline void apic_set_vector(int vec, void *bitmap)
>>>>>>>>  {
>>>>>>>>set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct
>>>>> kvm_vcpu
>>>>>>> *vcpu,
>>>>>>>>apic->highest_isr_cache = -1;
>>>>>>>>kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>>>>>>>  apic_find_highest_isr(apic)); kvm_make_request(KVM_REQ_EVENT,
>>>>>>>>  vcpu); +  kvm_rtc_irq_restore(vcpu); }
>>>>>>>>  
>>>>>>>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>>>>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>>>>>> index 967519c..004d2ad 100644
>>>>>>>> --- a/arch/x86/kvm/lapic.h
>>>>>>>> +++ b/arch/x86/kvm/lapic.h
>>>>>>>> @@ -170,4 +170,6 @@ static inline bool kvm_apic_has_events(struct
>>>>>>> kvm_vcpu *vcpu)
>>>>>>>>return vcpu->arch.apic->pending_events;
>>>>>>>>  }
>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>>>>>>>> +
>>>>>>>>  #endif
>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>>>> index 8664812..0b12b17 100644
>>>>>>>> --- a/virt/kvm/ioapic.c
>>>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>>>> @@ -90,6 +90,47 @@ static unsigned long ioapic_read_indirect(struct
>>>>>>> kvm_ioapic *ioapic,
>>>>>>>>return result;
>>>>>>>>  }
>>>>>>>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>>>>>>>> +{
>>>>>>>> +  ioapic->rtc_status.pending_eoi = 0;
>>>>>>>> +  bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic)
>>>>>>>> +{
>>>>>>>> +  struct kvm_vcpu *vcpu;
>>>>>>>> +  int vector, i, pending_eoi = 0;
>>>>>>>> +
>>>>>>>> +  if (RTC_GSI >= IOAPIC_NUM_PINS)
>>>>>>>> +  return;
>>>>>>>> +
>>>>>&g

RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 01:16:51PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 01:05:02PM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-04-07:
>>>>> On Sun, Apr 07, 2013 at 12:39:32PM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-04-07:
>>>>>>> On Sun, Apr 07, 2013 at 02:30:15AM +, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2013-04-04:
>>>>>>>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>>>>>>>>>> From: Yang Zhang 
>>>>>>>>>> 
>>>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>>>> ---
>>>>>>>>>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h | 2
>>>>>>>>>>  ++ virt/kvm/ioapic.c|   43
>>>>>>>>>>  +++ virt/kvm/ioapic.h
>>>>>>>>>>  | 1 + 4 files changed, 55 insertions(+), 0 deletions(-)
>>>>>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>>>>>> index 96ab160..9c041fa 100644
>>>>>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>>>>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void
>>>>> *bitmap)
>>>>>>>>>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>>>>  }
>>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>>>>>>>>>> +{
>>>>>>>>>> +struct kvm_lapic *apic = vcpu->arch.apic;
>>>>>>>>>> +
>>>>>>>>>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>>>>>>>>>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>  static inline void apic_set_vector(int vec, void *bitmap)
>>>>>>>>>>  {
>>>>>>>>>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>>>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct
>>>>>>> kvm_vcpu
>>>>>>>>> *vcpu,
>>>>>>>>>>  apic->highest_isr_cache = -1;
>>>>>>>>>>  kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>>>>>>>>>  apic_find_highest_isr(apic));   kvm_make_request(KVM_REQ_EVENT,
>>>>>>>>>>  vcpu); +kvm_rtc_irq_restore(vcpu); }
>>>>>>>>>>  
>>>>>>>>>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>>>>>>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>>>>>>>> index 967519c..004d2ad 100644
>>>>>>>>>> --- a/arch/x86/kvm/lapic.h
>>>>>>>>>> +++ b/arch/x86/kvm/lapic.h
>>>>>>>>>> @@ -170,4 +170,6 @@ static inline bool
> kvm_apic_has_events(struct
>>>>>>>>> kvm_vcpu *vcpu)
>>>>>>>>>>  return vcpu->arch.apic->pending_events;
>>>>>>>>>>  }
>>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>>>>>>>>>> +
>>>>>>>>>>  #endif
>>>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>>>>>> index 8664812..0b12b17 100644
>>>>>>>>>> --- a/virt/kvm/ioapic.c
>>>>>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>>>>>> @@ -90,6 +90,47 @@ static unsigned long
> ioapic_read_indirect(struct
>>>>>>>>> kvm_ioapic *ioapic,
>>>>>>>>>>  return result;
>>>>>>>>>>  }
>>>>>>>>>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>>>>>>>>>> +{
>>>>>>>>>> +ioapic->rtc_status.pending_eoi = 0;
>>>>>>>>>>

RE: [PATCH v7 4/7] KVM: Call common update function when ioapic entry changed.

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Mon, Apr 01, 2013 at 11:32:32AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Both TMR and EOI exit bitmap need to be updated when ioapic changed
>> or vcpu's id/ldr/dfr changed. So use common function instead eoi exit
>> bitmap specific function.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/ia64/kvm/lapic.h|6 --
>>  arch/x86/kvm/lapic.c |2 +-
>>  arch/x86/kvm/vmx.c   |3 +++
>>  arch/x86/kvm/x86.c   |   11 +++
>>  include/linux/kvm_host.h |4 ++--
>>  virt/kvm/ioapic.c|   26 +++---
>>  virt/kvm/ioapic.h|7 +++
>>  virt/kvm/irq_comm.c  |4 ++--
>>  virt/kvm/kvm_main.c  |4 ++--
>>  9 files changed, 35 insertions(+), 32 deletions(-)
>> diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
>> index c3e2935..c5f92a9 100644
>> --- a/arch/ia64/kvm/lapic.h
>> +++ b/arch/ia64/kvm/lapic.h
>> @@ -27,10 +27,4 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct
> kvm_lapic_irq *irq);
>>  #define kvm_apic_present(x) (true)
>>  #define kvm_lapic_enabled(x) (true)
>> -static inline bool kvm_apic_vid_enabled(void)
>> -{
>> -/* IA64 has no apicv supporting, do nothing here */
>> -return false;
>> -}
>> -
>>  #endif
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index e227474..ce8d6f6 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -209,7 +209,7 @@ out:
>>  if (old)
>>  kfree_rcu(old, rcu);
>> -kvm_ioapic_make_eoibitmap_request(kvm);
>> +kvm_vcpu_scan_ioapic(kvm);
>>  }
>>  
>>  static inline void kvm_apic_set_id(struct kvm_lapic *apic, u8 id)
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
>> b2e95bc..edfc87a 100644 --- a/arch/x86/kvm/vmx.c +++
>> b/arch/x86/kvm/vmx.c @@ -6420,6 +6420,9 @@ static void
>> vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
>> 
>>  static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64
>>  *eoi_exit_bitmap) {
>> +if (!vmx_vm_has_apicv(vcpu->kvm))
>> +return;
>> +
>>  vmcs_write64(EOI_EXIT_BITMAP0, eoi_exit_bitmap[0]);
>>  vmcs_write64(EOI_EXIT_BITMAP1, eoi_exit_bitmap[1]);
>>  vmcs_write64(EOI_EXIT_BITMAP2, eoi_exit_bitmap[2]);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 4d42fe1..64241b6 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5647,13 +5647,16 @@ static void kvm_gen_update_masterclock(struct
> kvm *kvm)
>>  #endif
>>  }
>> -static void update_eoi_exitmap(struct kvm_vcpu *vcpu)
>> +static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>>  {
>>  u64 eoi_exit_bitmap[4];
>> +if (!kvm_lapic_enabled(vcpu))
>> +return;
>> +
> Why is this needed here?
We don't need to calculate eoi_exit_bitmap and TMR if lapic is not enabled. 
Also, ioapic is meaningless for the vcpu that doesn't enable the lapic.

> 
>>  memset(eoi_exit_bitmap, 0, 32);
>> -kvm_ioapic_calculate_eoi_exitmap(vcpu, eoi_exit_bitmap);
>> +kvm_ioapic_scan_entry(vcpu, (unsigned long *)eoi_exit_bitmap);
>>  kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap);
>>  }
>> @@ -5710,8 +5713,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>  kvm_handle_pmu_event(vcpu);
>>  if (kvm_check_request(KVM_REQ_PMI, vcpu))
>>  kvm_deliver_pmi(vcpu);
>> -if (kvm_check_request(KVM_REQ_EOIBITMAP, vcpu))
>> -update_eoi_exitmap(vcpu);
>> +if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
>> +vcpu_scan_ioapic(vcpu);
>>  }
>>  
>>  if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 1c0be23..ef1b3e3 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -126,7 +126,7 @@ static inline bool is_error_page(struct page *page)
>>  #define KVM_REQ_MASTERCLOCK_UPDATE 19
>>  #define KVM_REQ_MCLOCK_INPROGRESS 20
>>  #define KVM_REQ_EPR_EXIT  21
>> -#define KVM_REQ_EOIBITMAP 22
>> +#define KVM_REQ_SCAN_IOAPIC   22
>> 
>>  #define KVM_USERSPACE_IRQ_SOURCE_ID 0 #define
>>  KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID1 @@ -571,7 +571,7 @@ void
>>  kvm_put_guest_fpu(struct kvm_vcpu *vcpu); void
>>  kvm_flush_remote_tlbs(struct kvm *kvm); void
>>  kvm_reload_remote_mmus(struct kvm *kvm); void
>>  kvm_make_mclock_inprogress_request(struct kvm *kvm);
>> -void kvm_make_update_eoibitmap_request(struct kvm *kvm);
>> +void kvm_make_scan_ioapic_request(struct kvm *kvm);
>> 
>>  long kvm_arch_dev_ioctl(struct file *filp,
>>  unsigned int ioctl, unsigned long arg);
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index fbd0556..f37c889 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -119,8 +119,8 @@ static void update_handled_vectors(struct kvm_ioapic
> *ioapic)
>>  smp_wmb();
>>  }
>> -void kvm_ioapic_calculate_eo

RE: [PATCH v7 3/7] KVM: VMX: Check the posted interrupt capability

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Mon, Apr 01, 2013 at 11:32:31AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Detect the posted interrupt feature. If it exists, then set it in 
>> vmcs_config.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/include/asm/vmx.h |4 ++ arch/x86/kvm/vmx.c |   87
>>  ++-- 2 files changed, 71
>>  insertions(+), 20 deletions(-)
>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
>> index fc1c313..6f07f19 100644
>> --- a/arch/x86/include/asm/vmx.h
>> +++ b/arch/x86/include/asm/vmx.h
>> @@ -71,6 +71,7 @@
>>  #define PIN_BASED_NMI_EXITING   0x0008
>>  #define PIN_BASED_VIRTUAL_NMIS  0x0020
>>  #define PIN_BASED_VMX_PREEMPTION_TIMER  0x0040
>> +#define PIN_BASED_POSTED_INTR   0x0080
>> 
>>  #define PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR 0x0016
>> @@ -102,6 +103,7 @@
>>  /* VMCS Encodings */ enum vmcs_field {  VIRTUAL_PROCESSOR_ID  
>>   = 0x, +POSTED_INTR_NV  = 0x0002,
>>  GUEST_ES_SELECTOR   = 0x0800,   GUEST_CS_SELECTOR 
>>   = 0x0802,  GUEST_SS_SELECTOR   = 0x0804,
>>  @@ -136,6 +138,8 @@ enum vmcs_field {   VIRTUAL_APIC_PAGE_ADDR_HIGH
>>  = 0x2013,   APIC_ACCESS_ADDR= 0x2014,
>>  APIC_ACCESS_ADDR_HIGH   = 0x2015,
>> +POSTED_INTR_DESC_ADDR   = 0x2016,
>> +POSTED_INTR_DESC_ADDR_HIGH  = 0x2017,
>>  EPT_POINTER = 0x201a,
>>  EPT_POINTER_HIGH= 0x201b,
>>  EOI_EXIT_BITMAP0= 0x201c,
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 7408d93..b2e95bc 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -84,7 +84,8 @@ module_param(vmm_exclusive, bool, S_IRUGO);
>>  static bool __read_mostly fasteoi = 1;
>>  module_param(fasteoi, bool, S_IRUGO);
>> -static bool __read_mostly enable_apicv_reg_vid;
>> +static bool __read_mostly enable_apicv;
>> +module_param(enable_apicv, bool, S_IRUGO);
>> 
>>  /*
>>   * If nested=1, nested virtualization is supported, i.e., guests may use
>> @@ -366,6 +367,19 @@ struct nested_vmx {
>>  struct page *apic_access_page;
>>  };
>> +#define POSTED_INTR_ON  0
>> +/* Posted-Interrupt Descriptor */
>> +struct pi_desc {
>> +u32 pir[8]; /* Posted interrupt requested */
>> +union {
>> +struct {
>> +u8  on:1,
> Do you actually use the 'on' member of the bit field? As far as I can
'on' is just an indicator to easy for code review. I don't use it.

> tell the paths always access control with (set|clear)_bit(). And C does not
> guaranty layout of the bit field, so on may not point to what you think
> it points to.
You are right. I will remove it and comments it instead.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 4/7] KVM: Call common update function when ioapic entry changed.

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 02:00:04PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Mon, Apr 01, 2013 at 11:32:32AM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> Both TMR and EOI exit bitmap need to be updated when ioapic changed
>>>> or vcpu's id/ldr/dfr changed. So use common function instead eoi exit
>>>> bitmap specific function.
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  arch/ia64/kvm/lapic.h|6 --
>>>>  arch/x86/kvm/lapic.c |2 +-
>>>>  arch/x86/kvm/vmx.c   |3 +++
>>>>  arch/x86/kvm/x86.c   |   11 +++
>>>>  include/linux/kvm_host.h |4 ++--
>>>>  virt/kvm/ioapic.c|   26 +++---
>>>>  virt/kvm/ioapic.h|7 +++
>>>>  virt/kvm/irq_comm.c  |4 ++--
>>>>  virt/kvm/kvm_main.c  |4 ++--
>>>>  9 files changed, 35 insertions(+), 32 deletions(-)
>>>> diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
>>>> index c3e2935..c5f92a9 100644
>>>> --- a/arch/ia64/kvm/lapic.h
>>>> +++ b/arch/ia64/kvm/lapic.h
>>>> @@ -27,10 +27,4 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct
>>> kvm_lapic_irq *irq);
>>>>  #define kvm_apic_present(x) (true)
>>>>  #define kvm_lapic_enabled(x) (true)
>>>> -static inline bool kvm_apic_vid_enabled(void)
>>>> -{
>>>> -  /* IA64 has no apicv supporting, do nothing here */
>>>> -  return false;
>>>> -}
>>>> -
>>>>  #endif
>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>> index e227474..ce8d6f6 100644
>>>> --- a/arch/x86/kvm/lapic.c
>>>> +++ b/arch/x86/kvm/lapic.c
>>>> @@ -209,7 +209,7 @@ out:
>>>>if (old)
>>>>kfree_rcu(old, rcu);
>>>> -  kvm_ioapic_make_eoibitmap_request(kvm);
>>>> +  kvm_vcpu_scan_ioapic(kvm);
>>>>  }
>>>>  
>>>>  static inline void kvm_apic_set_id(struct kvm_lapic *apic, u8 id)
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
>>>> b2e95bc..edfc87a 100644 --- a/arch/x86/kvm/vmx.c +++
>>>> b/arch/x86/kvm/vmx.c @@ -6420,6 +6420,9 @@ static void
>>>> vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
>>>> 
>>>>  static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64
>>>>  *eoi_exit_bitmap) {
>>>> +  if (!vmx_vm_has_apicv(vcpu->kvm))
>>>> +  return;
>>>> +
>>>>vmcs_write64(EOI_EXIT_BITMAP0, eoi_exit_bitmap[0]);
>>>>vmcs_write64(EOI_EXIT_BITMAP1, eoi_exit_bitmap[1]);
>>>>vmcs_write64(EOI_EXIT_BITMAP2, eoi_exit_bitmap[2]);
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 4d42fe1..64241b6 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -5647,13 +5647,16 @@ static void kvm_gen_update_masterclock(struct
>>> kvm *kvm)
>>>>  #endif
>>>>  }
>>>> -static void update_eoi_exitmap(struct kvm_vcpu *vcpu)
>>>> +static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>>>>  {
>>>>u64 eoi_exit_bitmap[4];
>>>> +  if (!kvm_lapic_enabled(vcpu))
>>>> +  return;
>>>> +
>>> Why is this needed here?
>> We don't need to calculate eoi_exit_bitmap and TMR if lapic is not
>> enabled. Also, ioapic is meaningless for the vcpu that doesn't enable
>> the lapic.
>> 
> OK, but then let's use apic_enabled() since kvm_lapic_enabled() also
> checks for in kernel apic and we should not be here if apic is not
Sure.

> emulated in kernel. Also please make sure that we rescan ioapic on all
> apic state changes.
Yes. recalculate_apic_map() is called on all apic state changes, so request 
ioapic scan in recalculate_apic_map is enough.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 6/7] KVM: VMX: Add the algorithm of deliver posted interrupt

2013-04-07 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Mon, Apr 01, 2013 at 11:32:34AM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Only deliver the posted interrupt when target vcpu is running
>> and there is no previous interrupt pending in pir.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/include/asm/kvm_host.h |2 + arch/x86/kvm/lapic.c 
>>|   13  arch/x86/kvm/lapic.h|1 +
>>  arch/x86/kvm/svm.c  |6  arch/x86/kvm/vmx.c
>>   |   60 ++-
>>  virt/kvm/kvm_main.c |1 + 6 files changed, 82
>>  insertions(+), 1 deletions(-)
>> diff --git a/arch/x86/include/asm/kvm_host.h
>> b/arch/x86/include/asm/kvm_host.h index 8e95512..842ea5a 100644 ---
>> a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -704,6 +704,8 @@ struct kvm_x86_ops {
>>  void (*hwapic_isr_update)(struct kvm *kvm, int isr);
>>  void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
>>  void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
>> +void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
>> +void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
>>  int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>>  int (*get_tdp_level)(void);
>>  u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 686afee..95e8f4a 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -310,6 +310,19 @@ static u8 count_vectors(void *bitmap)
>>  return count;
>>  }
>> +void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
>> +{
>> +u32 i, pir_val;
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> +for (i = 0; i <= 7; i++) {
>> +pir_val = xchg(&pir[i], 0);
>> +if (pir_val)
>> +*((u32 *)(apic->regs + APIC_IRR + i * 0x10)) |= pir_val;
>> +}
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
>> +
>>  static inline int apic_test_and_set_irr(int vec, struct kvm_lapic *apic)
>>  {
>>  apic->irr_pending = true;
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 599076e..16c3949 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -54,6 +54,7 @@ u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
>>  void kvm_apic_set_version(struct kvm_vcpu *vcpu);
>>  
>>  void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr); +void
>>  kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir); int
>>  kvm_apic_match_physical_addr(struct kvm_lapic *apic, u16 dest); int
>>  kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda); int
>>  kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 2f8fe3f..d6713e1 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -3577,6 +3577,11 @@ static void svm_hwapic_isr_update(struct kvm *kvm,
> int isr)
>>  return;
>>  }
>> +static void svm_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>> +{
>> +return;
>> +}
>> +
>>  static int svm_nmi_allowed(struct kvm_vcpu *vcpu) { struct vcpu_svm
>>  *svm = to_svm(vcpu); @@ -4305,6 +4310,7 @@ static struct kvm_x86_ops
>>  svm_x86_ops = { .vm_has_apicv = svm_vm_has_apicv,   
>> .load_eoi_exitmap
>>  = svm_load_eoi_exitmap, .hwapic_isr_update = svm_hwapic_isr_update,
>> +.sync_pir_to_irr = svm_sync_pir_to_irr,
>> 
>>  .set_tss_addr = svm_set_tss_addr,
>>  .get_tdp_level = get_npt_level,
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index edfc87a..690734c 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -380,6 +380,23 @@ struct pi_desc {
>>  } u;
>>  } __aligned(64);
>> +static bool pi_test_and_set_on(struct pi_desc *pi_desc)
>> +{
>> +return test_and_set_bit(POSTED_INTR_ON,
>> +(unsigned long *)&pi_desc->u.control);
>> +}
>> +
>> +static bool pi_test_and_clear_on(struct pi_desc *pi_desc)
>> +{
>> +return test_and_clear_bit(POSTED_INTR_ON,
>> +(unsigned long *)&pi_desc->u.control);
>> +}
>> +
>> +static int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
>> +{
>> +return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
>> +}
>> +
>>  struct vcpu_vmx {
>>  struct kvm_vcpu   vcpu;
>>  unsigned long host_rsp;
>> @@ -2851,8 +2868,10 @@ static __init int hardware_setup(void)
>> 
>>  if (enable_apicv)
>>  kvm_x86_ops->update_cr8_intercept = NULL;
>> -else
>> +else {
>>  kvm_x86_ops->hwapic_irr_update = NULL;
>> +kvm_x86_ops->deliver_posted_interrupt = NULL;
>> +}
>> 
>>  if (nested) nested_vmx_setup_ctls_msrs(); @@ -3914,6 
>> +3933,43 @@
>>  static int vmx_vm_has_apicv(struct kvm *kvm) }
>>  
>>  /*
>> + * Send interrupt to vcpu via posted interrupt way.
>> + * 1. If target vcpu is 

RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-08 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-07:
> On Sun, Apr 07, 2013 at 01:16:51PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 01:05:02PM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-04-07:
>>>>> On Sun, Apr 07, 2013 at 12:39:32PM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-04-07:
>>>>>>> On Sun, Apr 07, 2013 at 02:30:15AM +, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2013-04-04:
>>>>>>>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>>>>>>>>>> From: Yang Zhang 
>>>>>>>>>> 
>>>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>>>> ---
>>>>>>>>>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h | 2
>>>>>>>>>>  ++ virt/kvm/ioapic.c|   43
>>>>>>>>>>  +++ virt/kvm/ioapic.h
>>>>>>>>>>  | 1 + 4 files changed, 55 insertions(+), 0 deletions(-)
>>>>>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>>>>>> index 96ab160..9c041fa 100644
>>>>>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>>>>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void
>>>>> *bitmap)
>>>>>>>>>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>>>>  }
>>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>>>>>>>>>> +{
>>>>>>>>>> +struct kvm_lapic *apic = vcpu->arch.apic;
>>>>>>>>>> +
>>>>>>>>>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>>>>>>>>>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>  static inline void apic_set_vector(int vec, void *bitmap)
>>>>>>>>>>  {
>>>>>>>>>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>>>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct
>>>>>>> kvm_vcpu
>>>>>>>>> *vcpu,
>>>>>>>>>>  apic->highest_isr_cache = -1;
>>>>>>>>>>  kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>>>>>>>>>  apic_find_highest_isr(apic));   kvm_make_request(KVM_REQ_EVENT,
>>>>>>>>>>  vcpu); +kvm_rtc_irq_restore(vcpu); }
>>>>>>>>>>  
>>>>>>>>>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>>>>>>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>>>>>>>> index 967519c..004d2ad 100644
>>>>>>>>>> --- a/arch/x86/kvm/lapic.h
>>>>>>>>>> +++ b/arch/x86/kvm/lapic.h
>>>>>>>>>> @@ -170,4 +170,6 @@ static inline bool
> kvm_apic_has_events(struct
>>>>>>>>> kvm_vcpu *vcpu)
>>>>>>>>>>  return vcpu->arch.apic->pending_events;
>>>>>>>>>>  }
>>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>>>>>>>>>> +
>>>>>>>>>>  #endif
>>>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>>>>>> index 8664812..0b12b17 100644
>>>>>>>>>> --- a/virt/kvm/ioapic.c
>>>>>>>>>> +++ b/virt/kvm/ioapic.c
>>>>>>>>>> @@ -90,6 +90,47 @@ static unsigned long
> ioapic_read_indirect(struct
>>>>>>>>> kvm_ioapic *ioapic,
>>>>>>>>>>  return result;
>>>>>>>>>>  }
>>>>>>>>>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
>>>>>>>>>> +{
>>>>>>>>>> +ioapic->rtc_status.pending_eoi = 0;
>>>>>>>>>>

RE: [PATCH v7 4/7] KVM: Add reset/restore rtc_status support

2013-04-08 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-08:
> On Mon, Apr 08, 2013 at 11:21:34AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-07:
>>> On Sun, Apr 07, 2013 at 01:16:51PM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2013-04-07:
>>>>> On Sun, Apr 07, 2013 at 01:05:02PM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2013-04-07:
>>>>>>> On Sun, Apr 07, 2013 at 12:39:32PM +, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2013-04-07:
>>>>>>>>> On Sun, Apr 07, 2013 at 02:30:15AM +, Zhang, Yang Z wrote:
>>>>>>>>>> Gleb Natapov wrote on 2013-04-04:
>>>>>>>>>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote:
>>>>>>>>>>>> From: Yang Zhang 
>>>>>>>>>>>> 
>>>>>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>>>>>> ---
>>>>>>>>>>>>  arch/x86/kvm/lapic.c |9 + arch/x86/kvm/lapic.h |
>>>>>>>>>>>>  2 ++ virt/kvm/ioapic.c|   43
>>>>>>>>>>>>  +++
>>>>>>>>>>>>  virt/kvm/ioapic.h | 1 + 4 files changed, 55 insertions(+), 0
>>>>>>>>>>>>  deletions(-)
>>>>>>>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>>>>>>>> index 96ab160..9c041fa 100644
>>>>>>>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>>>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>>>>>>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec,
> void
>>>>>>> *bitmap)
>>>>>>>>>>>>return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>>>>>>  }
>>>>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +  struct kvm_lapic *apic = vcpu->arch.apic;
>>>>>>>>>>>> +
>>>>>>>>>>>> +  return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>>>>>>>>>>>> +  apic_test_vector(vector, apic->regs + APIC_IRR);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>>  static inline void apic_set_vector(int vec, void *bitmap)
>>>>>>>>>>>>  {
>>>>>>>>>>>>set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>>>>>>>>>>> @@ -1665,6 +1673,7 @@ void
> kvm_apic_post_state_restore(struct
>>>>>>>>> kvm_vcpu
>>>>>>>>>>> *vcpu,
>>>>>>>>>>>>apic->highest_isr_cache = -1;
>>>>>>>>>>>>kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>>>>>>>>>>>  apic_find_highest_isr(apic));
>>>>>>>>>>>>kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>>>>>>>>>>  + kvm_rtc_irq_restore(vcpu); }
>>>>>>>>>>>>  
>>>>>>>>>>>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>>>>>>>>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>>>>>>>>>> index 967519c..004d2ad 100644
>>>>>>>>>>>> --- a/arch/x86/kvm/lapic.h
>>>>>>>>>>>> +++ b/arch/x86/kvm/lapic.h
>>>>>>>>>>>> @@ -170,4 +170,6 @@ static inline bool
>>> kvm_apic_has_events(struct
>>>>>>>>>>> kvm_vcpu *vcpu)
>>>>>>>>>>>>return vcpu->arch.apic->pending_events;
>>>>>>>>>>>>  }
>>>>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>>>>>>>>>>>> +
>>>>>>>>>>>>  #endif
>>>>>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>>>>>>>>>> index 8664812..0b12b17 10064

RE: [PATCH v8 4/7] KVM: Add reset/restore rtc_status support

2013-04-09 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-09:
> On Mon, Apr 08, 2013 at 10:17:46PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |9 +++ arch/x86/kvm/lapic.h |2 +
>>  virt/kvm/ioapic.c|   60
>>  ++ virt/kvm/ioapic.h  
>>   |1 + 4 files changed, 72 insertions(+), 0 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 0b73402..6796218 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void *bitmap)
>>  return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct kvm_lapic *apic = vcpu->arch.apic;
>> +
>> +return apic_test_vector(vector, apic->regs + APIC_ISR) ||
>> +apic_test_vector(vector, apic->regs + APIC_IRR);
>> +}
>> +
>>  static inline void apic_set_vector(int vec, void *bitmap)
>>  {
>>  set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec));
>> @@ -1618,6 +1626,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu
> *vcpu,
>>  apic->highest_isr_cache = -1;
>>  kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
>>  apic_find_highest_isr(apic));   kvm_make_request(KVM_REQ_EVENT, vcpu);
>>  +   kvm_rtc_eoi_tracking_restore_one(vcpu); }
>>  
>>  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index 3e5a431..16304b1 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -166,4 +166,6 @@ static inline bool kvm_apic_has_events(struct
> kvm_vcpu *vcpu)
>>  return vcpu->arch.apic->pending_events;
>>  }
>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
>> +
>>  #endif
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index 27ae8dd..4699180 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -90,6 +90,64 @@ static unsigned long ioapic_read_indirect(struct
> kvm_ioapic *ioapic,
>>  return result;
>>  }
>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic)
> rtc_irq_eoi_tracking_reset()
Sure. 

> 
>> +{
>> +ioapic->rtc_status.pending_eoi = 0;
>> +bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS);
>> +}
>> +
>> +static void __rtc_irq_eoi_tracking_restore_one(struct kvm_vcpu *vcpu,
>> +int vector)
>> +{
>> +bool new_val, old_val;
>> +struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
>> +union kvm_ioapic_redirect_entry *e;
>> +
>> +e = &ioapic->redirtbl[RTC_GSI];
>> +if (!kvm_apic_match_dest(vcpu, NULL, 0, e->fields.dest_id,
>> +e->fields.dest_mode))
>> +return;
>> +
>> +new_val = kvm_apic_pending_eoi(vcpu, vector);
>> +old_val = test_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
>> +
>> +if (new_val == old_val)
>> +return;
>> +
>> +if (new_val) {
>> +__set_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
>> +ioapic->rtc_status.pending_eoi++;
>> +} else {
>> +__clear_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map);
>> +ioapic->rtc_status.pending_eoi--;
>> +}
> WARN_ON(ioapic->rtc_status.pending_eoi < 0);
Sure.

> 
>> +}
>> +
>> +void kvm_rtc_eoi_tracking_restore_one(struct kvm_vcpu *vcpu)
>> +{
>> +struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
>> +int vector;
>> +
>> +vector = ioapic->redirtbl[RTC_GSI].fields.vector;
> Do not access ioapic outside of the lock. Also since you access
> ioapic->redirtbl[RTC_GSI] in __rtc_irq_eoi_tracking_restore_one()
> anyway what's the point passing vector to it?
Right. 

> 
>> +spin_lock(&ioapic->lock);
>> +__rtc_irq_eoi_tracking_restore_one(vcpu, vector);
>> +spin_unlock(&ioapic->lock);
>> +}
>> +
>> +static void kvm_rtc_eoi_tracking_restore_all(struct kvm_ioapic *ioapic)
>> +{
>> +struct kvm_vcpu *vcpu;
>> +int i, vector;
>> +
>> +if (RTC_GSI >= IOAPIC_NUM_PINS)
>> +return;
>> +
>> +rtc_irq_reset(ioapic);
>> +vector = ioapic->redirtbl[RTC_GSI].fields.vector;
>> +kvm_for_each_vcpu(i, vcpu, ioapic->kvm)
>> +__rtc_irq_eoi_tracking_restore_one(vcpu, vector);
>> +}
>> +
>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx)
>>  {   union kvm_ioapic_redirect_entry *pent; @@ -428,6 +486,7 @@ void
>>  kvm_ioapic_reset(struct kvm_ioapic *ioapic) ioapic->ioregsel = 0;
>>  ioapic->irr = 0;ioapic->id = 0; +   rtc_irq_reset(ioapic);
>>  update_handled_vectors(ioapic); }
>> @@ -494,6 +553,7 @@ int kvm_set_ioapic(struct kvm *kvm, struct
> kvm_ioapic_state *state)
>>  memcpy(ioapic, state, sizeof(struct kvm_ioapic_state));
>>  update_handled_vectors(ioapic);
>>  kvm_ioapic_make_eoibitmap_request(kvm);
>>  +   kvm_rtc_eoi_tracking_restore_all(ioapic);
>>  spin_unlock(&ioapic->lock); retur

RE: [PATCH v8 7/7] KVM: VMX: Use posted interrupt to deliver virtual interrupt

2013-04-09 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-09:
> On Mon, Apr 08, 2013 at 10:23:22PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> If posted interrupt is avaliable, then uses it to inject virtual
>> interrupt to guest.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |   29 ++---
>>  arch/x86/kvm/vmx.c   |2 +-
>>  arch/x86/kvm/x86.c   |1 +
>>  3 files changed, 20 insertions(+), 12 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 8948979..46a4cca 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -353,6 +353,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic
> *apic)
>>  if (!apic->irr_pending)
>>  return -1;
>> +kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
>>  result = apic_search_irr(apic);
>>  ASSERT(result == -1 || result >= 16);
>> @@ -683,18 +684,24 @@ static int __apic_accept_irq(struct kvm_lapic *apic, 
>> int
> delivery_mode,
>>  if (dest_map)
>>  __set_bit(vcpu->vcpu_id, dest_map);
>> -result = !apic_test_and_set_irr(vector, apic);
>> -trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> -  trig_mode, vector, !result);
>> -if (!result) {
>> -if (trig_mode)
>> -apic_debug("level trig mode repeatedly for "
>> -"vector %d", vector);
>> -break;
>> -}
>> +if (kvm_x86_ops->deliver_posted_interrupt) {
>> +result = 1;
>> +kvm_x86_ops->deliver_posted_interrupt(vcpu, vector);
>> +} else {
>> +result = !apic_test_and_set_irr(vector, apic);
>> +
>> +trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> +trig_mode, vector, !result);
> Missed that in previous review. Do no drop tracing for PI case.
Hmm. I remember I have added the tracing for PI case. Don't know why it is not 
existing in this patch. Anyway, I will add it again.

> 
>> +if (!result) {
>> +if (trig_mode)
>> +apic_debug("level trig mode repeatedly "
>> +"for vector %d", vector);
>> +break;
>> +}
>> 
>> -kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -kvm_vcpu_kick(vcpu);
>> +kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +kvm_vcpu_kick(vcpu);
>> +}
>>  break;
>>  
>>  case APIC_DM_REMRD:
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 3de2d7f..cd1c6ff 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -84,7 +84,7 @@ module_param(vmm_exclusive, bool, S_IRUGO);
>>  static bool __read_mostly fasteoi = 1;
>>  module_param(fasteoi, bool, S_IRUGO);
>> -static bool __read_mostly enable_apicv;
>> +static bool __read_mostly enable_apicv = 1;
>>  module_param(enable_apicv, bool, S_IRUGO);
>>  
>>  /*
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 72be079..486f627 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2685,6 +2685,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,  
>>
>>  struct kvm_lapic_state *s) { +  kvm_x86_ops->sync_pir_to_irr(vcpu);
>>  memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
>>  
>>  return 0;
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v8 6/7] KVM: VMX: Add the algorithm of deliver posted interrupt

2013-04-09 Thread Zhang, Yang Z
Paolo Bonzini wrote on 2013-04-10:
> Il 08/04/2013 16:23, Yang Zhang ha scritto:
>> + * interrupt from PIR in next vmentry.
>> + */
>> +static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>> +{
>> +struct vcpu_vmx *vmx = to_vmx(vcpu);
>> +int r;
>> +
>> +if (pi_test_and_set_pir(vector, &vmx->pi_desc))
>> +return;
>> +
>> +r = pi_test_and_set_on(&vmx->pi_desc);
>> +kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +if (!r && (vcpu->mode == IN_GUEST_MODE))
>> +apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
>> +POSTED_INTR_VECTOR);
>> +else
>> +kvm_vcpu_kick(vcpu);
>> +
>> +return;
>> +}
> 
> No need for this return.
Right. Will remove it in next version.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v9 7/7] KVM: VMX: Use posted interrupt to deliver virtual interrupt

2013-04-10 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-10:
> On Wed, Apr 10, 2013 at 09:22:57PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> If posted interrupt is avaliable, then uses it to inject virtual
>> interrupt to guest.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |   32 +---
>>  arch/x86/kvm/vmx.c   |2 +-
>>  arch/x86/kvm/x86.c   |1 +
>>  3 files changed, 23 insertions(+), 12 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 42a87ac..4fdb984 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -349,6 +349,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic
> *apic)
>>  if (!apic->irr_pending)
>>  return -1;
>> +kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
>>  result = apic_search_irr(apic);
>>  ASSERT(result == -1 || result >= 16);
>> @@ -679,18 +680,27 @@ static int __apic_accept_irq(struct kvm_lapic *apic, 
>> int
> delivery_mode,
>>  if (dest_map)
>>  __set_bit(vcpu->vcpu_id, dest_map);
>> -result = !apic_test_and_set_irr(vector, apic);
>> -trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> -  trig_mode, vector, !result);
>> -if (!result) {
>> -if (trig_mode)
>> -apic_debug("level trig mode repeatedly for "
>> -"vector %d", vector);
>> -break;
>> -}
>> +if (kvm_x86_ops->deliver_posted_interrupt) {
>> +result = 1;
>> +kvm_x86_ops->deliver_posted_interrupt(vcpu, vector);
>> +} else {
>> +result = !apic_test_and_set_irr(vector, apic);
>> +
>> +trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> +trig_mode, vector, !result);
>> +if (!result) {
>> +if (trig_mode)
>> +apic_debug("level trig mode repeatedly "
>> +"for vector %d", vector);
>> +goto out;
>> +}
>> 
>> -kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -kvm_vcpu_kick(vcpu);
>> +kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +kvm_vcpu_kick(vcpu);
>> +}
>> +out:
>> +trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> +trig_mode, vector, !result);
> Sigh, now you trace it twice.
Yes. Forget to remove the old one. :(

> 
>>  break;
>>  
>>  case APIC_DM_REMRD:
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 314b2ed..52b21da 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -84,7 +84,7 @@ module_param(vmm_exclusive, bool, S_IRUGO);
>>  static bool __read_mostly fasteoi = 1;
>>  module_param(fasteoi, bool, S_IRUGO);
>> -static bool __read_mostly enable_apicv;
>> +static bool __read_mostly enable_apicv = 1;
>>  module_param(enable_apicv, bool, S_IRUGO);
>>  
>>  /*
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 72be079..486f627 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2685,6 +2685,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,  
>>
>>  struct kvm_lapic_state *s) { +  kvm_x86_ops->sync_pir_to_irr(vcpu);
>>  memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
>>  
>>  return 0;
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v9 0/7] KVM: VMX: Add Posted Interrupt supporting

2013-04-10 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-10:
> On Wed, Apr 10, 2013 at 09:22:50PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> The follwoing patches are adding the Posted Interrupt supporting to KVM:
>> The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
>> it is required by Posted interrupt, we need to enable it firstly.
>> 
>> And the subsequent patches are adding the posted interrupt supporting:
>> Posted Interrupt allows APIC interrupts to inject into guest directly
>> without any vmexit.
>> 
>> - When delivering a interrupt to guest, if target vcpu is running,
>>   update Posted-interrupt requests bitmap and send a notification event
>>   to the vcpu. Then the vcpu will handle this interrupt automatically,
>>   without any software involvemnt.
>> - If target vcpu is not running or there already a notification event
>>   pending in the vcpu, do nothing. The interrupt will be handled by
>>   next vm entry
>> Changes from v8 to v9:
>> * Add tracing in PI case when deliver interrupt.
>> * Scan ioapic when updating SPIV register.
> Do not see it at the patch series. Have I missed it?
The change is in forth patch:

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 6796218..4ccdc94 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -134,11 +134,7 @@ static inline void apic_set_spiv(struct kvm_lapic *apic, 
u32 val)
static_key_slow_inc(&apic_sw_disabled.key);
}
apic_set_reg(apic, APIC_SPIV, val);
-}
-
-static inline int apic_enabled(struct kvm_lapic *apic)
-{
-   return kvm_apic_sw_enabled(apic) && kvm_apic_hw_enabled(apic);
+   kvm_make_request(KVM_REQ_SCAN_IOAPIC, apic->vcpu);
 }

As you mentioned, since it will call apic_enabled() to check whether apic is 
enabled in vcpu_scan_ioapic. So we must ensure rescan ioapic when apic state 
changed.
And I found recalculate_apic_map() doesn't track the enable/disable apic by 
software approach. So make_scan_ioapic_request in recalculate_apic_map() is not 
enough.
We also should force rescan ioapic when apic state is changed via software 
approach(update spiv reg).

> 
>> * Rebase on top of KVM upstream + RTC eoi tracking patch.
>> 
>> Changes from v7 to v8:
>> * Remove unused memeber 'on' from struct pi_desc.
>> * Register a dummy function to sync_pir_to_irr is apicv is disabled.
>> * Minor fixup.
>> * Rebase on top of KVM upstream + RTC eoi tracking patch.
>> 
>> Yang Zhang (7):
>>   KVM: VMX: Enable acknowledge interupt on vmexit
>>   KVM: VMX: Register a new IPI for posted interrupt
>>   KVM: VMX: Check the posted interrupt capability
>>   KVM: Call common update function when ioapic entry changed.
>>   KVM: Set TMR when programming ioapic entry
>>   KVM: VMX: Add the algorithm of deliver posted interrupt
>>   KVM: VMX: Use posted interrupt to deliver virtual interrupt
>>  arch/ia64/kvm/lapic.h  |6 -
>>  arch/x86/include/asm/entry_arch.h  |4 +
>>  arch/x86/include/asm/hardirq.h |3 +
>>  arch/x86/include/asm/hw_irq.h  |1 +
>>  arch/x86/include/asm/irq_vectors.h |5 +
>>  arch/x86/include/asm/kvm_host.h|3 + arch/x86/include/asm/vmx.h
>>  |4 + arch/x86/kernel/entry_64.S |5 +
>>  arch/x86/kernel/irq.c  |   22 
>>  arch/x86/kernel/irqinit.c  |4 + arch/x86/kvm/lapic.c  
>>  |   66  arch/x86/kvm/lapic.h   |7
>>  ++ arch/x86/kvm/svm.c |   12 ++ arch/x86/kvm/vmx.c
>>  |  207 +++-
>>  arch/x86/kvm/x86.c |   19 +++-
>>  include/linux/kvm_host.h   |4 +- virt/kvm/ioapic.c
>>   |   32 -- virt/kvm/ioapic.h  |7 +-
>>  virt/kvm/irq_comm.c|4 +- virt/kvm/kvm_main.c  
>>   |5 +- 20 files changed, 341 insertions(+), 79 deletions(-)
> 
> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v9 0/7] KVM: VMX: Add Posted Interrupt supporting

2013-04-10 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-11:
> On Thu, Apr 11, 2013 at 01:03:30AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-10:
>>> On Wed, Apr 10, 2013 at 09:22:50PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> The follwoing patches are adding the Posted Interrupt supporting to KVM:
>>>> The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
>>>> it is required by Posted interrupt, we need to enable it firstly.
>>>> 
>>>> And the subsequent patches are adding the posted interrupt supporting:
>>>> Posted Interrupt allows APIC interrupts to inject into guest directly
>>>> without any vmexit.
>>>> 
>>>> - When delivering a interrupt to guest, if target vcpu is running,
>>>>   update Posted-interrupt requests bitmap and send a notification
>>>>   event to the vcpu. Then the vcpu will handle this interrupt
>>>>   automatically, without any software involvemnt. - If target vcpu is
>>>>   not running or there already a notification event pending in the
>>>>   vcpu, do nothing. The interrupt will be handled by next vm entry
>>>> Changes from v8 to v9:
>>>> * Add tracing in PI case when deliver interrupt.
>>>> * Scan ioapic when updating SPIV register.
>>> Do not see it at the patch series. Have I missed it?
>> The change is in forth patch:
>> 
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 6796218..4ccdc94 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -134,11 +134,7 @@ static inline void apic_set_spiv(struct kvm_lapic *apic,
> u32 val)
>>  static_key_slow_inc(&apic_sw_disabled.key);
>>  }
>>  apic_set_reg(apic, APIC_SPIV, val);
>> -}
>> -
>> -static inline int apic_enabled(struct kvm_lapic *apic)
>> -{
>> -return kvm_apic_sw_enabled(apic) && kvm_apic_hw_enabled(apic);
>> +kvm_make_request(KVM_REQ_SCAN_IOAPIC, apic->vcpu);
>>  }
> OK, see it now. Thanks.
> 
>> As you mentioned, since it will call apic_enabled() to check whether apic is
> enabled in vcpu_scan_ioapic. So we must ensure rescan ioapic when apic state
> changed.
>> And I found recalculate_apic_map() doesn't track the enable/disable apic by
> software approach. So make_scan_ioapic_request in recalculate_apic_map() is
> not enough.
>> We also should force rescan ioapic when apic state is changed via
>> software approach(update spiv reg).
>> 
> 10.4.7.2 Local APIC State After It Has Been Software Disabled says:
> 
>   Pending interrupts in the IRR and ISR registers are held and require
>   masking or handling by the CPU.
> My understanding is that we should treat software disabled APIC as a
> valid target for an interrupt. vcpu_scan_ioapic() should check
> kvm_apic_hw_enabled() only.
Indeed. kvm_apic_hw_enabled() is the right one.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v9 7/7] KVM: Use eoi to track RTC interrupt delivery status

2013-04-11 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-11:
> On Wed, Apr 10, 2013 at 09:22:20PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Current interrupt coalescing logci which only used by RTC has conflict
>> with Posted Interrupt.
>> This patch introduces a new mechinism to use eoi to track interrupt:
>> When delivering an interrupt to vcpu, the pending_eoi set to number of
>> vcpu that received the interrupt. And decrease it when each vcpu writing
>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>> write eoi.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  virt/kvm/ioapic.c |   39 ++-
>>  1 files changed, 38 insertions(+), 1 deletions(-)
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index a49fcd5..aeac154 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -147,6 +147,26 @@ static void kvm_rtc_eoi_tracking_restore_all(struct
> kvm_ioapic *ioapic)
>>  __rtc_irq_eoi_tracking_restore_one(vcpu);
>>  }
>> +static void rtc_irq_eoi(struct kvm_ioapic *ioapic, struct kvm_vcpu *vcpu)
>> +{
>> +if (test_and_clear_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map))
>> +--ioapic->rtc_status.pending_eoi;
>> +
>> +WARN_ON(ioapic->rtc_status.pending_eoi < 0);
>> +}
>> +
>> +static bool rtc_irq_check_coalesced(struct kvm_ioapic *ioapic, int irq,
>> +bool line_status)
>> +{
>> +if (irq != RTC_GSI || !line_status)
>> +return false;
> Please move the check from rtc_irq_check_coalesced() to
> kvm_ioapic_set_irq() like this: if (irq == RTC_GSI && line_status &&
> rtc_irq_check_coalesced(ioapic, irq, line_status)) 
> 
> I was going to fix it myself while applying, but since there will be
> new posted interrupt series anyway you can as well fix this one too.
You mean fix it and send out it with posted interrupt series? Or just rebase 
the posted interrupt series on the top of this fix, but needn't to send out it?

> 
>> +
>> +if (ioapic->rtc_status.pending_eoi > 0)
>> +return true; /* coalesced */
>> +
>> +return false;
>> +}
>> +
>>  static int ioapic_service(struct kvm_ioapic *ioapic, unsigned int idx,
>>  bool line_status)
>>  {
>> @@ -260,6 +280,7 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int 
>> irq,
> bool line_status)
>>  {
>>  union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
>>  struct kvm_lapic_irq irqe;
>> +int ret;
>> 
>>  ioapic_debug("dest=%x dest_mode=%x delivery_mode=%x "
>>   "vector=%x trig_mode=%x\n",
>> @@ -275,7 +296,15 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int
> irq, bool line_status)
>>  irqe.level = 1;
>>  irqe.shorthand = 0;
>> -return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>> +if (irq == RTC_GSI && line_status) {
>> +BUG_ON(ioapic->rtc_status.pending_eoi != 0);
>> +ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
>> +ioapic->rtc_status.dest_map);
>> +ioapic->rtc_status.pending_eoi = ret;
>> +} else
>> +ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>> +
>> +return ret;
>>  }
>>  
>>  int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int 
>> irq_source_id,
>> @@ -299,6 +328,11 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int
> irq, int irq_source_id,
>>  ret = 1;
>>  } else {
>>  int edge = (entry.fields.trig_mode == IOAPIC_EDGE_TRIG);
>> +
>> +if (rtc_irq_check_coalesced(ioapic, irq, line_status)) {
>> +ret = 0; /* coalesced */
>> +goto out;
>> +}
>>  ioapic->irr |= mask;
>>  if ((edge && old_irr != ioapic->irr) ||
>>  (!edge && !entry.fields.remote_irr))
>> @@ -306,6 +340,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int 
>> irq,
> int irq_source_id,
>>  elseret = 0; /* report coalesced interrupt 
>> */   } +out:
>>  trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0);
>>  spin_unlock(&ioapic->lock);
>> @@ -333,6 +368,8 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu
> *vcpu,
>>  if (ent->fields.vector != vector)
>>  continue;
>> +if (i == RTC_GSI)
>> +rtc_irq_eoi(ioapic, vcpu);
>>  /*
>>   * We are dropping lock while calling ack notifiers because ack
>>   * notifier callbacks for assigned devices call into IOAPIC
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v9 7/7] KVM: Use eoi to track RTC interrupt delivery status

2013-04-11 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-04-11:
> On Thu, Apr 11, 2013 at 07:54:01AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-04-11:
>>> On Wed, Apr 10, 2013 at 09:22:20PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> Current interrupt coalescing logci which only used by RTC has conflict
>>>> with Posted Interrupt.
>>>> This patch introduces a new mechinism to use eoi to track interrupt:
>>>> When delivering an interrupt to vcpu, the pending_eoi set to number of
>>>> vcpu that received the interrupt. And decrease it when each vcpu writing
>>>> eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
>>>> write eoi.
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  virt/kvm/ioapic.c |   39 ++- 1
>>>>  files changed, 38 insertions(+), 1 deletions(-)
>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>> index a49fcd5..aeac154 100644
>>>> --- a/virt/kvm/ioapic.c
>>>> +++ b/virt/kvm/ioapic.c
>>>> @@ -147,6 +147,26 @@ static void kvm_rtc_eoi_tracking_restore_all(struct
>>> kvm_ioapic *ioapic)
>>>>__rtc_irq_eoi_tracking_restore_one(vcpu);
>>>>  }
>>>> +static void rtc_irq_eoi(struct kvm_ioapic *ioapic, struct kvm_vcpu *vcpu)
>>>> +{
>>>> +  if (test_and_clear_bit(vcpu->vcpu_id, ioapic->rtc_status.dest_map))
>>>> +  --ioapic->rtc_status.pending_eoi;
>>>> +
>>>> +  WARN_ON(ioapic->rtc_status.pending_eoi < 0);
>>>> +}
>>>> +
>>>> +static bool rtc_irq_check_coalesced(struct kvm_ioapic *ioapic, int irq,
>>>> +  bool line_status)
>>>> +{
>>>> +  if (irq != RTC_GSI || !line_status)
>>>> +  return false;
>>> Please move the check from rtc_irq_check_coalesced() to
>>> kvm_ioapic_set_irq() like this: if (irq == RTC_GSI && line_status &&
>>> rtc_irq_check_coalesced(ioapic, irq, line_status)) 
>>> 
>>> I was going to fix it myself while applying, but since there will be
>>> new posted interrupt series anyway you can as well fix this one too.
>> You mean fix it and send out it with posted interrupt series? Or just
>> rebase the posted interrupt series on the top of this fix, but needn't
>> to send out it?
>> 
> Send both series. RTC one with this change.
Sure.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [patch] x86, kvm: fix build failure with CONFIG_SMP disabled

2013-04-17 Thread Zhang, Yang Z
David Rientjes wrote on 2013-04-18:
> On Wed, 17 Apr 2013, Randy Dunlap wrote:
> 
>> On 04/17/13 16:12, David Rientjes wrote:
>>> The build fails when CONFIG_SMP is disabled:
>>> 
>>> arch/x86/kvm/vmx.c: In function 'vmx_deliver_posted_interrupt':
>>> arch/x86/kvm/vmx.c:3950:3: error: 'apic' undeclared (first use in
>>> this function)
>>> 
>>> Fix it by including the necessary header.
>> 
>> Sorry, i386 build still fails with the same error message plus this one:
>> 
>> ERROR: "apic" [arch/x86/kvm/kvm-intel.ko] undefined!
>> 
> 
> Ahh, that's because you don't have CONFIG_X86_LOCAL_APIC as you already
> mentioned.  So it looks like this error can manifest in two different ways
> and we got different reports.
> 
> This failure came from "KVM: VMX: Add the deliver posted interrupt
> algorithm", so adding Yang to the cc to specify the dependency this has on
> apic and how it can be protected without CONFIG_X86_LOCAL_APIC on i386.
How about the follow patch?

commit a49dd819f502c1029c5a857e87201ef25ec06ce6
Author: Yang Zhang 
Date:   Wed Apr 17 05:34:07 2013 -0400

KVM: x86: Don't sending posted interrupt if not config CONFIG_SMP

In UP, posted interrupt logic will not work. So we should not send
posted interrupt and let vcpu to pick the pending interrupt before
vmentry.

Signed-off-by: Yang Zhang 

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 52b21da..d5c6b95 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3946,10 +3946,12 @@ static void vmx_deliver_posted_interrupt(struct 
kvm_vcpu *vcpu, int vector)

r = pi_test_and_set_on(&vmx->pi_desc);
kvm_make_request(KVM_REQ_EVENT, vcpu);
+#ifdef CONFIG_SMP
if (!r && (vcpu->mode == IN_GUEST_MODE))
apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
POSTED_INTR_VECTOR);
else
+#endif
kvm_vcpu_kick(vcpu);
 }

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [patch] x86, kvm: fix build failure with CONFIG_SMP disabled

2013-04-17 Thread Zhang, Yang Z
Randy Dunlap wrote on 2013-04-18:
> On 04/17/13 17:35, Zhang, Yang Z wrote:
>> David Rientjes wrote on 2013-04-18:
>>> On Wed, 17 Apr 2013, Randy Dunlap wrote:
>>> 
>>>> On 04/17/13 16:12, David Rientjes wrote:
>>>>> The build fails when CONFIG_SMP is disabled:
>>>>> 
>>>>>   arch/x86/kvm/vmx.c: In function 'vmx_deliver_posted_interrupt':
>>>>>   arch/x86/kvm/vmx.c:3950:3: error: 'apic' undeclared (first use in
>>>>> this function)
>>>>> 
>>>>> Fix it by including the necessary header.
>>>> 
>>>> Sorry, i386 build still fails with the same error message plus this one:
>>>> 
>>>> ERROR: "apic" [arch/x86/kvm/kvm-intel.ko] undefined!
>>>> 
>>> 
>>> Ahh, that's because you don't have CONFIG_X86_LOCAL_APIC as you already
>>> mentioned.  So it looks like this error can manifest in two different ways
>>> and we got different reports.
>>> 
>>> This failure came from "KVM: VMX: Add the deliver posted interrupt
>>> algorithm", so adding Yang to the cc to specify the dependency this has on
>>> apic and how it can be protected without CONFIG_X86_LOCAL_APIC on i386.
>> How about the follow patch?
>> 
>> commit a49dd819f502c1029c5a857e87201ef25ec06ce6
>> Author: Yang Zhang 
>> Date:   Wed Apr 17 05:34:07 2013 -0400
>> 
>> KVM: x86: Don't sending posted interrupt if not config CONFIG_SMP
>> 
>> In UP, posted interrupt logic will not work. So we should not send
>> posted interrupt and let vcpu to pick the pending interrupt before
>> vmentry.
>> 
>> Signed-off-by: Yang Zhang 
> 
> Missing Reported-by: and the patch does not apply cleanly (looks like
> lots of spaces instead of tabs in it)... but it does build now after
> massaging the patch.
Thanks.
Just copy it to you for a quick testing. I will resend a formal patch.

> Thanks.
> 
> Acked-by: Randy Dunlap 
> 
> 
>> 
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index
>> 52b21da..d5c6b95 100644 --- a/arch/x86/kvm/vmx.c +++
>> b/arch/x86/kvm/vmx.c @@ -3946,10 +3946,12 @@ static void
>> vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>> 
>> r = pi_test_and_set_on(&vmx->pi_desc);
>> kvm_make_request(KVM_REQ_EVENT, vcpu); +#ifdef CONFIG_SMP if
>> (!r && (vcpu->mode == IN_GUEST_MODE))
>> apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
>> POSTED_INTR_VECTOR);
>> else
>> +#endif
>> kvm_vcpu_kick(vcpu);
>>  }
>> Best regards,
>> Yang
>> 
>> 
> 
> 
> --
> ~Randy
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v10 7/7] KVM: VMX: Use posted interrupt to deliver virtual interrupt

2013-04-25 Thread Zhang, Yang Z
Yangminqiang wrote on 2013-04-26:
> Hi Yang Zhang,
> 
> Could you please let me know your CPU model or the CPU models which
> supports apic-v which your patch requires()? So that I could try you
> patches.
> 
>   Intel Software Developer's Manualm, Volume 3C,
>   System Programming Guide, Part 3. Ch29,
>   APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS
> Or how can I know whether my hardware support those features listed in the
> manual above?
Ivytown or newer platform supported it. 

> Thanks,
> Steven
> 
> kvm-ow...@vger.kernel.org wrote on 2013-04-11:
>> Subject: [PATCH v10 7/7] KVM: VMX: Use posted interrupt to deliver virtual
>> interrupt
>> 
>> From: Yang Zhang 
>> 
>> If posted interrupt is avaliable, then uses it to inject virtual
>> interrupt to guest.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/lapic.c |   30 +++---
>>  arch/x86/kvm/vmx.c   |2 +-
>>  arch/x86/kvm/x86.c   |1 +
>>  3 files changed, 21 insertions(+), 12 deletions(-)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index dbf74c9..e29883c 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -353,6 +353,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic
>> *apic)
>>  if (!apic->irr_pending)
>>  return -1;
>> +kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
>>  result = apic_search_irr(apic);
>>  ASSERT(result == -1 || result >= 16);
>> @@ -683,18 +684,25 @@ static int __apic_accept_irq(struct kvm_lapic *apic,
>> int delivery_mode,
>>  if (dest_map)
>>  __set_bit(vcpu->vcpu_id, dest_map);
>> -result = !apic_test_and_set_irr(vector, apic);
>> -trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> -  trig_mode, vector, !result);
>> -if (!result) {
>> -if (trig_mode)
>> -apic_debug("level trig mode repeatedly for "
>> -"vector %d", vector);
>> -break;
>> -}
>> +if (kvm_x86_ops->deliver_posted_interrupt) {
>> +result = 1;
>> +kvm_x86_ops->deliver_posted_interrupt(vcpu, vector);
>> +} else {
>> +result = !apic_test_and_set_irr(vector, apic);
>> 
>> -kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -kvm_vcpu_kick(vcpu);
>> +if (!result) {
>> +if (trig_mode)
>> +apic_debug("level trig mode repeatedly "
>> +"for vector %d", vector);
>> +goto out;
>> +}
>> +
>> +kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +kvm_vcpu_kick(vcpu);
>> +}
>> +out:
>> +trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> +trig_mode, vector, !result);
>>  break;
>>  
>>  case APIC_DM_REMRD:
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 314b2ed..52b21da 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -84,7 +84,7 @@ module_param(vmm_exclusive, bool, S_IRUGO);
>>  static bool __read_mostly fasteoi = 1;
>>  module_param(fasteoi, bool, S_IRUGO);
>> -static bool __read_mostly enable_apicv;
>> +static bool __read_mostly enable_apicv = 1;
>>  module_param(enable_apicv, bool, S_IRUGO);
>>  
>>  /*
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 6147d24..628582f 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2685,6 +2685,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,  
>>
>>  struct kvm_lapic_state *s) { +  kvm_x86_ops->sync_pir_to_irr(vcpu);
>>  memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
>>  
>>  return 0;
>> --
>> 1.7.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best regards,
Yang




RE: [PATCH v10 7/7] KVM: VMX: Use posted interrupt to deliver virtual interrupt

2013-05-06 Thread Zhang, Yang Z
Yangminqiang wrote on 2013-05-03:
> Nakajima, Jun wrote on 2013-04-26:
>> Subject: Re: [PATCH v10 7/7] KVM: VMX: Use posted interrupt to deliver 
>> virtual
>> interrupt
>> 
>> On Fri, Apr 26, 2013 at 2:29 AM, Yangminqiang 
>> wrote:
>> 
 Ivytown or newer platform supported it.
>>> 
>>> Ivytown? Do you mean Ivy Bridge?
>>> 
>> 
>> Ivy Town is the codename of "Ivy Bridge-based servers".
> 
> One more question, what is the relationship between x2APIC and APIC
> virtualization? APIC-v requires x2APIC or APIC-v includes x2APIC?
If you are using x2apic way(MSR base access)inside guest and want to benefit 
from apic virtualization technology, then you should set virtual x2apic bit in 
Secondary Processor-Based VM-Execution Controls.

Best regards,
Yang



RE: [PATCH v3 3/4] x86, apicv: add virtual interrupt delivery support

2012-12-05 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-06:
> On Thu, Dec 06, 2012 at 02:55:16AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-05:
>>> On Wed, Dec 05, 2012 at 01:51:36PM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-05:
>>>>> On Wed, Dec 05, 2012 at 06:02:59AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2012-12-05:
>>>>>>> On Wed, Dec 05, 2012 at 01:55:17AM +, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2012-12-04:
>>>>>>>>> On Tue, Dec 04, 2012 at 06:39:50AM +, Zhang, Yang Z wrote:
>>>>>>>>>> Gleb Natapov wrote on 2012-12-03:
>>>>>>>>>>> On Mon, Dec 03, 2012 at 03:01:03PM +0800, Yang Zhang wrote:
>>>>>>>>>>>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>>>>>>>>>>>> manually, which is fully taken care of by the hardware. This needs
>>>>>>>>>>>> some special awareness into existing interrupr injection path:
>>>>>>>>>>>> 
>>>>>>>>>>>> - for pending interrupt, instead of direct injection, we may need
>>>>>>>>>>>>   update architecture specific indicators before resuming to
>>>>>>>>>>>>   guest. - A pending interrupt, which is masked by ISR, should
>>>>>>>>>>>>   be also considered in above update action, since hardware
>>>>>>>>>>>>   will decide when to inject it at right time. Current
>>>>>>>>>>>>   has_interrupt and get_interrupt only returns a valid vector
>>>>>>>>>>>>   from injection p.o.v.
>>>>>>>>>>> Most of my previous comments still apply.
>>>>>>>>>>> 
>>>>>>>>>>>> +void kvm_set_eoi_exitmap(struct kvm_vcpu *vcpu, int vector,
>>>>>>>>>>>> +  int trig_mode, int always_set)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +  if (kvm_x86_ops->set_eoi_exitmap)
>>>>>>>>>>>> +  kvm_x86_ops->set_eoi_exitmap(vcpu, vector,
>>>>>>>>>>>> +  trig_mode, always_set);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>>  /*
>>>>>>>>>>>>   * Add a pending IRQ into lapic.
>>>>>>>>>>>>   * Return 1 if successfully added and 0 if discarded.
>>>>>>>>>>>> @@ -661,6 +669,7 @@ static int __apic_accept_irq(struct
> kvm_lapic
>>>>>>> *apic,
>>>>>>>>> int
>>>>>>>>>>> delivery_mode,
>>>>>>>>>>>>if (unlikely(!apic_enabled(apic)))
>>>>>>>>>>>>break;
>>>>>>>>>>>> +  kvm_set_eoi_exitmap(vcpu, vector, trig_mode, 0);
>>>>>>>>>>> As I said in the last review rebuild the bitmap when ioapic or irq
>>>>>>>>>>> notifier configuration changes, user request bit to notify vcpus to
>>>>>>>>>>> reload the bitmap.
>>>>>>>>>> It is too complicated. When program ioapic entry, we cannot get the
>>>>> target
>>>>>>> vcpu
>>>>>>>>> easily. We need to read destination format register and logical
>>>>>>>>> destination register to find out target vcpu if using logical mode.
>>>>>>>>> Also, we must trap every modification to the two registers to update
>>>>>>>>> eoi bitmap. No need to check target vcpu. Enable exit on all vcpus
>>>>>>>>> for the vector
>>>>>>>> This is wrong. As we known, modern OS uses per VCPU vector. We
> cannot
>>>>>>> ensure all vectors have same trigger mode. And what's worse, the
>>>>>>> vector in another vcpu is used to handle high frequency
>>>>>>> interrupts(like 10G NIC), then it will hurt performance.
>>>>>>>> 
>>>>>>> I never saw OSes reuse vector used by ioapic, as far as I see this
>>>&g

RE: [PATCH v3 3/4] x86, apicv: add virtual interrupt delivery support

2012-12-05 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-06:
> On Thu, Dec 06, 2012 at 07:16:07AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-06:
>>> On Thu, Dec 06, 2012 at 02:55:16AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-05:
>>>>> On Wed, Dec 05, 2012 at 01:51:36PM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2012-12-05:
>>>>>>> On Wed, Dec 05, 2012 at 06:02:59AM +, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2012-12-05:
>>>>>>>>> On Wed, Dec 05, 2012 at 01:55:17AM +0000, Zhang, Yang Z wrote:
>>>>>>>>>> Gleb Natapov wrote on 2012-12-04:
>>>>>>>>>>> On Tue, Dec 04, 2012 at 06:39:50AM +, Zhang, Yang Z wrote:
>>>>>>>>>>>> Gleb Natapov wrote on 2012-12-03:
>>>>>>>>>>>>> On Mon, Dec 03, 2012 at 03:01:03PM +0800, Yang Zhang wrote:
>>>>>>>>>>>>>> Virtual interrupt delivery avoids KVM to inject vAPIC
>>>>>>>>>>>>>> interrupts manually, which is fully taken care of by the
>>>>>>>>>>>>>> hardware. This needs some special awareness into existing
>>>>>>>>>>>>>> interrupr injection path:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - for pending interrupt, instead of direct injection, we may
> need
>>>>>>>>>>>>>>   update architecture specific indicators before resuming to
>>>>>>>>>>>>>>   guest. - A pending interrupt, which is masked by ISR, should
>>>>>>>>>>>>>>   be also considered in above update action, since hardware
>>>>>>>>>>>>>>   will decide when to inject it at right time. Current
>>>>>>>>>>>>>>   has_interrupt and get_interrupt only returns a valid vector
>>>>>>>>>>>>>>   from injection p.o.v.
>>>>>>>>>>>>> Most of my previous comments still apply.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> +void kvm_set_eoi_exitmap(struct kvm_vcpu *vcpu, int
>>>>>>>>>>>>>> vector, +int trig_mode, int always_set) +{ + 
>>>>>>>>>>>>>> if
>>>>>>>>>>>>>> (kvm_x86_ops->set_eoi_exitmap)
>>>>>>>>>>>>>> +kvm_x86_ops->set_eoi_exitmap(vcpu, vector,
>>>>>>>>>>>>>> +trig_mode, always_set); 
>>>>>>>>>>>>>> +} +
>>>>>>>>>>>>>>  /*
>>>>>>>>>>>>>>   * Add a pending IRQ into lapic.
>>>>>>>>>>>>>>   * Return 1 if successfully added and 0 if discarded.
>>>>>>>>>>>>>> @@ -661,6 +669,7 @@ static int __apic_accept_irq(struct
>>> kvm_lapic
>>>>>>>>> *apic,
>>>>>>>>>>> int
>>>>>>>>>>>>> delivery_mode,
>>>>>>>>>>>>>>  if (unlikely(!apic_enabled(apic)))
>>>>>>>>>>>>>>  break;
>>>>>>>>>>>>>> +kvm_set_eoi_exitmap(vcpu, vector, trig_mode, 0);
>>>>>>>>>>>>> As I said in the last review rebuild the bitmap when ioapic
>>>>>>>>>>>>> or irq notifier configuration changes, user request bit to
>>>>>>>>>>>>> notify vcpus to reload the bitmap.
>>>>>>>>>>>> It is too complicated. When program ioapic entry, we cannot get
> the
>>>>>>> target
>>>>>>>>> vcpu
>>>>>>>>>>> easily. We need to read destination format register and
>>>>>>>>>>> logical destination register to find out target vcpu if using
>>>>>>>>>>> logical mode. Also, we must trap every modification to the two
>>>>>>>>>>> registers to update eoi bitmap. No need to check target vcpu.
>>>

RE: [PATCH v3 3/4] x86, apicv: add virtual interrupt delivery support

2012-12-06 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-06:
> On Wed, Dec 05, 2012 at 08:38:59PM -0200, Marcelo Tosatti wrote:
>> On Wed, Dec 05, 2012 at 01:14:38PM +0200, Gleb Natapov wrote:
>>> On Wed, Dec 05, 2012 at 03:43:41AM +0000, Zhang, Yang Z wrote:
>>>>>> @@ -5657,12 +5673,20 @@ static int vcpu_enter_guest(struct kvm_vcpu
>>>>> *vcpu)
>>>>>>  }
>>>>>>  
>>>>>>  if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win)
> {
>>>>>> +/* update archtecture specific hints for APIC
>>>>>> + * virtual interrupt delivery */
>>>>>> +if (kvm_x86_ops->update_irq)
>>>>>> +kvm_x86_ops->update_irq(vcpu);
>>>>>> +
>>>>>>  inject_pending_event(vcpu);
>>>>>>  
>>>>>>  /* enable NMI/IRQ window open exits if needed */
>>>>>>  if (vcpu->arch.nmi_pending)
>>>>>>  kvm_x86_ops->enable_nmi_window(vcpu);
>>>>>> -else if (kvm_cpu_has_interrupt(vcpu) || req_int_win)
>>>>>> +else if (kvm_apic_vid_enabled(vcpu)) {
>>>>>> +if (kvm_cpu_has_extint(vcpu))
>>>>>> +kvm_x86_ops->enable_irq_window(vcpu);
>>>>> 
>>>>> If RVI is non-zero, then interrupt window should not be enabled,
>>>>> accordingly to 29.2.2:
>>>>> 
>>>>> "If a virtual interrupt has been recognized (see Section 29.2.1), it will
>>>>> be delivered at an instruction boundary when the following conditions all
>>>>> hold: (1) RFLAGS.IF = 1; (2) there is no blocking by STI; (3) there is no
>>>>> blocking by MOV SS or by POP SS; and (4) the "interrupt-window exiting"
>>>>> VM-execution control is 0."
>>>> Right. Must check RVI here.
>>>> 
>>> Why? We request interrupt window here because there is ExtINT interrupt
>>> pending. ExtINT interrupt has a precedence over APIC interrupts (our
>>> current code is incorrect!), so we want vmexit as soon as interrupts are
>>> allowed to inject ExtINT and we do not want virtual interrupt to be
>>> delivered. I think the (4) there is exactly for this situation.
>>> 
>>> --
>>> Gleb.
>> 
>> Right. BTW, delivery of ExtINT has no EOI, so there is no evaluation
>> of pending virtual interrupts. Therefore, shouldnt interrupt window be
>> enabled when injecting ExtINT so that evaluation of pending virtual
>> interrupts is performed on next vm-entry?
>> 
> Good question and I think, luckily for us, the answer is no. Spec uses
> two different terms when it talks about virtual interrupts "Evaluation
> of Pending Virtual Interrupts" and "Virtual-Interrupt Delivery". As far
> as my reading of the spec goes they are not necessary happen at the same
> time. So during ExtINT injection "evaluation" will happen (due to vmentry)
> and virtual interrupt will be recognized, but not "delivered". It will
> be delivered when condition described in section 29.2.2 will be met i.e
> when interrupts will be enabled.
> 
> Yang, can you confirm this?
Right. 
Vmentry causes the evaluation of pending virtual interrupt even during ExtINT 
injection. If RVI[7:4] > VPPR[7:4], the logical process recognizes a pending 
virtual interrupt. Then it will be delivery when condition is met.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 3/4] x86, apicv: add virtual interrupt delivery support

2012-12-06 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2012-12-07:
> On Thu, Dec 06, 2012 at 08:36:52AM +0200, Gleb Natapov wrote:
>> On Thu, Dec 06, 2012 at 05:02:15AM +0000, Zhang, Yang Z wrote:
>>> Zhang, Yang Z wrote on 2012-12-06:
>>>> Marcelo Tosatti wrote on 2012-12-06:
>>>>> On Mon, Dec 03, 2012 at 03:01:03PM +0800, Yang Zhang wrote:
>>>>>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>>>>>> manually, which is fully taken care of by the hardware. This needs
>>>>>> some special awareness into existing interrupr injection path:
>>>>>> 
>>>>>> - for pending interrupt, instead of direct injection, we may need
>>>>>>   update architecture specific indicators before resuming to guest. -
>>>>>>   A pending interrupt, which is masked by ISR, should be also
>>>>>>   considered in above update action, since hardware will decide when
>>>>>>   to inject it at right time. Current has_interrupt and get_interrupt
>>>>>>   only returns a valid vector from injection p.o.v.
>>>>>> Signed-off-by: Yang Zhang 
>>>>>> Signed-off-by: Kevin Tian 
>>>>>> ---
>>>>>>  arch/x86/include/asm/kvm_host.h |4 +
> arch/x86/include/asm/vmx.h
>>>>>>|   11 +++ arch/x86/kvm/irq.c  |   53
> ++-
>>>>>>  arch/x86/kvm/lapic.c|   56 +---
>>>>>>  arch/x86/kvm/lapic.h|6 ++ arch/x86/kvm/svm.c
>>>>>> |   19 + arch/x86/kvm/vmx.c  |  140
>>>>>>  ++- arch/x86/kvm/x86.c
>>>>>>   |   34 -- virt/kvm/ioapic.c   |1 + 9 files
>>>>>>  changed, 291 insertions(+), 33 deletions(-)
>>>>>> diff --git a/arch/x86/include/asm/kvm_host.h
>>>>>> b/arch/x86/include/asm/kvm_host.h index dc87b65..e5352c8 100644 ---
>>>>>> a/arch/x86/include/asm/kvm_host.h +++
>>>>>> b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,10 @@ struct
>>>>>> kvm_x86_ops {
>>>>>>  void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
>>>>>>  void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>>>>>>  void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, 
>>>>>> int irr);
>>>>>> +int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
>>>>>> +void (*update_irq)(struct kvm_vcpu *vcpu);
>>>>>> +void (*set_eoi_exitmap)(struct kvm_vcpu *vcpu, int vector,
>>>>>> +int trig_mode, int always_set);
>>>>>>  int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>>>>>>  int (*get_tdp_level)(void);
>>>>>>  u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
>>>> is_mmio);
>>>>>> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
>>>>>> index 21101b6..1003341 100644
>>>>>> --- a/arch/x86/include/asm/vmx.h
>>>>>> +++ b/arch/x86/include/asm/vmx.h
>>>>>> @@ -62,6 +62,7 @@
>>>>>>  #define EXIT_REASON_MCE_DURING_VMENTRY  41 #define
>>>>>>  EXIT_REASON_TPR_BELOW_THRESHOLD 43 #define
>>>> EXIT_REASON_APIC_ACCESS
>>>>>>  44 +#define EXIT_REASON_EOI_INDUCED 45 #define
>>>>>>  EXIT_REASON_EPT_VIOLATION   48 #define
>>>> EXIT_REASON_EPT_MISCONFIG
>>>>>>  49 #define EXIT_REASON_WBINVD  54 @@ -143,6
>>>> +144,7 @@
>>>>>>  #define SECONDARY_EXEC_WBINVD_EXITING   0x0040 #define
>>>>>>  SECONDARY_EXEC_UNRESTRICTED_GUEST   0x0080 #define
>>>>>>  SECONDARY_EXEC_APIC_REGISTER_VIRT   0x0100 +#define
>>>>>>  SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY0x0200 #define
>>>>>>  SECONDARY_EXEC_PAUSE_LOOP_EXITING   0x0400 #define
>>>>>>  SECONDARY_EXEC_ENABLE_INVPCID   0x1000 @@ -180,6 +182,7 @@ 
>>>>>> enum
>>>>>>  vmcs_field {GUEST_GS_SELECTOR   = 0x080a,
>>>>GUEST_LDTR_SELECTOR
>>>>>>   = 0x080c,  GUEST_TR_SELECTOR   =
>>>> 0x080e,
>>>>>>  +   GUEST_INTR_STATUS   = 0x0810,
>>>>

RE: [RESEND PATCH v4 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-09 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-09:
> On Sat, Dec 08, 2012 at 08:04:30PM +0800, Yang Zhang wrote:
>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>> index cfb7e4d..081225a 100644
>> --- a/virt/kvm/ioapic.c
>> +++ b/virt/kvm/ioapic.c
>> @@ -115,6 +115,40 @@ static void update_handled_vectors(struct kvm_ioapic
> *ioapic)
>>  smp_wmb();
>>  }
>> +void _ioapic_update_eoi_exitmap(struct kvm_ioapic *ioapic, int pin)
>> +{
> Bette make is ioapic_update_eoi_exitmap_one() or something. Underscore
> is undescriptive.
> 
>> +union kvm_ioapic_redirect_entry *e;
>> +
>> +e = &ioapic->redirtbl[pin];
>> +
>> +/* PIT is a special case: which is edge trig but have EOI hook.
>> + * Always set the eoi exit bitmap for PIT interrupt*/
> No hacks please. Check that ack notifier is register for gsi.
Do you mean do this in kvm_register_irq_ack_notifier()? The problem is that we 
cannot get the vector when calling this function. Because this function is 
called during device initializing, and guest is not starting at that time.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RESEND PATCH v4 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-09 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-10:
> On Mon, Dec 10, 2012 at 01:34:02AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-09:
>>> On Sat, Dec 08, 2012 at 08:04:30PM +0800, Yang Zhang wrote:
>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
>>>> index cfb7e4d..081225a 100644
>>>> --- a/virt/kvm/ioapic.c
>>>> +++ b/virt/kvm/ioapic.c
>>>> @@ -115,6 +115,40 @@ static void update_handled_vectors(struct
> kvm_ioapic
>>> *ioapic)
>>>>smp_wmb();
>>>>  }
>>>> +void _ioapic_update_eoi_exitmap(struct kvm_ioapic *ioapic, int pin)
>>>> +{
>>> Bette make is ioapic_update_eoi_exitmap_one() or something. Underscore
>>> is undescriptive.
>>> 
>>>> +  union kvm_ioapic_redirect_entry *e;
>>>> +
>>>> +  e = &ioapic->redirtbl[pin];
>>>> +
>>>> +  /* PIT is a special case: which is edge trig but have EOI hook.
>>>> +   * Always set the eoi exit bitmap for PIT interrupt*/
>>> No hacks please. Check that ack notifier is register for gsi.
>> Do you mean do this in kvm_register_irq_ack_notifier()? The problem is that
> we cannot get the vector when calling this function. Because this function is
> called during device initializing, and guest is not starting at that time.
>> 
> Call ioapic_update_eoi_exitmap() in kvm_register_irq_ack_notifier(),
> check that gsi (pin) has notifier registered in ioapic_update_eoi_exitmap().
> 
Ok.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 0/3] x86, apicv: Add APIC virtualization support

2012-12-11 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-11:
> Are you going to send the patch that removes non working PIT interrupt
> redirection hack, or should I do it?
Sure. I will do it.

> On Mon, Dec 10, 2012 at 03:20:37PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> APIC virtualization is a new feature which can eliminate most of VM exit
>> when vcpu handle a interrupt:
>> 
>> APIC register virtualization:
>> APIC read access doesn't cause APIC-access VM exits.
>> APIC write becomes trap-like.
>> Virtual interrupt delivery:
>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>> manually, which is fully taken care of by the hardware.
>> Please refer to Intel SDM volume 3, chapter 29 for more details.
>> 
>> Changes v4 to v5:
>>  * Set eoi exit bitmap when an interrupt has notifier registered.
>>  * Use request to track ioapic entry's modification.
>>  * Rebased on top of KVM upstream.
>> Changes v3 to v4:
>>  * use one option to control both register virtualization and virtual 
>> interrupt
>>delivery.
>>  * Update eoi exit bitmap when programing ioapic or programing apic's
>>  id/dfr/ldr. * Rebased on top of KVM upstream.
>> Changes v2 to v3:
>>  * Drop Posted Interrupt patch from v3.
>>According Gleb's suggestion, we will use global vector for all VCPUs
>>as notification event vector. So we will rewrite the Posted
>>Interrupt patch. And resend it
> later.
>>  * Use TMR to set the eoi exiting bitmap. We only want to set eoi exiting
> bitmap for
>>those interrupt which is level trigger or has notifier in EOI write
>>path. So TMR is enough to distinguish the interrupt trigger mode.
>>  * Simplify some code according Gleb's comments.
>>  * rebased on top of KVM upstream.
>> Changes v1 to v2:
>>  * Add Posted Interrupt support in this series patch.
>>  * Since there is a notifer hook in vAPIC EOI for PIT interrupt. So always 
>> Set PIT
>>interrupt in eoi exit bitmap to force vmexit when EOI to interrupt.
>>  * Rebased on top of KVM upstream
>> Yang Zhang (3):
>>   x86, apicv: add APICv register virtualization support
>>   x86, apicv: add virtual interrupt delivery support
>>   x86, apicv: add virtual x2apic support
>>  arch/ia64/kvm/lapic.h   |6 ++
>>  arch/x86/include/asm/kvm_host.h |5 + arch/x86/include/asm/vmx.h   
>>|   13 +++ arch/x86/kvm/irq.c  |   76 
>>  arch/x86/kvm/lapic.c|  114 +--
>>  arch/x86/kvm/lapic.h|   11 +++ arch/x86/kvm/svm.c 
>>  |   25 + arch/x86/kvm/vmx.c  |  191
>>  +-- arch/x86/kvm/x86.c
>>   |   18 +++-- include/linux/kvm_host.h|2 +
>>  virt/kvm/ioapic.c   |   35 +++ virt/kvm/ioapic.h  
>>  |1 + virt/kvm/irq_comm.c |   18  13 files
>>  changed, 474 insertions(+), 41 deletions(-)
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
>   Gleb.


Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-11 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-11:
> Looks very good overall. Are you testing this with vid disabled with
> Linux/Windows guests? Small comments below.
Yes. I tested rhel6u3, rhel5u4, winxp and win7. All of them work well with and 
without vid enabled.

> On Mon, Dec 10, 2012 at 03:20:39PM +0800, Yang Zhang wrote:
>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>> manually, which is fully taken care of by the hardware. This needs
>> some special awareness into existing interrupr injection path:
>> 
>> - for pending interrupt, instead of direct injection, we may need
>>   update architecture specific indicators before resuming to guest.
>> - A pending interrupt, which is masked by ISR, should be also
>>   considered in above update action, since hardware will decide
>>   when to inject it at right time. Current has_interrupt and
>>   get_interrupt only returns a valid vector from injection p.o.v.
>> Signed-off-by: Kevin Tian 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/ia64/kvm/lapic.h   |6 ++
>>  arch/x86/include/asm/kvm_host.h |5 ++ arch/x86/include/asm/vmx.h  
>> |   11  arch/x86/kvm/irq.c  |   76
>>  ++-- arch/x86/kvm/lapic.c|   99
>>  +--- arch/x86/kvm/lapic.h|
>> 9 +++ arch/x86/kvm/svm.c  |   25 +
>>  arch/x86/kvm/vmx.c  |  104
>>  +-- arch/x86/kvm/x86.c
>>   |   18 --- include/linux/kvm_host.h|2 +
>>  virt/kvm/ioapic.c   |   35 + virt/kvm/ioapic.h
>>|1 + virt/kvm/irq_comm.c |   18 +++
>>  13 files changed, 372 insertions(+), 37 deletions(-)
>> diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
>> index c5f92a9..cb59eb4 100644
>> --- a/arch/ia64/kvm/lapic.h
>> +++ b/arch/ia64/kvm/lapic.h
>> @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct
> kvm_lapic_irq *irq);
>>  #define kvm_apic_present(x) (true)
>>  #define kvm_lapic_enabled(x) (true)
>> +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
>> +struct kvm_lapic_irq *irq)
>> +{
>> +/* IA64 has no apicv supporting, do nothing here */
>> +}
>> +
>>  #endif
>> diff --git a/arch/x86/include/asm/kvm_host.h
>> b/arch/x86/include/asm/kvm_host.h index dc87b65..d797ade 100644 ---
>> a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -697,6 +697,10 @@ struct kvm_x86_ops {
>>  void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
>>  void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>>  void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
>> +int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
>> +void (*update_irq)(struct kvm_vcpu *vcpu, int max_irr);
> Lets call it update_apic_irq since this is what is does.
Ok.

>> +/*
>>   * Read pending interrupt vector and intack.
>>   */
>>  int kvm_cpu_get_interrupt(struct kvm_vcpu *v) { -   struct kvm_pic *s;
>>  int vector;
>>  
>>  if (!irqchip_in_kernel(v->kvm))
>>  return v->arch.interrupt.nr;
>> -vector = kvm_get_apic_interrupt(v); /* APIC */
>> -if (vector == -1) {
>> -if (kvm_apic_accept_pic_intr(v)) {
>> -s = pic_irqchip(v->kvm);
>> -s->output = 0;  /* PIC */
>> -vector = kvm_pic_read_irq(v->kvm);
>> -}
>> +if (kvm_apic_vid_enabled(v))
>> +vector = kvm_cpu_get_extint(v); /* non-APIC */
>> +else {
>> +vector = kvm_get_apic_interrupt(v); /* APIC */
>> +if (vector == -1)
>> +vector = kvm_cpu_get_extint(v); /* non-APIC */
>>  }
> I've send the patch to fix ExtINT handling. Can you review it and rebase on
> top of it?
Sorry. I missed it.
>From performance point, I thought this is not friendly. As we known, Extint 
>interrupt is used rarely(it may only exist when in virtual wire mode). When 
>guest boot up, it is in apic mode. So most of time, interrupts are went to 
>apic not pic. And it seems check extint first is unnecessary. From my point, 
>if there is no correctness issue, virtualization isn't force to follow the 
>hardware's behavior. 

>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 3bdaf29..060f36b 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5534,12 +5534,10 @@ static void inject_pending_event(struct kvm_vcpu
> *vcpu)
>>  vcpu->arch.nmi_injected = true;
>>  kvm_x86_ops->set_nmi(vcpu);
>>  }
>> -} else if (kvm_cpu_has_interrupt(vcpu)) {
>> -if (kvm_x86_ops->interrupt_allowed(vcpu)) {
>> -kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu),
>> -false);
>> -kvm_x86_ops->set_irq(vcpu);
>> -

RE: [PATCH] KVM: inject ExtINT interrupt before APIC interrupts

2012-12-11 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-10:
> According to Intel SDM Volume 3 Section 10.8.1 "Interrupt Handling with
> the Pentium 4 and Intel Xeon Processors" and Section 10.8.2 "Interrupt
> Handling with the P6 Family and Pentium Processors" ExtINT interrupts are
> sent directly to the processor core for handling. Currently KVM checks
> APIC before it considers ExtINT interrupts for injection which is
> backwards from the spec. Make code behave according to the SDM.
Ack.

> Signed-off-by: Gleb Natapov 
> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> index 848206d..cc31f7c 100644
> --- a/arch/x86/kvm/i8259.c
> +++ b/arch/x86/kvm/i8259.c
> @@ -241,6 +241,8 @@ int kvm_pic_read_irq(struct kvm *kvm)
>   int irq, irq2, intno;
>   struct kvm_pic *s = pic_irqchip(kvm);
> + s->output = 0;
> +
>   pic_lock(s);
>   irq = pic_get_irq(&s->pics[0]);
>   if (irq >= 0) {
> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> index 7e06ba1..ebd98d0 100644
> --- a/arch/x86/kvm/irq.c
> +++ b/arch/x86/kvm/irq.c
> @@ -48,14 +48,10 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>   if (!irqchip_in_kernel(v->kvm))
>   return v->arch.interrupt.pending;
> - if (kvm_apic_has_interrupt(v) == -1) {  /* LAPIC */
> - if (kvm_apic_accept_pic_intr(v)) {
> - s = pic_irqchip(v->kvm);/* PIC */
> - return s->output;
> - } else
> - return 0;
> - }
> - return 1;
> + if (kvm_apic_accept_pic_intr(v) && pic_irqchip(v->kvm)->output)
> + return pic_irqchip(v->kvm)->output; /* PIC */
> +
> + return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
>  }
>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
> @@ -65,20 +61,14 @@ EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>  int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>  {
>   struct kvm_pic *s;
> - int vector;
> 
>   if (!irqchip_in_kernel(v->kvm))
>   return v->arch.interrupt.nr;
> - vector = kvm_get_apic_interrupt(v); /* APIC */
> - if (vector == -1) {
> - if (kvm_apic_accept_pic_intr(v)) {
> - s = pic_irqchip(v->kvm);
> - s->output = 0;  /* PIC */
> - vector = kvm_pic_read_irq(v->kvm);
> - }
> - }
> - return vector;
> + if (kvm_apic_accept_pic_intr(v) && pic_irqchip(v->kvm)->output)
> + return kvm_pic_read_irq(v->kvm);/* PIC */
> +
> + return kvm_get_apic_interrupt(v);   /* APIC */
>  }
>  EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-11 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-11:
> On Tue, Dec 11, 2012 at 12:05:39PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-11:
>>> Looks very good overall. Are you testing this with vid disabled with
>>> Linux/Windows guests? Small comments below.
>> Yes. I tested rhel6u3, rhel5u4, winxp and win7. All of them work well
>> with and without vid enabled.
>> 
>>> On Mon, Dec 10, 2012 at 03:20:39PM +0800, Yang Zhang wrote:
>>>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>>>> manually, which is fully taken care of by the hardware. This needs
>>>> some special awareness into existing interrupr injection path:
>>>> 
>>>> - for pending interrupt, instead of direct injection, we may need
>>>>   update architecture specific indicators before resuming to guest. -
>>>>   A pending interrupt, which is masked by ISR, should be also
>>>>   considered in above update action, since hardware will decide when
>>>>   to inject it at right time. Current has_interrupt and get_interrupt
>>>>   only returns a valid vector from injection p.o.v.
>>>> Signed-off-by: Kevin Tian 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  arch/ia64/kvm/lapic.h   |6 ++
>>>>  arch/x86/include/asm/kvm_host.h |5 ++
> arch/x86/include/asm/vmx.h
>>>> |   11  arch/x86/kvm/irq.c  |   76
>>>>  ++-- arch/x86/kvm/lapic.c| 99
>>>>  +--- arch/x86/kvm/lapic.h
> |
>>>> 9 +++ arch/x86/kvm/svm.c  |   25 +
>>>>  arch/x86/kvm/vmx.c  |  104
>>>>  +-- arch/x86/kvm/x86.c
>>>>   |   18 --- include/linux/kvm_host.h|2 +
>>>>  virt/kvm/ioapic.c   |   35 +
> virt/kvm/ioapic.h
>>>>|1 + virt/kvm/irq_comm.c |   18
> +++
>>>>  13 files changed, 372 insertions(+), 37 deletions(-)
>>>> diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
>>>> index c5f92a9..cb59eb4 100644
>>>> --- a/arch/ia64/kvm/lapic.h
>>>> +++ b/arch/ia64/kvm/lapic.h
>>>> @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct
>>> kvm_lapic_irq *irq);
>>>>  #define kvm_apic_present(x) (true)
>>>>  #define kvm_lapic_enabled(x) (true)
>>>> +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
>>>> +  struct kvm_lapic_irq *irq)
>>>> +{
>>>> +  /* IA64 has no apicv supporting, do nothing here */
>>>> +}
>>>> +
>>>>  #endif
>>>> diff --git a/arch/x86/include/asm/kvm_host.h
>>>> b/arch/x86/include/asm/kvm_host.h index dc87b65..d797ade 100644 ---
>>>> a/arch/x86/include/asm/kvm_host.h +++
>>>> b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,10 @@ struct
>>>> kvm_x86_ops {
>>>>void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
>>>>void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>>>>void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
>>>> +  int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
>>>> +  void (*update_irq)(struct kvm_vcpu *vcpu, int max_irr);
>>> Lets call it update_apic_irq since this is what is does.
>> Ok.
>> 
>>>> +/*
>>>>   * Read pending interrupt vector and intack.
>>>>   */
>>>>  int kvm_cpu_get_interrupt(struct kvm_vcpu *v) { - struct kvm_pic *s;
>>>>int vector;
>>>>  
>>>>if (!irqchip_in_kernel(v->kvm))
>>>>return v->arch.interrupt.nr;
>>>> -  vector = kvm_get_apic_interrupt(v); /* APIC */
>>>> -  if (vector == -1) {
>>>> -  if (kvm_apic_accept_pic_intr(v)) {
>>>> -  s = pic_irqchip(v->kvm);
>>>> -  s->output = 0;  /* PIC */
>>>> -  vector = kvm_pic_read_irq(v->kvm);
>>>> -  }
>>>> +  if (kvm_apic_vid_enabled(v))
>>>> +  vector = kvm_cpu_get_extint(v); /* non-APIC */
>>>> +  else {
>>>> +  vector = kvm_get_apic_interrupt(v); /* APIC */
>>>> +  if (vector == -1)
>>>> +  vector = kvm_cpu_get_extint(v); /* non-APIC */
>>>>}

RE: [PATCH v6 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-12 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-12:
> On Wed, Dec 12, 2012 at 12:56:47PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>> manually, which is fully taken care of by the hardware. This needs
>> some special awareness into existing interrupr injection path:
>> 
>> - for pending interrupt, instead of direct injection, we may need
>>   update architecture specific indicators before resuming to guest.
>> - A pending interrupt, which is masked by ISR, should be also
>>   considered in above update action, since hardware will decide
>>   when to inject it at right time. Current has_interrupt and
>>   get_interrupt only returns a valid vector from injection p.o.v.
>> Signed-off-by: Kevin Tian 
>> Signed-off-by: Yang Zhang 
>> ---
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>> index ebd98d0..537ce4b 100644
>> --- a/arch/x86/kvm/irq.c
>> +++ b/arch/x86/kvm/irq.c
>> @@ -38,37 +38,81 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu
> *vcpu)
>>  EXPORT_SYMBOL(kvm_cpu_has_pending_timer);
>>  
>>  /*
>> + * check if there is pending interrupt from
>> + * non-APIC source without intack.
>> + */
>> +static int kvm_cpu_has_extint(struct kvm_vcpu *v)
>> +{
>> +if (kvm_apic_accept_pic_intr(v))
>> +return pic_irqchip(v->kvm)->output; /* PIC */
>> +else
>> +return 0;
>> +}
>> +
>> +/*
>> + * check if there is injectable interrupt:
>> + * when virtual interrupt delivery enabled,
>> + * interrupt from apic will handled by hardware,
>> + * we don't need to check it here.
>> + */
>> +int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v)
>> +{
>> +if (!irqchip_in_kernel(v->kvm))
>> +return v->arch.interrupt.pending;
>> +
>> +if (kvm_cpu_has_extint(v))
>> +return 1;
>> +else if (!kvm_apic_vid_enabled(v))
>> +return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
>> +
> I think:
>if (kvm_cpu_has_extint(v))
> return 1;
>if(kvm_apic_vid_enabled(v))return 0; return
>kvm_apic_has_interrupt(v) != -1; /* LAPIC */
> is clearer.
OK.
 
>> +/*
>>   * Read pending interrupt vector and intack.
>>   */
>>  int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>>  {
>> -struct kvm_pic *s;
>> +int vector;
>> 
>>  if (!irqchip_in_kernel(v->kvm))
>>  return v->arch.interrupt.nr;
>> -if (kvm_apic_accept_pic_intr(v) && pic_irqchip(v->kvm)->output)
>> -return kvm_pic_read_irq(v->kvm);/* PIC */
>> +vector = kvm_cpu_get_extint(v);
>> +
>> +if (kvm_apic_vid_enabled(v))
>> +return vector;  /* PIC */
>> +else if (vector == -1)
>> +vector = kvm_get_apic_interrupt(v); /* APIC */
>> 
> No need "else" here:
>   if (kvm_apic_vid_enabled(v) || vector != -1)
> return vector;
>   return kvm_get_apic_interrupt(v);
Ok.

>> -return kvm_get_apic_interrupt(v);   /* APIC */
>> +return vector;
>>  }
>>  EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 0664c13..0dfbd47 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -236,12 +236,14 @@ static inline void kvm_apic_set_id(struct kvm_lapic
> *apic, u8 id)
>>  {   apic_set_reg(apic, APIC_ID, id << 24);
>>  recalculate_apic_map(apic->vcpu->kvm);
>>  +   ioapic_update_eoi_exitmap(apic->vcpu->kvm); }
>>  
>>  static inline void kvm_apic_set_ldr(struct kvm_lapic *apic, u32 id) {
>>  apic_set_reg(apic, APIC_LDR, id);
>>  recalculate_apic_map(apic->vcpu->kvm);
>>  +   ioapic_update_eoi_exitmap(apic->vcpu->kvm); }
>>  
>>  static inline int apic_lvt_enabled(struct kvm_lapic *apic, int lvt_type)
>> @@ -577,6 +579,63 @@ int kvm_apic_match_dest(struct kvm_vcpu *vcpu,
> struct kvm_lapic *source,
>>  return result;
>>  }
>> +static void kvm_apic_update_eoi_exitmap(struct kvm_vcpu *vcpu,
>> +u32 vector, bool set)
>> +{
>> +kvm_x86_ops->update_eoi_exitmap(vcpu, vector, set);
>> +}
>> +
>> +void kvm_update_eoi_exitmap(struct kvm *kvm, struct kvm_lapic_irq *irq)
>> +{
> We probably should move the whole function into vmx code, not just
> bitmap update logic. Sorry for not mentioning it earlier.
Sure.

>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index dcb7952..b501d5a 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -3573,6 +3573,27 @@ static void update_cr8_intercept(struct kvm_vcpu
> *vcpu, int tpr, int irr)
>>  set_cr_intercept(svm, INTERCEPT_CR8_WRITE);
>>  }
>> +static int svm_has_virtual_interrupt_delivery(struct kvm_vcpu *vcpu)
>> +{
>> +return 0;
>> +}
>> +
>> +static void svm_update_apic_irq(struct kvm_vcpu *vcpu, int max_irr)
>> +{
>> +return ;
>> +}
> You do not need this function any more since caller checks for NULL
> pointer.
>> +.has_virtual_interrupt_delivery = svm_has_virtual_interrupt_delivery,
>> +.update_apic_irq = svm_upd

RE: [PATCH 1/2] x86: Enable ack interrupt on vmexit

2012-12-12 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-13:
> On Thu, Dec 13, 2012 at 03:29:39PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Ack interrupt on vmexit is required by Posted Interrupt. With it,
>> when external interrupt caused vmexit, the cpu will acknowledge the
>> interrupt controller and save the interrupt's vector in vmcs. Only
>> enable it when posted interrupt is enabled.
>> 
>> There are several approaches to enable it. This patch uses a simply
>> way: re-generate an interrupt via self ipi.
>> 
>> Signed-off-by: Yang Zhang 
>> ---
>>  arch/x86/kvm/vmx.c |   20 +---
>>  1 files changed, 17 insertions(+), 3 deletions(-)
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 8cd9eb7..6b6bd03 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -2549,7 +2549,7 @@ static __init int setup_vmcs_config(struct
> vmcs_config *vmcs_conf)
>>  #ifdef CONFIG_X86_64
>>  min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
>>  #endif
>> -opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT;
>> +opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
> VM_EXIT_ACK_INTR_ON_EXIT;
>>  if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
>>  &_vmexit_control) < 0)  return -EIO; @@ 
>> -3913,6 +3913,7 @@ static
>>  int vmx_vcpu_setup(struct vcpu_vmx *vmx)unsigned long a; #endif 
>> int
>>  i;
>> +u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>> 
>>  /* I/O */   vmcs_write64(IO_BITMAP_A, __pa(vmx_io_bitmap_a)); @@
>>  -3996,8 +3997,10 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
>>  vmx->guest_msrs[j].mask = -1ull;++vmx->nmsrs;   
>> }
>> -
>> -vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
>> +
>> +if(!enable_apicv_pi)
>> +vmexit_ctrl &= ~VM_EXIT_ACK_INTR_ON_EXIT;
>> +vmcs_write32(VM_EXIT_CONTROLS, vmexit_ctrl);
>> 
>>  /* 22.2.1, 20.8.1 */
>>  vmcs_write32(VM_ENTRY_CONTROLS, vmcs_config.vmentry_ctrl);
>> @@ -6267,6 +6270,17 @@ static void vmx_complete_atomic_exit(struct
> vcpu_vmx *vmx)
>>  asm("int $2");
>>  kvm_after_handle_nmi(&vmx->vcpu);
>>  }
>> +if ((exit_intr_info & INTR_INFO_INTR_TYPE_MASK) == INTR_TYPE_EXT_INTR
>> && + (exit_intr_info & INTR_INFO_VALID_MASK) && enable_apicv_pi) {
>> +unsigned int vector, tmr; + +   vector =  
>> exit_intr_info &
>> INTR_INFO_VECTOR_MASK; + tmr = apic_read(APIC_TMR + ((vector & 
>> ~0x1f)
>> >> 1)); +apic_eoi(); +   if ( !((1 << (vector % 32)) & 
>> >> tmr) )
>> +apic->send_IPI_self(vector); +  }
> What happen with the idea to dispatch interrupt through idt without IPI?
I am not sure upstream guys will allow to export idt to a module. If it is not 
a problem, then can do it as you suggested.

>> +
>>  }
>>  
>>  static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
>> --
>> 1.7.1
> 
> --
>   Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] x86: Enable ack interrupt on vmexit

2012-12-13 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-13:
> On Thu, Dec 13, 2012 at 07:54:35AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-13:
>>> On Thu, Dec 13, 2012 at 03:29:39PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang 
>>>> 
>>>> Ack interrupt on vmexit is required by Posted Interrupt. With it,
>>>> when external interrupt caused vmexit, the cpu will acknowledge the
>>>> interrupt controller and save the interrupt's vector in vmcs. Only
>>>> enable it when posted interrupt is enabled.
>>>> 
>>>> There are several approaches to enable it. This patch uses a simply
>>>> way: re-generate an interrupt via self ipi.
>>>> 
>>>> Signed-off-by: Yang Zhang 
>>>> ---
>>>>  arch/x86/kvm/vmx.c |   20 +---
>>>>  1 files changed, 17 insertions(+), 3 deletions(-)
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index 8cd9eb7..6b6bd03 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -2549,7 +2549,7 @@ static __init int setup_vmcs_config(struct
>>> vmcs_config *vmcs_conf)
>>>>  #ifdef CONFIG_X86_64
>>>>min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
>>>>  #endif
>>>> -  opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT;
>>>> +  opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
>>> VM_EXIT_ACK_INTR_ON_EXIT;
>>>>if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
>>>>&_vmexit_control) < 0)  return -EIO; @@ 
>>>> -3913,6 +3913,7 @@
>>>>  static int vmx_vcpu_setup(struct vcpu_vmx *vmx)   unsigned long a;
>>>>  #endifint i;
>>>> +  u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>>>> 
>>>>/* I/O */   vmcs_write64(IO_BITMAP_A, __pa(vmx_io_bitmap_a)); @@
>>>>  -3996,8 +3997,10 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
>>>>vmx->guest_msrs[j].mask = -1ull;++vmx->nmsrs;   
>>>> }
>>>> -
>>>> -  vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
>>>> +
>>>> +  if(!enable_apicv_pi)
>>>> +  vmexit_ctrl &= ~VM_EXIT_ACK_INTR_ON_EXIT;
>>>> +  vmcs_write32(VM_EXIT_CONTROLS, vmexit_ctrl);
>>>> 
>>>>/* 22.2.1, 20.8.1 */
>>>>vmcs_write32(VM_ENTRY_CONTROLS, vmcs_config.vmentry_ctrl);
>>>> @@ -6267,6 +6270,17 @@ static void vmx_complete_atomic_exit(struct
>>> vcpu_vmx *vmx)
>>>>asm("int $2");
>>>>kvm_after_handle_nmi(&vmx->vcpu);
>>>>}
>>>> +  if ((exit_intr_info & INTR_INFO_INTR_TYPE_MASK) ==
>>>> INTR_TYPE_EXT_INTR && +(exit_intr_info & INTR_INFO_VALID_MASK)
>>>> && enable_apicv_pi) { +unsigned int vector, tmr; + +   
>>>> vector = 
>>>> exit_intr_info & INTR_INFO_VECTOR_MASK; +  tmr = 
>>>> apic_read(APIC_TMR +
>>>> ((vector &
> ~0x1f)
>>>>>> 1)); +   apic_eoi(); +   if ( !((1 << (vector % 32)) & 
>>>>>> tmr) )
>>>> +  apic->send_IPI_self(vector); +  }
>>> What happen with the idea to dispatch interrupt through idt without IPI?
>> I am not sure upstream guys will allow to export idt to a module. If it
>> is not a problem, then can do it as you suggested.
>> 
> I replied to that before. No need to export idt to modules. Add function
> to entry_32/64.S that does dispatching and export it instead.
It still need to touch common code. Do you think upstream guys will buy-in this?

>>>> +
>>>>  }
>>>>  
>>>>  static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
>>>> --
>>>> 1.7.1
>>> 
>>> --
>>> Gleb.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
>> Best regards,
>> Yang
>> 
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] x86: Enable ack interrupt on vmexit

2012-12-13 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-13:
> On Thu, Dec 13, 2012 at 08:03:06AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-13:
>>> On Thu, Dec 13, 2012 at 07:54:35AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-13:
>>>>> On Thu, Dec 13, 2012 at 03:29:39PM +0800, Yang Zhang wrote:
>>>>>> From: Yang Zhang 
>>>>>> 
>>>>>> Ack interrupt on vmexit is required by Posted Interrupt. With it,
>>>>>> when external interrupt caused vmexit, the cpu will acknowledge the
>>>>>> interrupt controller and save the interrupt's vector in vmcs. Only
>>>>>> enable it when posted interrupt is enabled.
>>>>>> 
>>>>>> There are several approaches to enable it. This patch uses a simply
>>>>>> way: re-generate an interrupt via self ipi.
>>>>>> 
>>>>>> Signed-off-by: Yang Zhang 
>>>>>> ---
>>>>>>  arch/x86/kvm/vmx.c |   20 +---
>>>>>>  1 files changed, 17 insertions(+), 3 deletions(-)
>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>>> index 8cd9eb7..6b6bd03 100644
>>>>>> --- a/arch/x86/kvm/vmx.c
>>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>>> @@ -2549,7 +2549,7 @@ static __init int setup_vmcs_config(struct
>>>>> vmcs_config *vmcs_conf)
>>>>>>  #ifdef CONFIG_X86_64
>>>>>>  min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
>>>>>>  #endif
>>>>>> -opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT;
>>>>>> +opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
>>>>> VM_EXIT_ACK_INTR_ON_EXIT;
>>>>>>  if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
>>>>>>  &_vmexit_control) < 0)  return 
>>>>>> -EIO; @@ -3913,6 +3913,7 @@
>>>>>>  static int vmx_vcpu_setup(struct vcpu_vmx *vmx) unsigned long a;
>>>>>>  #endif  int i;
>>>>>> +u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>>>>>> 
>>>>>>  /* I/O */   vmcs_write64(IO_BITMAP_A, 
>>>>>> __pa(vmx_io_bitmap_a)); @@
>>>>>>  -3996,8 +3997,10 @@ static int vmx_vcpu_setup(struct vcpu_vmx
>>>>>>  *vmx)   vmx->guest_msrs[j].mask = -1ull;
>>>>>> ++vmx->nmsrs;   }
>>>>>> -
>>>>>> -vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
>>>>>> +
>>>>>> +if(!enable_apicv_pi)
>>>>>> +vmexit_ctrl &= ~VM_EXIT_ACK_INTR_ON_EXIT;
>>>>>> +vmcs_write32(VM_EXIT_CONTROLS, vmexit_ctrl);
>>>>>> 
>>>>>>  /* 22.2.1, 20.8.1 */
>>>>>>  vmcs_write32(VM_ENTRY_CONTROLS, vmcs_config.vmentry_ctrl);
>>>>>> @@ -6267,6 +6270,17 @@ static void vmx_complete_atomic_exit(struct
>>>>> vcpu_vmx *vmx)
>>>>>>  asm("int $2");
>>>>>>  kvm_after_handle_nmi(&vmx->vcpu);
>>>>>>  }
>>>>>> +if ((exit_intr_info & INTR_INFO_INTR_TYPE_MASK) ==
>>>>>> INTR_TYPE_EXT_INTR && +  (exit_intr_info & INTR_INFO_VALID_MASK)
>>>>>> && enable_apicv_pi) { +  unsigned int vector, tmr; + +   vector =
>>>>>> exit_intr_info & INTR_INFO_VECTOR_MASK; +tmr = 
>>>>>> apic_read(APIC_TMR
>>>>>> + ((vector &
>>> ~0x1f)
>>>>>>>> 1)); + apic_eoi(); +   if ( !((1 << (vector % 32)) & 
>>>>>>>> tmr) )
>>>>>> +apic->send_IPI_self(vector); +  }
>>>>> What happen with the idea to dispatch interrupt through idt without IPI?
>>>> I am not sure upstream guys will allow to export idt to a module. If it
>>>> is not a problem, then can do it as you suggested.
>>>> 
>>> I replied to that before. No need to export idt to modules. Add function
>>> to entry_32/64.S that does dispatching and export it instead.
>> It still need to touch common code. Do you think upstream guys will
>> buy-in this?
>> 
> What's the problem with touching common code

RE: [PATCH 1/2] x86: Enable ack interrupt on vmexit

2012-12-13 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-13:
> On Thu, Dec 13, 2012 at 08:19:01AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-13:
>>> On Thu, Dec 13, 2012 at 08:03:06AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-13:
>>>>> On Thu, Dec 13, 2012 at 07:54:35AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2012-12-13:
>>>>>>> On Thu, Dec 13, 2012 at 03:29:39PM +0800, Yang Zhang wrote:
>>>>>>>> From: Yang Zhang 
>>>>>>>> 
>>>>>>>> Ack interrupt on vmexit is required by Posted Interrupt. With it,
>>>>>>>> when external interrupt caused vmexit, the cpu will acknowledge the
>>>>>>>> interrupt controller and save the interrupt's vector in vmcs. Only
>>>>>>>> enable it when posted interrupt is enabled.
>>>>>>>> 
>>>>>>>> There are several approaches to enable it. This patch uses a simply
>>>>>>>> way: re-generate an interrupt via self ipi.
>>>>>>>> 
>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>> ---
>>>>>>>>  arch/x86/kvm/vmx.c |   20 +---
>>>>>>>>  1 files changed, 17 insertions(+), 3 deletions(-)
>>>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>>>>> index 8cd9eb7..6b6bd03 100644
>>>>>>>> --- a/arch/x86/kvm/vmx.c
>>>>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>>>>> @@ -2549,7 +2549,7 @@ static __init int setup_vmcs_config(struct
>>>>>>> vmcs_config *vmcs_conf)
>>>>>>>>  #ifdef CONFIG_X86_64
>>>>>>>>min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
>>>>>>>>  #endif
>>>>>>>> -  opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT;
>>>>>>>> +  opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
>>>>>>> VM_EXIT_ACK_INTR_ON_EXIT;
>>>>>>>>if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
>>>>>>>>&_vmexit_control) < 0)  return 
>>>>>>>> -EIO; @@ -3913,6 +3913,7 @@
>>>>>>>>  static int vmx_vcpu_setup(struct vcpu_vmx *vmx)   unsigned long
>>>>>>>>  a; #endif int i;
>>>>>>>> +  u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>>>>>>>> 
>>>>>>>>/* I/O */   vmcs_write64(IO_BITMAP_A, 
>>>>>>>> __pa(vmx_io_bitmap_a)); @@
>>>>>>>>  -3996,8 +3997,10 @@ static int vmx_vcpu_setup(struct vcpu_vmx
>>>>>>>>  *vmx) vmx->guest_msrs[j].mask = -1ull;
>>>>>>>> ++vmx->nmsrs;
>   }
>>>>>>>> -
>>>>>>>> -  vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
>>>>>>>> +
>>>>>>>> +  if(!enable_apicv_pi)
>>>>>>>> +  vmexit_ctrl &= ~VM_EXIT_ACK_INTR_ON_EXIT;
>>>>>>>> +  vmcs_write32(VM_EXIT_CONTROLS, vmexit_ctrl);
>>>>>>>> 
>>>>>>>>/* 22.2.1, 20.8.1 */
>>>>>>>>vmcs_write32(VM_ENTRY_CONTROLS,
> vmcs_config.vmentry_ctrl);
>>>>>>>> @@ -6267,6 +6270,17 @@ static void
> vmx_complete_atomic_exit(struct
>>>>>>> vcpu_vmx *vmx)
>>>>>>>>asm("int $2");
>>>>>>>>kvm_after_handle_nmi(&vmx->vcpu);
>>>>>>>>}
>>>>>>>> +  if ((exit_intr_info & INTR_INFO_INTR_TYPE_MASK) ==
>>>>>>>> INTR_TYPE_EXT_INTR && +(exit_intr_info &
>>>>>>>> INTR_INFO_VALID_MASK) && enable_apicv_pi) { +  unsigned int
>>>>>>>> vector, tmr; + +   vector = exit_intr_info &
>>>>>>>> INTR_INFO_VECTOR_MASK; +   tmr = apic_read(APIC_TMR + 
>>>>>>>> ((vector &
>>>>> ~0x1f)
>>>>>>>>>> 1)); +   apic_eoi(); +   if ( !((1 << (vector % 
>>>>>>>>>> 32)) & tmr) )
>>>>>>>> +  apic->send_IPI_self(vector); +  }
>>>>>>> What happen with the idea to dispatch interrupt through idt without
> IPI?
>>>>>> I am not sure upstream guys will allow to export idt to a module. If it
>>>>>> is not a problem, then can do it as you suggested.
>>>>>> 
>>>>> I replied to that before. No need to export idt to modules. Add function
>>>>> to entry_32/64.S that does dispatching and export it instead.
>>>> It still need to touch common code. Do you think upstream guys will
>>>> buy-in this?
>>>> 
>>> What's the problem with touching common code? Show the code, get the
>>> acks. But wait for merge window to close.
>> You are right. We hope to push the PI patch ASAP. If touch common code,
>> it may need long time to discuss to get final decision. As we
>> discussion early, I will enable this feature in kvm not just when PI is
> enabled later. At that time, we can get some performance data and to see
> whether self ipi has big problem. Before the data ready, I think to limit all 
> changes
> inside KVM modules should be a better way. How do you think?
>> 
> I think we have plenty of time till 3.9. We should do it right, not
> quick.
Sure. I will change it in next version.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] x86: Enable ack interrupt on vmexit

2012-12-16 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-13:
> On Thu, Dec 13, 2012 at 08:19:01AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-13:
>>> On Thu, Dec 13, 2012 at 08:03:06AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-13:
>>>>> On Thu, Dec 13, 2012 at 07:54:35AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2012-12-13:
>>>>>>> On Thu, Dec 13, 2012 at 03:29:39PM +0800, Yang Zhang wrote:
>>>>>>>> From: Yang Zhang 
>>>>>>>> 
>>>>>>>> Ack interrupt on vmexit is required by Posted Interrupt. With it,
>>>>>>>> when external interrupt caused vmexit, the cpu will acknowledge the
>>>>>>>> interrupt controller and save the interrupt's vector in vmcs. Only
>>>>>>>> enable it when posted interrupt is enabled.
>>>>>>>> 
>>>>>>>> There are several approaches to enable it. This patch uses a simply
>>>>>>>> way: re-generate an interrupt via self ipi.
>>>>>>>> 
>>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>>> ---
>>>>>>>>  arch/x86/kvm/vmx.c |   20 +---
>>>>>>>>  1 files changed, 17 insertions(+), 3 deletions(-)
>>>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>>>>> index 8cd9eb7..6b6bd03 100644
>>>>>>>> --- a/arch/x86/kvm/vmx.c
>>>>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>>>>> @@ -2549,7 +2549,7 @@ static __init int setup_vmcs_config(struct
>>>>>>> vmcs_config *vmcs_conf)
>>>>>>>>  #ifdef CONFIG_X86_64
>>>>>>>>min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
>>>>>>>>  #endif
>>>>>>>> -  opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT;
>>>>>>>> +  opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
>>>>>>> VM_EXIT_ACK_INTR_ON_EXIT;
>>>>>>>>if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
>>>>>>>>&_vmexit_control) < 0)  return 
>>>>>>>> -EIO; @@ -3913,6 +3913,7 @@
>>>>>>>>  static int vmx_vcpu_setup(struct vcpu_vmx *vmx)   unsigned long
>>>>>>>>  a; #endif int i;
>>>>>>>> +  u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>>>>>>>> 
>>>>>>>>/* I/O */   vmcs_write64(IO_BITMAP_A, 
>>>>>>>> __pa(vmx_io_bitmap_a)); @@
>>>>>>>>  -3996,8 +3997,10 @@ static int vmx_vcpu_setup(struct vcpu_vmx
>>>>>>>>  *vmx) vmx->guest_msrs[j].mask = -1ull;
>>>>>>>> ++vmx->nmsrs;
>   }
>>>>>>>> -
>>>>>>>> -  vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
>>>>>>>> +
>>>>>>>> +  if(!enable_apicv_pi)
>>>>>>>> +  vmexit_ctrl &= ~VM_EXIT_ACK_INTR_ON_EXIT;
>>>>>>>> +  vmcs_write32(VM_EXIT_CONTROLS, vmexit_ctrl);
>>>>>>>> 
>>>>>>>>/* 22.2.1, 20.8.1 */
>>>>>>>>vmcs_write32(VM_ENTRY_CONTROLS,
> vmcs_config.vmentry_ctrl);
>>>>>>>> @@ -6267,6 +6270,17 @@ static void
> vmx_complete_atomic_exit(struct
>>>>>>> vcpu_vmx *vmx)
>>>>>>>>asm("int $2");
>>>>>>>>kvm_after_handle_nmi(&vmx->vcpu);
>>>>>>>>}
>>>>>>>> +  if ((exit_intr_info & INTR_INFO_INTR_TYPE_MASK) ==
>>>>>>>> INTR_TYPE_EXT_INTR && +(exit_intr_info &
>>>>>>>> INTR_INFO_VALID_MASK) && enable_apicv_pi) { +  unsigned int
>>>>>>>> vector, tmr; + +   vector = exit_intr_info &
>>>>>>>> INTR_INFO_VECTOR_MASK; +   tmr = apic_read(APIC_TMR + 
>>>>>>>> ((vector &
>>>>> ~0x1f)
>>>>>>>>>> 1)); +   apic_eoi(); +   if ( !((1 << (vector % 
>>>>>>>>>> 32)) & tmr) )
>>>>>>>> +  apic->send_IPI_self(vector); +  }
>>>>>>> What happen with the idea to dispatch interrupt through idt without
> IPI?
>>>>>> I am not sure upstream guys will allow to export idt to a module. If it
>>>>>> is not a problem, then can do it as you suggested.
>>>>>> 
>>>>> I replied to that before. No need to export idt to modules. Add function
>>>>> to entry_32/64.S that does dispatching and export it instead.
>>>> It still need to touch common code. Do you think upstream guys will
>>>> buy-in this?
>>>> 
>>> What's the problem with touching common code? Show the code, get the
>>> acks. But wait for merge window to close.
>> You are right. We hope to push the PI patch ASAP. If touch common code,
>> it may need long time to discuss to get final decision. As we
>> discussion early, I will enable this feature in kvm not just when PI is
> enabled later. At that time, we can get some performance data and to see
> whether self ipi has big problem. Before the data ready, I think to limit all 
> changes
> inside KVM modules should be a better way. How do you think?
>> 
> I think we have plenty of time till 3.9. We should do it right, not
> quick.
Another choice is to get the IDT entry through IDTR. With this way, we can 
achieve the same goal but limited the changes inside KVM module.

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-23 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-20:
> On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
>> basically to benefit from apicv, we need clear MSR bitmap for
>> corresponding x2apic MSRs:
>> 0x800 - 0x8ff: no read intercept for apicv register virtualization
>> TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
> We do not set "Virtualize x2APIC mode" bit in secondary execution
> control. If I read the spec correctly without that those MSR read/writes
> will go straight to physical local APIC.
Right. Now it cannot get benefit, but we may enable it in future and then we 
can benefit from it.

>> Signed-off-by: Yang Zhang 
>> Signed-off-by: Kevin Tian 
>> ---
>>  arch/x86/kvm/vmx.c |   62
>>  ++-- 1 files changed,
>>  55 insertions(+), 7 deletions(-)
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index be66c3e..9b5e7a2 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3773,7 +3773,10 @@ static void free_vpid(struct vcpu_vmx *vmx)
>>  spin_unlock(&vmx_vpid_lock);
>>  }
>> -static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
>> u32 msr) +#define MSR_TYPE_R 1 +#define MSR_TYPE_W   2 +static void
>> __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, + 
>> u32
>> msr, int type)
>>  {
>>  int f = sizeof(unsigned long);
>> @@ -3786,20 +3789,52 @@ static void
> __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr)
>>   * We can control MSRs 0x-0x1fff and 0xc000-0xc0001fff.
>>   */
>>  if (msr <= 0x1fff) {
>> -__clear_bit(msr, msr_bitmap + 0x000 / f); /* read-low */
>> -__clear_bit(msr, msr_bitmap + 0x800 / f); /* write-low */
>> +if (type & MSR_TYPE_R)
>> +/* read-low */
>> +__clear_bit(msr, msr_bitmap + 0x000 / f);
>> +
>> +if (type & MSR_TYPE_W)
>> +/* write-low */
>> +__clear_bit(msr, msr_bitmap + 0x800 / f);
>> +
>>  } else if ((msr >= 0xc000) && (msr <= 0xc0001fff)) {
>>  msr &= 0x1fff;
>> -__clear_bit(msr, msr_bitmap + 0x400 / f); /* read-high */
>> -__clear_bit(msr, msr_bitmap + 0xc00 / f); /* write-high */
>> +if (type & MSR_TYPE_R)
>> +/* read-high */
>> +__clear_bit(msr, msr_bitmap + 0x400 / f);
>> +
>> +if (type & MSR_TYPE_W)
>> +/* write-high */
>> +__clear_bit(msr, msr_bitmap + 0xc00 / f);
>> +
>>  }
>>  }
>>  
>>  static void vmx_disable_intercept_for_msr(u32 msr, bool longmode_only)
>>  {
>>  if (!longmode_only)
>> -__vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, msr);
>> -__vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, msr);
>> +__vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
>> msr,
>> MSR_TYPE_R | MSR_TYPE_W);
>> +__vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +  
>> msr,
>> MSR_TYPE_R | MSR_TYPE_W); +} + +static void
>> vmx_disable_intercept_for_msr_read(u32 msr, bool longmode_only) +{ + if
>> (!longmode_only)
>> +__vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
>> msr,
>> MSR_TYPE_R); +   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode,
>> +msr, MSR_TYPE_R); +} + +static void
>> vmx_disable_intercept_for_msr_write(u32 msr, bool longmode_only) +{
>> +if (!longmode_only)
>> +__vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
>> msr,
>> MSR_TYPE_W); +   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode,
>> +msr, MSR_TYPE_W);
>>  }
>>  
>>  /* @@ -7633,6 +7668,19 @@ static int __init vmx_init(void)
>>  vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
>>  vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
>> +if (enable_apicv_reg_vid) {
>> +int msr;
>> +for (msr = 0x800; msr <= 0x8ff; msr++)
>> +vmx_disable_intercept_for_msr_read(msr, false);
>> +
>> +/* TPR */
>> +vmx_disable_intercept_for_msr_write(0x808, false);
>> +/* EOI */
>> +vmx_disable_intercept_for_msr_write(0x80b, false);
>> +/* SELF-IPI */
>> +vmx_disable_intercept_for_msr_write(0x83f, false);
>> +}
>> +
>>  if (enable_ept) {
>>  kvm_mmu_set_mask_ptes(0ull,
>>  (enable_ept_ad_bits) ? VMX_EPT_ACCESS_BIT : 0ull,
>> --
>> 1.7.1
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a messa

RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-23 Thread Zhang, Yang Z
Zhang, Yang Z wrote on 2012-12-24:
> Gleb Natapov wrote on 2012-12-20:
>> On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
>>> basically to benefit from apicv, we need clear MSR bitmap for
>>> corresponding x2apic MSRs:
>>> 0x800 - 0x8ff: no read intercept for apicv register virtualization
>>> TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
>> We do not set "Virtualize x2APIC mode" bit in secondary execution
>> control. If I read the spec correctly without that those MSR read/writes
>> will go straight to physical local APIC.
> Right. Now it cannot get benefit, but we may enable it in future and then we 
> can
> benefit from it.
how about to add the following check:
if (apicv_enabled && virtual_x2apic_enabled)
clear_msr();


>>> Signed-off-by: Yang Zhang 
>>> Signed-off-by: Kevin Tian 
>>> ---
>>>  arch/x86/kvm/vmx.c |   62
>>>  ++-- 1 files changed,
>>>  55 insertions(+), 7 deletions(-)
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index be66c3e..9b5e7a2 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -3773,7 +3773,10 @@ static void free_vpid(struct vcpu_vmx *vmx)
>>> spin_unlock(&vmx_vpid_lock);
>>>  }
>>> -static void __vmx_disable_intercept_for_msr(unsigned long
>>> *msr_bitmap, u32 msr) +#define MSR_TYPE_R   1 +#define MSR_TYPE_W   2
>>> +static void __vmx_disable_intercept_for_msr(unsigned long
>>> *msr_bitmap, +  u32 msr, int type)
>>>  {
>>> int f = sizeof(unsigned long);
>>> @@ -3786,20 +3789,52 @@ static void
>> __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr)
>>>  * We can control MSRs 0x-0x1fff and
>>>  0xc000-0xc0001fff.  */ if (msr <= 0x1fff) {
>>> -   __clear_bit(msr, msr_bitmap + 0x000 / f); /* read-low */
>>> -   __clear_bit(msr, msr_bitmap + 0x800 / f); /* write-low */
>>> +   if (type & MSR_TYPE_R)
>>> +   /* read-low */
>>> +   __clear_bit(msr, msr_bitmap + 0x000 / f);
>>> +
>>> +   if (type & MSR_TYPE_W)
>>> +   /* write-low */
>>> +   __clear_bit(msr, msr_bitmap + 0x800 / f);
>>> +
>>> } else if ((msr >= 0xc000) && (msr <= 0xc0001fff)) {
>>> msr &= 0x1fff;
>>> -   __clear_bit(msr, msr_bitmap + 0x400 / f); /* read-high */
>>> -   __clear_bit(msr, msr_bitmap + 0xc00 / f); /* write-high */
>>> +   if (type & MSR_TYPE_R)
>>> +   /* read-high */
>>> +   __clear_bit(msr, msr_bitmap + 0x400 / f);
>>> +
>>> +   if (type & MSR_TYPE_W)
>>> +   /* write-high */
>>> +   __clear_bit(msr, msr_bitmap + 0xc00 / f);
>>> +
>>> }
>>>  }
>>>  
>>>  static void vmx_disable_intercept_for_msr(u32 msr, bool longmode_only)
>>>  {
>>> if (!longmode_only)
>>> -   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, msr);
>>> -   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, msr);
>>> +   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
>>> msr,
>>> MSR_TYPE_R | MSR_TYPE_W);
>>> +   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +  
>>> msr,
>>> MSR_TYPE_R | MSR_TYPE_W); +} + +static void
>>> vmx_disable_intercept_for_msr_read(u32 msr, bool longmode_only) +{ +
>>> if (!longmode_only)
>>> +   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
>>> msr,
>>> MSR_TYPE_R); +
>>> __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +  
>>> msr,
>>> MSR_TYPE_R); +} + +static void vmx_disable_intercept_for_msr_write(u32
>>> msr, bool longmode_only) +{ +   if (!longmode_only)
>>> +   __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
>>> msr,
>>> MSR_TYPE_W); +
>>> __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +  
>>> msr,
>>> MSR_TYPE_W);
>>>  }
>>>

RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-24:
> On Mon, Dec 24, 2012 at 02:35:35AM +0000, Zhang, Yang Z wrote:
>> Zhang, Yang Z wrote on 2012-12-24:
>>> Gleb Natapov wrote on 2012-12-20:
>>>> On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
>>>>> basically to benefit from apicv, we need clear MSR bitmap for
>>>>> corresponding x2apic MSRs:
>>>>> 0x800 - 0x8ff: no read intercept for apicv register virtualization
>>>>> TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
>>>> We do not set "Virtualize x2APIC mode" bit in secondary execution
>>>> control. If I read the spec correctly without that those MSR read/writes
>>>> will go straight to physical local APIC.
>>> Right. Now it cannot get benefit, but we may enable it in future and
>>> then we can benefit from it.
> Without enabling it you cannot disable MSR intercept for x2apic MSRs.
> 
>> how about to add the following check:
>> if (apicv_enabled && virtual_x2apic_enabled)
>>  clear_msr();
>> 
> I do not understand what do you mean here.
In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As 
you said, since kvm doesn't set "virtualize x2apic mode", APIC register 
virtualization never take effect. So we need to clear MSR bitmap only when 
apicv enabled and virtualize x2apic mode set.

>> 
>>>>> Signed-off-by: Yang Zhang 
>>>>> Signed-off-by: Kevin Tian 
>>>>> ---
>>>>>  arch/x86/kvm/vmx.c |   62
>>>>>  ++-- 1 files
>>>>>  changed, 55 insertions(+), 7 deletions(-)
>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>> index be66c3e..9b5e7a2 100644
>>>>> --- a/arch/x86/kvm/vmx.c
>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>> @@ -3773,7 +3773,10 @@ static void free_vpid(struct vcpu_vmx *vmx)
>>>>>   spin_unlock(&vmx_vpid_lock);
>>>>>  }
>>>>> -static void __vmx_disable_intercept_for_msr(unsigned long
>>>>> *msr_bitmap, u32 msr) +#define MSR_TYPE_R 1 +#define MSR_TYPE_W   2
>>>>> +static void __vmx_disable_intercept_for_msr(unsigned long
>>>>> *msr_bitmap, +u32 msr, int type)
>>>>>  {
>>>>>   int f = sizeof(unsigned long);
>>>>> @@ -3786,20 +3789,52 @@ static void
>>>> __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, u32 msr)
>>>>>* We can control MSRs 0x-0x1fff and
>>>>>  0xc000-0xc0001fff.*/ if (msr <= 0x1fff) {
>>>>> - __clear_bit(msr, msr_bitmap + 0x000 / f); /* read-low */
>>>>> - __clear_bit(msr, msr_bitmap + 0x800 / f); /* write-low */
>>>>> + if (type & MSR_TYPE_R)
>>>>> + /* read-low */
>>>>> + __clear_bit(msr, msr_bitmap + 0x000 / f);
>>>>> +
>>>>> + if (type & MSR_TYPE_W)
>>>>> + /* write-low */
>>>>> + __clear_bit(msr, msr_bitmap + 0x800 / f);
>>>>> +
>>>>>   } else if ((msr >= 0xc000) && (msr <= 0xc0001fff)) {
>>>>>   msr &= 0x1fff;
>>>>> - __clear_bit(msr, msr_bitmap + 0x400 / f); /* read-high */
>>>>> - __clear_bit(msr, msr_bitmap + 0xc00 / f); /* write-high */
>>>>> + if (type & MSR_TYPE_R)
>>>>> + /* read-high */
>>>>> + __clear_bit(msr, msr_bitmap + 0x400 / f);
>>>>> +
>>>>> + if (type & MSR_TYPE_W)
>>>>> + /* write-high */
>>>>> + __clear_bit(msr, msr_bitmap + 0xc00 / f);
>>>>> +
>>>>>   }
>>>>>  }
>>>>>  
>>>>>  static void vmx_disable_intercept_for_msr(u32 msr, bool
>>>>>  longmode_only) { if (!longmode_only)
>>>>> - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, msr);
>>>>> - __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, msr);
>>>>> + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy, +
>>>>>   msr, MSR_TYPE_R | MSR_TYPE_W);
>>>>> + __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode, +
>>>>>

RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-25:
> On Mon, Dec 24, 2012 at 11:53:37PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-24:
>>> On Mon, Dec 24, 2012 at 02:35:35AM +0000, Zhang, Yang Z wrote:
>>>> Zhang, Yang Z wrote on 2012-12-24:
>>>>> Gleb Natapov wrote on 2012-12-20:
>>>>>> On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
>>>>>>> basically to benefit from apicv, we need clear MSR bitmap for
>>>>>>> corresponding x2apic MSRs:
>>>>>>> 0x800 - 0x8ff: no read intercept for apicv register virtualization
>>>>>>> TPR,EOI,SELF-IPI: no write intercept for virtual interrupt delivery
>>>>>> We do not set "Virtualize x2APIC mode" bit in secondary execution
>>>>>> control. If I read the spec correctly without that those MSR read/writes
>>>>>> will go straight to physical local APIC.
>>>>> Right. Now it cannot get benefit, but we may enable it in future and
>>>>> then we can benefit from it.
>>> Without enabling it you cannot disable MSR intercept for x2apic MSRs.
>>> 
>>>> how about to add the following check:
>>>> if (apicv_enabled && virtual_x2apic_enabled)
>>>>clear_msr();
>>>> 
>>> I do not understand what do you mean here.
>> In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. As 
>> you
> said, since kvm doesn't set "virtualize x2apic mode", APIC register 
> virtualization
> never take effect. So we need to clear MSR bitmap only when apicv enabled and
> virtualize x2apic mode set.
>> 
> But currently it is never set.
So you think the third patch is not necessary currently? Unless we enabled 
"virtualize x2apic mode".

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-25:
> On Tue, Dec 25, 2012 at 06:42:59AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-25:
>>> On Mon, Dec 24, 2012 at 11:53:37PM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-24:
>>>>> On Mon, Dec 24, 2012 at 02:35:35AM +0000, Zhang, Yang Z wrote:
>>>>>> Zhang, Yang Z wrote on 2012-12-24:
>>>>>>> Gleb Natapov wrote on 2012-12-20:
>>>>>>>> On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
>>>>>>>>> basically to benefit from apicv, we need clear MSR bitmap for
>>>>>>>>> corresponding x2apic MSRs:
>>>>>>>>> 0x800 - 0x8ff: no read intercept for apicv register virtualization
>>>>>>>>> TPR,EOI,SELF-IPI: no write intercept for virtual interrupt 
>>>>>>>>> delivery
>>>>>>>> We do not set "Virtualize x2APIC mode" bit in secondary execution
>>>>>>>> control. If I read the spec correctly without that those MSR 
>>>>>>>> read/writes
>>>>>>>> will go straight to physical local APIC.
>>>>>>> Right. Now it cannot get benefit, but we may enable it in future and
>>>>>>> then we can benefit from it.
>>>>> Without enabling it you cannot disable MSR intercept for x2apic MSRs.
>>>>> 
>>>>>> how about to add the following check:
>>>>>> if (apicv_enabled && virtual_x2apic_enabled)
>>>>>>  clear_msr();
>>>>>> 
>>>>> I do not understand what do you mean here.
>>>> In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled. 
>>>> As
> you
>>> said, since kvm doesn't set "virtualize x2apic mode", APIC register
>>> virtualization never take effect. So we need to clear MSR bitmap only
>>> when apicv enabled and virtualize x2apic mode set.
>>>> 
>>> But currently it is never set.
>> So you think the third patch is not necessary currently? Unless we
>> enabled "virtualize x2apic mode".
>> 
> Without third patch vid will not work properly if a guest is in x2apic
> mode. Actually second and third patches need to be reordered to not have
> a windows where x2apic is broken. The problem is that this patch itself
> is buggy since it does not set "virtualize x2apic mode" flag. It should
> set the flag if vid is enabled and if the flag cannot be set vid should
> be forced off.
In what conditions this flag cannot be set? I think the only case is that KVM 
doesn't expose the x2apic capability to guest, if this is true, the guest will 
never use x2apic and we still can use vid.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-24 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-25:
> On Tue, Dec 25, 2012 at 07:25:15AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-25:
>>> On Tue, Dec 25, 2012 at 06:42:59AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-25:
>>>>> On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2012-12-24:
>>>>>>> On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
>>>>>>>> Zhang, Yang Z wrote on 2012-12-24:
>>>>>>>>> Gleb Natapov wrote on 2012-12-20:
>>>>>>>>>> On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
>>>>>>>>>>> basically to benefit from apicv, we need clear MSR bitmap for
>>>>>>>>>>> corresponding x2apic MSRs:
>>>>>>>>>>> 0x800 - 0x8ff: no read intercept for apicv register 
>>>>>>>>>>> virtualization
>>>>>>>>>>> TPR,EOI,SELF-IPI: no write intercept for virtual interrupt
> delivery
>>>>>>>>>> We do not set "Virtualize x2APIC mode" bit in secondary
>>>>>>>>>> execution control. If I read the spec correctly without that
>>>>>>>>>> those MSR read/writes will go straight to physical local APIC.
>>>>>>>>> Right. Now it cannot get benefit, but we may enable it in future and
>>>>>>>>> then we can benefit from it.
>>>>>>> Without enabling it you cannot disable MSR intercept for x2apic MSRs.
>>>>>>> 
>>>>>>>> how about to add the following check:
>>>>>>>> if (apicv_enabled && virtual_x2apic_enabled)
>>>>>>>>clear_msr();
>>>>>>>> 
>>>>>>> I do not understand what do you mean here.
>>>>>> In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv enabled.
> As
>>> you
>>>>> said, since kvm doesn't set "virtualize x2apic mode", APIC register
>>>>> virtualization never take effect. So we need to clear MSR bitmap only
>>>>> when apicv enabled and virtualize x2apic mode set.
>>>>>> 
>>>>> But currently it is never set.
>>>> So you think the third patch is not necessary currently? Unless we
>>>> enabled "virtualize x2apic mode".
>>>> 
>>> Without third patch vid will not work properly if a guest is in x2apic
>>> mode. Actually second and third patches need to be reordered to not have
>>> a windows where x2apic is broken. The problem is that this patch itself
>>> is buggy since it does not set "virtualize x2apic mode" flag. It should
>>> set the flag if vid is enabled and if the flag cannot be set vid should
>>> be forced off.
>> In what conditions this flag cannot be set? I think the only case is that KVM
> doesn't expose the x2apic capability to guest, if this is true, the guest 
> will never
> use x2apic and we still can use vid.
>> 
> We can indeed set "virtualize x2apic mode" unconditionally since it does
> not take any effect if x2apic MSRs are intercepted.
No. Since "Virtual APIC access" must be cleared if "virtualize x2apic mode" is 
set, and if guest still use xAPIC, then there should be lots of ept violations 
for apic access emulation. This will hurt performance.
We should only set "virtualize x2apic mode" when guest really uses x2apic(guest 
set bit 11 of APIC_BASE_MSR).

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-24 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2012-12-21:
> On Thu, Dec 20, 2012 at 03:12:32PM +0200, Gleb Natapov wrote:
>> On Thu, Dec 20, 2012 at 10:53:16AM -0200, Marcelo Tosatti wrote:
>>> On Thu, Dec 20, 2012 at 08:42:06AM +0200, Gleb Natapov wrote:
 On Wed, Dec 19, 2012 at 10:59:36PM -0200, Marcelo Tosatti wrote:
> On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>> manually, which is fully taken care of by the hardware. This needs
>> some special awareness into existing interrupr injection path:
>> 
>> - for pending interrupt, instead of direct injection, we may need
>>   update architecture specific indicators before resuming to guest.
>> - A pending interrupt, which is masked by ISR, should be also
>>   considered in above update action, since hardware will decide
>>   when to inject it at right time. Current has_interrupt and
>>   get_interrupt only returns a valid vector from injection p.o.v.
>> Signed-off-by: Kevin Tian 
>> Signed-off-by: Yang Zhang 
> 
> 
> Resuming previous discussion:
> 
>>> How about to recaculate irr_pending according the VIRR on each
>>> vmexit?
>>> 
>> No need really. Since HW can only clear VIRR the only situation that
>> may
>> happen is that irr_pending will be true but VIRR is empty and
>> apic_find_highest_irr() will return correct result in this case.
> 
> Self-IPI does cause VIRR to be set, see "29.1.5 Self-IPI
> Virtualization".
> 
 True. But as I said later in that discussion once irr_pending is set
 to true it never becomes false, so the optimization is effectively
 disable. We can set it to true doing apic initialization to make it
 explicit.
>>> 
>>> Its just confusing, to have a variable which has different meanings
>>> in different configurations. I would rather have it explicit that
>>> its not used rather than check every time the i read the code.
>>> 
>>> if (apic_vid() == 0 && !apic->irr_pending)
>>> return -1;
>>> 
>> I'd prefer to avoid this additional if() especially as its sole purpose
>> is documentation.  We can add comment instead. Note that irr_pending
>> is just a hint anyway.  It can be true when no interrupt is pending in
>> irr. We can even rename it to irr_pending_hint or something.
> 
> Works for me (documentation).
> 
>>> Not sure if you can skip it, its probably necessary to calculate it
>>> before HW does so (say migration etc).
>> kvm_apic_has_interrupt() is not called during migration and
>> kvm_apic_post_state_restore() calls apic_update_ppr() explicitly.
>> I am not sure it is needed though since migrated value should be already
>> correct anyway.
> 
> Ok, best force isr_count to 1 if apic vintr enabled (and add a comment,
> please).

Sorry for the later reply. I will address those problems according your 
comments.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 3/3] x86, apicv: add virtual x2apic support

2012-12-25 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-25:
> On Tue, Dec 25, 2012 at 07:46:53AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-25:
>>> On Tue, Dec 25, 2012 at 07:25:15AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-25:
>>>>> On Tue, Dec 25, 2012 at 06:42:59AM +, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2012-12-25:
>>>>>>> On Mon, Dec 24, 2012 at 11:53:37PM +, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2012-12-24:
>>>>>>>>> On Mon, Dec 24, 2012 at 02:35:35AM +, Zhang, Yang Z wrote:
>>>>>>>>>> Zhang, Yang Z wrote on 2012-12-24:
>>>>>>>>>>> Gleb Natapov wrote on 2012-12-20:
>>>>>>>>>>>> On Mon, Dec 17, 2012 at 01:30:50PM +0800, Yang Zhang wrote:
>>>>>>>>>>>>> basically to benefit from apicv, we need clear MSR bitmap for
>>>>>>>>>>>>> corresponding x2apic MSRs:
>>>>>>>>>>>>> 0x800 - 0x8ff: no read intercept for apicv register
>>>>>>>>>>>>> virtualization TPR,EOI,SELF-IPI: no write intercept for
>>>>>>>>>>>>> virtual interrupt
>>> delivery
>>>>>>>>>>>> We do not set "Virtualize x2APIC mode" bit in secondary
>>>>>>>>>>>> execution control. If I read the spec correctly without that
>>>>>>>>>>>> those MSR read/writes will go straight to physical local APIC.
>>>>>>>>>>> Right. Now it cannot get benefit, but we may enable it in
>>>>>>>>>>> future and then we can benefit from it.
>>>>>>>>> Without enabling it you cannot disable MSR intercept for x2apic
>>>>>>>>> MSRs.
>>>>>>>>> 
>>>>>>>>>> how about to add the following check:
>>>>>>>>>> if (apicv_enabled && virtual_x2apic_enabled)
>>>>>>>>>>  clear_msr();
>>>>>>>>>> 
>>>>>>>>> I do not understand what do you mean here.
>>>>>>>> In this patch, it will clear MSR bitmap(0x800 -0x8ff) when apicv
> enabled.
>>> As
>>>>> you
>>>>>>> said, since kvm doesn't set "virtualize x2apic mode", APIC register
>>>>>>> virtualization never take effect. So we need to clear MSR bitmap only
>>>>>>> when apicv enabled and virtualize x2apic mode set.
>>>>>>>> 
>>>>>>> But currently it is never set.
>>>>>> So you think the third patch is not necessary currently? Unless we
>>>>>> enabled "virtualize x2apic mode".
>>>>>> 
>>>>> Without third patch vid will not work properly if a guest is in x2apic
>>>>> mode. Actually second and third patches need to be reordered to not have
>>>>> a windows where x2apic is broken. The problem is that this patch itself
>>>>> is buggy since it does not set "virtualize x2apic mode" flag. It should
>>>>> set the flag if vid is enabled and if the flag cannot be set vid should
>>>>> be forced off.
>>>> In what conditions this flag cannot be set? I think the only case is that 
>>>> KVM
>>> doesn't expose the x2apic capability to guest, if this is true, the
>>> guest will never use x2apic and we still can use vid.
>>>> 
>>> We can indeed set "virtualize x2apic mode" unconditionally since it does
>>> not take any effect if x2apic MSRs are intercepted.
>> No. Since "Virtual APIC access" must be cleared if "virtualize x2apic mode" 
>> is set,
> and if guest still use xAPIC, then there should be lots of ept violations for 
> apic
> access emulation. This will hurt performance.
> Stupid HW, why this pointless limitation? Can you point me where SDM says 
> that?
Vol 3, 26.2.1.1

>> We should only set "virtualize x2apic mode" when guest really uses
>> x2apic(guest set bit 11 of APIC_BASE_MSR).
>> 
> Looks like SDM force us to.
> 
> --
>   Gleb.


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-21:
> On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
>> On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
>>> On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
 On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
> From: Yang Zhang 
> 
> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
> manually, which is fully taken care of by the hardware. This needs
> some special awareness into existing interrupr injection path:
> 
> - for pending interrupt, instead of direct injection, we may need
>   update architecture specific indicators before resuming to guest.
> - A pending interrupt, which is masked by ISR, should be also
>   considered in above update action, since hardware will decide
>   when to inject it at right time. Current has_interrupt and
>   get_interrupt only returns a valid vector from injection p.o.v.
> Signed-off-by: Kevin Tian 
> Signed-off-by: Yang Zhang 
> ---
>  arch/ia64/kvm/lapic.h   |6 ++
>  arch/x86/include/asm/kvm_host.h |6 ++
>  arch/x86/include/asm/vmx.h  |   11 +++ arch/x86/kvm/irq.c  
> |   56 +- arch/x86/kvm/lapic.c|   65
>  ++--- arch/x86/kvm/lapic.h|   28 ++-
>  arch/x86/kvm/svm.c  |   24 ++ arch/x86/kvm/vmx.c   
>|  154 ++-
>  arch/x86/kvm/x86.c  |   11 ++- include/linux/kvm_host.h
> |2 + virt/kvm/ioapic.c   |   36 +
>  virt/kvm/ioapic.h   |1 + virt/kvm/irq_comm.c   
>   |   20 + 13 files changed, 379 insertions(+), 41
>  deletions(-)
> diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
> index c5f92a9..cb59eb4 100644
> --- a/arch/ia64/kvm/lapic.h
> +++ b/arch/ia64/kvm/lapic.h
> @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct
> kvm_lapic_irq *irq);
>  #define kvm_apic_present(x) (true)
>  #define kvm_lapic_enabled(x) (true)
> +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
> + struct kvm_lapic_irq *irq)
> +{
> + /* IA64 has no apicv supporting, do nothing here */
> +}
> +
>  #endif
> diff --git a/arch/x86/include/asm/kvm_host.h
> b/arch/x86/include/asm/kvm_host.h index c431b33..b63a144 100644 ---
> a/arch/x86/include/asm/kvm_host.h +++
> b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,11 @@ struct
> kvm_x86_ops {
>   void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
>   void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>   void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
> + int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
> + void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
> + void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq
> *irq); +  void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu); + void
> (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
>   int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>   int (*get_tdp_level)(void);
>   u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
> is_mmio);
 
 EOI exit bitmap is problematic (its racy). Please do this:
 
 1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC
 register modifications which require EOI exit bitmap updates. 2. On
 VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC map
 and adjusts its own EOI exit bitmap VMCS registers.
 
 1) that waits until remote executing VCPUs have acknowledge the request,
 using make_all_cpus_request (see virt/kvm/kvm_main.c), similarly to
 remote TLB flushes.
 
 What is the problem now: there is no control over _when_ a VCPU
 updates its EOI exit bitmap VMCS register from the (remotely updated)
 master EOI exit bitmap. The VCPU can be processing a
 KVM_REQ_EOIBITMAP relative to a precedence IOAPIC register write
 while the current IOAPIC register write is updating the EOI exit
 bitmap. There is no way to fix that without locking (which can be
 avoided if the IOAPIC->EOI exit bitmap synchronization is vcpu local).
 
>>> The race is benign. We have similar one for interrupt injection and
>>> the same race exists on a real HW. The two cases that can happen due
>>> to the race are:
>>> 
>>> 1. exitbitmap bit X is changed from 1 to 0
>>>   No problem. It is harmless to do an exit, on the next entry
>>>   exitbitmap will be fixed. 2. exitbitmap bit X is changed from 0 to 1
>>>   If vcpu serves X at the time this happens it was delivered as edge,
>>>   so no need to exit. The exitbitmap will be updated after the next
>>>   vmexit which will happen due to KVM_REQ_EOIBITMAP processing.
>> 
>> 1. Missed the ca

RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2012-12-21:
> On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
>> From: Yang Zhang 
>> 
>> @@ -3925,6 +3942,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
>>  vmx_secondary_exec_control(vmx));
>>  }
>> +if (enable_apicv_reg_vid) {
>> +vmcs_write64(EOI_EXIT_BITMAP0, 0);
>> +vmcs_write64(EOI_EXIT_BITMAP1, 0);
>> +vmcs_write64(EOI_EXIT_BITMAP2, 0);
>> +vmcs_write64(EOI_EXIT_BITMAP3, 0);
>> +
>> +vmcs_write16(GUEST_INTR_STATUS, 0);
>> +}
> 
> AFAICS SVI should be regenerated on migration. Consider:
> 
> 1. vintr delivery, sets SVI = vector = RVI.
> 2. clears RVI.
> 3. migration.
> 4. RVI properly set from VIRR on entry.
> 5. SVI = 0.
> 6. EOI -> EOI virtualization with SVI = 0.
> 
> Could hook into kvm_apic_post_state_restore() to do that (set highest
> index of bit set in VISR).
Ok. How about to make a request(KVM_REQ_UPDATE_SVI) and handle it in vmentry to 
set highest index of bit in VISR to RVI.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

2012-12-26 Thread Zhang, Yang Z
Gleb Natapov wrote on 2012-12-27:
> On Thu, Dec 27, 2012 at 02:24:04AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-21:
>>> On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
>>>> On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
>>>>> On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
>>>>>> On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
>>>>>>> From: Yang Zhang 
>>>>>>> 
>>>>>>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>>>>>>> manually, which is fully taken care of by the hardware. This needs
>>>>>>> some special awareness into existing interrupr injection path:
>>>>>>> 
>>>>>>> - for pending interrupt, instead of direct injection, we may need
>>>>>>>   update architecture specific indicators before resuming to
>>>>>>>   guest. - A pending interrupt, which is masked by ISR, should be
>>>>>>>   also considered in above update action, since hardware will
>>>>>>>   decide when to inject it at right time. Current has_interrupt
>>>>>>>   and get_interrupt only returns a valid vector from injection
>>>>>>>   p.o.v.
>>>>>>> Signed-off-by: Kevin Tian 
>>>>>>> Signed-off-by: Yang Zhang 
>>>>>>> ---
>>>>>>>  arch/ia64/kvm/lapic.h   |6 ++
>>>>>>>  arch/x86/include/asm/kvm_host.h |6 ++
>>>>>>>  arch/x86/include/asm/vmx.h  |   11 +++ arch/x86/kvm/irq.c
>>>>>>> |   56 +- arch/x86/kvm/lapic.c|
> 65
>>>>>>>  ++--- arch/x86/kvm/lapic.h|   28 ++-
>>>>>>>  arch/x86/kvm/svm.c  |   24 ++
> arch/x86/kvm/vmx.c
>>>>>>>|  154 ++-
>>>>>>>  arch/x86/kvm/x86.c  |   11 ++-
> include/linux/kvm_host.h
>>>>>>> |2 + virt/kvm/ioapic.c   |   36 +
>>>>>>>  virt/kvm/ioapic.h   |1 + virt/kvm/irq_comm.c
>>>>>>>   |   20 + 13 files changed, 379 insertions(+), 41
>>>>>>>  deletions(-)
>>>>>>> diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
>>>>>>> index c5f92a9..cb59eb4 100644
>>>>>>> --- a/arch/ia64/kvm/lapic.h
>>>>>>> +++ b/arch/ia64/kvm/lapic.h
>>>>>>> @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu,
> struct
>>> kvm_lapic_irq *irq);
>>>>>>>  #define kvm_apic_present(x) (true)
>>>>>>>  #define kvm_lapic_enabled(x) (true)
>>>>>>> +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
>>>>>>> +   struct kvm_lapic_irq *irq)
>>>>>>> +{
>>>>>>> +   /* IA64 has no apicv supporting, do nothing here */
>>>>>>> +}
>>>>>>> +
>>>>>>>  #endif
>>>>>>> diff --git a/arch/x86/include/asm/kvm_host.h
>>>>>>> b/arch/x86/include/asm/kvm_host.h index c431b33..b63a144 100644 ---
>>>>>>> a/arch/x86/include/asm/kvm_host.h +++
>>>>>>> b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,11 @@ struct
>>>>>>> kvm_x86_ops {
>>>>>>> void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
>>>>>>> void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>>>>>>> void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int
> irr);
>>>>>>> +   int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
>>>>>>> +   void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
>>>>>>> +   void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq
>>>>>>> *irq); +void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu); + 
>>>>>>> void
>>>>>>> (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
>>>>>>> int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>>>>>>> int (*get_tdp_level)(void);
>>>>>>> u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bo

  1   2   3   4   >