2014-11-16 23:49+0200, Nadav Amit:
> apic_find_highest_irr assumes irr_pending is set if any vector in APIC_IRR is
> set.  If this assumption is broken and apicv is disabled, the injection of
> interrupts may be deferred until another interrupt is delivered to the guest.
> Ultimately, if no other interrupt should be injected to that vCPU, the pending
> interrupt may be lost.
> 
> commit 56cc2406d68c ("KVM: nVMX: fix "acknowledge interrupt on exit" when 
> APICv
> is in use") changed the behavior of apic_clear_irr so irr_pending is cleared
> after setting APIC_IRR vector. After this commit, if apic_set_irr and
> apic_clear_irr run simultaneously, a race may occur, resulting in APIC_IRR
> vector set, and irr_pending cleared. In the following example, assume a single
> vector is set in IRR prior to calling apic_clear_irr:
> 
> apic_set_irr                          apic_clear_irr
> ------------                          --------------
> apic->irr_pending = true;
>                                       apic_clear_vector(...);
>                                       vec = apic_search_irr(apic);
>                                       // => vec == -1
> apic_set_vector(...);
>                                       apic->irr_pending = (vec != -1);
>                                       // => apic->irr_pending == false
> 
> Nonetheless, it appears the race might even occur prior to this commit:
> 
> apic_set_irr                          apic_clear_irr
> ------------                          --------------
> apic->irr_pending = true;
>                                       apic->irr_pending = false;
>                                       apic_clear_vector(...);
>                                       if (apic_search_irr(apic) != -1)
>                                               apic->irr_pending = true;
>                                       // => apic->irr_pending == false
> apic_set_vector(...);
> 
> Fixing this issue by:
> 1. Restoring the previous behavior of apic_clear_irr: clear irr_pending, call
>    apic_clear_vector, and then if APIC_IRR is non-zero, set irr_pending.
> 2. On apic_set_irr: first call apic_set_vector, then set irr_pending.
> 
> Signed-off-by: Nadav Amit <na...@cs.technion.ac.il>
> ---
>  arch/x86/kvm/lapic.c | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 6e8ce5a..e0e5642 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -341,8 +341,12 @@ EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
>  
>  static inline void apic_set_irr(int vec, struct kvm_lapic *apic)
>  {
> -     apic->irr_pending = true;
>       apic_set_vector(vec, apic->regs + APIC_IRR);
> +     /*
> +      * irr_pending must be true if any interrupt is pending; set it after
> +      * APIC_IRR to avoid race with apic_clear_irr
> +      */
> +     apic->irr_pending = true;

(A race that ends up with 'irr_pending = true' and zero IRR is
 harmless.)

>  }
>  
>  static inline int apic_search_irr(struct kvm_lapic *apic)
> @@ -374,13 +378,15 @@ static inline void apic_clear_irr(int vec, struct 
> kvm_lapic *apic)
>  
>       vcpu = apic->vcpu;
>  
> -     apic_clear_vector(vec, apic->regs + APIC_IRR);
> -     if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
> +     if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) {
>               /* try to update RVI */
> +             apic_clear_vector(vec, apic->regs + APIC_IRR);
>               kvm_make_request(KVM_REQ_EVENT, vcpu);
> -     else {
> -             vec = apic_search_irr(apic);
> -             apic->irr_pending = (vec != -1);
> +     } else {
> +             apic->irr_pending = false;
> +             apic_clear_vector(vec, apic->regs + APIC_IRR);
> +             if (apic_search_irr(apic) != -1)
> +                     apic->irr_pending = true;
>       }

Works because apic_clear_vector() is also a compiler barrier ...

Reviewed-by: Radim Krčmář <rkrc...@redhat.com>

(I hope the performance gain of irr_pending is worth its complexity.)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to