Re: [kvm-devel] [PATCH 0/9] in-kernel APIC v9 (kernel side)

Dong, Eddie Tue, 12 Jun 2007 16:51:02 -0700

Gregory Haskins wrote:
> Hi Eddie,
>  Back from vacation. Just catching up on email now....
> 
> On Thu, 2007-06-07 at 16:20 +0800, Dong, Eddie wrote:
>> 
>> Greg:
>>      Here are some detail comments towarding the LAPIC device model.
1:
>>      irqabstraction layer vcpu->irq.pending holds the abstract
processor
>> interrupt request (called localint, extint, nmi etc in V09).  But
>> API kvm_vcpu_intr only set the interrupt request, no clear.
> 
> By design
> 
>>   So far I only noticed when an
>> interrupt request is pop or injected,  vcpu->irq.pending bit may be
>> cleared. But there is case when guest raise TPR bar, an localint
>> could be cleared.
> 
> Yep, by design.  And it will be set again when the TPR bar is lowered.


Mmm, since the vcpu->irq.pending is not cleared, now KVM will think
there 
is an irq to inject before the TPR is lowered.


> 
>>  Same for extint when PIC IMR is masked.
> 
> Yeah, we have a hole here.  If IMR is changed to mask a pending
> vector, it is conceivable that it could remain pending in the kernel.
> Note that
> this hole was always in KVM, even before my patch.  Because of

No, in original KVM, this kind of hole doesn't exist since all the PIC 
logic are in Qemu. Each time if Qemu want to do KVM_INTERRUPT,
Qemu will check the eflag.if, interrupt window etc. I.e. each
KVM_INTERRUPT
injected IRQ will be injected to guest immediately.

> this I am
> of the opinion that it doesn't need to be fixed as part of the LAPIC
> work per se before merging.  However, that being said, we can lay the
> groundwork to support this now by adding an "int level" to the
> structure that is passed in the KVM_ISA_INTERRUPT ioctl.  Going
> forward, someone can patch the QEMU::i8259 to send clear events when
> IMR changes. 

No matter a new API or extend previous API, anyway the kernel interrupt
IRR should
be able to be set and cleared  by Qemu :-)

> 
>> 
>>      In Xen, since there is no notion of abstract processor interrupt
>> request, VMM check interrupt directly from PIC and APIC, so Xen
>> doesn't have problem. I guess Windows will fail here.
> 
> No, this is incorrect (and Windows boots fine even w/ACPI
> enabled, BTW).
> For the LAPIC, we do check the LAPIC model directly (and therefore
> consider TPR, etc).  The abstract interrupt mechanism simply tells the
> vcpu that it needs to check the LAPIC.  It is not authoritative in
> deciding if something actually gets injected.
> 
> For the PIC, we don't check directly (the PICs vectors are
> cached in the
> kernel), but again note that this is how KVM has always
> worked.  I think
> fixing the KVM_ISA_INTERRUPT::level mechanism + adding
> IMR-clear support
> shores up any issue there, however.
> 
>> 
>>      2: APIC timer
>>      a: V09 uses hrtimer for LAPIC timer, apic->timer.last_update is
>> updated every time when __apic_timer_fn is invoked at time of the
>> APIC timer fired. This impose an accumulated difference since the
>> fire time is already some ns later after expected time.
>>      Xen solve this issue by increase apic->timer.last_update with
>> the PERIOD, i.e. APIC_BUS_CYCLE_NS * apic->timer.divide_count *
>>      APIC_TMICT. b: Seems current approach starts hrtimer whenever
>> APIC_TMICT is updated. Should we check APIC_LVT to see if it is
>> masked here? (instead of doing in its callback
>> function:__apic_timer_fn). Also why APIC_TMCCT is updated here? I
>>      think TMCCT is reloaded only when it reaches 0 and LVTT works in
>> periodic mode. c: I didn't see LVTT mask status refelect the hrtimer 
>> cancel/start, do I miss something?
> 
> I inherited most of lapic.c from Dor, and I believe he
> inherited most of
> it from an older version of Xen.  While I have come to understand much
> of the inner workings of the LAPIC during the course of developing
> this patch, the timer is still a relative enigma to me.  Therefore, I
> do not have any comment as to the reasons why something was done here
> the way it was, nor to the validity of the problems you are
> highlighting. Perhaps Dor will know. 
> 
> But that being said, patches against v09 to fix problems you see are
> always welcome. 
> 
> 
>> 
>>      3:  Assume a senario there is an valid IDT_VECTORING_INFO_FIELD,
>> following code(after patch) in handle_exception push back the failed
>> interrupt vector, i.e. vcpu->irq.deferred.
>> 
>> ........
>>         if (is_external_interrupt(vect_info)) {
>>                 /*
>>                  * An exception was taken while we were trying to
>> inject an 
>>                  * IRQ.  We must defer the injection of the vector
>> until 
>>                  * the next window.
>>                  */
>>                 int irq = vect_info & VECTORING_INFO_VECTOR_MASK;
>>                 kvm_vcpu_irq_push(vcpu, irq);
>>         }
>> 
>> 
>>      a:  Now, the abstracted processor interrupt, i.e.
>> vcpu->irq.pending, could be 0, __kvm_vcpu_irq_pending invoked at
>> beginning of do_interrupt_requests will set kvm_irqpin_localint in
>> the abstracted processor interrupt (vcpu->irq.pending). But if at
>> same time, we get an external IRQ, i.e. vcpu->irq.pending is set
>>      with both localint & extint. From following code ,
>> kvm_irqpin_extint has higher priority than kvm_irqpin_localint.
> 
> Yes, I understand this scenario.  It is the same problem I was
> trying to
> describe earlier when I said extint/nmi can inadvertently get
> prioritized over deferred.  I will fix this.
> 
>> 
>>         while (pending) {
>>                 kvm_irqpin_t pin = __fls(pending);
>> 
>>                 switch (pin) {
>>                 case kvm_irqpin_localint:
>>                 case kvm_irqpin_extint:
>>                 case kvm_irqpin_nmi:
>>                         do_intr_requests(vcpu, kvm_run, pin);
>>                         break;
>>              ..............
>> 
>>      Now in do_intr_requests, we get an extirq vector instead of
>> vcpu->irq.deferred from following code since pin=kvm_irqpin_extint.
>> That means we inject an new irq instead of original failed irq in
>> IDT_VECTORING_INFO_FIELD. 
>> 
>>              ........
>>                 switch (pin) {
>>                 case kvm_irqpin_localint:
>>                         r = kvm_vcpu_irq_pop(vcpu, &ack);
>>                         break;
>>                 case kvm_irqpin_extint:
>>                         r = kvm_irqdevice_ack(&vcpu->kvm->isa_irq,
>>                         0, &ack); if (!(ack.flags &
>> KVM_IRQACKDATA_VECTOR_PENDING))
>>                                 __clear_bit(pin, &vcpu->irq.pending);
>>                         break;
>>                 case kvm_irqpin_nmi:
>> 
>> 
>>      Anyway due to SMP & in-kernel APIC, I'd like to suggest we move
>> IDT_VECTORING_INFO_FIELD to do_interrupt_requests like
>> vmx_intr_assist in Xen where physical IRQ is disabled.
> 
> I haven't looked at the new Xen code, but I will try to take a peek.
> Its probably moot since I am confident that the fix I am suggesting
> allows the deferred mechanism to work the way I intended, but I will
> keep an open mind to alternative solutions.

The Xen side doesn't support NMI yet though the patch for NMI is in hand
now :-(
Hopefully it will be out this week :-)

> 
>> 
>> 
>> 
>> thx, eddie

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 0/9] in-kernel APIC v9 (kernel side)

Reply via email to