Re: [question] Why newer QEMU may lose irq when doing migration?

2014-12-22 Thread Wincy Van
2014-12-19 17:50 GMT+08:00 Paolo Bonzini :
>
>
> On 19/12/2014 04:58, Wincy Van wrote:
>> 2014-12-17 18:46 GMT+08:00 Paolo Bonzini :
>>>
>>>
>>> On 17/12/2014 04:46, Wincy Van wrote:
 Hi, all:

 The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of
 Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc
 (KVM: ioapic: clear IRR for edge-triggered interrupts at delivery)
 introduced a bug (see
 https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html).

 From the description "Unlike the old qemu-kvm, which really never did
 that, with new QEMU it is for some reason
 somewhat likely to migrate a VM with a nonzero IRR in the ioapic."

 Why could new QEMU do that? I can not find any codes about the "some 
 reason"..
 As we know, once a irq is set in kvm's ioapic, the ioapic will send
 that irq to lapic, this is an "atomic" operation.
>>>
>>> It can happen if the IRQ is masked in the IOAPIC, for example.  Until
>>> commit 0bc830b, KVM could not distinguish two cases:
>>>
>>> 1) an edge-triggered interrupt that was raised while the IOAPIC had it
>>> masked
>>>
>>> 2) an edge-triggered interrupt that was raised and delivered, but for
>>> which userspace left the level to 1.
>>
>> It seems that QEMU's rtc behavior is case 2. But before this patchset, a rtc
>> interrupt may be lost when doing migration, and guest will not acknowledge 
>> it,
>> then the newer rtc interrupts are ignored forever. I think this is
>> none of the cases above, because the interrupt was lost. It must be
>> something wrong here.
>
> There is a third case actually.
>
> If the source kernel is an old one before commit 2c2bf0113697 (KVM: Use
> eoi to track RTC interrupt delivery status, 2013-04-11), ioapic->irr can
> also be set if the RTC interrupt was coalesced (for example because the
> PPR was too high to deliver it).
>

Yeah, understood, thanks.

> Instead, commit 2c2bf0113697 will not set ioapic->irr in this case.
> Yang, was this intentional?
>
> The question, however, is then why my patch set worked (fixed migration)
> even without moving
>
> ioapic->irr |= mask;
>
> above this:
>
> if (irq == RTC_GSI && line_status &&
> rtc_irq_check_coalesced(ioapic)) {
> ret = 0; /* coalesced */
> goto out;
> }
>

Paolo, how about this patch?

It uses a new field named irr_delivered to record the status of
edge-triggered interrupt,
and clears the delivered irr in kvm_get_ioapic. So it have the same
effect of commit
0bc830b while avoid the bug of Windows guests.

diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index b1947e0..eda7905 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -206,6 +206,8 @@ static int ioapic_set_irq(struct kvm_ioapic
*ioapic, unsigned int irq,

old_irr = ioapic->irr;
ioapic->irr |= mask;
+  if (edge)
+  ioapic->irr_delivered &= ~mask;
if ((edge && old_irr == ioapic->irr) ||
(!edge && entry.fields.remote_irr)) {
ret = 0;
@@ -349,7 +351,7 @@ static int ioapic_service(struct kvm_ioapic
*ioapic, int irq, bool line_status)
irqe.shorthand = 0;

if (irqe.trig_mode == IOAPIC_EDGE_TRIG)
-   ioapic->irr &= ~(1 << irq);
+  ioapic->irr_delivered |= 1 << irq;

if (irq == RTC_GSI && line_status) {
/*
@@ -597,6 +599,7 @@ static void kvm_ioapic_reset(struct kvm_ioapic *ioapic)
ioapic->base_address = IOAPIC_DEFAULT_BASE_ADDRESS;
ioapic->ioregsel = 0;
ioapic->irr = 0;
+  ioapic->irr_delivered = 0;
ioapic->id = 0;
memset(ioapic->irq_eoi, 0x00, IOAPIC_NUM_PINS);
rtc_irq_eoi_tracking_reset(ioapic);
@@ -654,6 +657,7 @@ int kvm_get_ioapic(struct kvm *kvm, struct
kvm_ioapic_state *state)

spin_lock(&ioapic->lock);
memcpy(state, ioapic, sizeof(struct kvm_ioapic_state));
+  state->irr &= ~ioapic->irr_delivered;
spin_unlock(&ioapic->lock);
return 0;
 }
diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index 3c91955..a5cdfc0 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -77,6 +77,7 @@ struct kvm_ioapic {
struct rtc_status rtc_status;
struct delayed_work eoi_inject;
u32 irq_eoi[IOAPIC_NUM_PINS];
+  u32 irr_delivered;
 };

 #ifdef DEBUG


Thanks,

Wincy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [question] Why newer QEMU may lose irq when doing migration?

2014-12-19 Thread Paolo Bonzini


On 19/12/2014 04:58, Wincy Van wrote:
> 2014-12-17 18:46 GMT+08:00 Paolo Bonzini :
>>
>>
>> On 17/12/2014 04:46, Wincy Van wrote:
>>> Hi, all:
>>>
>>> The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of
>>> Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc
>>> (KVM: ioapic: clear IRR for edge-triggered interrupts at delivery)
>>> introduced a bug (see
>>> https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html).
>>>
>>> From the description "Unlike the old qemu-kvm, which really never did
>>> that, with new QEMU it is for some reason
>>> somewhat likely to migrate a VM with a nonzero IRR in the ioapic."
>>>
>>> Why could new QEMU do that? I can not find any codes about the "some 
>>> reason"..
>>> As we know, once a irq is set in kvm's ioapic, the ioapic will send
>>> that irq to lapic, this is an "atomic" operation.
>>
>> It can happen if the IRQ is masked in the IOAPIC, for example.  Until
>> commit 0bc830b, KVM could not distinguish two cases:
>>
>> 1) an edge-triggered interrupt that was raised while the IOAPIC had it
>> masked
>>
>> 2) an edge-triggered interrupt that was raised and delivered, but for
>> which userspace left the level to 1.
> 
> It seems that QEMU's rtc behavior is case 2. But before this patchset, a rtc
> interrupt may be lost when doing migration, and guest will not acknowledge it,
> then the newer rtc interrupts are ignored forever. I think this is
> none of the cases above, because the interrupt was lost. It must be
> something wrong here.

There is a third case actually.

If the source kernel is an old one before commit 2c2bf0113697 (KVM: Use
eoi to track RTC interrupt delivery status, 2013-04-11), ioapic->irr can
also be set if the RTC interrupt was coalesced (for example because the
PPR was too high to deliver it).

Instead, commit 2c2bf0113697 will not set ioapic->irr in this case.
Yang, was this intentional?

The question, however, is then why my patch set worked (fixed migration)
even without moving

ioapic->irr |= mask;

above this:

if (irq == RTC_GSI && line_status &&
rtc_irq_check_coalesced(ioapic)) {
ret = 0; /* coalesced */
goto out;
}

>> No, QEMU does not save the pending IRQ.  IRQs are stateless in QEMU.
>> The assumption is that after a qemu_set_irq the IRQ will be
>> delivered---possibly on the other side of the migration, but it will be
>> delivered.
> 
> I find that in kvm_arch_vcpu_ioctl_get_sregs, KVM will save pending IRQs
> into sregs->interrupt_bitmap and QEMU will save it.
> Isn't it right?

This is a pending interrupt that has been queued by the kernel but not
delivered yet.  It can happen if the interrupt controller is implemented
in userspace.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [question] Why newer QEMU may lose irq when doing migration?

2014-12-18 Thread Wincy Van
2014-12-17 18:46 GMT+08:00 Paolo Bonzini :
>
>
> On 17/12/2014 04:46, Wincy Van wrote:
>> Hi, all:
>>
>> The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of
>> Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc
>> (KVM: ioapic: clear IRR for edge-triggered interrupts at delivery)
>> introduced a bug (see
>> https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html).
>>
>> From the description "Unlike the old qemu-kvm, which really never did
>> that, with new QEMU it is for some reason
>> somewhat likely to migrate a VM with a nonzero IRR in the ioapic."
>>
>> Why could new QEMU do that? I can not find any codes about the "some 
>> reason"..
>> As we know, once a irq is set in kvm's ioapic, the ioapic will send
>> that irq to lapic, this is an "atomic" operation.
>
> It can happen if the IRQ is masked in the IOAPIC, for example.  Until
> commit 0bc830b, KVM could not distinguish two cases:
>
> 1) an edge-triggered interrupt that was raised while the IOAPIC had it
> masked
>
> 2) an edge-triggered interrupt that was raised and delivered, but for
> which userspace left the level to 1.
>

Thank you Paolo.

It seems that QEMU's rtc behavior is case 2. But before this patchset, a rtc
interrupt may be lost when doing migration, and guest will not acknowledge it,
then the newer rtc interrupts are ignored forever. I think this is
none of the cases
above, because the interrupt was lost. It must be something wrong here.

>> Then, kvm will inject them in inject_pending_event(or set rvi in
>> apic-v case). QEMU will also save the pending irq when doing
>> migration.
>
> No, QEMU does not save the pending IRQ.  IRQs are stateless in QEMU.
> The assumption is that after a qemu_set_irq the IRQ will be
> delivered---possibly on the other side of the migration, but it will be
> delivered.
>

I find that in kvm_arch_vcpu_ioctl_get_sregs, KVM will save pending IRQs
into sregs->interrupt_bitmap and QEMU will save it.
Isn't it right?


Thanks,

Wincy



> Paolo
>
>> I can not find a point which guest could lose a irq, but this scenario
>> really exists.
>>
>> Any ideas?
>>
>>
>> Thanks,
>>
>> Wincy
>>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [question] Why newer QEMU may lose irq when doing migration?

2014-12-17 Thread Paolo Bonzini


On 17/12/2014 04:46, Wincy Van wrote:
> Hi, all:
> 
> The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of
> Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc
> (KVM: ioapic: clear IRR for edge-triggered interrupts at delivery)
> introduced a bug (see
> https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html).
> 
> From the description "Unlike the old qemu-kvm, which really never did
> that, with new QEMU it is for some reason
> somewhat likely to migrate a VM with a nonzero IRR in the ioapic."
> 
> Why could new QEMU do that? I can not find any codes about the "some reason"..
> As we know, once a irq is set in kvm's ioapic, the ioapic will send
> that irq to lapic, this is an "atomic" operation.

It can happen if the IRQ is masked in the IOAPIC, for example.  Until
commit 0bc830b, KVM could not distinguish two cases:

1) an edge-triggered interrupt that was raised while the IOAPIC had it
masked

2) an edge-triggered interrupt that was raised and delivered, but for
which userspace left the level to 1.

> Then, kvm will inject them in inject_pending_event(or set rvi in
> apic-v case). QEMU will also save the pending irq when doing
> migration.

No, QEMU does not save the pending IRQ.  IRQs are stateless in QEMU.
The assumption is that after a qemu_set_irq the IRQ will be
delivered---possibly on the other side of the migration, but it will be
delivered.

Paolo

> I can not find a point which guest could lose a irq, but this scenario
> really exists.
> 
> Any ideas?
> 
> 
> Thanks,
> 
> Wincy
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[question] Why newer QEMU may lose irq when doing migration?

2014-12-16 Thread Wincy Van
Hi, all:

The patchset (https://lkml.org/lkml/2014/3/18/309) fixed migration of
Windows guests, but commit 0bc830b05c667218d703f2026ec866c49df974fc
(KVM: ioapic: clear IRR for edge-triggered interrupts at delivery)
introduced a bug (see
https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html).

>From the description "Unlike the old qemu-kvm, which really never did
that, with new QEMU it is for some reason
somewhat likely to migrate a VM with a nonzero IRR in the ioapic."

Why could new QEMU do that? I can not find any codes about the "some reason"..
As we know, once a irq is set in kvm's ioapic, the ioapic will send
that irq to lapic, this is an "atomic" operation.
Then, kvm will inject them in inject_pending_event(or set rvi in
apic-v case). QEMU will also save the pending irq when doing
migration.

I can not find a point which guest could lose a irq, but this scenario
really exists.

Any ideas?


Thanks,

Wincy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html