Dear all,
we are facing some difficulties with GSI interrupt storms
originating from a PCI card that seem to be caused by
ipipe: The card is passed through to qemu-kvm (the setup
is based on the patches sent by Jan some time ago). Once
the card becomes active, we are hit by a tremendous amount
of interrupts (> 100000/s) that keep ipipe fully occupied.
The observed pattern is (excerpt from the ipipe tracer)
:| common_interrupt+0x20 (__ipipe_spin_unlock_irqrestore+0x62)
:| __ipipe_handle_irq+0x11 (common_interrupt+0x27)
(...)
: handle_irq+0x9 (do_IRQ+0x66)
: irq_to_desc+0x4 (handle_irq+0x15)
: handle_fasteoi_irq+0x14 (handle_irq+0x22)
(...)
: unmask_ioapic_irq+0x4 (handle_fasteoi_irq+0x94)
: unmask_ioapic+0xd (unmask_ioapic_irq+0x14)
: __ipipe_spin_lock_irqsave+0x7 (unmask_ioapic+0x23)
:| __ipipe_spin_lock_irqsave+0x93 (unmask_ioapic+0x23)
:| __io_apic_modify_irq+0x4 (unmask_ioapic+0x41)
:| __ipipe_unlock_irq+0x11 (unmask_ioapic+0x66)
:| __ipipe_spin_unlock_irqrestore+0x9 (unmask_ioapic+0x75)
:| __ipipe_spin_unlock_irqrestore+0x60 (unmask_ioapic+0x75)
:| common_interrupt+0x20 (__ipipe_spin_unlock_irqrestore+0x62)
That is, as soon as the IRQ in question is unmasked, the
next one is immediately received, and the interrupt handler
in non-RT context never gets a chance to actually service
the interrupt.
The problem seems to be caused by unmasking the IRQ in
handle_fasteoi_irq(), and with a hack along the lines of
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -586,7 +586,8 @@ handle_fasteoi_irq(unsigned int irq, struct irq_desc
*desc)
raw_spin_lock(&desc->lock);
desc->status &= ~IRQ_INPROGRESS;
#ifdef CONFIG_IPIPE
- desc->irq_data.chip->irq_unmask(&desc->irq_data);
+ if (irq != WHICHEVER_IRQ_CAUSES_THE_STORM)
+ desc->irq_data.chip->irq_unmask(&desc->irq_data);
out:
#else
out:
the issue is solved.
So the question is: Why is it okay to unconditionally unmask
all interrupts in the fasteoi handler? All cards that re-send
interrupts at high frequencies unless they are properly handled
by their device driver should cause the same problem.
I take the early unmasking is an optimisation, or are there any
further reasons for the unconditional unmasking in
handle_fasteoi_irq()?
Thanks & best regards, Wolfgang
--
Siemens AG, Open Source Platforms,
Corporate Competence Centre Embedded Linux
_______________________________________________
Adeos-main mailing list
[email protected]
https://mail.gna.org/listinfo/adeos-main