Re:[patch] genirq: temporary fix for level-triggered IRQ resend

2007-08-09 Thread Jean-Baptiste Vignaud
 Hi,
 
 I see there is a bit of complaining on this original resend temporary
 patch. But, since it seems to do a good job for some people, here is
 my proposal to limit the 'range of fire' a little bit.
 
 Marcin and Jean-Baptiste: try to test this with 2.6.23-rc2, please.
 (Unless Ingo or Thomas have other plans with this problem?)

2.6.23-rc2 + this patch, and the box's cards are still networking after 20 
hours.

RX bytes:452423991847 (421.3 GiB)  TX bytes:13464471620 (12.5 GiB)

still testing.

Jb


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] genirq: temporary fix for level-triggered IRQ resend

2007-08-02 Thread Gabriel C
Ingo Molnar wrote:
 Linus,
 
 with -rc2 approaching i think we should apply the minimal fix below to 
 get Marcin's ne2k-pci networking back in working order. The 
 WARN_ON_ONCE() will not prevent the system from working and it will be a 
 reminder.
 
 a better workaround would be to inhibit the resent vector via the 
 IO-APIC irqchip - but i'd still like to have the patch below because the 
 ne2k driver _should_ be able to survive the spurious irq that happens. 
 (even on Marcin's system that ne2k-pci irq line is shared with another 
 networking card, so an irq could happen at any moment - it's just that 
 with the delayed-disable logic it happens _all the time_.)
 

I get a warning on each boot now with this patch .. 

[   63.686613] WARNING: at kernel/irq/resend.c:70 check_irq_resend()
[   63.686636]  [c013c55c] check_irq_resend+0x8c/0xa0
[   63.686653]  [c013c15f] enable_irq+0xad/0xb3
[   63.686662]  [e886481e] vortex_timer+0x20c/0x3d5 [3c59x]
[   63.686675]  [c01164b9] scheduler_tick+0x154/0x273
[   63.686685]  [c012fed1] getnstimeofday+0x34/0xe3
[   63.686697]  [c0121f4a] run_timer_softirq+0x137/0x197
[   63.686709]  [e8864612] vortex_timer+0x0/0x3d5 [3c59x]
[   63.686720]  [c011ed09] __do_softirq+0x75/0xe1
[   63.686729]  [c011edac] do_softirq+0x37/0x3d
[   63.686735]  [c011ef85] irq_exit+0x7c/0x7e
[   63.686740]  [c010e013] smp_apic_timer_interrupt+0x59/0x84
[   63.686751]  [c0103428] apic_timer_interrupt+0x28/0x30
[   63.686759]  [c0101355] default_idle+0x0/0x3f
[   63.686767]  [c0101385] default_idle+0x30/0x3f
[   63.686773]  [c0100c19] cpu_idle+0x5e/0x8e
[   63.686779]  [c03fdc5f] start_kernel+0x2d7/0x368


That means ?:)


   Ingo
 


Gabriel
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] genirq: temporary fix for level-triggered IRQ resend

2007-08-02 Thread Ingo Molnar

* Gabriel C [EMAIL PROTECTED] wrote:

 I get a warning on each boot now with this patch ..
 
 [   63.686613] WARNING: at kernel/irq/resend.c:70 check_irq_resend()
 [   63.686636]  [c013c55c] check_irq_resend+0x8c/0xa0
 [   63.686653]  [c013c15f] enable_irq+0xad/0xb3
 [   63.686662]  [e886481e] vortex_timer+0x20c/0x3d5 [3c59x]
 [   63.686675]  [c01164b9] scheduler_tick+0x154/0x273
 [   63.686685]  [c012fed1] getnstimeofday+0x34/0xe3
 [   63.686697]  [c0121f4a] run_timer_softirq+0x137/0x197
 [   63.686709]  [e8864612] vortex_timer+0x0/0x3d5 [3c59x]
 [   63.686720]  [c011ed09] __do_softirq+0x75/0xe1
 [   63.686729]  [c011edac] do_softirq+0x37/0x3d
 [   63.686735]  [c011ef85] irq_exit+0x7c/0x7e
 [   63.686740]  [c010e013] smp_apic_timer_interrupt+0x59/0x84
 [   63.686751]  [c0103428] apic_timer_interrupt+0x28/0x30
 [   63.686759]  [c0101355] default_idle+0x0/0x3f
 [   63.686767]  [c0101385] default_idle+0x30/0x3f
 [   63.686773]  [c0100c19] cpu_idle+0x5e/0x8e
 [   63.686779]  [c03fdc5f] start_kernel+0x2d7/0x368
 
 
 That means ?:)

if your network still works fine then you can ignore it :-)

we are still trying to figure out what happens with ne2k-pci. The 
message will vanish soon.

Ingo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] genirq: temporary fix for level-triggered IRQ resend

2007-07-31 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 Linus,
 
 with -rc2 approaching i think we should apply the minimal fix below to 
 get Marcin's ne2k-pci networking back in working order. The 
 WARN_ON_ONCE() will not prevent the system from working and it will be 
 a reminder.

there's one more test-patch that Marcin has not tested yet (see below) - 
perhaps a POST artifact in ne2k could explain this bug.

Ingo

-

* Alan Cox [EMAIL PROTECTED] wrote:

 Ok the logic behind the 8390 is very simple:

thanks for the explanation Alan! A few comments and a question:

 Things to know
   - IRQ delivery is asynchronous to the PCI bus
   - Blocking the local CPU IRQ via spin locks was too slow
   - The chip has register windows needing locking work
 
 So the path was once (I say once as people appear to have changed it 
 in the mean time and it now looks rather bogus if the changes to use 
 disable_irq_nosync_irqsave are disabling the local IRQ)
 
 
   Take the page lock
   Mask the IRQ on chip
   Disable the IRQ (but not mask locally- someone seems to have
   broken this with the lock validator stuff)
   [This must be _nosync as the page lock may otherwise
   deadlock us]

( side-note: you can ignore the lock validator stuff here, the validator
  changes are supposed to a NOP on the !lockdep case. Local irqs will
  only be disabled if the validator is running. This could cause dropped
  serial irqs on very old boxes but i doubt anyone will want to run the
  validator on those. )

   Drop the page lock and turn IRQs back on
   
   At this point an existing IRQ may still be running but we can't
   get a new one
 
   Take the lock (so we know the IRQ has terminated) but don't mask
 the IRQs on the processor
   Set irqlock [for debug]
 
   Transmit (slow as )
 
   re-enable the IRQ
 
 
 We have to use disable_irq because otherwise you will get delayed 
 interrupts on the APIC bus deadlocking the transmit path.
 
 Quite hairy but the chip simply wasn't designed for SMP and you can't 
 even ACK an interrupt without risking corrupting other parallel 
 activities on the chip.

So the whole locking is to be able to keep irqs enabled for a long time, 
without risking entry of the same IRQ handler on this same CPU, correct?

Marcin's test results suggest that if an IRQ is resent right at the 
enable_irq() point [be that via the hw irq-resend mechanism or the sw 
irq-resend mechanism], the hang happens.

In the previous 2.6.20 logic we'd not normally generate an IRQ at that 
point (because we masked the irq and the card itself deasserts the line 
so any level-triggered irq is now moot).

Once Thomas hacked off this resend mechanism for level-triggered irqs, 
Marcin saw the hangs go away.

So it seems to me that maybe the driver could be surprised via these 
spurious interrupts that happen right after the irq_enable(). Does the 
patch below make any sense in your opinion?

Ingo

Index: linux/drivers/net/lib8390.c
===
--- linux.orig/drivers/net/lib8390.c
+++ linux/drivers/net/lib8390.c
@@ -375,6 +375,8 @@ static int ei_start_xmit(struct sk_buff 
/* Turn 8390 interrupts back on. */
ei_local-irqlock = 0;
ei_outb_p(ENISR_ALL, e8390_base + EN0_IMR);
+   /* force POST: */
+   ei_inb_p(e8390_base + EN0_IMR);
 
spin_unlock(ei_local-page_lock);
enable_irq_lockdep_irqrestore(dev-irq, flags);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html