Re: [PATCH] natsemi: netpoll fixes

2007-03-13 Thread Sergei Shtylyov

Hello.

Mark Brown wrote:


Subject: natsemi: Fix NAPI for interrupt sharing



The interrupt status register for the natsemi chips is clear on read and
was read unconditionally from both the interrupt and from the NAPI poll
routine, meaning that if the interrupt service routine was called (for 
example, due to a shared interrupt) while a NAPI poll was scheduled

interrupts could be missed.  This patch fixes that by ensuring that the
interrupt status register is only read by the interrupt handler when
interrupts are enabled from the chip.

It also reverts a workaround for this problem from the netpoll hook and
improves the trace for interrupt events.

Thanks to Sergei Shtylyov [EMAIL PROTECTED] for spotting the
issue, Mark Huth [EMAIL PROTECTED] for a simpler method and Simon
Blake [EMAIL PROTECTED] for testing resources.



Signed-Off-By: Mark Brown [EMAIL PROTECTED]



Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c2007-03-11 02:32:43.0 
+
+++ linux-2.6/drivers/net/natsemi.c 2007-03-13 00:12:29.0 +

[...]

@@ -2131,17 +2133,23 @@
   dev-name, np-intr_status,
   readl(ioaddr + IntrMask));
 
-	if (!np-intr_status)

-   return IRQ_NONE;
-
-   prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]);
+   if (np-intr_status) {
+   prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]);
 
-	if (netif_rx_schedule_prep(dev)) {

/* Disable interrupts and register for poll */
-   natsemi_irq_disable(dev);
-   __netif_rx_schedule(dev);
+   if (netif_rx_schedule_prep(dev)) {
+   natsemi_irq_disable(dev);
+   __netif_rx_schedule(dev);
+   } else
+   printk(KERN_WARNING
+  %s: Ignoring interrupt, status %#08x, mask 
%#08x.\n,
+  dev-name, np-intr_status,
+  readl(ioaddr + IntrMask));
+
+   return IRQ_HANDLED;
}
-   return IRQ_HANDLED;
+
+   return IRQ_NONE;
 }


   The only complaint I have is that this restructuring seems 
unnecessary: the only real change it does is an addition of else to the if 
statement.


WBR, Sergei
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] natsemi: netpoll fixes

2007-03-13 Thread Sergei Shtylyov

Hello.

Mark Brown wrote:


Moving netdev_rx() would fix that one but there's some others too -
there's one in the timer routine if the chip crashes.  In the case you


  Erm, sorry, I'm not seeing it -- could you point with finger please? 
  :-)



In netdev_timer() when the device is using PORT_TP if the DspCfg read
back from the chip differs from the one we think we programmed into it
then the driver thinks the PHY fell over.  It then goes through an init
sequence, including init_registers() which will reset IntrEnable among
other things.


   What's more important for us, it will also clear IntrStatus (and ignore 
all pending interrupts).  Well, as it will also reinit the whole TX/RX rings, 
so that all packets will be lost...



describe above the consequences shouldn't be too bad since it tends to
only occur at high volume so further traffic will tend to occur and
cause things to recover - all the testing of that patch was done with
the bug present and no ill effects.



  Oversized packets occur only at high volume? Is it some errata?



It's an errata - AN 1287 which you can get from the National web site.
It's not actually that chip that's getting oversided packets, what
happens is that the state machine which reads data off the wire gets
confused and eventually locks up.  Before locking up it will usually
report one or more oversided packets so this is a useful hint that we
should reset the recieve state machine in order to recover from this.


   That's all good by why we need to completely lose TX and other interrupts
in the meantime? High inbound traffic doesn't necessarily mean a high outbound 
one, does it?


WBR, Sergei

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] natsemi: netpoll fixes

2007-03-13 Thread Mark Brown
On Tue, Mar 13, 2007 at 04:53:54PM +0300, Sergei Shtylyov wrote:
 Mark Brown wrote:

 confused and eventually locks up.  Before locking up it will usually
 report one or more oversided packets so this is a useful hint that we
 should reset the recieve state machine in order to recover from this.

That's all good by why we need to completely lose TX and other interrupts
 in the meantime? High inbound traffic doesn't necessarily mean a high 
 outbound one, does it?

While the code in the driver can cope if the chip takes a while to
respond to the reset as far as I have been able to tell in testing 
it does so close enough to immediately to avoid repeating the loop at
all.  The effect on transmit processing should be minimal.

-- 
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature


Re: [PATCH] natsemi: netpoll fixes

2007-03-12 Thread Sergei Shtylyov

Hello.

Mark Brown wrote:

  Oops, I was going to recast the patch but my attention switched 
  elsewhere for couple of days, and it slipped into mainline. I'm now 
preparing a better patch to also protect...



Ah, I was also looking at it.  I enclose my current patch which appears
to work although I have not finished testing it yet.


interrupt handler, check the dev-state __LINK_STATE_SCHED flag - if 
it's set, leave immediately, it can't be our interrupt. If it's clear, 
read the irq enable hardware register.  If enabled, do the rest of the 
interrupt handler.


  It seems that there's no need to read it, as it gets set to 0 
synchronously with setting the 'hands_off' flag (except in NAPI 
callback)...



hands_off is stronger than that - it's used for sync with some of the
other code paths like suspend/resume and means don't touch the chip.
I've added a new driver local flag instead.


   I'm not sure it was worth it -- we already had __LINK_STATE_RX_SCHED...

  Yeah, it seems currently unjustified.  However IntrEnable would have 
  been an ultimate criterion on taking or ignoring an interrupt otherwise...


I guess the tradeoff depends on the probability 
of getting the isr called when NAPI is active for the device.



  Didn't get it... :-/


   I mean I didn't understand why there's tradeoff and how it depends on the 
probability...



Some systems can provoke this fairly easily - Sokeris have some boards
where at least three natsemis share a single interrupt line, for example
(the model I'm looking at has three, they look to have another
configuration where 5 would end up on the line).


   PCI means IRQ sharing, generally. So, it should have been fixed anyway. :-)


  BTW, it seems I've found another interrupt lossage path in the driver:



netdev_poll() - netdev_rx() - reset_rx()


If the netdev_rx() detects an oversized packet, it will call reset_rx() 
which will spin in a loop collecting interrupt status until it sees 
RxResetDone there.  The issue is 'intr_status' field will get overwritten 
and interrupt status lost after netdev_rx() returns to netdev_poll().  How 
do you think, is this corner case worth fixing (by moving netdev_rx() call

to the top of a do/while loop)?



Moving netdev_rx() would fix that one but there's some others too -
there's one in the timer routine if the chip crashes.  In the case you


   Erm, sorry, I'm not seeing it -- could you point with finger please? :-)


describe above the consequences shouldn't be too bad since it tends to
only occur at high volume so further traffic will tend to occur and
cause things to recover - all the testing of that patch was done with
the bug present and no ill effects.


   Oversized packets occur only at high volume? Is it some errata?


Subject: natsemi: Fix NAPI for interrupt sharing
To: Jeff Garzik [EMAIL PROTECTED]
Cc: Sergei Shtylyov [EMAIL PROTECTED], Simon Blake [EMAIL PROTECTED], John 
Philips [EMAIL PROTECTED], netdev@vger.kernel.org



The interrupt status register for the natsemi chips is clear on read and
was read unconditionally from both the interrupt and from the NAPI poll
routine, meaning that if the interrupt service routine was called (for 
example, due to a shared interrupt) while a NAPI poll was scheduled

interrupts could be missed.  This patch fixes that by ensuring that the
interrupt status register is only read when there is no poll scheduled.



It also reverts a workaround for this problem from the netpoll hook.



Thanks to Sergei Shtylyov [EMAIL PROTECTED] for spotting the
issue and Simon Blake [EMAIL PROTECTED] for testing resources.


   Thanks for the patch!
   (If I only knew somebody else was working on that issue, it could have 
saved my cycles, sigh... but well, I should have said  that I was going to 
recast the patch. :-)



Signed-Off-By: Mark Brown [EMAIL PROTECTED]



Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c2007-03-11 02:32:43.0 
+
+++ linux-2.6/drivers/net/natsemi.c 2007-03-11 12:09:14.0 +
@@ -571,6 +571,8 @@
int oom;
/* Interrupt status */
u32 intr_status;
+   int poll_active;
+   spinlock_t intr_lock;
/* Do not touch the nic registers */
int hands_off;
/* Don't pay attention to the reported link state. */
@@ -812,9 +814,11 @@
pci_set_drvdata(pdev, dev);
np-iosize = iosize;
spin_lock_init(np-lock);
+   spin_lock_init(np-intr_lock);
np-msg_enable = (debug = 0) ? (1debug)-1 : NATSEMI_DEF_MSG;
np-hands_off = 0;
np-intr_status = 0;
+   np-poll_active = 0;
np-eeprom_size = natsemi_pci_info[chip_idx].eeprom_size;
if (natsemi_pci_info[chip_idx].flags  NATSEMI_FLAG_IGNORE_PHY)
np-ignore_phy = 1;
@@ -1406,6 +1410,8 @@
writel(rfcr, ioaddr + RxFilterAddr);
 }
 
+/* MUST be called so that both NAPI poll and 

Re: [PATCH] natsemi: netpoll fixes

2007-03-12 Thread Mark Huth

Mark Brown wrote:

On Sat, Mar 10, 2007 at 11:25:05PM +0300, Sergei Shtylyov wrote:

  
   Oops, I was going to recast the patch but my attention switched 
   elsewhere for couple of days, and it slipped into mainline. I'm now 
preparing a better patch to also protect...



Ah, I was also looking at it.  I enclose my current patch which appears
to work although I have not finished testing it yet.

  
interrupt handler, check the dev-state __LINK_STATE_SCHED flag - if 
it's set, leave immediately, it can't be our interrupt. If it's clear, 
read the irq enable hardware register.  If enabled, do the rest of the 
interrupt handler.
  


  
   It seems that there's no need to read it, as it gets set to 0 
synchronously with setting the 'hands_off' flag (except in NAPI 
callback)...



hands_off is stronger than that - it's used for sync with some of the
other code paths like suspend/resume and means don't touch the chip.
I've added a new driver local flag instead.

  
   Yeah, it seems currently unjustified.  However IntrEnable would have 
   been an ultimate criterion on taking or ignoring an interrupt otherwise...



  
I guess the tradeoff depends on the probability 
of getting the isr called when NAPI is active for the device.
  


  

   Didn't get it... :-/



Some systems can provoke this fairly easily - Sokeris have some boards
where at least three natsemis share a single interrupt line, for example
(the model I'm looking at has three, they look to have another
configuration where 5 would end up on the line).

  

   BTW, it seems I've found another interrupt lossage path in the driver:



  

netdev_poll() - netdev_rx() - reset_rx()



  
If the netdev_rx() detects an oversized packet, it will call reset_rx() 
which will spin in a loop collecting interrupt status until it sees 
RxResetDone there.  The issue is 'intr_status' field will get overwritten 
and interrupt status lost after netdev_rx() returns to netdev_poll().  How 
do you think, is this corner case worth fixing (by moving netdev_rx() call

to the top of a do/while loop)?



Moving netdev_rx() would fix that one but there's some others too -
there's one in the timer routine if the chip crashes.  In the case you
describe above the consequences shouldn't be too bad since it tends to
only occur at high volume so further traffic will tend to occur and
cause things to recover - all the testing of that patch was done with
the bug present and no ill effects.
  
The proposed patch would seem to be way overkill.  The problem, as 
solved in this patch, at least is just the prevention of the ISR from 
reading the intr_status while another context (NAPI) is still active.  
The simplest fix is to revert the netpoll attempted fix and then modify 
the isr etnry criteria as follows:


--- natsemi.c   2006-09-19 20:42:06.0 -0700
+++ natsemi.c.new   2007-03-12 11:35:28.0 -0700
@@ -2094,7 +2094,7 @@
   struct netdev_private *np = netdev_priv(dev);
   void __iomem * ioaddr = ns_ioaddr(dev);

-if (np-hands_off)
+   if (np-hands_off || !readl(ioaddr + IntrEnable))
   return IRQ_NONE;

   /* Reading automatically acknowledges. */

Since the interrupts are enabled as the NAPI-callback exits, and the 
interrupts are disabled in the isr after the callback is scheduled, this 
fully avoids the potential race conditions, and requires no locking.  If 
we want to avoid the ioaccess, then this could be augmented with a flag 
in the device private area which would be tested first, short-circuiting 
the ioaccess most of the time when the callback is already scheduled.  
But the IntrEnable still would need to be check to avoid the race on the 
enable/disable condition between the callback and the isr.


It is possible to entirely avoid the IntrEnable access and use only a 
flag, but that requires a local_irqsave/restore around the calback's 
enabling of the interrupt and and clearing of the poll_active flag.  
That is fully up-safe, but may have a theoretical possibility of an 
unserviced (spurious) interrupt on SMP systems.  Such a possibility 
would only exist if an architecture is able to accept an interrupt on a 
second processor and get to the natsemi isr before the the first 
proccesor can clear the poll_active flag.  That can be further minimized 
by omitting the readback of the IntrEnable register, or deferring it 
until after the flag is cleared.


If you would like me to prepare formal patches based on any of these 
concepts, let me know.


Mark Huth

Subject: natsemi: Fix NAPI for interrupt sharing
To: Jeff Garzik [EMAIL PROTECTED]
Cc: Sergei Shtylyov [EMAIL PROTECTED], Simon Blake [EMAIL PROTECTED], John 
Philips [EMAIL PROTECTED], netdev@vger.kernel.org

The interrupt status register for the natsemi chips is clear on read and
was read unconditionally from both the interrupt and from the NAPI poll
routine, meaning that if the interrupt service routine was called (for 

Re: [PATCH] natsemi: netpoll fixes

2007-03-12 Thread Mark Brown
On Mon, Mar 12, 2007 at 04:05:48PM +0300, Sergei Shtylyov wrote:
 Mark Brown wrote:

 hands_off is stronger than that - it's used for sync with some of the
 other code paths like suspend/resume and means don't touch the chip.
 I've added a new driver local flag instead.

I'm not sure it was worth it -- we already had __LINK_STATE_RX_SCHED...

It would be nice but as well as being a general layering violation it
could cause problems when trying to quiesce the hardware since
netif_poll_disable() sets this without actually scheduling the poll.

   Yeah, it seems currently unjustified.  However IntrEnable would have 
   been an ultimate criterion on taking or ignoring an interrupt 
   otherwise...

I guess the tradeoff depends on the probability
of getting the isr called when NAPI is active for the device.

I mean I didn't understand why there's tradeoff and how it depends on 
the probability...

Reading device registers means going over the PCI bus, which is
expensive.  Shadowing the interrupt mask state in the driver makes
the driver more complicated but means that we avoid synchronous PCI
accesses, reducing the number of cycles the CPU spends stalled doing
those.

The performance benefit from the extra code complexity depends on how
often we end up doing the extra read.  Since one of the things NAPI does
is provide some interrupt mitigation the extra work is most likely to be
noticable if some other device is generating interrupts that trigger the
natsemi interrupt handler.

 Moving netdev_rx() would fix that one but there's some others too -
 there's one in the timer routine if the chip crashes.  In the case you

Erm, sorry, I'm not seeing it -- could you point with finger please? 
:-)

In netdev_timer() when the device is using PORT_TP if the DspCfg read
back from the chip differs from the one we think we programmed into it
then the driver thinks the PHY fell over.  It then goes through an init
sequence, including init_registers() which will reset IntrEnable among
other things.

 describe above the consequences shouldn't be too bad since it tends to
 only occur at high volume so further traffic will tend to occur and
 cause things to recover - all the testing of that patch was done with
 the bug present and no ill effects.

Oversized packets occur only at high volume? Is it some errata?

It's an errata - AN 1287 which you can get from the National web site.
It's not actually that chip that's getting oversided packets, what
happens is that the state machine which reads data off the wire gets
confused and eventually locks up.  Before locking up it will usually
report one or more oversided packets so this is a useful hint that we
should reset the recieve state machine in order to recover from this.

(If I only knew somebody else was working on that issue, it could have 
 saved my cycles, sigh... but well, I should have said  that I was going to 
 recast the patch. :-)

Yeah, me too.  I'll submit this one once I've finished testing, then
audit other IntrStatus accesses.

 -   readl(ioaddr + IntrMask));
 +spin_lock_irqsave(np-intr_lock, flags);

Yeah, I've suspected that we need to grab np-lock here... but does that 
 separate spinlock actually protect us from anything?

...

 +/* We need to ensure that the ISR doesn't run between telling
 + * NAPI we're done and enabling the interrupt. */
 
Why? :-O

This is to ensure that the interrupt handler can't run between us
marking that the poll has complete and reenabling the interrupt from the
chip.  That would mean that the chip would end up with interrupts from
the chip enabled while the poll is scheduled.

-- 
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature


Re: [PATCH] natsemi: netpoll fixes

2007-03-12 Thread Sergei Shtylyov

Hello, I wrote:


Subject: natsemi: Fix NAPI for interrupt sharing
To: Jeff Garzik [EMAIL PROTECTED]
Cc: Sergei Shtylyov [EMAIL PROTECTED], Simon Blake 
[EMAIL PROTECTED], John Philips [EMAIL PROTECTED], 
netdev@vger.kernel.org



The interrupt status register for the natsemi chips is clear on read and
was read unconditionally from both the interrupt and from the NAPI poll
routine, meaning that if the interrupt service routine was called (for 
example, due to a shared interrupt) while a NAPI poll was scheduled

interrupts could be missed.  This patch fixes that by ensuring that the
interrupt status register is only read when there is no poll scheduled.



It also reverts a workaround for this problem from the netpoll hook.



Thanks to Sergei Shtylyov [EMAIL PROTECTED] for spotting the


   Well, I've blithely overlooked it, and it's you who did spot it. :-)


issue and Simon Blake [EMAIL PROTECTED] for testing resources.



   Thanks for the patch!
   (If I only knew somebody else was working on that issue, it could 
have saved my cycles, sigh... but well, I should have said  that I was 
going to recast the patch. :-)



Signed-Off-By: Mark Brown [EMAIL PROTECTED]



Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c2007-03-11 
02:32:43.0 +
+++ linux-2.6/drivers/net/natsemi.c2007-03-11 12:09:14.0 
+

@@ -571,6 +571,8 @@
 int oom;
 /* Interrupt status */
 u32 intr_status;
+int poll_active;
+spinlock_t intr_lock;
 /* Do not touch the nic registers */
 int hands_off;
 /* Don't pay attention to the reported link state. */
@@ -812,9 +814,11 @@
 pci_set_drvdata(pdev, dev);
 np-iosize = iosize;
 spin_lock_init(np-lock);
+spin_lock_init(np-intr_lock);
 np-msg_enable = (debug = 0) ? (1debug)-1 : NATSEMI_DEF_MSG;
 np-hands_off = 0;
 np-intr_status = 0;
+np-poll_active = 0;
 np-eeprom_size = natsemi_pci_info[chip_idx].eeprom_size;
 if (natsemi_pci_info[chip_idx].flags  NATSEMI_FLAG_IGNORE_PHY)
 np-ignore_phy = 1;
@@ -1406,6 +1410,8 @@
 writel(rfcr, ioaddr + RxFilterAddr);
 }
 
+/* MUST be called so that both NAPI poll and ISR are excluded due to

+ * use of intr_status. */
 static void reset_rx(struct net_device *dev)
 {
 int i;
@@ -2118,30 +2124,45 @@
 struct net_device *dev = dev_instance;
 struct netdev_private *np = netdev_priv(dev);
 void __iomem * ioaddr = ns_ioaddr(dev);
+unsigned long flags;
+irqreturn_t status = IRQ_NONE;
 
 if (np-hands_off)

 return IRQ_NONE;
 
-/* Reading automatically acknowledges. */

-np-intr_status = readl(ioaddr + IntrStatus);
-
-if (netif_msg_intr(np))
-printk(KERN_DEBUG
-   %s: Interrupt, status %#08x, mask %#08x.\n,
-   dev-name, np-intr_status,
-   readl(ioaddr + IntrMask));
+spin_lock_irqsave(np-intr_lock, flags);


   Yeah, I've suspected that we need to grab np-lock here... but does 
that separate spinlock actually protect us from anything?


   I'm also not sure that we need to disable interrupts here.


-if (!np-intr_status)
-return IRQ_NONE;
+/* Reading IntrStatus automatically acknowledges so don't do
+ * that while a poll is scheduled.  */
+if (!np-poll_active) {
+np-intr_status = readl(ioaddr + IntrStatus);
 
-prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]);

+if (netif_msg_intr(np))
+printk(KERN_DEBUG
+   %s: Interrupt, status %#08x, mask %#08x.\n,
+   dev-name, np-intr_status,
+   readl(ioaddr + IntrMask));
+
+if (np-intr_status) {
+prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]);
+
+/* Disable interrupts and register for poll */
+if (netif_rx_schedule_prep(dev)) {
+natsemi_irq_disable(dev);
+__netif_rx_schedule(dev);
+np-poll_active = 1;
+} else
+printk(KERN_WARNING
+  %s: Ignoring interrupt, status %#08x, 
mask %#08x.\n,

+   dev-name, np-intr_status,
+   readl(ioaddr + IntrMask));
 
-if (netif_rx_schedule_prep(dev)) {

-/* Disable interrupts and register for poll */
-natsemi_irq_disable(dev);
-__netif_rx_schedule(dev);
+status = IRQ_HANDLED;
+}
 }
-return IRQ_HANDLED;
+
+spin_unlock_irqrestore(np-intr_lock, flags);
+return status;
 }
 
 /* This is the NAPI poll routine.  As well as the standard RX handling

@@ -2154,8 +2175,15 @@
 
 int work_to_do = min(*budget, dev-quota);

 int work_done = 0;
+unsigned long flags;
 
 do {

+if (netif_msg_intr(np))
+printk(KERN_DEBUG
+   %s: Poll, status %#08x, mask %#08x.\n,
+   dev-name, np-intr_status,
+  

Re: [PATCH] natsemi: netpoll fixes

2007-03-12 Thread Mark Brown
On Mon, Mar 12, 2007 at 12:05:46PM -0700, Mark Huth wrote:

 Since the interrupts are enabled as the NAPI-callback exits, and the 
 interrupts are disabled in the isr after the callback is scheduled, this 
 fully avoids the potential race conditions, and requires no locking.  If 

I've benchmarked both approaches and they appear to be much of a
muchness for performance in my tests so I've updated my patch to use
this approach instead.  Thanks for the suggestion, it is simpler.

 If you would like me to prepare formal patches based on any of these 
 concepts, let me know.

That's OK - unless any further suggestions I'll submit the patch below
after further testing.  The improvements to trace and correctness of the
interupt handler seem useful.

Subject: natsemi: Fix NAPI for interrupt sharing

The interrupt status register for the natsemi chips is clear on read and
was read unconditionally from both the interrupt and from the NAPI poll
routine, meaning that if the interrupt service routine was called (for 
example, due to a shared interrupt) while a NAPI poll was scheduled
interrupts could be missed.  This patch fixes that by ensuring that the
interrupt status register is only read by the interrupt handler when
interrupts are enabled from the chip.

It also reverts a workaround for this problem from the netpoll hook and
improves the trace for interrupt events.

Thanks to Sergei Shtylyov [EMAIL PROTECTED] for spotting the
issue, Mark Huth [EMAIL PROTECTED] for a simpler method and Simon
Blake [EMAIL PROTECTED] for testing resources.

Signed-Off-By: Mark Brown [EMAIL PROTECTED]

Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c2007-03-11 02:32:43.0 
+
+++ linux-2.6/drivers/net/natsemi.c 2007-03-13 00:12:29.0 +
@@ -2119,10 +2119,12 @@
struct netdev_private *np = netdev_priv(dev);
void __iomem * ioaddr = ns_ioaddr(dev);
 
-   if (np-hands_off)
+   /* Reading IntrStatus automatically acknowledges so don't do
+* that while interrupts are disabled, (for example, while a
+* poll is scheduled).  */
+   if (np-hands_off || !readl(ioaddr + IntrEnable))
return IRQ_NONE;
 
-   /* Reading automatically acknowledges. */
np-intr_status = readl(ioaddr + IntrStatus);
 
if (netif_msg_intr(np))
@@ -2131,17 +2133,23 @@
   dev-name, np-intr_status,
   readl(ioaddr + IntrMask));
 
-   if (!np-intr_status)
-   return IRQ_NONE;
-
-   prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]);
+   if (np-intr_status) {
+   prefetch(np-rx_skbuff[np-cur_rx % RX_RING_SIZE]);
 
-   if (netif_rx_schedule_prep(dev)) {
/* Disable interrupts and register for poll */
-   natsemi_irq_disable(dev);
-   __netif_rx_schedule(dev);
+   if (netif_rx_schedule_prep(dev)) {
+   natsemi_irq_disable(dev);
+   __netif_rx_schedule(dev);
+   } else
+   printk(KERN_WARNING
+  %s: Ignoring interrupt, status %#08x, mask 
%#08x.\n,
+  dev-name, np-intr_status,
+  readl(ioaddr + IntrMask));
+
+   return IRQ_HANDLED;
}
-   return IRQ_HANDLED;
+
+   return IRQ_NONE;
 }
 
 /* This is the NAPI poll routine.  As well as the standard RX handling
@@ -2156,6 +2164,12 @@
int work_done = 0;
 
do {
+   if (netif_msg_intr(np))
+   printk(KERN_DEBUG
+  %s: Poll, status %#08x, mask %#08x.\n,
+  dev-name, np-intr_status,
+  readl(ioaddr + IntrMask));
+
if (np-intr_status 
(IntrTxDone | IntrTxIntr | IntrTxIdle | IntrTxErr)) {
spin_lock(np-lock);
@@ -2399,19 +2413,8 @@
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void natsemi_poll_controller(struct net_device *dev)
 {
-   struct netdev_private *np = netdev_priv(dev);
-
disable_irq(dev-irq);
-
-   /*
-* A real interrupt might have already reached us at this point
-* but NAPI might still haven't called us back.  As the interrupt
-* status register is cleared by reading, we should prevent an
-* interrupt loss in this case...
-*/
-   if (!np-intr_status)
-   intr_handler(dev-irq, dev);
-
+   intr_handler(dev-irq, dev);
enable_irq(dev-irq);
 }
 #endif

-- 
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature


Re: [PATCH] natsemi: netpoll fixes

2007-03-11 Thread Mark Brown
On Sat, Mar 10, 2007 at 11:25:05PM +0300, Sergei Shtylyov wrote:

Oops, I was going to recast the patch but my attention switched 
elsewhere for couple of days, and it slipped into mainline. I'm now 
 preparing a better patch to also protect...

Ah, I was also looking at it.  I enclose my current patch which appears
to work although I have not finished testing it yet.

 interrupt handler, check the dev-state __LINK_STATE_SCHED flag - if 
 it's set, leave immediately, it can't be our interrupt. If it's clear, 
 read the irq enable hardware register.  If enabled, do the rest of the 
 interrupt handler.

It seems that there's no need to read it, as it gets set to 0 
 synchronously with setting the 'hands_off' flag (except in NAPI 
 callback)...

hands_off is stronger than that - it's used for sync with some of the
other code paths like suspend/resume and means don't touch the chip.
I've added a new driver local flag instead.

Yeah, it seems currently unjustified.  However IntrEnable would have 
been an ultimate criterion on taking or ignoring an interrupt otherwise...

 I guess the tradeoff depends on the probability 
 of getting the isr called when NAPI is active for the device.

Didn't get it... :-/

Some systems can provoke this fairly easily - Sokeris have some boards
where at least three natsemis share a single interrupt line, for example
(the model I'm looking at has three, they look to have another
configuration where 5 would end up on the line).

BTW, it seems I've found another interrupt lossage path in the driver:

 netdev_poll() - netdev_rx() - reset_rx()

 If the netdev_rx() detects an oversized packet, it will call reset_rx() 
 which will spin in a loop collecting interrupt status until it sees 
 RxResetDone there.  The issue is 'intr_status' field will get overwritten 
 and interrupt status lost after netdev_rx() returns to netdev_poll().  How 
 do you think, is this corner case worth fixing (by moving netdev_rx() call
 to the top of a do/while loop)?

Moving netdev_rx() would fix that one but there's some others too -
there's one in the timer routine if the chip crashes.  In the case you
describe above the consequences shouldn't be too bad since it tends to
only occur at high volume so further traffic will tend to occur and
cause things to recover - all the testing of that patch was done with
the bug present and no ill effects.

Subject: natsemi: Fix NAPI for interrupt sharing
To: Jeff Garzik [EMAIL PROTECTED]
Cc: Sergei Shtylyov [EMAIL PROTECTED], Simon Blake [EMAIL PROTECTED], John 
Philips [EMAIL PROTECTED], netdev@vger.kernel.org

The interrupt status register for the natsemi chips is clear on read and
was read unconditionally from both the interrupt and from the NAPI poll
routine, meaning that if the interrupt service routine was called (for 
example, due to a shared interrupt) while a NAPI poll was scheduled
interrupts could be missed.  This patch fixes that by ensuring that the
interrupt status register is only read when there is no poll scheduled.

It also reverts a workaround for this problem from the netpoll hook.

Thanks to Sergei Shtylyov [EMAIL PROTECTED] for spotting the
issue and Simon Blake [EMAIL PROTECTED] for testing resources.

Signed-Off-By: Mark Brown [EMAIL PROTECTED]

Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c2007-03-11 02:32:43.0 
+
+++ linux-2.6/drivers/net/natsemi.c 2007-03-11 12:09:14.0 +
@@ -571,6 +571,8 @@
int oom;
/* Interrupt status */
u32 intr_status;
+   int poll_active;
+   spinlock_t intr_lock;
/* Do not touch the nic registers */
int hands_off;
/* Don't pay attention to the reported link state. */
@@ -812,9 +814,11 @@
pci_set_drvdata(pdev, dev);
np-iosize = iosize;
spin_lock_init(np-lock);
+   spin_lock_init(np-intr_lock);
np-msg_enable = (debug = 0) ? (1debug)-1 : NATSEMI_DEF_MSG;
np-hands_off = 0;
np-intr_status = 0;
+   np-poll_active = 0;
np-eeprom_size = natsemi_pci_info[chip_idx].eeprom_size;
if (natsemi_pci_info[chip_idx].flags  NATSEMI_FLAG_IGNORE_PHY)
np-ignore_phy = 1;
@@ -1406,6 +1410,8 @@
writel(rfcr, ioaddr + RxFilterAddr);
 }
 
+/* MUST be called so that both NAPI poll and ISR are excluded due to
+ * use of intr_status. */
 static void reset_rx(struct net_device *dev)
 {
int i;
@@ -2118,30 +2124,45 @@
struct net_device *dev = dev_instance;
struct netdev_private *np = netdev_priv(dev);
void __iomem * ioaddr = ns_ioaddr(dev);
+   unsigned long flags;
+   irqreturn_t status = IRQ_NONE;
 
if (np-hands_off)
return IRQ_NONE;
 
-   /* Reading automatically acknowledges. */
-   np-intr_status = readl(ioaddr + IntrStatus);
-
-   if (netif_msg_intr(np))
-  

Re: [PATCH] natsemi: netpoll fixes

2007-03-10 Thread Sergei Shtylyov

Hello.

Mark Huth wrote:


 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void natsemi_poll_controller(struct net_device *dev)
 {
+ struct netdev_private *np = netdev_priv(dev);
+
  disable_irq(dev-irq);
- intr_handler(dev-irq, dev);
+
+ /*
+  * A real interrupt might have already reached us at this point
+  * but NAPI might still haven't called us back.  As the
interrupt
+  * status register is cleared by reading, we should prevent an
+  * interrupt loss in this case...
+  */
+ if (!np-intr_status)
+ intr_handler(dev-irq, dev);
+
  enable_irq(dev-irq);


   Oops, I was going to recast the patch but my attention switched elsewhere 
for couple of days, and it slipped into mainline. I'm now preparing a better 
patch to also protect...



Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np-intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the


Well, there is a whole task of analyzing the netpoll conditions under 
smp.  There appears to me to be a race with netpoll and NAPI on another 
processor, given that netpoll can be called with virtually any system 
condition on a debug breakpoint or crash dump initiation.  I'm spending 
some time looking into it, but don't have a smoking gun immediately.  
Regardless, if such a condition does exist, it is shared across many or 
all of the potential netpolled devices.  Since that is exactly the 
condition  the suggested patch purports to solve, it is pointless if the 
whole NAPI/netpoll race exists.  Such a race would lead to various and 
imaginative failures in the system.  So don't fix that problem in a 
particular driver.  If it exists, fix it generally in the netpoll/NAPI 
infrastructure.


   Any takers? :-)


In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.


Yes, that would appear so.  If an interrupt line is shared with this 
device, then the interrupt handler can be called again, even though the 
device's interrupts are disabled on the interface.  So, in the actual 
interrupt handler, check the dev-state __LINK_STATE_SCHED flag - if 
it's set, leave immediately, it can't be our interrupt. If it's clear, 
read the irq enable hardware register.  If enabled, do the rest of the 
interrupt handler.


   It seems that there's no need to read it, as it gets set to 0 
synchronously with setting the 'hands_off' flag (except in NAPI callback)...


Since the isr is disabled only by the interrupt 
handler, and enabled only by the poll routine, the race on the interrupt 
cause register is prevented.  And, as a byproduct, the netpoll race is 
also prevented.  You could just always read the isr enable hardware 
register, but that means you always do an operation to the chip, which 
can be painfully slow.


   Yeah, it seems currently unjustified.  However IntrEnable would have been 
an ultimate criterion on taking or ignoring an interrupt otherwise...


I guess the tradeoff depends on the probability 
of getting the isr called when NAPI is active for the device.


   Didn't get it... :-/

If this results in netpoll not getting a packet right away, that's okay, 
since the netpoll users should try again.


   Well, in certain stuations (like KGDBoE), netpoll callback being called 
*while* NAPI callback is being executed would mean a deadlock, I think (as 
NAPI callback will never complete)...

   BTW, it seems I've found another interrupt lossage path in the driver:

netdev_poll() - netdev_rx() - reset_rx()

If the netdev_rx() detects an oversized packet, it will call reset_rx() which 
will spin in a loop collecting interrupt status until it sees RxResetDone 
there.  The issue is 'intr_status' field will get overwritten and interrupt 
status lost after netdev_rx() returns to netdev_poll().  How do you think, is 
this corner case worth fixing (by moving netdev_rx() call to the top of a 
do/while loop)?



Mark Huth


WBR, Sergei
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] natsemi: netpoll fixes

2007-03-06 Thread Jeff Garzik

Sergei Shtylyov wrote:

Fix two issues in this driver's netpoll path: one usual, with spin_unlock_irq()
enabling interrupts which nobody asks it to do (that has been fixed recently in
a number of drivers) and one unusual, with poll_controller() method possibly
causing loss of interrupts due to the interrupt status register being cleared
by a simple read and the interrpupt handler simply storing it, not accumulating.

Signed-off-by: Sergei Shtylyov [EMAIL PROTECTED]

---
 drivers/net/natsemi.c |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)


applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] natsemi: netpoll fixes

2007-03-05 Thread Sergei Shtylyov
Fix two issues in this driver's netpoll path: one usual, with spin_unlock_irq()
enabling interrupts which nobody asks it to do (that has been fixed recently in
a number of drivers) and one unusual, with poll_controller() method possibly
causing loss of interrupts due to the interrupt status register being cleared
by a simple read and the interrpupt handler simply storing it, not accumulating.

Signed-off-by: Sergei Shtylyov [EMAIL PROTECTED]

---
 drivers/net/natsemi.c |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c
+++ linux-2.6/drivers/net/natsemi.c
@@ -2024,6 +2024,7 @@ static int start_tx(struct sk_buff *skb,
struct netdev_private *np = netdev_priv(dev);
void __iomem * ioaddr = ns_ioaddr(dev);
unsigned entry;
+   unsigned long flags;
 
/* Note: Ordering is important here, set the field with the
   ownership bit last, and only then increment cur_tx. */
@@ -2037,7 +2038,7 @@ static int start_tx(struct sk_buff *skb,
 
np-tx_ring[entry].addr = cpu_to_le32(np-tx_dma[entry]);
 
-   spin_lock_irq(np-lock);
+   spin_lock_irqsave(np-lock, flags);
 
if (!np-hands_off) {
np-tx_ring[entry].cmd_status = cpu_to_le32(DescOwn | skb-len);
@@ -2056,7 +2057,7 @@ static int start_tx(struct sk_buff *skb,
dev_kfree_skb_irq(skb);
np-stats.tx_dropped++;
}
-   spin_unlock_irq(np-lock);
+   spin_unlock_irqrestore(np-lock, flags);
 
dev-trans_start = jiffies;
 
@@ -,6 +2223,8 @@ static void netdev_rx(struct net_device 
pkt_len = (desc_status  DescSizeMask) - 4;
if ((desc_status(DescMore|DescPktOK|DescRxLong)) != DescPktOK){
if (desc_status  DescMore) {
+   unsigned long flags;
+
if (netif_msg_rx_err(np))
printk(KERN_WARNING
%s: Oversized(?) Ethernet 
@@ -2236,12 +2239,12 @@ static void netdev_rx(struct net_device 
 * reset procedure documented in
 * AN-1287. */
 
-   spin_lock_irq(np-lock);
+   spin_lock_irqsave(np-lock, flags);
reset_rx(dev);
reinit_rx(dev);
writel(np-ring_dma, ioaddr + RxRingPtr);
check_link(dev);
-   spin_unlock_irq(np-lock);
+   spin_unlock_irqrestore(np-lock, flags);
 
/* We'll enable RX on exit from this
 * function. */
@@ -2396,8 +2399,19 @@ static struct net_device_stats *get_stat
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void natsemi_poll_controller(struct net_device *dev)
 {
+   struct netdev_private *np = netdev_priv(dev);
+
disable_irq(dev-irq);
-   intr_handler(dev-irq, dev);
+
+   /*
+* A real interrupt might have already reached us at this point
+* but NAPI might still haven't called us back.  As the interrupt
+* status register is cleared by reading, we should prevent an
+* interrupt loss in this case...
+*/
+   if (!np-intr_status)
+   intr_handler(dev-irq, dev);
+
enable_irq(dev-irq);
 }
 #endif

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] natsemi: netpoll fixes

2007-03-05 Thread Mark Brown
On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote:

  #ifdef CONFIG_NET_POLL_CONTROLLER
  static void natsemi_poll_controller(struct net_device *dev)
  {
 + struct netdev_private *np = netdev_priv(dev);
 +
   disable_irq(dev-irq);
 - intr_handler(dev-irq, dev);
 +
 + /*
 +  * A real interrupt might have already reached us at this point
 +  * but NAPI might still haven't called us back.  As the interrupt
 +  * status register is cleared by reading, we should prevent an
 +  * interrupt loss in this case...
 +  */
 + if (!np-intr_status)
 + intr_handler(dev-irq, dev);
 +
   enable_irq(dev-irq);

Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np-intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the 

In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.

-- 
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature


Re: [PATCH] natsemi: netpoll fixes

2007-03-05 Thread Mark Brown
[Once more with CCs]

On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote:

  #ifdef CONFIG_NET_POLL_CONTROLLER
  static void natsemi_poll_controller(struct net_device *dev)
  {
 + struct netdev_private *np = netdev_priv(dev);
 +
   disable_irq(dev-irq);
 - intr_handler(dev-irq, dev);
 +
 + /*
 +  * A real interrupt might have already reached us at this point
 +  * but NAPI might still haven't called us back.  As the
 interrupt
 +  * status register is cleared by reading, we should prevent an
 +  * interrupt loss in this case...
 +  */
 + if (!np-intr_status)
 + intr_handler(dev-irq, dev);
 +
   enable_irq(dev-irq);

Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np-intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the

In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.

--
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature