Re: [PATCH] netpoll: break recursive loop in netpoll rx path

2006-06-12 Thread Matt Mackall
On Mon, Jun 12, 2006 at 11:40:29AM -0400, Neil Horman wrote:
 Hey there-
   the netpoll system currently has a rx to tx path via:
 netpoll_rx
  __netpoll_rx
   arp_reply
netpoll_send_skb
 dev-hard_start_tx
 
   This rx-tx loop places network drivers at risk of
 inadvertently causing a deadlock or BUG halt by recursively trying
 to acquire a spinlock that is used in both their rx and tx paths
 (this problem was origionally reported to me in the 3c59x driver,
 which shares a spinlock between the boomerang_interrupt and
 boomerang_start_xmit routines).

Grumble.

 This patch breaks this loop, by queueing arp frames, so that they
 can be responded to after all receive operations have been
 completed. Tested by myself and the reported with successful
 results.

Tested how? kgdb?

 + if (likely(npi))
 + while ((skb = skb_dequeue(npi-arp_tx)) != NULL)
 + arp_reply(skb);
 +

Assignment inside tests is frowned upon. I'd prefer pulling this out
to its own function too.


Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netpoll: break recursive loop in netpoll rx path

2006-06-12 Thread Neil Horman
On Mon, Jun 12, 2006 at 10:51:21AM -0500, Matt Mackall wrote:
 On Mon, Jun 12, 2006 at 11:40:29AM -0400, Neil Horman wrote:
  Hey there-
  the netpoll system currently has a rx to tx path via:
  netpoll_rx
   __netpoll_rx
arp_reply
 netpoll_send_skb
  dev-hard_start_tx
  
  This rx-tx loop places network drivers at risk of
  inadvertently causing a deadlock or BUG halt by recursively trying
  to acquire a spinlock that is used in both their rx and tx paths
  (this problem was origionally reported to me in the 3c59x driver,
  which shares a spinlock between the boomerang_interrupt and
  boomerang_start_xmit routines).
 
 Grumble.
 
  This patch breaks this loop, by queueing arp frames, so that they
  can be responded to after all receive operations have been
  completed. Tested by myself and the reported with successful
  results.
 
 Tested how? kgdb?
 
Specifically It was tested with netdump.  Heres the BZ with details:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=194055


  +   if (likely(npi))
  +   while ((skb = skb_dequeue(npi-arp_tx)) != NULL)
  +   arp_reply(skb);
  +
 
 Assignment inside tests is frowned upon. I'd prefer pulling this out
 to its own function too.
 
Sure, no problem.  New patch attached with suggested modifications made

Regards
Neil

Signed-off-by: Neil Horman [EMAIL PROTECTED]

 include/linux/netpoll.h |1 +
 net/core/netpoll.c  |   27 +--
 2 files changed, 26 insertions(+), 2 deletions(-)


--- linux-2.6/include/linux/netpoll.h.orig  2006-06-12 09:11:01.0 
-0400
+++ linux-2.6/include/linux/netpoll.h   2006-06-12 09:34:17.0 -0400
@@ -31,6 +31,7 @@ struct netpoll_info {
int rx_flags;
spinlock_t rx_lock;
struct netpoll *rx_np; /* netpoll that registered an rx_hook */
+   struct sk_buff_head arp_tx; /* list of arp requests to reply to */
 };
 
 void netpoll_poll(struct netpoll *np);
--- linux-2.6/net/core/netpoll.c.orig   2006-06-12 09:11:01.0 -0400
+++ linux-2.6/net/core/netpoll.c2006-06-12 13:43:25.0 -0400
@@ -54,6 +54,7 @@ static atomic_t trapped;
sizeof(struct iphdr) + sizeof(struct ethhdr))
 
 static void zap_completion_queue(void);
+static void arp_reply(struct sk_buff *skb);
 
 static void queue_process(void *p)
 {
@@ -153,8 +154,25 @@ static void poll_napi(struct netpoll *np
}
 }
 
+static void service_arp_queue(struct netpoll_info *npi)
+{
+   struct sk_buff *skb;
+
+   if(unlikely(!npi))
+   return;
+
+   skb = skb_dequeue(npi-arp_tx);
+
+   while (skb != NULL) {
+   arp_reply(skb);
+   skb = skb_dequeue(npi-arp_tx);
+   }
+   return;
+}
+
 void netpoll_poll(struct netpoll *np)
 {
+
if(!np-dev || !netif_running(np-dev) || !np-dev-poll_controller)
return;
 
@@ -163,6 +181,8 @@ void netpoll_poll(struct netpoll *np)
if (np-dev-poll)
poll_napi(np);
 
+   service_arp_queue(np-dev-npinfo);
+
zap_completion_queue();
 }
 
@@ -449,7 +469,9 @@ int __netpoll_rx(struct sk_buff *skb)
int proto, len, ulen;
struct iphdr *iph;
struct udphdr *uh;
-   struct netpoll *np = skb-dev-npinfo-rx_np;
+   struct netpoll_info *npi = skb-dev-npinfo;
+   struct netpoll *np = npi-rx_np;
+
 
if (!np)
goto out;
@@ -459,7 +481,7 @@ int __netpoll_rx(struct sk_buff *skb)
/* check if netpoll clients need ARP */
if (skb-protocol == __constant_htons(ETH_P_ARP) 
atomic_read(trapped)) {
-   arp_reply(skb);
+   skb_queue_tail(npi-arp_tx, skb);
return 1;
}
 
@@ -654,6 +676,7 @@ int netpoll_setup(struct netpoll *np)
npinfo-poll_owner = -1;
npinfo-tries = MAX_RETRIES;
spin_lock_init(npinfo-rx_lock);
+   skb_queue_head_init(npinfo-arp_tx);
} else
npinfo = ndev-npinfo;
 
-- 
/***
 *Neil Horman
 *Software Engineer
 *gpg keyid: 1024D / 0x92A74FA1 - http://pgp.mit.edu
 ***/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html