Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-24 Thread Matt Mackall
Simon, can you test this patch? I think it's the most straightforward
2.6.24 fix.

diff -r c60016ba6237 net/core/netpoll.c
--- a/net/core/netpoll.cTue Nov 13 09:09:36 2007 -0800
+++ b/net/core/netpoll.cFri Nov 23 13:10:28 2007 -0600
@@ -203,6 +203,12 @@ static void refill_skbs(void)
spin_unlock_irqrestore(skb_pool.lock, flags);
 }
 
+/* used to mark an skb as owned by netpoll */
+static void netpoll_skb_destroy(struct sk_buff *skb)
+{
+   return;
+}
+
 static void zap_completion_queue(void)
 {
unsigned long flags;
@@ -219,10 +225,12 @@ static void zap_completion_queue(void)
while (clist != NULL) {
struct sk_buff *skb = clist;
clist = clist-next;
-   if (skb-destructor)
+   if (skb-destructor == netpoll_skb_destroy) {
+   skb-destructor = NULL;
+   __kfree_skb(skb);
+   }
+   else
dev_kfree_skb_any(skb); /* put this one back */
-   else
-   __kfree_skb(skb);
}
}
 
@@ -252,6 +260,7 @@ repeat:
 
atomic_set(skb-users, 1);
skb_reserve(skb, reserve);
+   skb-destructor = netpoll_skb_destroy;
return skb;
 }
 

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Andrew Morton
On Thu, 22 Nov 2007 19:47:35 + Simon Arlott [EMAIL PROTECTED] wrote:

 WARN during log message being output to ttyS0 and netconsole:
 
 [2059664.615816] __iptables__: init4 IN=ppp0 OUT=ppp0 WARNING: at 
 kernel/softirq.c:139 local_bh_enable()
 [2059664.620535]  [80120364] local_bh_enable+0x3c/0x97
 [2059664.620553]  [802e3356] __nf_ct_ext_destroy+0x35/0x5b
 [2059664.620569]  [802dfbeb] destroy_conntrack+0x5e/0xf6
 [2059664.620577]  [802db821] nf_conntrack_destroy+0x1f/0x3f
 [2059664.620585]  [802c0c71] __kfree_skb+0xb8/0xf6
 [2059664.620597]  [802d00f0] zap_completion_queue+0x3e/0x64
 [2059664.620606]  [802d0583] find_skb+0x14/0x6b
 [2059664.620612]  [801167cc] inc_nr_running+0x12/0x25
 [2059664.620622]  [802d0fd8] netpoll_send_udp+0x2d/0x251
 [2059664.620630]  [80226135] uart_console_write+0x2a/0x33
 [2059664.620645]  [80241319] write_msg+0x32/0x41
 [2059664.620657]  [8011c205] __call_console_drivers+0x61/0x6d
 [2059664.620669]  [8011c3fc] release_console_sem+0x164/0x1bf
 [2059664.620679]  [8011c81f] vprintk+0x27a/0x2ff
 [2059664.620692]  [8013881e] handle_IRQ_event+0x1a/0x3f
 [2059664.620702]  [801203a5] local_bh_enable+0x7d/0x97
 [2059664.620709]  [8031ae6a] fn_hash_lookup+0xb0/0xcd
 [2059664.620723]  [8011c8bf] printk+0x1b/0x1f
 [2059664.620731]  [8032b372] ipt_log_packet+0x71/0x15e
 [2059664.620747]  [8032b4a0] ipt_log_target+0x41/0x4a
 [2059664.620757]  [802ea446] ipt_limit_match+0x58/0x76
 [2059664.620766]  [8032b45f] ipt_log_target+0x0/0x4a
 [2059664.620774]  [803288c5] ipt_do_table+0x3e2/0x479
 [2059664.620785]  [80331228] xfrm_policy_put_afinfo+0xa/0x1e
 [2059664.620800]  [80332219] xfrm_lookup+0x15/0x69
 [2059664.620809]  [802db7d0] nf_iterate+0x38/0x6a
 [2059664.620822]  [802dbb60] nf_hook_slow+0x57/0xe0
 [2059664.620830]  [802f1e7c] ip_forward_finish+0x0/0x22
 [2059664.620841]  [802f20d1] ip_forward+0x233/0x2ae
 [2059664.620849]  [802f1e7c] ip_forward_finish+0x0/0x22
 [2059664.620859]  [802f0e7a] ip_rcv+0x46f/0x49c
 [2059664.620866]  [802f04a8] ip_rcv_finish+0x0/0x2ab
 [2059664.620876]  [802f0a0b] ip_rcv+0x0/0x49c
 [2059664.620884]  [802c544a] netif_receive_skb+0x326/0x3ae
 [2059664.620894]  [802c6efe] process_backlog+0x6d/0xd2
 [2059664.620903]  [802c766e] net_rx_action+0x86/0x193
 [2059664.620911]  [80120001] __do_softirq+0x40/0x85
 [2059664.620919]  [8012006d] do_softirq+0x27/0x2b
 [2059664.620925]  [8012023d] irq_exit+0x2d/0x37
 [2059664.620931]  [801062d9] do_IRQ+0x7a/0x8d
 [2059664.620943]  [8010479b] common_interrupt+0x23/0x28
 [2059664.620954]  [80209f85] acpi_processor_idle+0x235/0x3a0
 [2059664.620967]  [80102344] cpu_idle+0x43/0x6c
 [2059664.620973]  [804aa99e] start_kernel+0x210/0x215
 [2059664.620989]  [804aa317] unknown_bootoption+0x0/0x195
 [2059664.620998]  ===
 [2059664.852188] SRC=66.249.93.147 DST=* LEN=40 TOS=0x00 PREC=0x00 TTL=53 
 ID=23868 DF PROTO=TCP SPT=80 DPT=55219 WINDOW=0 RES=0x00 RST URGP=0

If that trace is to be beieved we're doing nefilter stuff on packets which
were sent across netconsole.

This probably isn't anything the netfilter guys have thought about.  And
probably we don't want them to.  Is there some simple way in which we can
exempt netconsole from netfilter processing?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 01:11:20PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
wrote:
 On Fri, Nov 23, 2007 at 09:59:06PM +0300, Evgeniy Polyakov wrote:
  On Fri, Nov 23, 2007 at 09:51:01PM +0300, Evgeniy Polyakov ([EMAIL 
  PROTECTED]) wrote:
   On Fri, Nov 23, 2007 at 09:48:51PM +0300, Evgeniy Polyakov ([EMAIL 
   PROTECTED]) wrote:
Stop, we are trying to free skb without destructor and catch connection
tracking, so it is not a solution. To fix the problem we need to check
if it is not netfilter related, kind of this (not tested), Simon please
give it a try:
   
   And to be really cool we need to bypass skbs with xfrm attached, since
   its freeing also assumes BH context.
  
  What about compile options?
 
 What about my original suggestion that we mark skbs owned by netpoll
 and free only those. Much safer, no? Untested:

This should work if there are netpoll's skbs, but if we are under memory
pressure we want to free not only netpoll skbs, but at least one, and 
what if there are no netpoll skbs in the queue?

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Matt Mackall
On Fri, Nov 23, 2007 at 10:54:11PM +0300, Evgeniy Polyakov wrote:
 On Fri, Nov 23, 2007 at 01:41:39PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
 wrote:
  Here's another thought: move all this logic into the networking core,
  unify it with current softirq zapper, then allow it to be called from
  various other places (like atomic allocators). Then it'll all be in
  central maintained place with more users.
 
 This can be done quite easily - put a check into __kfree_skb() if
 netpoll is compiled-in and we are in hardirq context, then put skb
 into softirq freeing queue. Then zap_completion_queue() can free
 anything without ever knowing about nature of the packet, since this
 will be checked in __kfree_skb() anyway.

What I had in mind was moving the whole zap_completion_queue concept
into net/core/skbuff. So that netpoll (and, say, atomic kmalloc) can
simply call something like clean_completion_queue.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 10:54:10PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
 On Fri, Nov 23, 2007 at 01:41:39PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
 wrote:
  Here's another thought: move all this logic into the networking core,
  unify it with current softirq zapper, then allow it to be called from
  various other places (like atomic allocators). Then it'll all be in
  central maintained place with more users.
 
 This can be done quite easily - put a check into __kfree_skb() if
 netpoll is compiled-in and we are in hardirq context, then put skb
 into softirq freeing queue. Then zap_completion_queue() can free
 anything without ever knowing about nature of the packet, since this
 will be checked in __kfree_skb() anyway.

And let's add some mess...
But should fix the case when netpoll code is being executed in interrupt
context and is about to free skb, which should not be freed.

Frankly saying this looks like crap.

Crap-added-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 758dafe..88f8ea9 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -196,10 +196,7 @@ static void zap_completion_queue(void)
while (clist != NULL) {
struct sk_buff *skb = clist;
clist = clist-next;
-   if (skb-destructor)
-   dev_kfree_skb_any(skb); /* put this one back */
-   else
-   __kfree_skb(skb);
+   __kfree_skb(skb);
}
}
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 27cfe5f..8642097 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -318,6 +318,26 @@ void kfree_skbmem(struct sk_buff *skb)
 
 void __kfree_skb(struct sk_buff *skb)
 {
+#if defined(CONFIG_NETPOLL) || defined(CONFIG_NETPOLL_TRAP)
+   if (in_irq() || irqs_disabled()) {
+   if (skb-destructor) {
+   dev_kfree_skb_irq(skb);
+   return;
+   }
+#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
+   if (skb-nfct || skb-nfct_reasm) {
+   dev_kfree_skb_irq(skb);
+   return;
+   }
+#endif
+#ifdef CONFIG_XFRM
+   if (skb-sp) {
+   dev_kfree_skb_irq(skb);
+   return;
+   }
+#endif
+   }
+#endif
dst_release(skb-dst);
 #ifdef CONFIG_XFRM
secpath_put(skb-sp);

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Matt Mackall
On Fri, Nov 23, 2007 at 09:59:06PM +0300, Evgeniy Polyakov wrote:
 On Fri, Nov 23, 2007 at 09:51:01PM +0300, Evgeniy Polyakov ([EMAIL 
 PROTECTED]) wrote:
  On Fri, Nov 23, 2007 at 09:48:51PM +0300, Evgeniy Polyakov ([EMAIL 
  PROTECTED]) wrote:
   Stop, we are trying to free skb without destructor and catch connection
   tracking, so it is not a solution. To fix the problem we need to check
   if it is not netfilter related, kind of this (not tested), Simon please
   give it a try:
  
  And to be really cool we need to bypass skbs with xfrm attached, since
  its freeing also assumes BH context.
 
 What about compile options?

What about my original suggestion that we mark skbs owned by netpoll
and free only those. Much safer, no? Untested:

diff -r c60016ba6237 net/core/netpoll.c
--- a/net/core/netpoll.cTue Nov 13 09:09:36 2007 -0800
+++ b/net/core/netpoll.cFri Nov 23 13:10:28 2007 -0600
@@ -203,6 +203,12 @@ static void refill_skbs(void)
spin_unlock_irqrestore(skb_pool.lock, flags);
 }
 
+/* used to mark an skb as owned by netpoll */
+static void netpoll_skb_destroy(struct sk_buff *skb)
+{
+   return;
+}
+
 static void zap_completion_queue(void)
 {
unsigned long flags;
@@ -219,10 +225,12 @@ static void zap_completion_queue(void)
while (clist != NULL) {
struct sk_buff *skb = clist;
clist = clist-next;
-   if (skb-destructor)
+   if (skb-destructor == netpoll_skb_destroy) {
+   skb-destructor = NULL;
+   __kfree_skb(skb);
+   }
+   else
dev_kfree_skb_any(skb); /* put this one back */
-   else
-   __kfree_skb(skb);
}
}
 
@@ -252,6 +260,7 @@ repeat:
 
atomic_set(skb-users, 1);
skb_reserve(skb, reserve);
+   skb-destructor = netpoll_skb_destroy;
return skb;
 }
 

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Matt Mackall
On Fri, Nov 23, 2007 at 08:57:57PM +0300, Evgeniy Polyakov wrote:
 On Fri, Nov 23, 2007 at 11:07:56AM -0600, Matt Mackall ([EMAIL PROTECTED]) 
 wrote:
  On Fri, Nov 23, 2007 at 01:55:19PM +0300, Evgeniy Polyakov wrote:
   On Fri, Nov 23, 2007 at 12:21:57AM -0800, Andrew Morton ([EMAIL 
   PROTECTED]) wrote:
 [2059664.615816] __iptables__: init4 IN=ppp0 OUT=ppp0 WARNING: at 
 kernel/softirq.c:139 local_bh_enable()
 [2059664.620535]  [80120364] local_bh_enable+0x3c/0x97
   
 [2059664.620657]  [8011c205] __call_console_drivers+0x61/0x6d
 [2059664.620669]  [8011c3fc] release_console_sem+0x164/0x1bf
 [2059664.620679]  [8011c81f] vprintk+0x27a/0x2ff

If that trace is to be beieved we're doing nefilter stuff on packets 
which
were sent across netconsole.

This probably isn't anything the netfilter guys have thought about.  And
probably we don't want them to.  Is there some simple way in which we 
can
exempt netconsole from netfilter processing?
   
   This is not about netfilter, but about freeing skb in interrupt context, 
   which is not allowed, and in interrupt skbs are queued to be freed in 
   softirq,
   but netcnsole wants to flush softirq freeing queue. That is a question: 
   why?
  
  My memory here is hazy, but I think this exists to rescue netconsole
  in low-memory situations. This bit originated with Ingo, so maybe he
  can recall.
  
  Netpoll can process an arbitrary number of skbs inside a single
  interrupt. Think sysrq-t at one packet per line or kgdboe where the
  entire trace session can happen inside one very long interrupt.
  
  Perhaps we can refine this to mark netpoll's skbs (perhaps with
  -destructor?) and delete only skbs we own. As these are never passed
  through any of the other route/xfrm/filter code, they should be safe
  to delete even in irq context, yes?
  
   Removing zap_completion_queue() from find_skb() will fix the warning,
   but I'm not sure this is a correct fix. I've added Matt to the Cc list.
  
  Care to try the sysrq-t or OOM message tests?
 
 We basically can not free skbs there - if it is interrupt context and
 we are freeing some skb with destructor we will catch the warning anyway.

Perhaps I'm missing some context here. We don't free skbs with
destructors in irq context in zap_completion_queue. We reinsert them on the
completion list. We do this by calling dev_kfree_skb_any.

So I'd be surprised if that was a problem. But I can imagine having
problems for skbs without destructors which run into one of these in
__kfree_skb:

dst_release
secpath_put
nf_conntrack_put
nf_conntrack_put_reasm
nf_bridge_put

..some or all of which assume a softirq context.

 No matter if we are under memory pressure or whatever - it is not
 allowed - a lot of skbs are supposed to be freed in softirq context,
 that is why dev_kfree_skb_any() exists.

Some skbs we definitely -can- free in irq context. The only ones we
care about are the ones generated by netpoll. If there's a reason you
think netpoll's own skbs can't be freed, please describe it.

 I think we can drop skbs _without_ destructor from the queue though in
 that conditions given that we actually need only one.

Huh?

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 09:48:51PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
 Stop, we are trying to free skb without destructor and catch connection
 tracking, so it is not a solution. To fix the problem we need to check
 if it is not netfilter related, kind of this (not tested), Simon please
 give it a try:

And to be really cool we need to bypass skbs with xfrm attached, since
its freeing also assumes BH context.

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 758dafe..5f86e60 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -196,7 +196,8 @@ static void zap_completion_queue(void)
while (clist != NULL) {
struct sk_buff *skb = clist;
clist = clist-next;
-   if (skb-destructor)
+   if (skb-destructor || skb-nfct ||
+   skb-nfct_reasm || skb-sp)
dev_kfree_skb_any(skb); /* put this one back */
else
__kfree_skb(skb);


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 08:57:57PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
  My memory here is hazy, but I think this exists to rescue netconsole
  in low-memory situations. This bit originated with Ingo, so maybe he
  can recall.
  
  Netpoll can process an arbitrary number of skbs inside a single
  interrupt. Think sysrq-t at one packet per line or kgdboe where the
  entire trace session can happen inside one very long interrupt.
  
  Perhaps we can refine this to mark netpoll's skbs (perhaps with
  -destructor?) and delete only skbs we own. As these are never passed
  through any of the other route/xfrm/filter code, they should be safe
  to delete even in irq context, yes?
  
   Removing zap_completion_queue() from find_skb() will fix the warning,
   but I'm not sure this is a correct fix. I've added Matt to the Cc list.
  
  Care to try the sysrq-t or OOM message tests?
 
 We basically can not free skbs there - if it is interrupt context and
 we are freeing some skb with destructor we will catch the warning anyway.
 
 No matter if we are under memory pressure or whatever - it is not
 allowed - a lot of skbs are supposed to be freed in softirq context,
 that is why dev_kfree_skb_any() exists.
 
 I think we can drop skbs _without_ destructor from the queue though in
 that conditions given that we actually need only one.

Stop, we are trying to free skb without destructor and catch connection
tracking, so it is not a solution. To fix the problem we need to check
if it is not netfilter related, kind of this (not tested), Simon please
give it a try:

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 758dafe..855bb3f 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -196,7 +196,7 @@ static void zap_completion_queue(void)
while (clist != NULL) {
struct sk_buff *skb = clist;
clist = clist-next;
-   if (skb-destructor)
+   if (skb-destructor || skb-nfct || skb-nfct_reasm)
dev_kfree_skb_any(skb); /* put this one back */
else
__kfree_skb(skb);


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 12:21:57AM -0800, Andrew Morton ([EMAIL PROTECTED]) 
wrote:
  [2059664.615816] __iptables__: init4 IN=ppp0 OUT=ppp0 WARNING: at 
  kernel/softirq.c:139 local_bh_enable()
  [2059664.620535]  [80120364] local_bh_enable+0x3c/0x97

  [2059664.620657]  [8011c205] __call_console_drivers+0x61/0x6d
  [2059664.620669]  [8011c3fc] release_console_sem+0x164/0x1bf
  [2059664.620679]  [8011c81f] vprintk+0x27a/0x2ff
 
 If that trace is to be beieved we're doing nefilter stuff on packets which
 were sent across netconsole.
 
 This probably isn't anything the netfilter guys have thought about.  And
 probably we don't want them to.  Is there some simple way in which we can
 exempt netconsole from netfilter processing?

This is not about netfilter, but about freeing skb in interrupt context, 
which is not allowed, and in interrupt skbs are queued to be freed in softirq,
but netcnsole wants to flush softirq freeing queue. That is a question: why?

Removing zap_completion_queue() from find_skb() will fix the warning,
but I'm not sure this is a correct fix. I've added Matt to the Cc list.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 01:41:39PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
wrote:
 Here's another thought: move all this logic into the networking core,
 unify it with current softirq zapper, then allow it to be called from
 various other places (like atomic allocators). Then it'll all be in
 central maintained place with more users.

This can be done quite easily - put a check into __kfree_skb() if
netpoll is compiled-in and we are in hardirq context, then put skb
into softirq freeing queue. Then zap_completion_queue() can free
anything without ever knowing about nature of the packet, since this
will be checked in __kfree_skb() anyway.

Kind of this:

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 758dafe..88f8ea9 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -196,10 +196,7 @@ static void zap_completion_queue(void)
while (clist != NULL) {
struct sk_buff *skb = clist;
clist = clist-next;
-   if (skb-destructor)
-   dev_kfree_skb_any(skb); /* put this one back */
-   else
-   __kfree_skb(skb);
+   __kfree_skb(skb);
}
}
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 27cfe5f..f720685 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -318,6 +318,12 @@ void kfree_skbmem(struct sk_buff *skb)
 
 void __kfree_skb(struct sk_buff *skb)
 {
+#if defined(CONFIG_NETPOLL) || defined(CONFIG_NETPOLL_TRAP)
+   if (in_irq() || irqs_disabled()) {
+   dev_kfree_skb_irq(skb);
+   return;
+   }
+#endif
dst_release(skb-dst);
 #ifdef CONFIG_XFRM
secpath_put(skb-sp);

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 09:51:01PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
 On Fri, Nov 23, 2007 at 09:48:51PM +0300, Evgeniy Polyakov ([EMAIL 
 PROTECTED]) wrote:
  Stop, we are trying to free skb without destructor and catch connection
  tracking, so it is not a solution. To fix the problem we need to check
  if it is not netfilter related, kind of this (not tested), Simon please
  give it a try:
 
 And to be really cool we need to bypass skbs with xfrm attached, since
 its freeing also assumes BH context.

What about compile options?

Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED]

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 758dafe..adb3c54 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -196,10 +196,25 @@ static void zap_completion_queue(void)
while (clist != NULL) {
struct sk_buff *skb = clist;
clist = clist-next;
-   if (skb-destructor)
+   if (skb-destructor) {
dev_kfree_skb_any(skb); /* put this one back */
-   else
-   __kfree_skb(skb);
+   continue;
+   }
+
+#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
+   if (skb-nfct || skb-nfct_reasm) {
+   dev_kfree_skb_any(skb); /* put this one back */
+   continue;
+   }
+#endif
+
+#ifdef CONFIG_XFRM
+   if (skb-sp) {
+   dev_kfree_skb_any(skb); /* put this one back */
+   continue;
+   }
+#endif
+   __kfree_skb(skb);
}
}
 

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 11:07:56AM -0600, Matt Mackall ([EMAIL PROTECTED]) 
wrote:
 On Fri, Nov 23, 2007 at 01:55:19PM +0300, Evgeniy Polyakov wrote:
  On Fri, Nov 23, 2007 at 12:21:57AM -0800, Andrew Morton ([EMAIL PROTECTED]) 
  wrote:
[2059664.615816] __iptables__: init4 IN=ppp0 OUT=ppp0 WARNING: at 
kernel/softirq.c:139 local_bh_enable()
[2059664.620535]  [80120364] local_bh_enable+0x3c/0x97
  
[2059664.620657]  [8011c205] __call_console_drivers+0x61/0x6d
[2059664.620669]  [8011c3fc] release_console_sem+0x164/0x1bf
[2059664.620679]  [8011c81f] vprintk+0x27a/0x2ff
   
   If that trace is to be beieved we're doing nefilter stuff on packets which
   were sent across netconsole.
   
   This probably isn't anything the netfilter guys have thought about.  And
   probably we don't want them to.  Is there some simple way in which we can
   exempt netconsole from netfilter processing?
  
  This is not about netfilter, but about freeing skb in interrupt context, 
  which is not allowed, and in interrupt skbs are queued to be freed in 
  softirq,
  but netcnsole wants to flush softirq freeing queue. That is a question: why?
 
 My memory here is hazy, but I think this exists to rescue netconsole
 in low-memory situations. This bit originated with Ingo, so maybe he
 can recall.
 
 Netpoll can process an arbitrary number of skbs inside a single
 interrupt. Think sysrq-t at one packet per line or kgdboe where the
 entire trace session can happen inside one very long interrupt.
 
 Perhaps we can refine this to mark netpoll's skbs (perhaps with
 -destructor?) and delete only skbs we own. As these are never passed
 through any of the other route/xfrm/filter code, they should be safe
 to delete even in irq context, yes?
 
  Removing zap_completion_queue() from find_skb() will fix the warning,
  but I'm not sure this is a correct fix. I've added Matt to the Cc list.
 
 Care to try the sysrq-t or OOM message tests?

We basically can not free skbs there - if it is interrupt context and
we are freeing some skb with destructor we will catch the warning anyway.

No matter if we are under memory pressure or whatever - it is not
allowed - a lot of skbs are supposed to be freed in softirq context,
that is why dev_kfree_skb_any() exists.

I think we can drop skbs _without_ destructor from the queue though in
that conditions given that we actually need only one.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Matt Mackall
On Fri, Nov 23, 2007 at 01:55:19PM +0300, Evgeniy Polyakov wrote:
 On Fri, Nov 23, 2007 at 12:21:57AM -0800, Andrew Morton ([EMAIL PROTECTED]) 
 wrote:
   [2059664.615816] __iptables__: init4 IN=ppp0 OUT=ppp0 WARNING: at 
   kernel/softirq.c:139 local_bh_enable()
   [2059664.620535]  [80120364] local_bh_enable+0x3c/0x97
 
   [2059664.620657]  [8011c205] __call_console_drivers+0x61/0x6d
   [2059664.620669]  [8011c3fc] release_console_sem+0x164/0x1bf
   [2059664.620679]  [8011c81f] vprintk+0x27a/0x2ff
  
  If that trace is to be beieved we're doing nefilter stuff on packets which
  were sent across netconsole.
  
  This probably isn't anything the netfilter guys have thought about.  And
  probably we don't want them to.  Is there some simple way in which we can
  exempt netconsole from netfilter processing?
 
 This is not about netfilter, but about freeing skb in interrupt context, 
 which is not allowed, and in interrupt skbs are queued to be freed in softirq,
 but netcnsole wants to flush softirq freeing queue. That is a question: why?

My memory here is hazy, but I think this exists to rescue netconsole
in low-memory situations. This bit originated with Ingo, so maybe he
can recall.

Netpoll can process an arbitrary number of skbs inside a single
interrupt. Think sysrq-t at one packet per line or kgdboe where the
entire trace session can happen inside one very long interrupt.

Perhaps we can refine this to mark netpoll's skbs (perhaps with
-destructor?) and delete only skbs we own. As these are never passed
through any of the other route/xfrm/filter code, they should be safe
to delete even in irq context, yes?

 Removing zap_completion_queue() from find_skb() will fix the warning,
 but I'm not sure this is a correct fix. I've added Matt to the Cc list.

Care to try the sysrq-t or OOM message tests?

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Evgeniy Polyakov
On Fri, Nov 23, 2007 at 12:59:43PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
wrote:
 So I'd be surprised if that was a problem. But I can imagine having
 problems for skbs without destructors which run into one of these in
 __kfree_skb:
 
 dst_release
 secpath_put
 nf_conntrack_put
 nf_conntrack_put_reasm
 nf_bridge_put
 
 ..some or all of which assume a softirq context.

bridging is ok, others require softirq context.
I've sent a patch (the last one should be ok) to guard against xfrm and
connection tracking.

  No matter if we are under memory pressure or whatever - it is not
  allowed - a lot of skbs are supposed to be freed in softirq context,
  that is why dev_kfree_skb_any() exists.
 
 Some skbs we definitely -can- free in irq context. The only ones we
 care about are the ones generated by netpoll. If there's a reason you
 think netpoll's own skbs can't be freed, please describe it.

Only some and to distinguish them we can not use destructor - if it is
set (even empty function) it will fire an alarm.

  I think we can drop skbs _without_ destructor from the queue though in
  that conditions given that we actually need only one.
 
 Huh?

Don't mind - friday...
I posted a patch (third one should be ok) to fix this issue.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Matt Mackall
On Fri, Nov 23, 2007 at 10:15:24PM +0300, Evgeniy Polyakov wrote:
 On Fri, Nov 23, 2007 at 12:59:43PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
 wrote:
  So I'd be surprised if that was a problem. But I can imagine having
  problems for skbs without destructors which run into one of these in
  __kfree_skb:
  
  dst_release
  secpath_put
  nf_conntrack_put
  nf_conntrack_put_reasm
  nf_bridge_put
  
  ..some or all of which assume a softirq context.
 
 bridging is ok, others require softirq context.
 I've sent a patch (the last one should be ok) to guard against xfrm and
 connection tracking.
 
   No matter if we are under memory pressure or whatever - it is not
   allowed - a lot of skbs are supposed to be freed in softirq context,
   that is why dev_kfree_skb_any() exists.
  
  Some skbs we definitely -can- free in irq context. The only ones we
  care about are the ones generated by netpoll. If there's a reason you
  think netpoll's own skbs can't be freed, please describe it.
 
 Only some and to distinguish them we can not use destructor - if it is
 set (even empty function) it will fire an alarm.

Yep, please look at the patch I just posted.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-23 Thread Matt Mackall
On Fri, Nov 23, 2007 at 10:32:22PM +0300, Evgeniy Polyakov wrote:
 On Fri, Nov 23, 2007 at 01:11:20PM -0600, Matt Mackall ([EMAIL PROTECTED]) 
 wrote:
  On Fri, Nov 23, 2007 at 09:59:06PM +0300, Evgeniy Polyakov wrote:
   On Fri, Nov 23, 2007 at 09:51:01PM +0300, Evgeniy Polyakov ([EMAIL 
   PROTECTED]) wrote:
On Fri, Nov 23, 2007 at 09:48:51PM +0300, Evgeniy Polyakov ([EMAIL 
PROTECTED]) wrote:
 Stop, we are trying to free skb without destructor and catch 
 connection
 tracking, so it is not a solution. To fix the problem we need to check
 if it is not netfilter related, kind of this (not tested), Simon 
 please
 give it a try:

And to be really cool we need to bypass skbs with xfrm attached, since
its freeing also assumes BH context.
   
   What about compile options?
  
  What about my original suggestion that we mark skbs owned by netpoll
  and free only those. Much safer, no? Untested:
 
 This should work if there are netpoll's skbs, but if we are under memory
 pressure we want to free not only netpoll skbs, but at least one, and 
 what if there are no netpoll skbs in the queue?

Yeah, that's a concern (but note that we do have a private reserve and
we only really need the zap when our reserve is depleted). But I worry
that it's too fragile and if we add a new unsafe case, it won't be
noticed for a long time. This is the first report I've seen of this
particular problem, so this has been a latent bug for three or four
years now.

Here's another thought: move all this logic into the networking core,
unify it with current softirq zapper, then allow it to be called from
various other places (like atomic allocators). Then it'll all be in
central maintained place with more users.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html