Re: [PATCH 4/11] use ether_addr_equal_64bits

2013-12-31 Thread Ben Greear
org/majordomo-info.html -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.h

Re: [PATCH 4/11] use ether_addr_equal_64bits

2013-12-31 Thread Ben Greear
-info.html -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/11] use ether_addr_equal_64bits

2013-12-31 Thread Ben Greear
On 12/31/2013 08:09 AM, Julia Lawall wrote: On Tue, 31 Dec 2013, Ben Greear wrote: On 12/30/2013 10:32 PM, Julia Lawall wrote: I'm just thinking of a programmer, e.g. changing a struct like this: struct foo { u8 addr[ETH_ALEN]; - u16 dummy; }; I don't know of a way to catch

Re: Is __ffs64 supposed to be zero based?

2013-11-06 Thread Ben Greear
On 11/06/2013 03:52 AM, Clemens Ladisch wrote: > Ben Greear wrote: >> Similarly named methods elsewhere seem to indicate it is supposed to be >> ones-based counting (ie, bit (1<<0) would be considred 'bit 1'. > > ffs() is defined to use one-based counting: > <htt

Re: Is __ffs64 supposed to be zero based?

2013-11-06 Thread Ben Greear
On 11/06/2013 03:52 AM, Clemens Ladisch wrote: Ben Greear wrote: Similarly named methods elsewhere seem to indicate it is supposed to be ones-based counting (ie, bit (10) would be considred 'bit 1'. ffs() is defined to use one-based counting: http://pubs.opengroup.org/onlinepubs/9699919799

Is __ffs64 supposed to be zero based?

2013-11-05 Thread Ben Greear
Similarly named methods elsewhere seem to indicate it is supposed to be ones-based counting (ie, bit (1<<0) would be considred 'bit 1'. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe l

Is __ffs64 supposed to be zero based?

2013-11-05 Thread Ben Greear
Similarly named methods elsewhere seem to indicate it is supposed to be ones-based counting (ie, bit (10) would be considred 'bit 1'. Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line

3.11.0-rc1: Mellanox mlx5 fails to compile on 32-bit kernels

2013-07-17 Thread Ben Greear
Seems there is a 64-bit division in there somewhere. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo inf

3.11.0-rc1: Mellanox mlx5 fails to compile on 32-bit kernels

2013-07-17 Thread Ben Greear
Seems there is a 64-bit division in there somewhere. Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More

Re: kernel panic in skb_copy_bits

2013-06-29 Thread Ben Greear
On 06/29/2013 09:26 AM, Eric Dumazet wrote: On Sat, 2013-06-29 at 09:11 -0700, Ben Greear wrote: Do you know if your patch should go in 3.9? Yes it should. Ok, I'll add that to my tree. Your test case sounds a bit like what gives us the rare crash in tcp_collapse (we have lots

Re: kernel panic in skb_copy_bits

2013-06-29 Thread Ben Greear
-- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FA

Re: kernel panic in skb_copy_bits

2013-06-29 Thread Ben Greear
. Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: kernel panic in skb_copy_bits

2013-06-29 Thread Ben Greear
On 06/29/2013 09:26 AM, Eric Dumazet wrote: On Sat, 2013-06-29 at 09:11 -0700, Ben Greear wrote: Do you know if your patch should go in 3.9? Yes it should. Ok, I'll add that to my tree. Your test case sounds a bit like what gives us the rare crash in tcp_collapse (we have lots

Re: kmemleak reports in kernel 3.9.5+

2013-06-17 Thread Ben Greear
On 06/13/2013 08:50 AM, Catalin Marinas wrote: On Wed, Jun 12, 2013 at 01:28:13AM +0100, Ben Greear wrote: On 06/11/2013 12:52 PM, Ben Greear wrote: On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear wrote: We had a system go OOM while doing lots of wireless

Re: kmemleak reports in kernel 3.9.5+

2013-06-17 Thread Ben Greear
On 06/13/2013 08:50 AM, Catalin Marinas wrote: On Wed, Jun 12, 2013 at 01:28:13AM +0100, Ben Greear wrote: On 06/11/2013 12:52 PM, Ben Greear wrote: On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear gree...@candelatech.com wrote: We had a system go OOM while

Question on rcu_access_pointer, rcu_assign_pointer and locking.

2013-06-13 Thread Ben Greear
h this code within a single RCU period? I think that if the rcu_assign_pointer logic wasn't 'published' before a second thread came through this logic it could cause this leakage? The actual code I'm curious about is in net/mac80211/scan.c, in the cfg80211_bss_update method. Thanks, Ben -- B

Question on rcu_access_pointer, rcu_assign_pointer and locking.

2013-06-13 Thread Ben Greear
a single RCU period? I think that if the rcu_assign_pointer logic wasn't 'published' before a second thread came through this logic it could cause this leakage? The actual code I'm curious about is in net/mac80211/scan.c, in the cfg80211_bss_update method. Thanks, Ben -- Ben Greear gree

Re: kmemleak reports in kernel 3.9.5+

2013-06-11 Thread Ben Greear
On 06/11/2013 12:52 PM, Ben Greear wrote: On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear wrote: We had a system go OOM while doing lots of wireless stations. (System had 8GB of RAM, so I suspect a leak). I enabled kmemleak in a 3.9.5 (plus some local

Re: kmemleak reports in kernel 3.9.5+

2013-06-11 Thread Ben Greear
On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear wrote: We had a system go OOM while doing lots of wireless stations. (System had 8GB of RAM, so I suspect a leak). I enabled kmemleak in a 3.9.5 (plus some local patches) and I see the entries below. Any idea

Re: kmemleak reports in kernel 3.9.5+

2013-06-11 Thread Ben Greear
On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear wrote: We had a system go OOM while doing lots of wireless stations. (System had 8GB of RAM, so I suspect a leak). I enabled kmemleak in a 3.9.5 (plus some local patches) and I see the entries below. Any idea

Re: kmemleak reports in kernel 3.9.5+

2013-06-11 Thread Ben Greear
On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear gree...@candelatech.com wrote: We had a system go OOM while doing lots of wireless stations. (System had 8GB of RAM, so I suspect a leak). I enabled kmemleak in a 3.9.5 (plus some local patches) and I see

Re: kmemleak reports in kernel 3.9.5+

2013-06-11 Thread Ben Greear
On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear gree...@candelatech.com wrote: We had a system go OOM while doing lots of wireless stations. (System had 8GB of RAM, so I suspect a leak). I enabled kmemleak in a 3.9.5 (plus some local patches) and I see

Re: kmemleak reports in kernel 3.9.5+

2013-06-11 Thread Ben Greear
On 06/11/2013 12:52 PM, Ben Greear wrote: On 06/10/2013 03:32 PM, Catalin Marinas wrote: On 10 June 2013 19:22, Ben Greear gree...@candelatech.com wrote: We had a system go OOM while doing lots of wireless stations. (System had 8GB of RAM, so I suspect a leak). I enabled kmemleak in a 3.9.5

kmemleak reports in kernel 3.9.5+

2013-06-10 Thread Ben Greear
_rcu.clone.1+0x58/0x22a [] call_rcu+0x17/0x19 [] put_object+0x46/0x4a [] delete_object_full+0x2d/0x32 [] kmemleak_free+0x59/0x7a [] slab_free_hook+0x21/0x87 [] kmem_cache_free+0xbe/0x15d [] final_putname+0x38/0x3c -- Ben Greear Candela Technologies Inc http://www.can

Re: [PATCH v3] Fix lockup related to stop_machine being stuck in __do_softirq.

2013-06-10 Thread Ben Greear
On 06/06/2013 02:40 PM, Tejun Heo wrote: On Thu, Jun 06, 2013 at 02:29:49PM -0700, gree...@candelatech.com wrote: From: Ben Greear The stop machine logic can lock up if all but one of the migration threads make it through the disable-irq step and the one remaining thread gets stuck

Re: [PATCH v3] Fix lockup related to stop_machine being stuck in __do_softirq.

2013-06-10 Thread Ben Greear
On 06/06/2013 02:40 PM, Tejun Heo wrote: On Thu, Jun 06, 2013 at 02:29:49PM -0700, gree...@candelatech.com wrote: From: Ben Greear gree...@candelatech.com The stop machine logic can lock up if all but one of the migration threads make it through the disable-irq step and the one remaining

kmemleak reports in kernel 3.9.5+

2013-06-10 Thread Ben Greear
[815de663] kmemleak_free+0x59/0x7a [8118bc0a] slab_free_hook+0x21/0x87 [8118e888] kmem_cache_free+0xbe/0x15d [811a51c6] final_putname+0x38/0x3c -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe

Re: [PATCH v2] Fix lockup related to stop_machine being stuck in __do_softirq.

2013-06-06 Thread Ben Greear
add a link to the email thread.. The commit message and patch has enough info that I think an interested party could find the email thread easily enough if they needed more history. And, much of the email thread is me running in circles thinking I am going insane :) Thanks, Ben -- Ben Greea

Re: stop_machine lockup issue in 3.9.y.

2013-06-06 Thread Ben Greear
On 06/06/2013 01:55 PM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 08:41:01PM -0700, Ben Greear wrote: On 06/05/2013 08:26 PM, Eric Dumazet wrote: On Wed, 2013-06-05 at 20:14 -0700, Tejun Heo wrote: Ah, so, that's why it's showing up now. We probably have had the same issue all

Re: stop_machine lockup issue in 3.9.y.

2013-06-06 Thread Ben Greear
On 06/06/2013 01:55 PM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 08:41:01PM -0700, Ben Greear wrote: On 06/05/2013 08:26 PM, Eric Dumazet wrote: On Wed, 2013-06-05 at 20:14 -0700, Tejun Heo wrote: Ah, so, that's why it's showing up now. We probably have had the same issue all

Re: [PATCH v2] Fix lockup related to stop_machine being stuck in __do_softirq.

2013-06-06 Thread Ben Greear
to the email thread.. The commit message and patch has enough info that I think an interested party could find the email thread easily enough if they needed more history. And, much of the email thread is me running in circles thinking I am going insane :) Thanks, Ben -- Ben Greear gree

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
On 06/05/2013 08:46 PM, Eric Dumazet wrote: On Wed, 2013-06-05 at 20:41 -0700, Ben Greear wrote: On 06/05/2013 08:26 PM, Eric Dumazet wrote: On Wed, 2013-06-05 at 20:14 -0700, Tejun Heo wrote: Ah, so, that's why it's showing up now. We probably have had the same issue all along but it used

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
agree on the max number of loops (and if indeed my version of the patch is acceptable). Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
On 06/05/2013 02:11 PM, Tejun Heo wrote: (cc'ing wireless crowd, tglx and Ingo. The original thread is at http://thread.gmane.org/gmane.linux.kernel/1500158/focus=55005 ) Hello, Ben. On Wed, Jun 05, 2013 at 01:58:31PM -0700, Ben Greear wrote: Hmm, wonder if I found it. I previously saw

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
On 06/05/2013 12:31 PM, Ben Greear wrote: This is no longer really about the module unlink, so changing subject. On 06/05/2013 12:11 PM, Ben Greear wrote: On 06/05/2013 11:48 AM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 09:59:00AM -0700, Ben Greear wrote: One pattern I notice

stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
This is no longer really about the module unlink, so changing subject. On 06/05/2013 12:11 PM, Ben Greear wrote: On 06/05/2013 11:48 AM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 09:59:00AM -0700, Ben Greear wrote: One pattern I notice repeating for at least most of the hangs

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-05 Thread Ben Greear
On 06/05/2013 11:48 AM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 09:59:00AM -0700, Ben Greear wrote: One pattern I notice repeating for at least most of the hangs is that all but one CPU thread has irqs disabled and is in state 2. But, there will be one thread in state 1

Re: 3.9.x: Possible race related to stop_machine leads to lockup.

2013-06-05 Thread Ben Greear
On 06/04/2013 09:41 PM, Rusty Russell wrote: Ben Greear writes: On 06/04/2013 02:18 PM, Ben Greear wrote: I've been trying to figure out why I see the migration/* processes hang in a busy loop While reading the stop_machine.c file, I think I might have an answer. The set_state() method

Re: 3.9.x: Possible race related to stop_machine leads to lockup.

2013-06-05 Thread Ben Greear
On 06/04/2013 09:41 PM, Rusty Russell wrote: Ben Greear gree...@candelatech.com writes: On 06/04/2013 02:18 PM, Ben Greear wrote: I've been trying to figure out why I see the migration/* processes hang in a busy loop While reading the stop_machine.c file, I think I might have an answer

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-05 Thread Ben Greear
On 06/05/2013 11:48 AM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 09:59:00AM -0700, Ben Greear wrote: One pattern I notice repeating for at least most of the hangs is that all but one CPU thread has irqs disabled and is in state 2. But, there will be one thread in state 1

stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
This is no longer really about the module unlink, so changing subject. On 06/05/2013 12:11 PM, Ben Greear wrote: On 06/05/2013 11:48 AM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 09:59:00AM -0700, Ben Greear wrote: One pattern I notice repeating for at least most of the hangs

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
On 06/05/2013 12:31 PM, Ben Greear wrote: This is no longer really about the module unlink, so changing subject. On 06/05/2013 12:11 PM, Ben Greear wrote: On 06/05/2013 11:48 AM, Tejun Heo wrote: Hello, Ben. On Wed, Jun 05, 2013 at 09:59:00AM -0700, Ben Greear wrote: One pattern I notice

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
On 06/05/2013 02:11 PM, Tejun Heo wrote: (cc'ing wireless crowd, tglx and Ingo. The original thread is at http://thread.gmane.org/gmane.linux.kernel/1500158/focus=55005 ) Hello, Ben. On Wed, Jun 05, 2013 at 01:58:31PM -0700, Ben Greear wrote: Hmm, wonder if I found it. I previously saw

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
agree on the max number of loops (and if indeed my version of the patch is acceptable). Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord

Re: stop_machine lockup issue in 3.9.y.

2013-06-05 Thread Ben Greear
On 06/05/2013 08:46 PM, Eric Dumazet wrote: On Wed, 2013-06-05 at 20:41 -0700, Ben Greear wrote: On 06/05/2013 08:26 PM, Eric Dumazet wrote: On Wed, 2013-06-05 at 20:14 -0700, Tejun Heo wrote: Ah, so, that's why it's showing up now. We probably have had the same issue all along but it used

Re: 3.9.x: Possible race related to stop_machine leads to lockup.

2013-06-04 Thread Ben Greear
On 06/04/2013 02:18 PM, Ben Greear wrote: I've been trying to figure out why I see the migration/* processes hang in a busy loop While reading the stop_machine.c file, I think I might have an answer. The set_state() method sets the thread_ack to the current number of threads. Each

3.9.x: Possible race related to stop_machine leads to lockup.

2013-06-04 Thread Ben Greear
. Does this make sense? Any ideas on how to fix this properly? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majo

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-04 Thread Ben Greear
On 06/04/2013 09:53 AM, Ben Greear wrote: On 06/04/2013 07:07 AM, Joe Lawrence wrote: On Tue, 04 Jun 2013 15:26:28 +0930 Rusty Russell wrote: Do you have a backtrace of the 3.9.4 crash? You can add "CFLAGS_module.o = -O0" to get a clearer backtrace if you want... Hi Rusty,

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-04 Thread Ben Greear
s at 1, so no progress is ever made. http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg443471.html Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to major

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-04 Thread Ben Greear
at 1, so no progress is ever made. http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg443471.html Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-04 Thread Ben Greear
On 06/04/2013 09:53 AM, Ben Greear wrote: On 06/04/2013 07:07 AM, Joe Lawrence wrote: On Tue, 04 Jun 2013 15:26:28 +0930 Rusty Russell ru...@rustcorp.com.au wrote: Do you have a backtrace of the 3.9.4 crash? You can add CFLAGS_module.o = -O0 to get a clearer backtrace if you want... Hi

3.9.x: Possible race related to stop_machine leads to lockup.

2013-06-04 Thread Ben Greear
. Does this make sense? Any ideas on how to fix this properly? Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More

Re: 3.9.x: Possible race related to stop_machine leads to lockup.

2013-06-04 Thread Ben Greear
On 06/04/2013 02:18 PM, Ben Greear wrote: I've been trying to figure out why I see the migration/* processes hang in a busy loop While reading the stop_machine.c file, I think I might have an answer. The set_state() method sets the thread_ack to the current number of threads. Each

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-03 Thread Ben Greear
On 06/03/2013 08:59 AM, Ben Greear wrote: On 06/03/2013 07:17 AM, Joe Lawrence wrote: Hi Rusty, I had pointed Ben (offlist) to that bugzilla entry without realizing there were other earlier related fixes in this space. Re-viewing bz- 58011, it looks like it was opened against 3.8.12, while

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-03 Thread Ben Greear
ists in 3.9.4 for me, so I will also try applying that other kobject patch and continue testing today... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a messag

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-03 Thread Ben Greear
that other kobject patch and continue testing today... Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More

Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-06-03 Thread Ben Greear
On 06/03/2013 08:59 AM, Ben Greear wrote: On 06/03/2013 07:17 AM, Joe Lawrence wrote: Hi Rusty, I had pointed Ben (offlist) to that bugzilla entry without realizing there were other earlier related fixes in this space. Re-viewing bz- 58011, it looks like it was opened against 3.8.12, while

Re: 3.9.4+ watchdog overflow and migration lockup soon after boot (cpu_stopper_thread related?)

2013-05-31 Thread Ben Greear
On 05/31/2013 02:40 PM, Ben Greear wrote: On 05/31/2013 12:22 PM, Ben Greear wrote: While trying to verify that the kobject patch (see "Please add to stable: module: don't unlink the module until we've removed all exposure." email) fixed the problems I was seeing, I hit what

Re: 3.9.4+ watchdog overflow and migration lockup soon after boot (cpu_stopper_thread related?)

2013-05-31 Thread Ben Greear
On 05/31/2013 12:22 PM, Ben Greear wrote: While trying to verify that the kobject patch (see "Please add to stable: module: don't unlink the module until we've removed all exposure." email) fixed the problems I was seeing, I hit what I believe is a different problem. Much harder to

Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-05-31 Thread Ben Greear
it Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please re

Please add to stable: module: don't unlink the module until we've removed all exposure.

2013-05-31 Thread Ben Greear
it Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 3.9.4+ watchdog overflow and migration lockup soon after boot (cpu_stopper_thread related?)

2013-05-31 Thread Ben Greear
On 05/31/2013 12:22 PM, Ben Greear wrote: While trying to verify that the kobject patch (see Please add to stable: module: don't unlink the module until we've removed all exposure. email) fixed the problems I was seeing, I hit what I believe is a different problem. Much harder to reproduce

Re: 3.9.4+ watchdog overflow and migration lockup soon after boot (cpu_stopper_thread related?)

2013-05-31 Thread Ben Greear
On 05/31/2013 02:40 PM, Ben Greear wrote: On 05/31/2013 12:22 PM, Ben Greear wrote: While trying to verify that the kobject patch (see Please add to stable: module: don't unlink the module until we've removed all exposure. email) fixed the problems I was seeing, I hit what I believe

Re: Question on mod_sysfs_init and kobject_put in error handling code.

2013-05-30 Thread Ben Greear
On 05/30/2013 12:39 PM, Ben Greear wrote: I'm seeing a crash (on hacked 3.9.3+ kernels). It's rare, but in a kernel larded down with debugging, we are having some luck reproducing it. Please note, this kernel is running a fair amount of my patches, so it could be my bug. We did not see

Question on mod_sysfs_init and kobject_put in error handling code.

2013-05-30 Thread Ben Greear
err = -EINVAL; goto out; } -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kerne

Question on mod_sysfs_init and kobject_put in error handling code.

2013-05-30 Thread Ben Greear
); if (kobj) { printk(KERN_ERR %s: module is already loaded\n, mod-name); kobject_put(kobj); err = -EINVAL; goto out; } -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com

Re: Question on mod_sysfs_init and kobject_put in error handling code.

2013-05-30 Thread Ben Greear
On 05/30/2013 12:39 PM, Ben Greear wrote: I'm seeing a crash (on hacked 3.9.3+ kernels). It's rare, but in a kernel larded down with debugging, we are having some luck reproducing it. Please note, this kernel is running a fair amount of my patches, so it could be my bug. We did not see

Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

2013-05-28 Thread Ben Greear
On 05/28/2013 09:15 AM, Rafael Aquini wrote: On Tue, May 28, 2013 at 09:00:45AM -0700, Ben Greear wrote: On 05/27/2013 03:41 PM, Francois Romieu wrote: atom...@redhat.com : [...] Failed GFP_ATOMIC allocations by the network stack result in dropped packets, which will be received

Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

2013-05-28 Thread Ben Greear
because some shit ends in your backyard. We should rate-limit these messages at least. When a system is low on memory the logs can quickly fill up with useless OOM messages, further slowing the system... Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com

Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

2013-05-28 Thread Ben Greear
paper over it just because some shit ends in your backyard. We should rate-limit these messages at least. When a system is low on memory the logs can quickly fill up with useless OOM messages, further slowing the system... Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc

Re: [Patch v2] skbuff: Hide GFP_ATOMIC page allocation failures for dropped packets

2013-05-28 Thread Ben Greear
On 05/28/2013 09:15 AM, Rafael Aquini wrote: On Tue, May 28, 2013 at 09:00:45AM -0700, Ben Greear wrote: On 05/27/2013 03:41 PM, Francois Romieu wrote: atom...@redhat.com atom...@redhat.com : [...] Failed GFP_ATOMIC allocations by the network stack result in dropped packets, which

soft lockup in 3.9.3 (with local patches)

2013-05-21 Thread Ben Greear
pus with NMI drm_kms_helper: panic occurred, switching back to text console Rebooting in 10 seconds.. -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.ker

soft lockup in 3.9.3 (with local patches)

2013-05-21 Thread Ben Greear
/0xb0 [810b4954] ? kthread_freezable_should_stop+0x60/0x60 Shutting down cpus with NMI drm_kms_helper: panic occurred, switching back to text console Rebooting in 10 seconds.. -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from

Re: relayfs question related to removing parent directory.

2013-05-08 Thread Ben Greear
On 05/08/2013 01:35 PM, Al Viro wrote: On Wed, May 08, 2013 at 01:32:06PM -0700, Ben Greear wrote: I'm seeing a crash when unloading the ath9k module. It seems relay_close() is being passed bad memory. The relay_open call uses an ath9k debugfs directory, so that may be removed before the call

relayfs question related to removing parent directory.

2013-05-08 Thread Ben Greear
is removed? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html P

relayfs question related to removing parent directory.

2013-05-08 Thread Ben Greear
is removed? Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo

Re: relayfs question related to removing parent directory.

2013-05-08 Thread Ben Greear
On 05/08/2013 01:35 PM, Al Viro wrote: On Wed, May 08, 2013 at 01:32:06PM -0700, Ben Greear wrote: I'm seeing a crash when unloading the ath9k module. It seems relay_close() is being passed bad memory. The relay_open call uses an ath9k debugfs directory, so that may be removed before the call

Re: Jitter and latency benchmarks with netlink / nl80211

2013-04-10 Thread Ben Greear
On 04/10/2013 11:05 AM, Luis R. Rodriguez wrote: On Wed, Apr 10, 2013 at 10:59 AM, Ben Greear wrote: On 04/10/2013 10:55 AM, Luis R. Rodriguez wrote: Curious if anyone has worked on latency and jitter benchmarks in using netlink, specifically with nl80211. Has anyone benchmarked this? Ben

Re: Jitter and latency benchmarks with netlink / nl80211

2013-04-10 Thread Ben Greear
timing related to netlink configuration/report requests? We have lots of ways to measure wifi packet throughput latencies and jitter (to/from userspace and/or to/from pktgen), but it doesn't use netlink Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com

Re: Jitter and latency benchmarks with netlink / nl80211

2013-04-10 Thread Ben Greear
timing related to netlink configuration/report requests? We have lots of ways to measure wifi packet throughput latencies and jitter (to/from userspace and/or to/from pktgen), but it doesn't use netlink Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http

Re: Jitter and latency benchmarks with netlink / nl80211

2013-04-10 Thread Ben Greear
On 04/10/2013 11:05 AM, Luis R. Rodriguez wrote: On Wed, Apr 10, 2013 at 10:59 AM, Ben Greear gree...@candelatech.com wrote: On 04/10/2013 10:55 AM, Luis R. Rodriguez wrote: Curious if anyone has worked on latency and jitter benchmarks in using netlink, specifically with nl80211. Has anyone

Re: 3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-07 Thread Ben Greear
On 03/07/2013 11:19 AM, Mateusz Guzik wrote: On Tue, Mar 05, 2013 at 10:54:56AM -0800, Ben Greear wrote: In doing some CIFS testing (utilizing it's feature to bind to local address..but not sure that matters), we saw this error when trying to un-mount. Our kernel is patched (nfs, some

Re: 3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-07 Thread Ben Greear
On 03/07/2013 11:19 AM, Mateusz Guzik wrote: On Tue, Mar 05, 2013 at 10:54:56AM -0800, Ben Greear wrote: In doing some CIFS testing (utilizing it's feature to bind to local address..but not sure that matters), we saw this error when trying to un-mount. Our kernel is patched (nfs, some

Re: 3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-05 Thread Ben Greear
On 03/05/2013 01:09 PM, Jeff Layton wrote: On Tue, 05 Mar 2013 11:42:46 -0800 Ben Greear wrote: On 03/05/2013 11:22 AM, Jeff Layton wrote: On Tue, 5 Mar 2013 14:08:49 -0500 Jeff Layton wrote: On Tue, 05 Mar 2013 10:54:56 -0800 Ben Greear wrote: In doing some CIFS testing (utilizing

Re: 3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-05 Thread Ben Greear
On 03/05/2013 11:22 AM, Jeff Layton wrote: On Tue, 5 Mar 2013 14:08:49 -0500 Jeff Layton wrote: On Tue, 05 Mar 2013 10:54:56 -0800 Ben Greear wrote: In doing some CIFS testing (utilizing it's feature to bind to local address..but not sure that matters), we saw this error when trying to un

3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-05 Thread Ben Greear
39 e3 75 3c 48 8b 93 90 00 00 00 48 RIP [] shrink_dcache_for_umount_subtree+0x84/0x194 RSP ---[ end trace 9b2978a89532c292 ]--- -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the b

3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-05 Thread Ben Greear
8800c0085dc8 ---[ end trace 9b2978a89532c292 ]--- -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http

Re: 3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-05 Thread Ben Greear
On 03/05/2013 11:22 AM, Jeff Layton wrote: On Tue, 5 Mar 2013 14:08:49 -0500 Jeff Layton jlay...@redhat.com wrote: On Tue, 05 Mar 2013 10:54:56 -0800 Ben Greear gree...@candelatech.com wrote: In doing some CIFS testing (utilizing it's feature to bind to local address..but not sure

Re: 3.7.10+: BUG Dentry still in use [unmount of cifs cifs]

2013-03-05 Thread Ben Greear
On 03/05/2013 01:09 PM, Jeff Layton wrote: On Tue, 05 Mar 2013 11:42:46 -0800 Ben Greear gree...@candelatech.com wrote: On 03/05/2013 11:22 AM, Jeff Layton wrote: On Tue, 5 Mar 2013 14:08:49 -0500 Jeff Layton jlay...@redhat.com wrote: On Tue, 05 Mar 2013 10:54:56 -0800 Ben Greear gree

Re: [RFC PATCH 0/5] net: low latency Ethernet device polling

2013-02-27 Thread Ben Greear
as the low latency traffic at all. Have you done any tests for bulk throughput with busy-poll? Yes, it will eat a core, but that might be worth it in some cases if there was significant throughput increase... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com

Re: [RFC PATCH 0/5] net: low latency Ethernet device polling

2013-02-27 Thread Ben Greear
as the low latency traffic at all. Have you done any tests for bulk throughput with busy-poll? Yes, it will eat a core, but that might be worth it in some cases if there was significant throughput increase... Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http

[tip:core/locking] lockdep: Print more info when MAX_LOCK_DEPTH is exceeded

2013-02-22 Thread tip-bot for Ben Greear
Commit-ID: c0540606837af79b2ae101e5e7b2206e3844d150 Gitweb: http://git.kernel.org/tip/c0540606837af79b2ae101e5e7b2206e3844d150 Author: Ben Greear AuthorDate: Wed, 6 Feb 2013 10:56:19 -0800 Committer: Ingo Molnar CommitDate: Tue, 19 Feb 2013 08:42:44 +0100 lockdep: Print more info when

[tip:core/locking] lockdep: Print more info when MAX_LOCK_DEPTH is exceeded

2013-02-22 Thread tip-bot for Ben Greear
Commit-ID: c0540606837af79b2ae101e5e7b2206e3844d150 Gitweb: http://git.kernel.org/tip/c0540606837af79b2ae101e5e7b2206e3844d150 Author: Ben Greear gree...@candelatech.com AuthorDate: Wed, 6 Feb 2013 10:56:19 -0800 Committer: Ingo Molnar mi...@kernel.org CommitDate: Tue, 19 Feb 2013 08:42

[tip:core/locking] lockdep: Print more info when MAX_LOCK_DEPTH is exceeded

2013-02-07 Thread tip-bot for Ben Greear
Commit-ID: 7a508076d4efdfd4fcb6fbd50a32d2c1a6e98791 Gitweb: http://git.kernel.org/tip/7a508076d4efdfd4fcb6fbd50a32d2c1a6e98791 Author: Ben Greear AuthorDate: Wed, 6 Feb 2013 10:56:19 -0800 Committer: Ingo Molnar CommitDate: Thu, 7 Feb 2013 12:17:43 +0100 lockdep: Print more info when

[tip:core/locking] lockdep: Print more info when MAX_LOCK_DEPTH is exceeded

2013-02-07 Thread tip-bot for Ben Greear
Commit-ID: 7a508076d4efdfd4fcb6fbd50a32d2c1a6e98791 Gitweb: http://git.kernel.org/tip/7a508076d4efdfd4fcb6fbd50a32d2c1a6e98791 Author: Ben Greear gree...@candelatech.com AuthorDate: Wed, 6 Feb 2013 10:56:19 -0800 Committer: Ingo Molnar mi...@kernel.org CommitDate: Thu, 7 Feb 2013 12:17

Re: [PATCH] net: mac80211/cfg.c: fix error using of sizeof()

2013-02-06 Thread Ben Greear
On 02/06/2013 09:54 AM, Ben Greear wrote: On 02/06/2013 08:27 AM, Johannes Berg wrote: On Wed, 2013-02-06 at 17:23 +0100, Cong Ding wrote: Using 'sizeof' on array given as function argument returns size of a pointer rather than the size of array. Oops, yeah, Stephen Hemminger pointed

Re: [PATCH] net: mac80211/cfg.c: fix error using of sizeof()

2013-02-06 Thread Ben Greear
turn 0; } -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read

Re: [PATCH] net: mac80211/cfg.c: fix error using of sizeof()

2013-02-06 Thread Ben Greear
(setup->mcast_rate)); Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-in

Re: Question on lockdep and MAX_LOCK_DEPTH

2013-02-06 Thread Ben Greear
On 02/06/2013 08:07 AM, Steven Rostedt wrote: On Wed, 2013-02-06 at 07:56 -0800, Ben Greear wrote: I'm 99% sure that the bug is in your modifications. I'm sorry, I tried to make that clear. You said it was an out of tree module, I didn't realize it had changes to the core Linux as well

Re: Question on lockdep and MAX_LOCK_DEPTH

2013-02-06 Thread Ben Greear
On 02/06/2013 05:21 AM, Steven Rostedt wrote: On Tue, 2013-02-05 at 22:23 -0800, Ben Greear wrote: On 02/05/2013 08:36 PM, Steven Rostedt wrote: On Tue, 2013-02-05 at 19:30 -0800, Ben Greear wrote: It's huge, so here's a link: http://www.candelatech.com/~greearb/debug.tgz The trace shows

<    1   2   3   4   5   >