Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
Hi Dave, I think here is the overflow problem. Not the stackoverflow, but the array index overflow. Please have a look at the following path: numa_init() |---> numa_register_memblks() | |---> memblock_set_node(memory) set correct nid in memblock.memory | |--->

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 01:17:21PM +0800, Tang Chen wrote: > Seeing from your earlier mail, it crashed at: > > while (zonelist_zone_idx(z) > highest_zoneidx) >de: 3b 77 08cmp0x8(%rdi),%esi > > > I stuck this at the top of the function.. >

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 12:47 PM, Dave Jones wrote: On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote: > On 01/28/2014 11:55 AM, Dave Jones wrote: > > On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: > > > >> > I did a bisect with the patch above applied each step of

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 12:47 PM, Dave Jones wrote: On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote: > On 01/28/2014 11:55 AM, Dave Jones wrote: > > On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: > > > >> > I did a bisect with the patch above applied each step of

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote: > On 01/28/2014 11:55 AM, Dave Jones wrote: > > On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: > > > > > > I did a bisect with the patch above applied each step of the way. > > > > This time I got a plausible

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 11:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: > > I did a bisect with the patch above applied each step of the way. > > This time I got a plausible looking result > > I cannot reproduce this. Would you please share how to

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: > > I did a bisect with the patch above applied each step of the way. > > This time I got a plausible looking result > > I cannot reproduce this. Would you please share how to reproduce it ? > Or does it just happen during the

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 10:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: > On 01/28/2014 08:32 AM, David Rientjes wrote: > > On Wed, 22 Jan 2014, David Rientjes wrote: > > > >>>arch/x86/mm/numa.c | 2 +- > >>>1 file changed, 1 insertion(+), 1

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 10:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: > On 01/28/2014 08:32 AM, David Rientjes wrote: > > On Wed, 22 Jan 2014, David Rientjes wrote: > > > >>>arch/x86/mm/numa.c | 2 +- > >>>1 file changed, 1 insertion(+), 1

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: > On 01/28/2014 08:32 AM, David Rientjes wrote: > > On Wed, 22 Jan 2014, David Rientjes wrote: > > > >>> arch/x86/mm/numa.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/arch/x86/mm/numa.c

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 08:32 AM, David Rientjes wrote: On Wed, 22 Jan 2014, David Rientjes wrote: arch/x86/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 81b2750..ebefeb7 100644 --- a/arch/x86/mm/numa.c +++

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread David Rientjes
On Wed, 22 Jan 2014, David Rientjes wrote: > > arch/x86/mm/numa.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > > index 81b2750..ebefeb7 100644 > > --- a/arch/x86/mm/numa.c > > +++ b/arch/x86/mm/numa.c > > @@ -562,10

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Mon, Jan 27, 2014 at 03:29:03PM +0800, Tang Chen wrote: > > Some build tests show.. > > > > MAXSMP ( NODESHIFT=10 ) : Bug > > NRCPUS=4& NODESHIFT=10 : Bug > > NRCPUS=4& NODESHIFT=1 : no bug > > > > > > The middle config test was accidental, I hadn't realised disabling MAXSMP > >

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Mon, Jan 27, 2014 at 03:29:03PM +0800, Tang Chen wrote: Some build tests show.. MAXSMP ( NODESHIFT=10 ) : Bug NRCPUS=4 NODESHIFT=10 : Bug NRCPUS=4 NODESHIFT=1 : no bug The middle config test was accidental, I hadn't realised disabling MAXSMP wouldn't reset

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread David Rientjes
On Wed, 22 Jan 2014, David Rientjes wrote: arch/x86/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 81b2750..ebefeb7 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -562,10 +562,10 @@ static

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 08:32 AM, David Rientjes wrote: On Wed, 22 Jan 2014, David Rientjes wrote: arch/x86/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 81b2750..ebefeb7 100644 --- a/arch/x86/mm/numa.c +++

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: On 01/28/2014 08:32 AM, David Rientjes wrote: On Wed, 22 Jan 2014, David Rientjes wrote: arch/x86/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 10:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: On 01/28/2014 08:32 AM, David Rientjes wrote: On Wed, 22 Jan 2014, David Rientjes wrote: arch/x86/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 10:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: On 01/28/2014 08:32 AM, David Rientjes wrote: On Wed, 22 Jan 2014, David Rientjes wrote: arch/x86/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: I did a bisect with the patch above applied each step of the way. This time I got a plausible looking result I cannot reproduce this. Would you please share how to reproduce it ? Or does it just happen during the booting

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 11:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: I did a bisect with the patch above applied each step of the way. This time I got a plausible looking result I cannot reproduce this. Would you please share how to

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote: On 01/28/2014 11:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: I did a bisect with the patch above applied each step of the way. This time I got a plausible looking result

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 12:47 PM, Dave Jones wrote: On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote: On 01/28/2014 11:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: I did a bisect with the patch above applied each step of the

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
On 01/28/2014 12:47 PM, Dave Jones wrote: On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote: On 01/28/2014 11:55 AM, Dave Jones wrote: On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote: I did a bisect with the patch above applied each step of the

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 01:17:21PM +0800, Tang Chen wrote: Seeing from your earlier mail, it crashed at: while (zonelist_zone_idx(z) highest_zoneidx) de: 3b 77 08cmp0x8(%rdi),%esi I stuck this at the top of the function..

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen
Hi Dave, I think here is the overflow problem. Not the stackoverflow, but the array index overflow. Please have a look at the following path: numa_init() |--- numa_register_memblks() | |--- memblock_set_node(memory) set correct nid in memblock.memory | |---

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-26 Thread Tang Chen
On 01/24/2014 06:31 AM, Dave Jones wrote: On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote: > 128 bytes is a pretty small amount of stack though, so I'm just as confused > as to what the actual bug here is. > > After trying the proposed fix, I got another oops in the

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-26 Thread Tang Chen
On 01/24/2014 06:31 AM, Dave Jones wrote: On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote: 128 bytes is a pretty small amount of stack though, so I'm just as confused as to what the actual bug here is. After trying the proposed fix, I got another oops in the early

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote: > 128 bytes is a pretty small amount of stack though, so I'm just as confused > as to what the actual bug here is. > > After trying the proposed fix, I got another oops in the early init code.. > > > nr_free_zone_pages >

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:15:51PM -0800, David Rientjes wrote: > On Thu, 23 Jan 2014, Dave Jones wrote: > > > It's 10, because I had MAXSMP set. > > > > So, MAX_NUMNODES = 1 << 10 > > > > And the bitmask is made of longs. 1024 of them. > > > > How does this work ? > > > > It's

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:15:51PM -0800, David Rientjes wrote: On Thu, 23 Jan 2014, Dave Jones wrote: It's 10, because I had MAXSMP set. So, MAX_NUMNODES = 1 10 And the bitmask is made of longs. 1024 of them. How does this work ? It's 1024 bits. ok, I got

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote: 128 bytes is a pretty small amount of stack though, so I'm just as confused as to what the actual bug here is. After trying the proposed fix, I got another oops in the early init code.. trace nr_free_zone_pages

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Tang Chen
On 01/23/2014 02:13 PM, Dave Jones wrote: On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote: > On Thu, 23 Jan 2014, Tang Chen wrote: > .. > > I guess it depends on what Dave's CONFIG_NODES_SHIFT is? It's 10, because I had MAXSMP set. So, MAX_NUMNODES = 1<< 10 And

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Dave Jones wrote: > It's 10, because I had MAXSMP set. > > So, MAX_NUMNODES = 1 << 10 > > And the bitmask is made of longs. 1024 of them. > > How does this work ? > It's 1024 bits. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote: > On Thu, 23 Jan 2014, Tang Chen wrote: > > > Dave found that the kernel will hang during boot. This is because > > the nodemask_t type stack variable numa_kernel_nodes is large enough > > to overflow the stack. > > > > This

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Tang Chen wrote: > Dave found that the kernel will hang during boot. This is because > the nodemask_t type stack variable numa_kernel_nodes is large enough > to overflow the stack. > > This doesn't always happen. According to Dave, this happened once > in about five boots.

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Andrew Morton
On Thu, 23 Jan 2014 13:49:28 +0800 Tang Chen wrote: > Dave found that the kernel will hang during boot. This is because > the nodemask_t type stack variable numa_kernel_nodes is large enough > to overflow the stack. > > This doesn't always happen. According to Dave, this happened once > in

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:49:28PM +0800, Tang Chen wrote: > This doesn't always happen. According to Dave, this happened once > in about five boots. The backtrace is like the following: > > dump_stack > panic > ? numa_clear_kernel_node_hotplug > __stack_chk_fail >

[PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Tang Chen
Dave found that the kernel will hang during boot. This is because the nodemask_t type stack variable numa_kernel_nodes is large enough to overflow the stack. This doesn't always happen. According to Dave, this happened once in about five boots. The backtrace is like the following: dump_stack

[PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Tang Chen
Dave found that the kernel will hang during boot. This is because the nodemask_t type stack variable numa_kernel_nodes is large enough to overflow the stack. This doesn't always happen. According to Dave, this happened once in about five boots. The backtrace is like the following: dump_stack

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:49:28PM +0800, Tang Chen wrote: This doesn't always happen. According to Dave, this happened once in about five boots. The backtrace is like the following: dump_stack panic ? numa_clear_kernel_node_hotplug __stack_chk_fail

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Andrew Morton
On Thu, 23 Jan 2014 13:49:28 +0800 Tang Chen tangc...@cn.fujitsu.com wrote: Dave found that the kernel will hang during boot. This is because the nodemask_t type stack variable numa_kernel_nodes is large enough to overflow the stack. This doesn't always happen. According to Dave, this

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Tang Chen wrote: Dave found that the kernel will hang during boot. This is because the nodemask_t type stack variable numa_kernel_nodes is large enough to overflow the stack. This doesn't always happen. According to Dave, this happened once in about five boots. The

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote: On Thu, 23 Jan 2014, Tang Chen wrote: Dave found that the kernel will hang during boot. This is because the nodemask_t type stack variable numa_kernel_nodes is large enough to overflow the stack. This doesn't

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Dave Jones wrote: It's 10, because I had MAXSMP set. So, MAX_NUMNODES = 1 10 And the bitmask is made of longs. 1024 of them. How does this work ? It's 1024 bits. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to

Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Tang Chen
On 01/23/2014 02:13 PM, Dave Jones wrote: On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote: On Thu, 23 Jan 2014, Tang Chen wrote: .. I guess it depends on what Dave's CONFIG_NODES_SHIFT is? It's 10, because I had MAXSMP set. So, MAX_NUMNODES = 1 10 And the