Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen


Hi Dave,

I think here is the overflow problem. Not the stackoverflow,
but the array index overflow.

Please have a look at the following path:

numa_init()
 |---> numa_register_memblks()
 |  |---> memblock_set_node(memory)  set correct nid in 
memblock.memory
 |  |---> memblock_set_node(reserved)	set correct nid in 
memblock.reserved

 |  |..
 |  |---> setup_node_data()
 | |---> memblock_alloc_nid()	here, nid is set to 
MAX_NUMNODES (1024)

 |..
 |---> numa_clear_kernel_node_hotplug()
|---> node_set() here, we have an index 1024, and 
overflowed

For now, I think this is the first problem you mentioned.

Will send a new patch to fix it and do more tests.

Thanks.

On 01/28/2014 01:31 PM, Tang Chen wrote:

On 01/28/2014 12:47 PM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
> On 01/28/2014 11:55 AM, Dave Jones wrote:
> > On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:
> >
> > > > I did a bisect with the patch above applied each step of the way.
> > > > This time I got a plausible looking result
> > >
> > > I cannot reproduce this. Would you please share how to reproduce
it ?
> > > Or does it just happen during the booting ?
> >
> > Just during boot. Very early. So early in fact, I have no logging
facilities
> > like usb-serial, just what is on vga console.
> >
> > If you want me to add some printk's, I can add a while (1); before
> > the part that oopses so we can diagnose further..
>
> Sure. Would you please do that for me ? Maybe we can find something in
> the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)



I think I found something.

Since I can reproduce the first problem on 3.10, I found some memory
ranges in memblock
have nid = 1024. When we use node_set(), it will crash.

I'll see if we have the same problem on the latest kernel.

[ 0.00] NUMA: Initialized distance table, cnt=2
[ 0.00] NUMA: Warning: node ids are out of bound, from=-1 to=-1
distance=10
[ 0.00] NUMA: Node 0 [mem 0x-0x7fff] + [mem
0x1-0x47fff] -> [mem 0x-0x47fff]
[ 0.00] Initmem setup node 0 [mem 0x-0x47fff]
[ 0.00] NODE_DATA [mem 0x47ffd9000-0x47fff]
[ 0.00] Initmem setup node 1 [mem 0x48000-0x87fff]
[ 0.00] NODE_DATA [mem 0x87ffbb000-0x87ffe1fff]
[ 0.00] : i = 0, nid = 0
[ 0.00] : i = 1, nid = 0
[ 0.00] : i = 2, nid = 0
[ 0.00] : i = 3, nid = 0
[ 0.00] : i = 4, nid = 1024
[ 0.00] : i = 5, nid = 1024
[ 0.00] : i = 6, nid = 1
[ 0.00] : i = 7, nid = 1
[ 0.00] Reserving 128MB of memory at 704MB for crashkernel (System
RAM: 32406MB)
[ 0.00] [ea00-ea0011ff] PMD ->
[88047020-88047fdf] on node 0
[ 0.00] [ea001200-ea0021ff] PMD ->
[88086f60-88087f5f] on node 1
[ 0.00] Zone ranges:
[ 0.00] DMA [mem 0x1000-0x00ff]
[ 0.00] DMA32 [mem 0x0100-0x]
[ 0.00] Normal [mem 0x1-0x87fff]
[ 0.00] Movable zone start for each node
[ 0.00] Early memory node ranges
[ 0.00] node 0: [mem 0x1000-0x00098fff]
[ 0.00] node 0: [mem 0x0010-0x696f7fff]
[ 0.00] node 0: [mem 0x1-0x47fff]
[ 0.00] node 1: [mem 0x48000-0x87fff]

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 01:17:21PM +0800, Tang Chen wrote:

 > Seeing from your earlier mail, it crashed at:
 > 
 >  while (zonelist_zone_idx(z) > highest_zoneidx)
 >de:   3b 77 08cmp0x8(%rdi),%esi
 > 
 > 
 >  I stuck this at the top of the function..
 > 
 >  printk(KERN_ERR "z:%p nodes:%p highest:%d\n", z, nodes, 
 > highest_zoneidx);
 > 
 >  and got
 > 
 >  z: 1d08   nodes: (null)  highest:3
 > 
 > 
 > nodes=null and highest=3, they are correct. When looking into 
 > next_zones_zonelist(),
 > I cannot see why it crashed. So, can you print the zone id in the
 > for_each_zone_zonelist() loop in nr_free_zone_pages() ?
 > I want to know why it crashed. A NULL pointer ?  Which one ?

It's not so easy further in the function, because the oops scrolls off
any useful printks, there's no scrollback, and no logging..
I even tried adding some udelays to slow things down (and using boot_delay)
but that makes things just hang seemingly indefinitly.

What about that 'z' ptr though ? 0x1d08 seems like a strange address
for us to have a structure at, though I'm not too familiar with the early
boot code, so maybe we do have something down there ?

Dave 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 12:47 PM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
  >  On 01/28/2014 11:55 AM, Dave Jones wrote:
  >  >  On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:
  >  >
  >  >>   >   I did a bisect with the patch above applied each step of the 
way.
  >  >>   >   This time I got a plausible looking result
  >  >>
  >  >>   I cannot reproduce this. Would you please share how to reproduce 
it ?
  >  >>   Or does it just happen during the booting ?
  >  >
  >  >  Just during boot. Very early. So early in fact, I have no logging 
facilities
  >  >  like usb-serial, just what is on vga console.
  >  >
  >  >  If you want me to add some printk's, I can add a while (1); before
  >  >  the part that oopses so we can diagnose further..
  >
  >  Sure. Would you please do that for me ? Maybe we can find something in
  >  the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)



I think I found something.

Since I can reproduce the first problem on 3.10, I found some memory 
ranges in memblock

have nid = 1024. When we use node_set(), it will crash.

I'll see if we have the same problem on the latest kernel.

[0.00] NUMA: Initialized distance table, cnt=2
[0.00] NUMA: Warning: node ids are out of bound, from=-1 to=-1 
distance=10
[0.00] NUMA: Node 0 [mem 0x-0x7fff] + [mem 
0x1-0x47fff] -> [mem 0x-0x47fff]

[0.00] Initmem setup node 0 [mem 0x-0x47fff]
[0.00]   NODE_DATA [mem 0x47ffd9000-0x47fff]
[0.00] Initmem setup node 1 [mem 0x48000-0x87fff]
[0.00]   NODE_DATA [mem 0x87ffbb000-0x87ffe1fff]
[0.00] : i = 0, nid = 0
[0.00] : i = 1, nid = 0
[0.00] : i = 2, nid = 0
[0.00] : i = 3, nid = 0
[0.00] : i = 4, nid = 1024
[0.00] : i = 5, nid = 1024
[0.00] : i = 6, nid = 1
[0.00] : i = 7, nid = 1
[0.00] Reserving 128MB of memory at 704MB for crashkernel 
(System RAM: 32406MB)
[0.00]  [ea00-ea0011ff] PMD -> 
[88047020-88047fdf] on node 0
[0.00]  [ea001200-ea0021ff] PMD -> 
[88086f60-88087f5f] on node 1

[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   DMA32[mem 0x0100-0x]
[0.00]   Normal   [mem 0x1-0x87fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x00098fff]
[0.00]   node   0: [mem 0x0010-0x696f7fff]
[0.00]   node   0: [mem 0x1-0x47fff]
[0.00]   node   1: [mem 0x48000-0x87fff]

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 12:47 PM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
  >  On 01/28/2014 11:55 AM, Dave Jones wrote:
  >  >  On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:
  >  >
  >  >>   >   I did a bisect with the patch above applied each step of the 
way.
  >  >>   >   This time I got a plausible looking result
  >  >>
  >  >>   I cannot reproduce this. Would you please share how to reproduce 
it ?
  >  >>   Or does it just happen during the booting ?
  >  >
  >  >  Just during boot. Very early. So early in fact, I have no logging 
facilities
  >  >  like usb-serial, just what is on vga console.
  >  >
  >  >  If you want me to add some printk's, I can add a while (1); before
  >  >  the part that oopses so we can diagnose further..
  >
  >  Sure. Would you please do that for me ? Maybe we can find something in
  >  the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)


Sorry. I didn't say it clearly. :)

Seeing from your earlier mail, it crashed at:

while (zonelist_zone_idx(z) > highest_zoneidx)
  de:   3b 77 08cmp0x8(%rdi),%esi


I stuck this at the top of the function..

printk(KERN_ERR "z:%p nodes:%p highest:%d\n", z, nodes, 
highest_zoneidx);

and got

z: 1d08   nodes: (null)  highest:3


nodes=null and highest=3, they are correct. When looking into 
next_zones_zonelist(),

I cannot see why it crashed. So, can you print the zone id in the
for_each_zone_zonelist() loop in nr_free_zone_pages() ?

I want to know why it crashed. A NULL pointer ?  Which one ?

Thanks.



Dave



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
 > On 01/28/2014 11:55 AM, Dave Jones wrote:
 > > On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:
 > >
 > >   >  >  I did a bisect with the patch above applied each step of the way.
 > >   >  >  This time I got a plausible looking result
 > >   >
 > >   >  I cannot reproduce this. Would you please share how to reproduce it ?
 > >   >  Or does it just happen during the booting ?
 > >
 > > Just during boot. Very early. So early in fact, I have no logging 
 > > facilities
 > > like usb-serial, just what is on vga console.
 > >
 > > If you want me to add some printk's, I can add a while (1); before
 > > the part that oopses so we can diagnose further..
 > 
 > Sure. Would you please do that for me ? Maybe we can find something in 
 > the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 11:55 AM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:

  >  >  I did a bisect with the patch above applied each step of the way.
  >  >  This time I got a plausible looking result
  >
  >  I cannot reproduce this. Would you please share how to reproduce it ?
  >  Or does it just happen during the booting ?

Just during boot. Very early. So early in fact, I have no logging facilities
like usb-serial, just what is on vga console.

If you want me to add some printk's, I can add a while (1); before
the part that oopses so we can diagnose further..


Sure. Would you please do that for me ? Maybe we can find something in 
the early log.


Thanks.



Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:

 > > I did a bisect with the patch above applied each step of the way.
 > > This time I got a plausible looking result
 > 
 > I cannot reproduce this. Would you please share how to reproduce it ?
 > Or does it just happen during the booting ?

Just during boot. Very early. So early in fact, I have no logging facilities
like usb-serial, just what is on vga console.

If you want me to add some printk's, I can add a while (1); before
the part that oopses so we can diagnose further..

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 10:55 AM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote:
  >  On 01/28/2014 08:32 AM, David Rientjes wrote:
  >  >  On Wed, 22 Jan 2014, David Rientjes wrote:
  >  >
  >  >>>arch/x86/mm/numa.c | 2 +-
  >  >>>1 file changed, 1 insertion(+), 1 deletion(-)
  >  >>>
  >  >>>  diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
  >  >>>  index 81b2750..ebefeb7 100644
  >  >>>  --- a/arch/x86/mm/numa.c
  >  >>>  +++ b/arch/x86/mm/numa.c
  >  >>>  @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
  >  >>>}
  >  >>>}
  >  >>>
  >  >>>  +static nodemask_t numa_kernel_nodes __initdata;
  >  >>>static void __init numa_clear_kernel_node_hotplug(void)
  >  >>>{
  >  >>>int i, nid;
  >  >>>  - nodemask_t numa_kernel_nodes;
  >  >>>unsigned long start, end;
  >  >>>struct memblock_type *type =
  >  >>>
  >  >>
  >  >>  Isn't this also a bugfix since you never initialize numa_kernel_nodes 
when
  >  >>  it's allocated on the stack with NODE_MASK_NONE?
  >  >>
  >  >
  >  >  This hasn't been answered and the patch still isn't in linux-kernel yet
  >  >  Dave tested it as good.  I'm suspicious of the changelog that indicates
  >  >  this nodemask is the result of a stack overflow itself which only 
manages
  >  >  to reproduce itself in the init patch slightly more than 50% of the 
time.
  >  >  How is that possible?
  >  >
  >  >  I think the changelog should indicate this also fixes an uninitialized
  >  >  nodemask issue.
  >
  >  Hi David,
  >
  >  I'm still working on this problem, but unfortunately nothing new for now.
  >  And the test till now shows no more problem here.
  >
  >  I'm digging into it, but need more time.
  >
  >  I'll resend a new patch and modify the changelog soon. Before we find the
  >  root cause, I think we can use this patch as a temporary solution.

Ok, I hit the 2nd bug again (oops in next_zones_zonelist...)

I did a bisect with the patch above applied each step of the way.
This time I got a plausible looking result


I cannot reproduce this. Would you please share how to reproduce it ?
Or does it just happen during the booting ?




a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit
commit a0acda917284183f9b71e2d08b0aa0aea722b321
Author: Tang Chen
Date:   Tue Jan 21 15:49:32 2014 -0800

 acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable


Reverting this commit of course removes the whole function from above,
so we haven't really learned anything new, other than that commit is broken,
even after the above fix-up.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 10:55 AM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote:
  >  On 01/28/2014 08:32 AM, David Rientjes wrote:
  >  >  On Wed, 22 Jan 2014, David Rientjes wrote:
  >  >
  >  >>>arch/x86/mm/numa.c | 2 +-
  >  >>>1 file changed, 1 insertion(+), 1 deletion(-)
  >  >>>
  >  >>>  diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
  >  >>>  index 81b2750..ebefeb7 100644
  >  >>>  --- a/arch/x86/mm/numa.c
  >  >>>  +++ b/arch/x86/mm/numa.c
  >  >>>  @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
  >  >>>}
  >  >>>}
  >  >>>
  >  >>>  +static nodemask_t numa_kernel_nodes __initdata;
  >  >>>static void __init numa_clear_kernel_node_hotplug(void)
  >  >>>{
  >  >>>int i, nid;
  >  >>>  - nodemask_t numa_kernel_nodes;
  >  >>>unsigned long start, end;
  >  >>>struct memblock_type *type =
  >  >>>
  >  >>
  >  >>  Isn't this also a bugfix since you never initialize numa_kernel_nodes 
when
  >  >>  it's allocated on the stack with NODE_MASK_NONE?
  >  >>
  >  >
  >  >  This hasn't been answered and the patch still isn't in linux-kernel yet
  >  >  Dave tested it as good.  I'm suspicious of the changelog that indicates
  >  >  this nodemask is the result of a stack overflow itself which only 
manages
  >  >  to reproduce itself in the init patch slightly more than 50% of the 
time.
  >  >  How is that possible?
  >  >
  >  >  I think the changelog should indicate this also fixes an uninitialized
  >  >  nodemask issue.
  >
  >  Hi David,
  >
  >  I'm still working on this problem, but unfortunately nothing new for now.
  >  And the test till now shows no more problem here.
  >
  >  I'm digging into it, but need more time.
  >
  >  I'll resend a new patch and modify the changelog soon. Before we find the
  >  root cause, I think we can use this patch as a temporary solution.

Ok, I hit the 2nd bug again (oops in next_zones_zonelist...)

I did a bisect with the patch above applied each step of the way.
This time I got a plausible looking result


a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit
commit a0acda917284183f9b71e2d08b0aa0aea722b321
Author: Tang Chen
Date:   Tue Jan 21 15:49:32 2014 -0800

 acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable


Reverting this commit of course removes the whole function from above,
so we haven't really learned anything new, other than that commit is broken,
even after the above fix-up.


If we revert this commit, memory hot-remove won't be able to work.
Let's try to fix it before the merge window is close.



Dave



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote:
 > On 01/28/2014 08:32 AM, David Rientjes wrote:
 > > On Wed, 22 Jan 2014, David Rientjes wrote:
 > >
 > >>>   arch/x86/mm/numa.c | 2 +-
 > >>>   1 file changed, 1 insertion(+), 1 deletion(-)
 > >>>
 > >>> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
 > >>> index 81b2750..ebefeb7 100644
 > >>> --- a/arch/x86/mm/numa.c
 > >>> +++ b/arch/x86/mm/numa.c
 > >>> @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
 > >>>  }
 > >>>   }
 > >>>
 > >>> +static nodemask_t numa_kernel_nodes __initdata;
 > >>>   static void __init numa_clear_kernel_node_hotplug(void)
 > >>>   {
 > >>>  int i, nid;
 > >>> -nodemask_t numa_kernel_nodes;
 > >>>  unsigned long start, end;
 > >>>  struct memblock_type *type =
 > >>>
 > >>
 > >> Isn't this also a bugfix since you never initialize numa_kernel_nodes when
 > >> it's allocated on the stack with NODE_MASK_NONE?
 > >>
 > >
 > > This hasn't been answered and the patch still isn't in linux-kernel yet
 > > Dave tested it as good.  I'm suspicious of the changelog that indicates
 > > this nodemask is the result of a stack overflow itself which only manages
 > > to reproduce itself in the init patch slightly more than 50% of the time.
 > > How is that possible?
 > >
 > > I think the changelog should indicate this also fixes an uninitialized
 > > nodemask issue.
 > 
 > Hi David,
 > 
 > I'm still working on this problem, but unfortunately nothing new for now.
 > And the test till now shows no more problem here.
 > 
 > I'm digging into it, but need more time.
 > 
 > I'll resend a new patch and modify the changelog soon. Before we find the
 > root cause, I think we can use this patch as a temporary solution.

Ok, I hit the 2nd bug again (oops in next_zones_zonelist...)

I did a bisect with the patch above applied each step of the way.
This time I got a plausible looking result


a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit
commit a0acda917284183f9b71e2d08b0aa0aea722b321
Author: Tang Chen 
Date:   Tue Jan 21 15:49:32 2014 -0800

acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable


Reverting this commit of course removes the whole function from above,
so we haven't really learned anything new, other than that commit is broken,
even after the above fix-up.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 08:32 AM, David Rientjes wrote:

On Wed, 22 Jan 2014, David Rientjes wrote:


  arch/x86/mm/numa.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 81b2750..ebefeb7 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -562,10 +562,10 @@ static void __init numa_init_array(void)
}
  }

+static nodemask_t numa_kernel_nodes __initdata;
  static void __init numa_clear_kernel_node_hotplug(void)
  {
int i, nid;
-   nodemask_t numa_kernel_nodes;
unsigned long start, end;
struct memblock_type *type =



Isn't this also a bugfix since you never initialize numa_kernel_nodes when
it's allocated on the stack with NODE_MASK_NONE?



This hasn't been answered and the patch still isn't in linux-kernel yet
Dave tested it as good.  I'm suspicious of the changelog that indicates
this nodemask is the result of a stack overflow itself which only manages
to reproduce itself in the init patch slightly more than 50% of the time.
How is that possible?

I think the changelog should indicate this also fixes an uninitialized
nodemask issue.


Hi David,

I'm still working on this problem, but unfortunately nothing new for now.
And the test till now shows no more problem here.

I'm digging into it, but need more time.

I'll resend a new patch and modify the changelog soon. Before we find the
root cause, I think we can use this patch as a temporary solution.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread David Rientjes
On Wed, 22 Jan 2014, David Rientjes wrote:

> >  arch/x86/mm/numa.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > index 81b2750..ebefeb7 100644
> > --- a/arch/x86/mm/numa.c
> > +++ b/arch/x86/mm/numa.c
> > @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
> > }
> >  }
> >  
> > +static nodemask_t numa_kernel_nodes __initdata;
> >  static void __init numa_clear_kernel_node_hotplug(void)
> >  {
> > int i, nid;
> > -   nodemask_t numa_kernel_nodes;
> > unsigned long start, end;
> > struct memblock_type *type = 
> >  
> 
> Isn't this also a bugfix since you never initialize numa_kernel_nodes when 
> it's allocated on the stack with NODE_MASK_NONE?
> 

This hasn't been answered and the patch still isn't in linux-kernel yet 
Dave tested it as good.  I'm suspicious of the changelog that indicates 
this nodemask is the result of a stack overflow itself which only manages 
to reproduce itself in the init patch slightly more than 50% of the time.  
How is that possible?

I think the changelog should indicate this also fixes an uninitialized 
nodemask issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Mon, Jan 27, 2014 at 03:29:03PM +0800, Tang Chen wrote:

 > > Some build tests show..
 > >
 > > MAXSMP ( NODESHIFT=10 ) : Bug
 > > NRCPUS=4&  NODESHIFT=10 : Bug
 > > NRCPUS=4&  NODESHIFT=1 : no bug
 > >
 > >
 > > The middle config test was accidental, I hadn't realised disabling MAXSMP
 > > wouldn't reset NODESHIFT to something sane.
 > >
 > > I'll start bisecting, as MAXSMP worked fine until a few days ago.
 > 
 > Hi Dave,
 > 
 > I didn't reproduce this bug. Would you please share the bisect result ?

The bisect pointed at something completely unrelated, and then when I tried 
again on git HEAD, I couldn't reproduce it.. 
If I manage to get it happening again I'll look into it more, but for now
it seems to have gone into hiding.

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Mon, Jan 27, 2014 at 03:29:03PM +0800, Tang Chen wrote:

   Some build tests show..
  
   MAXSMP ( NODESHIFT=10 ) : Bug
   NRCPUS=4  NODESHIFT=10 : Bug
   NRCPUS=4  NODESHIFT=1 : no bug
  
  
   The middle config test was accidental, I hadn't realised disabling MAXSMP
   wouldn't reset NODESHIFT to something sane.
  
   I'll start bisecting, as MAXSMP worked fine until a few days ago.
  
  Hi Dave,
  
  I didn't reproduce this bug. Would you please share the bisect result ?

The bisect pointed at something completely unrelated, and then when I tried 
again on git HEAD, I couldn't reproduce it.. 
If I manage to get it happening again I'll look into it more, but for now
it seems to have gone into hiding.

Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread David Rientjes
On Wed, 22 Jan 2014, David Rientjes wrote:

   arch/x86/mm/numa.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
  index 81b2750..ebefeb7 100644
  --- a/arch/x86/mm/numa.c
  +++ b/arch/x86/mm/numa.c
  @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
  }
   }
   
  +static nodemask_t numa_kernel_nodes __initdata;
   static void __init numa_clear_kernel_node_hotplug(void)
   {
  int i, nid;
  -   nodemask_t numa_kernel_nodes;
  unsigned long start, end;
  struct memblock_type *type = memblock.reserved;
   
 
 Isn't this also a bugfix since you never initialize numa_kernel_nodes when 
 it's allocated on the stack with NODE_MASK_NONE?
 

This hasn't been answered and the patch still isn't in linux-kernel yet 
Dave tested it as good.  I'm suspicious of the changelog that indicates 
this nodemask is the result of a stack overflow itself which only manages 
to reproduce itself in the init patch slightly more than 50% of the time.  
How is that possible?

I think the changelog should indicate this also fixes an uninitialized 
nodemask issue.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 08:32 AM, David Rientjes wrote:

On Wed, 22 Jan 2014, David Rientjes wrote:


  arch/x86/mm/numa.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 81b2750..ebefeb7 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -562,10 +562,10 @@ static void __init numa_init_array(void)
}
  }

+static nodemask_t numa_kernel_nodes __initdata;
  static void __init numa_clear_kernel_node_hotplug(void)
  {
int i, nid;
-   nodemask_t numa_kernel_nodes;
unsigned long start, end;
struct memblock_type *type =memblock.reserved;



Isn't this also a bugfix since you never initialize numa_kernel_nodes when
it's allocated on the stack with NODE_MASK_NONE?



This hasn't been answered and the patch still isn't in linux-kernel yet
Dave tested it as good.  I'm suspicious of the changelog that indicates
this nodemask is the result of a stack overflow itself which only manages
to reproduce itself in the init patch slightly more than 50% of the time.
How is that possible?

I think the changelog should indicate this also fixes an uninitialized
nodemask issue.


Hi David,

I'm still working on this problem, but unfortunately nothing new for now.
And the test till now shows no more problem here.

I'm digging into it, but need more time.

I'll resend a new patch and modify the changelog soon. Before we find the
root cause, I think we can use this patch as a temporary solution.

Thanks.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote:
  On 01/28/2014 08:32 AM, David Rientjes wrote:
   On Wed, 22 Jan 2014, David Rientjes wrote:
  
 arch/x86/mm/numa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  
   diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
   index 81b2750..ebefeb7 100644
   --- a/arch/x86/mm/numa.c
   +++ b/arch/x86/mm/numa.c
   @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
}
 }
  
   +static nodemask_t numa_kernel_nodes __initdata;
 static void __init numa_clear_kernel_node_hotplug(void)
 {
int i, nid;
   -nodemask_t numa_kernel_nodes;
unsigned long start, end;
struct memblock_type *type =memblock.reserved;
  
  
   Isn't this also a bugfix since you never initialize numa_kernel_nodes when
   it's allocated on the stack with NODE_MASK_NONE?
  
  
   This hasn't been answered and the patch still isn't in linux-kernel yet
   Dave tested it as good.  I'm suspicious of the changelog that indicates
   this nodemask is the result of a stack overflow itself which only manages
   to reproduce itself in the init patch slightly more than 50% of the time.
   How is that possible?
  
   I think the changelog should indicate this also fixes an uninitialized
   nodemask issue.
  
  Hi David,
  
  I'm still working on this problem, but unfortunately nothing new for now.
  And the test till now shows no more problem here.
  
  I'm digging into it, but need more time.
  
  I'll resend a new patch and modify the changelog soon. Before we find the
  root cause, I think we can use this patch as a temporary solution.

Ok, I hit the 2nd bug again (oops in next_zones_zonelist...)

I did a bisect with the patch above applied each step of the way.
This time I got a plausible looking result


a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit
commit a0acda917284183f9b71e2d08b0aa0aea722b321
Author: Tang Chen tangc...@cn.fujitsu.com
Date:   Tue Jan 21 15:49:32 2014 -0800

acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable


Reverting this commit of course removes the whole function from above,
so we haven't really learned anything new, other than that commit is broken,
even after the above fix-up.

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 10:55 AM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote:
On 01/28/2014 08:32 AM, David Rientjes wrote:
  On Wed, 22 Jan 2014, David Rientjes wrote:

arch/x86/mm/numa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

  diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
  index 81b2750..ebefeb7 100644
  --- a/arch/x86/mm/numa.c
  +++ b/arch/x86/mm/numa.c
  @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
}
}

  +static nodemask_t numa_kernel_nodes __initdata;
static void __init numa_clear_kernel_node_hotplug(void)
{
int i, nid;
  - nodemask_t numa_kernel_nodes;
unsigned long start, end;
struct memblock_type *type =memblock.reserved;


  Isn't this also a bugfix since you never initialize numa_kernel_nodes 
when
  it's allocated on the stack with NODE_MASK_NONE?


  This hasn't been answered and the patch still isn't in linux-kernel yet
  Dave tested it as good.  I'm suspicious of the changelog that indicates
  this nodemask is the result of a stack overflow itself which only 
manages
  to reproduce itself in the init patch slightly more than 50% of the 
time.
  How is that possible?

  I think the changelog should indicate this also fixes an uninitialized
  nodemask issue.
  
Hi David,
  
I'm still working on this problem, but unfortunately nothing new for now.
And the test till now shows no more problem here.
  
I'm digging into it, but need more time.
  
I'll resend a new patch and modify the changelog soon. Before we find the
root cause, I think we can use this patch as a temporary solution.

Ok, I hit the 2nd bug again (oops in next_zones_zonelist...)

I did a bisect with the patch above applied each step of the way.
This time I got a plausible looking result


a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit
commit a0acda917284183f9b71e2d08b0aa0aea722b321
Author: Tang Chentangc...@cn.fujitsu.com
Date:   Tue Jan 21 15:49:32 2014 -0800

 acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable


Reverting this commit of course removes the whole function from above,
so we haven't really learned anything new, other than that commit is broken,
even after the above fix-up.


If we revert this commit, memory hot-remove won't be able to work.
Let's try to fix it before the merge window is close.



Dave



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 10:55 AM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote:
On 01/28/2014 08:32 AM, David Rientjes wrote:
  On Wed, 22 Jan 2014, David Rientjes wrote:

arch/x86/mm/numa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

  diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
  index 81b2750..ebefeb7 100644
  --- a/arch/x86/mm/numa.c
  +++ b/arch/x86/mm/numa.c
  @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
}
}

  +static nodemask_t numa_kernel_nodes __initdata;
static void __init numa_clear_kernel_node_hotplug(void)
{
int i, nid;
  - nodemask_t numa_kernel_nodes;
unsigned long start, end;
struct memblock_type *type =memblock.reserved;


  Isn't this also a bugfix since you never initialize numa_kernel_nodes 
when
  it's allocated on the stack with NODE_MASK_NONE?


  This hasn't been answered and the patch still isn't in linux-kernel yet
  Dave tested it as good.  I'm suspicious of the changelog that indicates
  this nodemask is the result of a stack overflow itself which only 
manages
  to reproduce itself in the init patch slightly more than 50% of the 
time.
  How is that possible?

  I think the changelog should indicate this also fixes an uninitialized
  nodemask issue.
  
Hi David,
  
I'm still working on this problem, but unfortunately nothing new for now.
And the test till now shows no more problem here.
  
I'm digging into it, but need more time.
  
I'll resend a new patch and modify the changelog soon. Before we find the
root cause, I think we can use this patch as a temporary solution.

Ok, I hit the 2nd bug again (oops in next_zones_zonelist...)

I did a bisect with the patch above applied each step of the way.
This time I got a plausible looking result


I cannot reproduce this. Would you please share how to reproduce it ?
Or does it just happen during the booting ?




a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit
commit a0acda917284183f9b71e2d08b0aa0aea722b321
Author: Tang Chentangc...@cn.fujitsu.com
Date:   Tue Jan 21 15:49:32 2014 -0800

 acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable


Reverting this commit of course removes the whole function from above,
so we haven't really learned anything new, other than that commit is broken,
even after the above fix-up.

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:

   I did a bisect with the patch above applied each step of the way.
   This time I got a plausible looking result
  
  I cannot reproduce this. Would you please share how to reproduce it ?
  Or does it just happen during the booting ?

Just during boot. Very early. So early in fact, I have no logging facilities
like usb-serial, just what is on vga console.

If you want me to add some printk's, I can add a while (1); before
the part that oopses so we can diagnose further..

Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 11:55 AM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:

  I did a bisect with the patch above applied each step of the way.
  This time I got a plausible looking result
  
I cannot reproduce this. Would you please share how to reproduce it ?
Or does it just happen during the booting ?

Just during boot. Very early. So early in fact, I have no logging facilities
like usb-serial, just what is on vga console.

If you want me to add some printk's, I can add a while (1); before
the part that oopses so we can diagnose further..


Sure. Would you please do that for me ? Maybe we can find something in 
the early log.


Thanks.



Dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
  On 01/28/2014 11:55 AM, Dave Jones wrote:
   On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:
  
 I did a bisect with the patch above applied each step of the way.
 This time I got a plausible looking result
 
   I cannot reproduce this. Would you please share how to reproduce it ?
   Or does it just happen during the booting ?
  
   Just during boot. Very early. So early in fact, I have no logging 
   facilities
   like usb-serial, just what is on vga console.
  
   If you want me to add some printk's, I can add a while (1); before
   the part that oopses so we can diagnose further..
  
  Sure. Would you please do that for me ? Maybe we can find something in 
  the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 12:47 PM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
On 01/28/2014 11:55 AM, Dave Jones wrote:
  On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:

  I did a bisect with the patch above applied each step of the 
way.
  This time I got a plausible looking result

   I cannot reproduce this. Would you please share how to reproduce 
it ?
   Or does it just happen during the booting ?

  Just during boot. Very early. So early in fact, I have no logging 
facilities
  like usb-serial, just what is on vga console.

  If you want me to add some printk's, I can add a while (1); before
  the part that oopses so we can diagnose further..
  
Sure. Would you please do that for me ? Maybe we can find something in
the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)


Sorry. I didn't say it clearly. :)

Seeing from your earlier mail, it crashed at:

while (zonelist_zone_idx(z)  highest_zoneidx)
  de:   3b 77 08cmp0x8(%rdi),%esi


I stuck this at the top of the function..

printk(KERN_ERR z:%p nodes:%p highest:%d\n, z, nodes, 
highest_zoneidx);

and got

z: 1d08   nodes: (null)  highest:3


nodes=null and highest=3, they are correct. When looking into 
next_zones_zonelist(),

I cannot see why it crashed. So, can you print the zone id in the
for_each_zone_zonelist() loop in nr_free_zone_pages() ?

I want to know why it crashed. A NULL pointer ?  Which one ?

Thanks.



Dave



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen

On 01/28/2014 12:47 PM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
On 01/28/2014 11:55 AM, Dave Jones wrote:
  On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:

  I did a bisect with the patch above applied each step of the 
way.
  This time I got a plausible looking result

   I cannot reproduce this. Would you please share how to reproduce 
it ?
   Or does it just happen during the booting ?

  Just during boot. Very early. So early in fact, I have no logging 
facilities
  like usb-serial, just what is on vga console.

  If you want me to add some printk's, I can add a while (1); before
  the part that oopses so we can diagnose further..
  
Sure. Would you please do that for me ? Maybe we can find something in
the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)



I think I found something.

Since I can reproduce the first problem on 3.10, I found some memory 
ranges in memblock

have nid = 1024. When we use node_set(), it will crash.

I'll see if we have the same problem on the latest kernel.

[0.00] NUMA: Initialized distance table, cnt=2
[0.00] NUMA: Warning: node ids are out of bound, from=-1 to=-1 
distance=10
[0.00] NUMA: Node 0 [mem 0x-0x7fff] + [mem 
0x1-0x47fff] - [mem 0x-0x47fff]

[0.00] Initmem setup node 0 [mem 0x-0x47fff]
[0.00]   NODE_DATA [mem 0x47ffd9000-0x47fff]
[0.00] Initmem setup node 1 [mem 0x48000-0x87fff]
[0.00]   NODE_DATA [mem 0x87ffbb000-0x87ffe1fff]
[0.00] : i = 0, nid = 0
[0.00] : i = 1, nid = 0
[0.00] : i = 2, nid = 0
[0.00] : i = 3, nid = 0
[0.00] : i = 4, nid = 1024
[0.00] : i = 5, nid = 1024
[0.00] : i = 6, nid = 1
[0.00] : i = 7, nid = 1
[0.00] Reserving 128MB of memory at 704MB for crashkernel 
(System RAM: 32406MB)
[0.00]  [ea00-ea0011ff] PMD - 
[88047020-88047fdf] on node 0
[0.00]  [ea001200-ea0021ff] PMD - 
[88086f60-88087f5f] on node 1

[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   DMA32[mem 0x0100-0x]
[0.00]   Normal   [mem 0x1-0x87fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x00098fff]
[0.00]   node   0: [mem 0x0010-0x696f7fff]
[0.00]   node   0: [mem 0x1-0x47fff]
[0.00]   node   1: [mem 0x48000-0x87fff]

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Dave Jones
On Tue, Jan 28, 2014 at 01:17:21PM +0800, Tang Chen wrote:

  Seeing from your earlier mail, it crashed at:
  
   while (zonelist_zone_idx(z)  highest_zoneidx)
 de:   3b 77 08cmp0x8(%rdi),%esi
  
  
   I stuck this at the top of the function..
  
   printk(KERN_ERR z:%p nodes:%p highest:%d\n, z, nodes, 
  highest_zoneidx);
  
   and got
  
   z: 1d08   nodes: (null)  highest:3
  
  
  nodes=null and highest=3, they are correct. When looking into 
  next_zones_zonelist(),
  I cannot see why it crashed. So, can you print the zone id in the
  for_each_zone_zonelist() loop in nr_free_zone_pages() ?
  I want to know why it crashed. A NULL pointer ?  Which one ?

It's not so easy further in the function, because the oops scrolls off
any useful printks, there's no scrollback, and no logging..
I even tried adding some udelays to slow things down (and using boot_delay)
but that makes things just hang seemingly indefinitly.

What about that 'z' ptr though ? 0x1d08 seems like a strange address
for us to have a structure at, though I'm not too familiar with the early
boot code, so maybe we do have something down there ?

Dave 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-27 Thread Tang Chen


Hi Dave,

I think here is the overflow problem. Not the stackoverflow,
but the array index overflow.

Please have a look at the following path:

numa_init()
 |--- numa_register_memblks()
 |  |--- memblock_set_node(memory)  set correct nid in 
memblock.memory
 |  |--- memblock_set_node(reserved)	set correct nid in 
memblock.reserved

 |  |..
 |  |--- setup_node_data()
 | |--- memblock_alloc_nid()	here, nid is set to 
MAX_NUMNODES (1024)

 |..
 |--- numa_clear_kernel_node_hotplug()
|--- node_set() here, we have an index 1024, and 
overflowed

For now, I think this is the first problem you mentioned.

Will send a new patch to fix it and do more tests.

Thanks.

On 01/28/2014 01:31 PM, Tang Chen wrote:

On 01/28/2014 12:47 PM, Dave Jones wrote:

On Tue, Jan 28, 2014 at 12:47:11PM +0800, Tang Chen wrote:
 On 01/28/2014 11:55 AM, Dave Jones wrote:
  On Tue, Jan 28, 2014 at 11:24:37AM +0800, Tang Chen wrote:
 
I did a bisect with the patch above applied each step of the way.
This time I got a plausible looking result
  
   I cannot reproduce this. Would you please share how to reproduce
it ?
   Or does it just happen during the booting ?
 
  Just during boot. Very early. So early in fact, I have no logging
facilities
  like usb-serial, just what is on vga console.
 
  If you want me to add some printk's, I can add a while (1); before
  the part that oopses so we can diagnose further..

 Sure. Would you please do that for me ? Maybe we can find something in
 the early log.

I was hoping you'd have suggestions what you'd like me to dump ;-)



I think I found something.

Since I can reproduce the first problem on 3.10, I found some memory
ranges in memblock
have nid = 1024. When we use node_set(), it will crash.

I'll see if we have the same problem on the latest kernel.

[ 0.00] NUMA: Initialized distance table, cnt=2
[ 0.00] NUMA: Warning: node ids are out of bound, from=-1 to=-1
distance=10
[ 0.00] NUMA: Node 0 [mem 0x-0x7fff] + [mem
0x1-0x47fff] - [mem 0x-0x47fff]
[ 0.00] Initmem setup node 0 [mem 0x-0x47fff]
[ 0.00] NODE_DATA [mem 0x47ffd9000-0x47fff]
[ 0.00] Initmem setup node 1 [mem 0x48000-0x87fff]
[ 0.00] NODE_DATA [mem 0x87ffbb000-0x87ffe1fff]
[ 0.00] : i = 0, nid = 0
[ 0.00] : i = 1, nid = 0
[ 0.00] : i = 2, nid = 0
[ 0.00] : i = 3, nid = 0
[ 0.00] : i = 4, nid = 1024
[ 0.00] : i = 5, nid = 1024
[ 0.00] : i = 6, nid = 1
[ 0.00] : i = 7, nid = 1
[ 0.00] Reserving 128MB of memory at 704MB for crashkernel (System
RAM: 32406MB)
[ 0.00] [ea00-ea0011ff] PMD -
[88047020-88047fdf] on node 0
[ 0.00] [ea001200-ea0021ff] PMD -
[88086f60-88087f5f] on node 1
[ 0.00] Zone ranges:
[ 0.00] DMA [mem 0x1000-0x00ff]
[ 0.00] DMA32 [mem 0x0100-0x]
[ 0.00] Normal [mem 0x1-0x87fff]
[ 0.00] Movable zone start for each node
[ 0.00] Early memory node ranges
[ 0.00] node 0: [mem 0x1000-0x00098fff]
[ 0.00] node 0: [mem 0x0010-0x696f7fff]
[ 0.00] node 0: [mem 0x1-0x47fff]
[ 0.00] node 1: [mem 0x48000-0x87fff]

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-26 Thread Tang Chen




On 01/24/2014 06:31 AM, Dave Jones wrote:

On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote:

  >  128 bytes is a pretty small amount of stack though, so I'm just as confused
  >  as to what the actual bug here is.
  >
  >  After trying the proposed fix, I got another oops in the early init code..
  >
  >  
  >  nr_free_zone_pages
  >  nr_free_pagecache_pages
  >  build_all_zonelists
  >  start_kernel
  >bc164b1e next_zones_zonelist
  >bcc01f00

Ok, this is crashing here in next_zones_zonelist...

 while (zonelist_zone_idx(z)>  highest_zoneidx)
   de:   3b 77 08cmp0x8(%rdi),%esi


I stuck this at the top of the function..

printk(KERN_ERR "z:%p nodes:%p highest:%d\n", z, nodes, highest_zoneidx);

and got

z: 1d08   nodes: (null)  highest:3


Some build tests show..

MAXSMP ( NODESHIFT=10 ) : Bug
NRCPUS=4&  NODESHIFT=10 : Bug
NRCPUS=4&  NODESHIFT=1 : no bug


The middle config test was accidental, I hadn't realised disabling MAXSMP
wouldn't reset NODESHIFT to something sane.

I'll start bisecting, as MAXSMP worked fine until a few days ago.


Hi Dave,

I didn't reproduce this bug. Would you please share the bisect result ?

Thanks.



Dave



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-26 Thread Tang Chen




On 01/24/2014 06:31 AM, Dave Jones wrote:

On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote:

128 bytes is a pretty small amount of stack though, so I'm just as confused
as to what the actual bug here is.
  
After trying the proposed fix, I got another oops in the early init code..
  
trace
nr_free_zone_pages
nr_free_pagecache_pages
build_all_zonelists
start_kernel
rip  bc164b1e next_zones_zonelist
rsp  bcc01f00

Ok, this is crashing here in next_zones_zonelist...

 while (zonelist_zone_idx(z)  highest_zoneidx)
   de:   3b 77 08cmp0x8(%rdi),%esi


I stuck this at the top of the function..

printk(KERN_ERR z:%p nodes:%p highest:%d\n, z, nodes, highest_zoneidx);

and got

z: 1d08   nodes: (null)  highest:3


Some build tests show..

MAXSMP ( NODESHIFT=10 ) : Bug
NRCPUS=4  NODESHIFT=10 : Bug
NRCPUS=4  NODESHIFT=1 : no bug


The middle config test was accidental, I hadn't realised disabling MAXSMP
wouldn't reset NODESHIFT to something sane.

I'll start bisecting, as MAXSMP worked fine until a few days ago.


Hi Dave,

I didn't reproduce this bug. Would you please share the bisect result ?

Thanks.



Dave



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote:

 > 128 bytes is a pretty small amount of stack though, so I'm just as confused
 > as to what the actual bug here is.
 > 
 > After trying the proposed fix, I got another oops in the early init code..
 > 
 > 
 > nr_free_zone_pages
 > nr_free_pagecache_pages
 > build_all_zonelists
 > start_kernel
 >  bc164b1e next_zones_zonelist
 >  bcc01f00

Ok, this is crashing here in next_zones_zonelist...

while (zonelist_zone_idx(z) > highest_zoneidx)
  de:   3b 77 08cmp0x8(%rdi),%esi


I stuck this at the top of the function..

printk(KERN_ERR "z:%p nodes:%p highest:%d\n", z, nodes, highest_zoneidx);

and got

z: 1d08   nodes: (null)  highest:3


Some build tests show..

MAXSMP ( NODESHIFT=10 ) : Bug
NRCPUS=4 & NODESHIFT=10 : Bug
NRCPUS=4 & NODESHIFT=1 : no bug


The middle config test was accidental, I hadn't realised disabling MAXSMP
wouldn't reset NODESHIFT to something sane.

I'll start bisecting, as MAXSMP worked fine until a few days ago.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:15:51PM -0800, David Rientjes wrote:
 > On Thu, 23 Jan 2014, Dave Jones wrote:
 > 
 > > It's 10, because I had MAXSMP set.
 > > 
 > > So, MAX_NUMNODES = 1 << 10
 > > 
 > > And the bitmask is made of longs. 1024 of them.
 > > 
 > > How does this work ?
 > > 
 > 
 > It's 1024 bits.

ok, I got lost in the maze of macros.

128 bytes is a pretty small amount of stack though, so I'm just as confused
as to what the actual bug here is.


After trying the proposed fix, I got another oops in the early init code..


nr_free_zone_pages
nr_free_pagecache_pages
build_all_zonelists
start_kernel
 bc164b1e next_zones_zonelist
 bcc01f00

I'll poke at it more in the morning. Too sleepy.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:15:51PM -0800, David Rientjes wrote:
  On Thu, 23 Jan 2014, Dave Jones wrote:
  
   It's 10, because I had MAXSMP set.
   
   So, MAX_NUMNODES = 1  10
   
   And the bitmask is made of longs. 1024 of them.
   
   How does this work ?
   
  
  It's 1024 bits.

ok, I got lost in the maze of macros.

128 bytes is a pretty small amount of stack though, so I'm just as confused
as to what the actual bug here is.


After trying the proposed fix, I got another oops in the early init code..

trace
nr_free_zone_pages
nr_free_pagecache_pages
build_all_zonelists
start_kernel
rip bc164b1e next_zones_zonelist
rsp bcc01f00

I'll poke at it more in the morning. Too sleepy.

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-23 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:58:24AM -0500, Dave Jones wrote:

  128 bytes is a pretty small amount of stack though, so I'm just as confused
  as to what the actual bug here is.
  
  After trying the proposed fix, I got another oops in the early init code..
  
  trace
  nr_free_zone_pages
  nr_free_pagecache_pages
  build_all_zonelists
  start_kernel
  rip bc164b1e next_zones_zonelist
  rsp bcc01f00

Ok, this is crashing here in next_zones_zonelist...

while (zonelist_zone_idx(z)  highest_zoneidx)
  de:   3b 77 08cmp0x8(%rdi),%esi


I stuck this at the top of the function..

printk(KERN_ERR z:%p nodes:%p highest:%d\n, z, nodes, highest_zoneidx);

and got

z: 1d08   nodes: (null)  highest:3


Some build tests show..

MAXSMP ( NODESHIFT=10 ) : Bug
NRCPUS=4  NODESHIFT=10 : Bug
NRCPUS=4  NODESHIFT=1 : no bug


The middle config test was accidental, I hadn't realised disabling MAXSMP
wouldn't reset NODESHIFT to something sane.

I'll start bisecting, as MAXSMP worked fine until a few days ago.

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Tang Chen

On 01/23/2014 02:13 PM, Dave Jones wrote:

On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote:
  >  On Thu, 23 Jan 2014, Tang Chen wrote:
  >

..

  >
  >  I guess it depends on what Dave's CONFIG_NODES_SHIFT is?

It's 10, because I had MAXSMP set.

So, MAX_NUMNODES = 1<<  10

And the bitmask is made of longs. 1024 of them.

How does this work ?


I have the same config with you.

Would you please try it for me ?  Does it work on your box ?

I cannot reproduce this problem on the latest kernel.
But I can reproduce it on 3.10.

Thanks



Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Dave Jones wrote:

> It's 10, because I had MAXSMP set.
> 
> So, MAX_NUMNODES = 1 << 10
> 
> And the bitmask is made of longs. 1024 of them.
> 
> How does this work ?
> 

It's 1024 bits.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote:
 > On Thu, 23 Jan 2014, Tang Chen wrote:
 > 
 > > Dave found that the kernel will hang during boot. This is because
 > > the nodemask_t type stack variable numa_kernel_nodes is large enough
 > > to overflow the stack.
 > > 
 > > This doesn't always happen. According to Dave, this happened once
 > > in about five boots. The backtrace is like the following:
 > > 
 > > dump_stack
 > > panic
 > > ? numa_clear_kernel_node_hotplug
 > > __stack_chk_fail
 > > numa_clear_kernel_node_hotplug
 > > ? memblock_search_pfn_nid
 > > ? __early_pfn_to_nid
 > > numa_init
 > > x86_numa_init
 > > initmem_init
 > > setup_arch
 > > start_kernel
 > > 
 > > This patch fix this problem by defining numa_kernel_nodes as a
 > > static global variable in __initdata area.
 > > 
 > > Reported-by: Dave Jones 
 > > Signed-off-by: Tang Chen 
 > > Tested-by: Gu Zheng 
 > 
 > I guess it depends on what Dave's CONFIG_NODES_SHIFT is?

It's 10, because I had MAXSMP set.

So, MAX_NUMNODES = 1 << 10

And the bitmask is made of longs. 1024 of them.

How does this work ?

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Tang Chen wrote:

> Dave found that the kernel will hang during boot. This is because
> the nodemask_t type stack variable numa_kernel_nodes is large enough
> to overflow the stack.
> 
> This doesn't always happen. According to Dave, this happened once
> in about five boots. The backtrace is like the following:
> 
> dump_stack
> panic
> ? numa_clear_kernel_node_hotplug
> __stack_chk_fail
> numa_clear_kernel_node_hotplug
> ? memblock_search_pfn_nid
> ? __early_pfn_to_nid
> numa_init
> x86_numa_init
> initmem_init
> setup_arch
> start_kernel
> 
> This patch fix this problem by defining numa_kernel_nodes as a
> static global variable in __initdata area.
> 
> Reported-by: Dave Jones 
> Signed-off-by: Tang Chen 
> Tested-by: Gu Zheng 

I guess it depends on what Dave's CONFIG_NODES_SHIFT is?

> ---
>  arch/x86/mm/numa.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index 81b2750..ebefeb7 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
>   }
>  }
>  
> +static nodemask_t numa_kernel_nodes __initdata;
>  static void __init numa_clear_kernel_node_hotplug(void)
>  {
>   int i, nid;
> - nodemask_t numa_kernel_nodes;
>   unsigned long start, end;
>   struct memblock_type *type = 
>  

Isn't this also a bugfix since you never initialize numa_kernel_nodes when 
it's allocated on the stack with NODE_MASK_NONE?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Andrew Morton
On Thu, 23 Jan 2014 13:49:28 +0800 Tang Chen  wrote:

> Dave found that the kernel will hang during boot. This is because
> the nodemask_t type stack variable numa_kernel_nodes is large enough
> to overflow the stack.
> 
> This doesn't always happen. According to Dave, this happened once
> in about five boots. The backtrace is like the following:
> 
> dump_stack
> panic
> ? numa_clear_kernel_node_hotplug
> __stack_chk_fail
> numa_clear_kernel_node_hotplug
> ? memblock_search_pfn_nid
> ? __early_pfn_to_nid
> numa_init
> x86_numa_init
> initmem_init
> setup_arch
> start_kernel
> 
> This patch fix this problem by defining numa_kernel_nodes as a
> static global variable in __initdata area.
> 
> ...
>
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
>   }
>  }
>  
> +static nodemask_t numa_kernel_nodes __initdata;
>  static void __init numa_clear_kernel_node_hotplug(void)
>  {
>   int i, nid;
> - nodemask_t numa_kernel_nodes;
>   unsigned long start, end;
>   struct memblock_type *type = 

Seems odd.  The maximum size of a nodemask_t is 128 bytes, isn't it? 
If so, what the heck have we done in there to consume so much stack?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:49:28PM +0800, Tang Chen wrote:
 
 > This doesn't always happen. According to Dave, this happened once
 > in about five boots. The backtrace is like the following:
 > 
 > dump_stack
 > panic
 > ? numa_clear_kernel_node_hotplug
 > __stack_chk_fail
 > numa_clear_kernel_node_hotplug
 > ? memblock_search_pfn_nid
 > ? __early_pfn_to_nid
 > numa_init
 > x86_numa_init
 > initmem_init
 > setup_arch
 > start_kernel
 > 
 > This patch fix this problem by defining numa_kernel_nodes as a
 > static global variable in __initdata area.
 > 
 > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
 > index 81b2750..ebefeb7 100644
 > --- a/arch/x86/mm/numa.c
 > +++ b/arch/x86/mm/numa.c
 > @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
 >  }
 >  }
 >  
 > +static nodemask_t numa_kernel_nodes __initdata;
 >  static void __init numa_clear_kernel_node_hotplug(void)
 >  {
 >  int i, nid;
 > -nodemask_t numa_kernel_nodes;
 >  unsigned long start, end;
 >  struct memblock_type *type = 

I'm surprised that this worked for anyone.
By my math, nodemask_t is 1024 longs, which should fill the whole stack.

Any idea why it only broke sometimes ?

There are other on-stack nodemask_t's in the tree too, why are they safe ?

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Thu, Jan 23, 2014 at 01:49:28PM +0800, Tang Chen wrote:
 
  This doesn't always happen. According to Dave, this happened once
  in about five boots. The backtrace is like the following:
  
  dump_stack
  panic
  ? numa_clear_kernel_node_hotplug
  __stack_chk_fail
  numa_clear_kernel_node_hotplug
  ? memblock_search_pfn_nid
  ? __early_pfn_to_nid
  numa_init
  x86_numa_init
  initmem_init
  setup_arch
  start_kernel
  
  This patch fix this problem by defining numa_kernel_nodes as a
  static global variable in __initdata area.
  
  diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
  index 81b2750..ebefeb7 100644
  --- a/arch/x86/mm/numa.c
  +++ b/arch/x86/mm/numa.c
  @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
   }
   }
   
  +static nodemask_t numa_kernel_nodes __initdata;
   static void __init numa_clear_kernel_node_hotplug(void)
   {
   int i, nid;
  -nodemask_t numa_kernel_nodes;
   unsigned long start, end;
   struct memblock_type *type = memblock.reserved;

I'm surprised that this worked for anyone.
By my math, nodemask_t is 1024 longs, which should fill the whole stack.

Any idea why it only broke sometimes ?

There are other on-stack nodemask_t's in the tree too, why are they safe ?

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Andrew Morton
On Thu, 23 Jan 2014 13:49:28 +0800 Tang Chen tangc...@cn.fujitsu.com wrote:

 Dave found that the kernel will hang during boot. This is because
 the nodemask_t type stack variable numa_kernel_nodes is large enough
 to overflow the stack.
 
 This doesn't always happen. According to Dave, this happened once
 in about five boots. The backtrace is like the following:
 
 dump_stack
 panic
 ? numa_clear_kernel_node_hotplug
 __stack_chk_fail
 numa_clear_kernel_node_hotplug
 ? memblock_search_pfn_nid
 ? __early_pfn_to_nid
 numa_init
 x86_numa_init
 initmem_init
 setup_arch
 start_kernel
 
 This patch fix this problem by defining numa_kernel_nodes as a
 static global variable in __initdata area.
 
 ...

 --- a/arch/x86/mm/numa.c
 +++ b/arch/x86/mm/numa.c
 @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
   }
  }
  
 +static nodemask_t numa_kernel_nodes __initdata;
  static void __init numa_clear_kernel_node_hotplug(void)
  {
   int i, nid;
 - nodemask_t numa_kernel_nodes;
   unsigned long start, end;
   struct memblock_type *type = memblock.reserved;

Seems odd.  The maximum size of a nodemask_t is 128 bytes, isn't it? 
If so, what the heck have we done in there to consume so much stack?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Tang Chen wrote:

 Dave found that the kernel will hang during boot. This is because
 the nodemask_t type stack variable numa_kernel_nodes is large enough
 to overflow the stack.
 
 This doesn't always happen. According to Dave, this happened once
 in about five boots. The backtrace is like the following:
 
 dump_stack
 panic
 ? numa_clear_kernel_node_hotplug
 __stack_chk_fail
 numa_clear_kernel_node_hotplug
 ? memblock_search_pfn_nid
 ? __early_pfn_to_nid
 numa_init
 x86_numa_init
 initmem_init
 setup_arch
 start_kernel
 
 This patch fix this problem by defining numa_kernel_nodes as a
 static global variable in __initdata area.
 
 Reported-by: Dave Jones da...@redhat.com
 Signed-off-by: Tang Chen tangc...@cn.fujitsu.com
 Tested-by: Gu Zheng guz.f...@cn.fujitsu.com

I guess it depends on what Dave's CONFIG_NODES_SHIFT is?

 ---
  arch/x86/mm/numa.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
 index 81b2750..ebefeb7 100644
 --- a/arch/x86/mm/numa.c
 +++ b/arch/x86/mm/numa.c
 @@ -562,10 +562,10 @@ static void __init numa_init_array(void)
   }
  }
  
 +static nodemask_t numa_kernel_nodes __initdata;
  static void __init numa_clear_kernel_node_hotplug(void)
  {
   int i, nid;
 - nodemask_t numa_kernel_nodes;
   unsigned long start, end;
   struct memblock_type *type = memblock.reserved;
  

Isn't this also a bugfix since you never initialize numa_kernel_nodes when 
it's allocated on the stack with NODE_MASK_NONE?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Dave Jones
On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote:
  On Thu, 23 Jan 2014, Tang Chen wrote:
  
   Dave found that the kernel will hang during boot. This is because
   the nodemask_t type stack variable numa_kernel_nodes is large enough
   to overflow the stack.
   
   This doesn't always happen. According to Dave, this happened once
   in about five boots. The backtrace is like the following:
   
   dump_stack
   panic
   ? numa_clear_kernel_node_hotplug
   __stack_chk_fail
   numa_clear_kernel_node_hotplug
   ? memblock_search_pfn_nid
   ? __early_pfn_to_nid
   numa_init
   x86_numa_init
   initmem_init
   setup_arch
   start_kernel
   
   This patch fix this problem by defining numa_kernel_nodes as a
   static global variable in __initdata area.
   
   Reported-by: Dave Jones da...@redhat.com
   Signed-off-by: Tang Chen tangc...@cn.fujitsu.com
   Tested-by: Gu Zheng guz.f...@cn.fujitsu.com
  
  I guess it depends on what Dave's CONFIG_NODES_SHIFT is?

It's 10, because I had MAXSMP set.

So, MAX_NUMNODES = 1  10

And the bitmask is made of longs. 1024 of them.

How does this work ?

Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread David Rientjes
On Thu, 23 Jan 2014, Dave Jones wrote:

 It's 10, because I had MAXSMP set.
 
 So, MAX_NUMNODES = 1  10
 
 And the bitmask is made of longs. 1024 of them.
 
 How does this work ?
 

It's 1024 bits.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable.

2014-01-22 Thread Tang Chen

On 01/23/2014 02:13 PM, Dave Jones wrote:

On Wed, Jan 22, 2014 at 10:06:14PM -0800, David Rientjes wrote:
On Thu, 23 Jan 2014, Tang Chen wrote:
  

..

  
I guess it depends on what Dave's CONFIG_NODES_SHIFT is?

It's 10, because I had MAXSMP set.

So, MAX_NUMNODES = 1  10

And the bitmask is made of longs. 1024 of them.

How does this work ?


I have the same config with you.

Would you please try it for me ?  Does it work on your box ?

I cannot reproduce this problem on the latest kernel.
But I can reproduce it on 3.10.

Thanks



Dave

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/