On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: > On 01/28/2014 08:32 AM, David Rientjes wrote: > > On Wed, 22 Jan 2014, David Rientjes wrote: > > > >>> arch/x86/mm/numa.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > >>> index 81b2750..ebefeb7 100644 > >>> --- a/arch/x86/mm/numa.c > >>> +++ b/arch/x86/mm/numa.c > >>> @@ -562,10 +562,10 @@ static void __init numa_init_array(void) > >>> } > >>> } > >>> > >>> +static nodemask_t numa_kernel_nodes __initdata; > >>> static void __init numa_clear_kernel_node_hotplug(void) > >>> { > >>> int i, nid; > >>> - nodemask_t numa_kernel_nodes; > >>> unsigned long start, end; > >>> struct memblock_type *type =&memblock.reserved; > >>> > >> > >> Isn't this also a bugfix since you never initialize numa_kernel_nodes when > >> it's allocated on the stack with NODE_MASK_NONE? > >> > > > > This hasn't been answered and the patch still isn't in linux-kernel yet > > Dave tested it as good. I'm suspicious of the changelog that indicates > > this nodemask is the result of a stack overflow itself which only manages > > to reproduce itself in the init patch slightly more than 50% of the time. > > How is that possible? > > > > I think the changelog should indicate this also fixes an uninitialized > > nodemask issue. > > Hi David, > > I'm still working on this problem, but unfortunately nothing new for now. > And the test till now shows no more problem here. > > I'm digging into it, but need more time. > > I'll resend a new patch and modify the changelog soon. Before we find the > root cause, I think we can use this patch as a temporary solution.
Ok, I hit the 2nd bug again (oops in next_zones_zonelist...) I did a bisect with the patch above applied each step of the way. This time I got a plausible looking result.... a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit commit a0acda917284183f9b71e2d08b0aa0aea722b321 Author: Tang Chen <tangc...@cn.fujitsu.com> Date: Tue Jan 21 15:49:32 2014 -0800 acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable Reverting this commit of course removes the whole function from above, so we haven't really learned anything new, other than that commit is broken, even after the above fix-up. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/