On Tue, 22 Apr 2014, Peter Zijlstra wrote: > On Tue, Apr 22, 2014 at 01:01:51PM -0700, Andrew Morton wrote: > > On Tue, 22 Apr 2014 10:15:15 +0200 Peter Zijlstra <pet...@infradead.org> > > wrote: > > > > > On Tue, Apr 22, 2014 at 01:27:15PM +0800, Jiang Liu wrote: > > > > When calling kzalloc_node(size, flags, node), we should first check > > > > whether node is onlined, otherwise it may cause invalid memory access > > > > as below. > > > > > > But this is only for memory less node crap, right? > > > > um, why are memoryless nodes crap? > > Why wouldn't they be? Having CPUs with no local memory seems decidedly > suboptimal.
The quick fix for memoryless node issues is usually just do cpu_to_mem() rather than cpu_to_node() in the caller. This assumes that the arch is setup correctly to handle memoryless nodes with CONFIG_HAVE_MEMORYLESS_NODES (and we've had problems recently with memoryless nodes not being configured correctly on powerpc). That type of a fix would probably be better handled in the slab allocator, though, since kmalloc_node(nid) shouldn't crash just because nid is memoryless, we should be doing local_memory_node(node) when allocating the slab pages. However, I don't think memoryless nodes are the problem here since Jiang is testing for !node_online(nid) in his patch, so it's a problem with cpu_to_node() pointing to an offline node. It makes sense for the page allocator to crash in such a case, the node id is erroneous. So either the cpu-to-node mapping is invalid or alloc_fair_sched_group() is allocating memory for a cpu on an offline node. The for_each_possible_cpu() looks suspicious. There's no guarantee that local_memory_node(node) for an offline node will return anything with affinity, so falling back to NUMA_NO_NODE looks appropriate in Jiang's patch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/