Hi Chris, I found the previous analysis of the BUG_ON() issue is incorrect after another round of code review. The really issue is that function early_kmem_cache_node_alloc() calls inc_slabs_node(kmem_cache_node, node, page->objects) to increase the object count on local node no matter whether page is allocated from local or remote node. With current implementation it's OK because every memory node has normal memory so page is allocated from local node. Now we are working on a patch set to improve memory hotplug. The basic idea is to to let some memory nodes only host ZONE_MOVABLE zone, so we could easily remove the whole memory node when needed. That means some memory nodes have no ZONE_NORMAL/ZONE_DMA, and the page will be allocated from remote node in function early_kmem_cache_node_alloc(). But early_kmem_cache_node_alloc() still increases object count on local node, which triggers the BUG_ON eventually when removing the affected memory node. I will try to work out another version for it. Thanks! Gerry
On 07/18/2012 01:39 AM, Christoph Lameter wrote: > On Wed, 18 Jul 2012, Jiang Liu wrote: > >> From: Jianguo Wu <wujian...@huawei.com> >> >> From: Jianguo Wu <wujian...@huawei.com> >> >> SLUB allocator may cause a BUG_ON() when offlining a memory node if >> CONFIG_SLUB_DEBUG is on. The scenario is: >> >> 1) when creating kmem_cache_node slab, it cause inc_slabs_node() twice. >> early_kmem_cache_node_alloc >> ->new_slab >> ->inc_slabs_node >> ->inc_slabs_node > > New slab will not be able to increment the slab counter. It will > check that there is no per node structure yet and then skip the inc slabs > node. > > This suggests that a call to early_kmem_cache_node_alloc was not needed > because the per node structure already existed. Lets fix that instead. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/