On Mon, Nov 26, 2018 at 04:45:54AM -0500, Qian Cai wrote: > On 11/25/18 11:52 PM, Qian Cai wrote: > > BTW, calling debug_objects_mem_init() before kmemleak_init() actually > > could trigger a loop on machines with 160+ CPUs until the pool is filled > > up, > > > > debug_objects_pool_min_level += num_possible_cpus() * 4; > > > > [1] while (obj_pool_free < debug_objects_pool_min_level) > > > > kmemleak_init > > kmemleak_ignore (from replaced static debug objects) > > make_black_object > > put_object > > __call_rcu (kernel/rcu/tree.c) > > debug_rcu_head_queue > > debug_object_activate > > debug_object_init > > fill_pool > > kmemleak_ignore (looping in [1]) > > make_black_object > > ... > > > > I think until this is resolved, there is no way to move > > debug_objects_mem_init() before kmemleak_init(). > > I believe this is a separate issue that kmemleak is broken with > CONFIG_DEBUG_OBJECTS_RCU_HEAD anyway where the infinite loop above could be > triggered in the existing code as well, i.e., once the pool need be refilled > (fill_pool()) after the system boot up, debug object creation will call > kmemleak_ignore() and it will create a new rcu debug_object_init(), and then > it will call fill_pool() again and again. As the results, the system is > locking up during kernel compilations. > > Hence, I'll send out a patch for debug objects with large CPUs anyway and > deal with kmemleak + CONFIG_DEBUG_OBJECTS_RCU_HEAD issue later.
I haven't hit this before but I can see why it happens. Kmemleak uses RCU for freeing its own data structures to avoid a recursive call to sl*b (kmem_cache_free()). Since we already tell kmemleak to ignore the debug_objects_cache allocations, I think we could as well add SLAB_NOLEAKTRACE to kmem_cache_create() (I haven't tried yet, this was not a documented API for kmemleak, rather used to avoid kmemleak tracking itself). -- Catalin