On 06/04/2015 02:45 AM, Gu Zheng wrote: > The following lockdep warning occurrs when running with latest kernel: > [ 3.178000] ------------[ cut here ]------------ > [ 3.183000] WARNING: CPU: 128 PID: 0 at kernel/locking/lockdep.c:2755 > lockdep_trace_alloc+0xdd/0xe0() > [ 3.193000] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) > [ 3.199000] Modules linked in: > > [ 3.203000] CPU: 128 PID: 0 Comm: swapper/128 Not tainted 4.1.0-rc3 #70 > [ 3.221000] 0000000000000000 2d6601fb3e6d4e4c ffff88086fd5fc38 > ffffffff81773f0a > [ 3.230000] 0000000000000000 ffff88086fd5fc90 ffff88086fd5fc78 > ffffffff8108c85a > [ 3.238000] ffff88086fd60000 0000000000000092 ffff88086fd60000 > 00000000000000d0 > [ 3.246000] Call Trace: > [ 3.249000] [<ffffffff81773f0a>] dump_stack+0x4c/0x65 > [ 3.255000] [<ffffffff8108c85a>] warn_slowpath_common+0x8a/0xc0 > [ 3.261000] [<ffffffff8108c8e5>] warn_slowpath_fmt+0x55/0x70 > [ 3.268000] [<ffffffff810ee24d>] lockdep_trace_alloc+0xdd/0xe0 > [ 3.274000] [<ffffffff811cda0d>] __alloc_pages_nodemask+0xad/0xca0 > [ 3.281000] [<ffffffff810ec7ad>] ? __lock_acquire+0xf6d/0x1560 > [ 3.288000] [<ffffffff81219c8a>] alloc_page_interleave+0x3a/0x90 > [ 3.295000] [<ffffffff8121b32d>] alloc_pages_current+0x17d/0x1a0 > [ 3.301000] [<ffffffff811c869e>] ? __get_free_pages+0xe/0x50 > [ 3.308000] [<ffffffff811c869e>] __get_free_pages+0xe/0x50 > [ 3.314000] [<ffffffff8102640b>] init_espfix_ap+0x17b/0x320 > [ 3.320000] [<ffffffff8105c691>] start_secondary+0xf1/0x1f0 > [ 3.327000] ---[ end trace 1b3327d9d6a1d62c ]--- > > As we alloc pages with GFP_KERNEL in init_espfix_ap() which is called > before enabled local irq, and the lockdep sub-system considers this > behaviour as allocating memory with GFP_FS with local irq disabled, > then trigger the warning as mentioned about. > > Though we could allocate them on the boot CPU side and hand them over to > the secondary CPU, but it seemes a bit waste if some of cpus are offline. > As thers is no need to these pages(espfix stack) until we try to run user > code, so we postpone the initialization of espfix stack, and let the boot > up routine init the espfix stack for the target cpu after it booted to > avoid the noise. >
It isn't *at all* obvious to me at least that if the GFP_KERNEL allocation fails we may not get rescheduled on another CPU and/or get stuck. I'm starting to think that the right thing to do is to allocate these on the CPU that is bringing up the other CPU, at the same time we allocate the percpu area. This won't affect offline CPUs. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/