Hi, all
        I am working on some debug features' development.
I use kmalloc in some places of *scheduler*. And the gfp_flag is GFP_ATOMIC, 
code looks like 
p = kmalloc(sizeof(*p), GFP_ATOMIC);

however I notice GFP_ATOMIC is still not enough. because when system is at low 
memory state, slub might try to wakeup kswapd. then some weird issues hit.

for example, this is one bug log.

[   83.713009, 0]BUG: spinlock recursion on CPU#0, rcu_preempt/7
[   83.719463, 0] lock: 0xffff88007ac12140, .magic: dead4ead, .owner: 
rcu_preempt/7, .owner_cpu: 0
[   83.729211, 0]CPU: 0 PID: 7 Comm: rcu_preempt Tainted: G        W    
3.14.37-x86_64-gfa0f236-dirty #364
[   83.739733, 0]Hardware name: Intel Corporation CHERRYVIEW C0 PLATFORM/Cherry 
Trail FFD, BIOS BYO-P2.X64.0023.R03.1509161045 09/16/2015
[   83.753266, 0] ffff88007ac12140 ffff880074863880 ffffffff8198bca9 
ffff8800749118b0
[   83.761871, 0] ffff8800748638a0 ffffffff819886d2 ffff88007ac12140 
ffffffff81d89797
[   83.770451, 0] ffff8800748638c0 ffffffff819886fd ffff88007ac12140 
ffff880070dc38a8
[   83.779066, 0]Call Trace:
[   83.782014, 0] [<ffffffff8198bca9>] dump_stack+0x4e/0x7a
[   83.787975, 0] [<ffffffff819886d2>] spin_dump+0x91/0x96
[   83.793839, 0] [<ffffffff819886fd>] spin_bug+0x26/0x2b
[   83.799606, 0] [<ffffffff810d3366>] do_raw_spin_lock+0x116/0x140
[   83.806344, 0] [<ffffffff81998d8f>] _raw_spin_lock+0x1f/0x30
[   83.812693, 0] [<ffffffff810bb884>] try_to_wake_up+0x154/0x2c0
[   83.819237, 0] [<ffffffff810bba62>] default_wake_function+0x12/0x20
[   83.826265, 0] [<ffffffff810cb1e8>] autoremove_wake_function+0x18/0x40
[   83.833585, 0] [<ffffffff810caaf8>] __wake_up_common+0x58/0x90
[   83.840128, 0] [<ffffffff810cad29>] __wake_up+0x39/0x50
[   83.845994, 0] [<ffffffff811659cd>] wakeup_kswapd+0xcd/0x140
[   83.852343, 0] [<ffffffff8115cbdd>] __alloc_pages_nodemask+0x95d/0xa30
[   83.859666, 0] [<ffffffff81195d2f>] new_slab+0x6f/0x2b0
[   83.865529, 0] [<ffffffff81989e5d>] __slab_alloc.constprop.64+0x26c/0x49f
[   83.873141, 0] [<ffffffff810af6c5>] ? insert_kill_task+0x25/0xa0
[   83.879880, 0] [<ffffffff81387f54>] ? __list_del_entry+0x14/0xf0
[   83.886618, 0] [<ffffffff811975f4>] kmem_cache_alloc_trace+0x174/0x1b0
[   83.893938, 0] [<ffffffff810af6c5>] ? insert_kill_task+0x25/0xa0
[   83.900677, 0] [<ffffffff810af6c5>] insert_kill_task+0x25/0xa0 //this 
function is simple, just treat it as kmalloc :)
[   83.907220, 0] [<ffffffff81994846>] __schedule+0x6a6/0x870
[   83.913375, 0] [<ffffffff81994a39>] schedule+0x29/0x70
[   83.919141, 0] [<ffffffff81993b02>] schedule_timeout+0x172/0x310
[   83.925879, 0] [<ffffffff81998e8e>] ? _raw_spin_unlock_irqrestore+0x1e/0x40
[   83.933685, 0] [<ffffffff810937c0>] ? __internal_add_timer+0x130/0x130
[   83.941005, 0] [<ffffffff810cb047>] ? prepare_to_wait_event+0x87/0xf0
[   83.948230, 0] [<ffffffff810e371a>] rcu_gp_kthread+0x40a/0x6e0
[   83.954774, 0] [<ffffffff810cb1d0>] ? abort_exclusive_wait+0xb0/0xb0
[   83.961899, 0] [<ffffffff810e3310>] ? rcu_try_advance_all_cbs+0xf0/0xf0
[   83.969315, 0] [<ffffffff810aa8a4>] kthread+0xe4/0x100
[   83.975081, 0] [<ffffffff810aa7c0>] ? kthread_create_on_node+0x190/0x190
[   83.982596, 0] [<ffffffff819a0f48>] ret_from_fork+0x58/0x90
[   83.988847, 0] [<ffffffff810aa7c0>] ? kthread_create_on_node+0x190/0x190

After some simple check, I change my codes. this time code looks like:
p = kmalloc(sizeof(*p), GFP_ATOMIC | __GFP_NO_KSWAPD);
I think this flag will forbid slub to call any scheduler codes. But issue still 
hit. :(

my test result shows that __GFP_NO_KSWAPD is cleared when slub pass gfp_flag to 
page allocator!!!

at last I found it is clear by codes below.
1441 static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
1442 {
1443         if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
1444                 pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
1445                 BUG();
1446         }
1447 
1448         return allocate_slab(s,
1449                 flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), 
node);//all other flags will be cleared. my god!!!
1450 }

I think GFP_RECLAIM_MASK should include as many available flags as possible. :)

thanks
xinhui

On 2015年10月14日 13:36, Pan Xinhui wrote:
> From: Pan Xinhui <xinhuix....@intel.com>
> 
> GFP_RECLAIM_MASK was introduced in commit 6cb062296f73 ("Categorize GFP
> flags"). In slub subsystem, this macro controls slub's allocation
> behavior. In particular, some flags which are not in GFP_RECLAIM_MASK
> will be cleared. So when slub pass this new gfp_flag into page
> allocator, we might lost some very important flags.
> 
> There are some mistakes when we introduce __GFP_NO_KSWAPD. This flag is
> used to avoid any scheduler-related codes recursive.  But it seems like
> patch author forgot to add it into GFP_RECLAIM_MASK. So lets add it now.
> 
> Signed-off-by: Pan Xinhui <xinhuix....@intel.com>
> ---
>  include/linux/gfp.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index f92cbd2..9ebad4d 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -130,7 +130,8 @@ struct vm_area_struct;
>  /* Control page allocator reclaim behavior */
>  #define GFP_RECLAIM_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS|\
>                       __GFP_NOWARN|__GFP_REPEAT|__GFP_NOFAIL|\
> -                     __GFP_NORETRY|__GFP_MEMALLOC|__GFP_NOMEMALLOC)
> +                     __GFP_NORETRY|__GFP_MEMALLOC|__GFP_NOMEMALLOC|\
> +                     __GFP_NO_KSWAPD)
>  
>  /* Control slab gfp mask during early boot */
>  #define GFP_BOOT_MASK (__GFP_BITS_MASK & ~(__GFP_WAIT|__GFP_IO|__GFP_FS))
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to