Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
On Thu 27-07-17 11:30:50, Wenwei Tao wrote: > 2017-07-26 21:44 GMT+08:00 Michal Hocko: > > On Wed 26-07-17 21:07:42, Wenwei Tao wrote: [...] > >> I think there is a css double put in mem_cgroup_iter. Under reclaim, > >> we call mem_cgroup_iter the first time with prev == NULL, and we get > >> last_visited memcg from per zone's reclaim_iter then call > >> __mem_cgroup_iter_next > >> try to get next alive memcg, __mem_cgroup_iter_next could return NULL > >> if last_visited is already the last one so we put the last_visited's > >> memcg css and continue to the next while loop, this time we might not > >> do css_tryget(_visited->css) if the dead_count is changed, but > >> we still do css_put(_visited->css), we put it twice, this could > >> trigger the BUG_ON at kernel/cgroup.c:893. > > > > Yes, I guess your are right and I suspect that this has been silently > > fixed by 519ebea3bf6d ("mm: memcontrol: factor out reclaim iterator > > loading and updating"). I think a more appropriate fix is would be. > > Are you able to reproduce and re-test it? > > --- > > Yes, I think this commit can fix this issue, and I backport this > commit to 3.10.107 kernel and cannot reproduce this issue. I guess > this commit might need to be backported to 3.10.y stable kernel. Please send it to the kernel-stable mailing list. 3.10 seems to be still maintained. -- Michal Hocko SUSE Labs
Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
On Thu 27-07-17 11:30:50, Wenwei Tao wrote: > 2017-07-26 21:44 GMT+08:00 Michal Hocko : > > On Wed 26-07-17 21:07:42, Wenwei Tao wrote: [...] > >> I think there is a css double put in mem_cgroup_iter. Under reclaim, > >> we call mem_cgroup_iter the first time with prev == NULL, and we get > >> last_visited memcg from per zone's reclaim_iter then call > >> __mem_cgroup_iter_next > >> try to get next alive memcg, __mem_cgroup_iter_next could return NULL > >> if last_visited is already the last one so we put the last_visited's > >> memcg css and continue to the next while loop, this time we might not > >> do css_tryget(_visited->css) if the dead_count is changed, but > >> we still do css_put(_visited->css), we put it twice, this could > >> trigger the BUG_ON at kernel/cgroup.c:893. > > > > Yes, I guess your are right and I suspect that this has been silently > > fixed by 519ebea3bf6d ("mm: memcontrol: factor out reclaim iterator > > loading and updating"). I think a more appropriate fix is would be. > > Are you able to reproduce and re-test it? > > --- > > Yes, I think this commit can fix this issue, and I backport this > commit to 3.10.107 kernel and cannot reproduce this issue. I guess > this commit might need to be backported to 3.10.y stable kernel. Please send it to the kernel-stable mailing list. 3.10 seems to be still maintained. -- Michal Hocko SUSE Labs
Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
2017-07-26 21:44 GMT+08:00 Michal Hocko: > On Wed 26-07-17 21:07:42, Wenwei Tao wrote: >> From: Wenwei Tao >> >> By removing the child cgroup while the parent cgroup is >> under reclaim, we could trigger the following kernel panic >> on kernel 3.10: >> >> kernel BUG at kernel/cgroup.c:893! >> invalid opcode: [#1] SMP >> CPU: 1 PID: 22477 Comm: kworker/1:1 Not tainted 3.10.107 #1 >> Workqueue: cgroup_destroy css_dput_fn >> task: 8817959a5780 ti: 8817e8886000 task.ti: 8817e8886000 >> RIP: 0010:[] [] >> cgroup_diput+0xc0/0xf0 >> RSP: :8817e8887da0 EFLAGS: 00010246 >> RAX: RBX: 8817a5dd5d40 RCX: dead0200 >> RDX: RSI: 8817973a6910 RDI: 8817f54c2a00 >> RBP: 8817e8887dc8 R08: 8817a5dd5dd0 R09: df9fb35794b01820 >> R10: df9fb35794b01820 R11: 7fa95b1efcda R12: 8817a5dd5d9c >> R13: 8817f38b3a40 R14: 8817973a6910 R15: 8817973a6910 >> FS: () GS:88181f22() >> knlGS: >> CS: 0010 DS: ES: CR0: 80050033 >> CR2: 7fa6e6234000 CR3: 00179f19d000 CR4: 000407e0 >> DR0: DR1: DR2: >> DR3: DR6: 0ff0 DR7: 0400 >> Stack: >> 8817a5dd5d40 8817a5dd5d9c 8817f38b3a40 8817973a6910 >> 0040 8817e8887df8 811b37c2 8817fa23c000 >> 8817f57dbb80 88181f232ac0 88181f237500 8817e8887e10 >> Call Trace: >> [] dput+0x1a2/0x2f0 >> [] cgroup_dput.isra.21+0x1c/0x30 >> [] css_dput_fn+0x1d/0x20 >> [] process_one_work+0x17c/0x460 >> [] worker_thread+0x116/0x3b0 >> [] ? manage_workers.isra.25+0x290/0x290 >> [] kthread+0xc0/0xd0 >> [] ? insert_kthread_work+0x40/0x40 >> [] ret_from_fork+0x58/0x90 >> [] ? insert_kthread_work+0x40/0x40 >> Code: 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 7f 78 48 8b 07 a8 01 74 15 >> 48 81 c7 30 01 00 00 48 c7 c6 a0 a7 0c 81 e8 b2 83 02 00 eb c8 <0f> 0b >> 49 8b 4e 18 48 c7 c2 7e f1 7a 81 be 85 03 00 00 48 c7 c7 >> RIP [] cgroup_diput+0xc0/0xf0 >> RSP >> ---[ end trace 85eeea5212c44f51 ]--- >> >> >> I think there is a css double put in mem_cgroup_iter. Under reclaim, >> we call mem_cgroup_iter the first time with prev == NULL, and we get >> last_visited memcg from per zone's reclaim_iter then call >> __mem_cgroup_iter_next >> try to get next alive memcg, __mem_cgroup_iter_next could return NULL >> if last_visited is already the last one so we put the last_visited's >> memcg css and continue to the next while loop, this time we might not >> do css_tryget(_visited->css) if the dead_count is changed, but >> we still do css_put(_visited->css), we put it twice, this could >> trigger the BUG_ON at kernel/cgroup.c:893. > > Yes, I guess your are right and I suspect that this has been silently > fixed by 519ebea3bf6d ("mm: memcontrol: factor out reclaim iterator > loading and updating"). I think a more appropriate fix is would be. > Are you able to reproduce and re-test it? > --- Yes, I think this commit can fix this issue, and I backport this commit to 3.10.107 kernel and cannot reproduce this issue. I guess this commit might need to be backported to 3.10.y stable kernel. > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 437ae2cbe102..0848ec05c12a 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1224,6 +1224,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup > *root, > if (last_visited && last_visited != root && > !css_tryget(_visited->css)) > last_visited = NULL; > + } else { > + last_visited = true; > } > } > > -- > Michal Hocko > SUSE Labs
Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
2017-07-26 21:44 GMT+08:00 Michal Hocko : > On Wed 26-07-17 21:07:42, Wenwei Tao wrote: >> From: Wenwei Tao >> >> By removing the child cgroup while the parent cgroup is >> under reclaim, we could trigger the following kernel panic >> on kernel 3.10: >> >> kernel BUG at kernel/cgroup.c:893! >> invalid opcode: [#1] SMP >> CPU: 1 PID: 22477 Comm: kworker/1:1 Not tainted 3.10.107 #1 >> Workqueue: cgroup_destroy css_dput_fn >> task: 8817959a5780 ti: 8817e8886000 task.ti: 8817e8886000 >> RIP: 0010:[] [] >> cgroup_diput+0xc0/0xf0 >> RSP: :8817e8887da0 EFLAGS: 00010246 >> RAX: RBX: 8817a5dd5d40 RCX: dead0200 >> RDX: RSI: 8817973a6910 RDI: 8817f54c2a00 >> RBP: 8817e8887dc8 R08: 8817a5dd5dd0 R09: df9fb35794b01820 >> R10: df9fb35794b01820 R11: 7fa95b1efcda R12: 8817a5dd5d9c >> R13: 8817f38b3a40 R14: 8817973a6910 R15: 8817973a6910 >> FS: () GS:88181f22() >> knlGS: >> CS: 0010 DS: ES: CR0: 80050033 >> CR2: 7fa6e6234000 CR3: 00179f19d000 CR4: 000407e0 >> DR0: DR1: DR2: >> DR3: DR6: 0ff0 DR7: 0400 >> Stack: >> 8817a5dd5d40 8817a5dd5d9c 8817f38b3a40 8817973a6910 >> 0040 8817e8887df8 811b37c2 8817fa23c000 >> 8817f57dbb80 88181f232ac0 88181f237500 8817e8887e10 >> Call Trace: >> [] dput+0x1a2/0x2f0 >> [] cgroup_dput.isra.21+0x1c/0x30 >> [] css_dput_fn+0x1d/0x20 >> [] process_one_work+0x17c/0x460 >> [] worker_thread+0x116/0x3b0 >> [] ? manage_workers.isra.25+0x290/0x290 >> [] kthread+0xc0/0xd0 >> [] ? insert_kthread_work+0x40/0x40 >> [] ret_from_fork+0x58/0x90 >> [] ? insert_kthread_work+0x40/0x40 >> Code: 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 7f 78 48 8b 07 a8 01 74 15 >> 48 81 c7 30 01 00 00 48 c7 c6 a0 a7 0c 81 e8 b2 83 02 00 eb c8 <0f> 0b >> 49 8b 4e 18 48 c7 c2 7e f1 7a 81 be 85 03 00 00 48 c7 c7 >> RIP [] cgroup_diput+0xc0/0xf0 >> RSP >> ---[ end trace 85eeea5212c44f51 ]--- >> >> >> I think there is a css double put in mem_cgroup_iter. Under reclaim, >> we call mem_cgroup_iter the first time with prev == NULL, and we get >> last_visited memcg from per zone's reclaim_iter then call >> __mem_cgroup_iter_next >> try to get next alive memcg, __mem_cgroup_iter_next could return NULL >> if last_visited is already the last one so we put the last_visited's >> memcg css and continue to the next while loop, this time we might not >> do css_tryget(_visited->css) if the dead_count is changed, but >> we still do css_put(_visited->css), we put it twice, this could >> trigger the BUG_ON at kernel/cgroup.c:893. > > Yes, I guess your are right and I suspect that this has been silently > fixed by 519ebea3bf6d ("mm: memcontrol: factor out reclaim iterator > loading and updating"). I think a more appropriate fix is would be. > Are you able to reproduce and re-test it? > --- Yes, I think this commit can fix this issue, and I backport this commit to 3.10.107 kernel and cannot reproduce this issue. I guess this commit might need to be backported to 3.10.y stable kernel. > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 437ae2cbe102..0848ec05c12a 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1224,6 +1224,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup > *root, > if (last_visited && last_visited != root && > !css_tryget(_visited->css)) > last_visited = NULL; > + } else { > + last_visited = true; > } > } > > -- > Michal Hocko > SUSE Labs
Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
On Wed 26-07-17 21:07:42, Wenwei Tao wrote: > From: Wenwei Tao> > By removing the child cgroup while the parent cgroup is > under reclaim, we could trigger the following kernel panic > on kernel 3.10: > > kernel BUG at kernel/cgroup.c:893! > invalid opcode: [#1] SMP > CPU: 1 PID: 22477 Comm: kworker/1:1 Not tainted 3.10.107 #1 > Workqueue: cgroup_destroy css_dput_fn > task: 8817959a5780 ti: 8817e8886000 task.ti: 8817e8886000 > RIP: 0010:[] [] > cgroup_diput+0xc0/0xf0 > RSP: :8817e8887da0 EFLAGS: 00010246 > RAX: RBX: 8817a5dd5d40 RCX: dead0200 > RDX: RSI: 8817973a6910 RDI: 8817f54c2a00 > RBP: 8817e8887dc8 R08: 8817a5dd5dd0 R09: df9fb35794b01820 > R10: df9fb35794b01820 R11: 7fa95b1efcda R12: 8817a5dd5d9c > R13: 8817f38b3a40 R14: 8817973a6910 R15: 8817973a6910 > FS: () GS:88181f22() > knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7fa6e6234000 CR3: 00179f19d000 CR4: 000407e0 > DR0: DR1: DR2: > DR3: DR6: 0ff0 DR7: 0400 > Stack: > 8817a5dd5d40 8817a5dd5d9c 8817f38b3a40 8817973a6910 > 0040 8817e8887df8 811b37c2 8817fa23c000 > 8817f57dbb80 88181f232ac0 88181f237500 8817e8887e10 > Call Trace: > [] dput+0x1a2/0x2f0 > [] cgroup_dput.isra.21+0x1c/0x30 > [] css_dput_fn+0x1d/0x20 > [] process_one_work+0x17c/0x460 > [] worker_thread+0x116/0x3b0 > [] ? manage_workers.isra.25+0x290/0x290 > [] kthread+0xc0/0xd0 > [] ? insert_kthread_work+0x40/0x40 > [] ret_from_fork+0x58/0x90 > [] ? insert_kthread_work+0x40/0x40 > Code: 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 7f 78 48 8b 07 a8 01 74 15 > 48 81 c7 30 01 00 00 48 c7 c6 a0 a7 0c 81 e8 b2 83 02 00 eb c8 <0f> 0b > 49 8b 4e 18 48 c7 c2 7e f1 7a 81 be 85 03 00 00 48 c7 c7 > RIP [] cgroup_diput+0xc0/0xf0 > RSP > ---[ end trace 85eeea5212c44f51 ]--- > > > I think there is a css double put in mem_cgroup_iter. Under reclaim, > we call mem_cgroup_iter the first time with prev == NULL, and we get > last_visited memcg from per zone's reclaim_iter then call > __mem_cgroup_iter_next > try to get next alive memcg, __mem_cgroup_iter_next could return NULL > if last_visited is already the last one so we put the last_visited's > memcg css and continue to the next while loop, this time we might not > do css_tryget(_visited->css) if the dead_count is changed, but > we still do css_put(_visited->css), we put it twice, this could > trigger the BUG_ON at kernel/cgroup.c:893. Yes, I guess your are right and I suspect that this has been silently fixed by 519ebea3bf6d ("mm: memcontrol: factor out reclaim iterator loading and updating"). I think a more appropriate fix is would be. Are you able to reproduce and re-test it? --- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 437ae2cbe102..0848ec05c12a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1224,6 +1224,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, if (last_visited && last_visited != root && !css_tryget(_visited->css)) last_visited = NULL; + } else { + last_visited = true; } } -- Michal Hocko SUSE Labs
Re: [RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
On Wed 26-07-17 21:07:42, Wenwei Tao wrote: > From: Wenwei Tao > > By removing the child cgroup while the parent cgroup is > under reclaim, we could trigger the following kernel panic > on kernel 3.10: > > kernel BUG at kernel/cgroup.c:893! > invalid opcode: [#1] SMP > CPU: 1 PID: 22477 Comm: kworker/1:1 Not tainted 3.10.107 #1 > Workqueue: cgroup_destroy css_dput_fn > task: 8817959a5780 ti: 8817e8886000 task.ti: 8817e8886000 > RIP: 0010:[] [] > cgroup_diput+0xc0/0xf0 > RSP: :8817e8887da0 EFLAGS: 00010246 > RAX: RBX: 8817a5dd5d40 RCX: dead0200 > RDX: RSI: 8817973a6910 RDI: 8817f54c2a00 > RBP: 8817e8887dc8 R08: 8817a5dd5dd0 R09: df9fb35794b01820 > R10: df9fb35794b01820 R11: 7fa95b1efcda R12: 8817a5dd5d9c > R13: 8817f38b3a40 R14: 8817973a6910 R15: 8817973a6910 > FS: () GS:88181f22() > knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7fa6e6234000 CR3: 00179f19d000 CR4: 000407e0 > DR0: DR1: DR2: > DR3: DR6: 0ff0 DR7: 0400 > Stack: > 8817a5dd5d40 8817a5dd5d9c 8817f38b3a40 8817973a6910 > 0040 8817e8887df8 811b37c2 8817fa23c000 > 8817f57dbb80 88181f232ac0 88181f237500 8817e8887e10 > Call Trace: > [] dput+0x1a2/0x2f0 > [] cgroup_dput.isra.21+0x1c/0x30 > [] css_dput_fn+0x1d/0x20 > [] process_one_work+0x17c/0x460 > [] worker_thread+0x116/0x3b0 > [] ? manage_workers.isra.25+0x290/0x290 > [] kthread+0xc0/0xd0 > [] ? insert_kthread_work+0x40/0x40 > [] ret_from_fork+0x58/0x90 > [] ? insert_kthread_work+0x40/0x40 > Code: 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 7f 78 48 8b 07 a8 01 74 15 > 48 81 c7 30 01 00 00 48 c7 c6 a0 a7 0c 81 e8 b2 83 02 00 eb c8 <0f> 0b > 49 8b 4e 18 48 c7 c2 7e f1 7a 81 be 85 03 00 00 48 c7 c7 > RIP [] cgroup_diput+0xc0/0xf0 > RSP > ---[ end trace 85eeea5212c44f51 ]--- > > > I think there is a css double put in mem_cgroup_iter. Under reclaim, > we call mem_cgroup_iter the first time with prev == NULL, and we get > last_visited memcg from per zone's reclaim_iter then call > __mem_cgroup_iter_next > try to get next alive memcg, __mem_cgroup_iter_next could return NULL > if last_visited is already the last one so we put the last_visited's > memcg css and continue to the next while loop, this time we might not > do css_tryget(_visited->css) if the dead_count is changed, but > we still do css_put(_visited->css), we put it twice, this could > trigger the BUG_ON at kernel/cgroup.c:893. Yes, I guess your are right and I suspect that this has been silently fixed by 519ebea3bf6d ("mm: memcontrol: factor out reclaim iterator loading and updating"). I think a more appropriate fix is would be. Are you able to reproduce and re-test it? --- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 437ae2cbe102..0848ec05c12a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1224,6 +1224,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, if (last_visited && last_visited != root && !css_tryget(_visited->css)) last_visited = NULL; + } else { + last_visited = true; } } -- Michal Hocko SUSE Labs
[RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
From: Wenwei TaoBy removing the child cgroup while the parent cgroup is under reclaim, we could trigger the following kernel panic on kernel 3.10: kernel BUG at kernel/cgroup.c:893! invalid opcode: [#1] SMP CPU: 1 PID: 22477 Comm: kworker/1:1 Not tainted 3.10.107 #1 Workqueue: cgroup_destroy css_dput_fn task: 8817959a5780 ti: 8817e8886000 task.ti: 8817e8886000 RIP: 0010:[] [] cgroup_diput+0xc0/0xf0 RSP: :8817e8887da0 EFLAGS: 00010246 RAX: RBX: 8817a5dd5d40 RCX: dead0200 RDX: RSI: 8817973a6910 RDI: 8817f54c2a00 RBP: 8817e8887dc8 R08: 8817a5dd5dd0 R09: df9fb35794b01820 R10: df9fb35794b01820 R11: 7fa95b1efcda R12: 8817a5dd5d9c R13: 8817f38b3a40 R14: 8817973a6910 R15: 8817973a6910 FS: () GS:88181f22() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fa6e6234000 CR3: 00179f19d000 CR4: 000407e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Stack: 8817a5dd5d40 8817a5dd5d9c 8817f38b3a40 8817973a6910 0040 8817e8887df8 811b37c2 8817fa23c000 8817f57dbb80 88181f232ac0 88181f237500 8817e8887e10 Call Trace: [] dput+0x1a2/0x2f0 [] cgroup_dput.isra.21+0x1c/0x30 [] css_dput_fn+0x1d/0x20 [] process_one_work+0x17c/0x460 [] worker_thread+0x116/0x3b0 [] ? manage_workers.isra.25+0x290/0x290 [] kthread+0xc0/0xd0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork+0x58/0x90 [] ? insert_kthread_work+0x40/0x40 Code: 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 7f 78 48 8b 07 a8 01 74 15 48 81 c7 30 01 00 00 48 c7 c6 a0 a7 0c 81 e8 b2 83 02 00 eb c8 <0f> 0b 49 8b 4e 18 48 c7 c2 7e f1 7a 81 be 85 03 00 00 48 c7 c7 RIP [] cgroup_diput+0xc0/0xf0 RSP ---[ end trace 85eeea5212c44f51 ]--- I think there is a css double put in mem_cgroup_iter. Under reclaim, we call mem_cgroup_iter the first time with prev == NULL, and we get last_visited memcg from per zone's reclaim_iter then call __mem_cgroup_iter_next try to get next alive memcg, __mem_cgroup_iter_next could return NULL if last_visited is already the last one so we put the last_visited's memcg css and continue to the next while loop, this time we might not do css_tryget(_visited->css) if the dead_count is changed, but we still do css_put(_visited->css), we put it twice, this could trigger the BUG_ON at kernel/cgroup.c:893. Reported-by: Wang Yu Tested-by: Wang Yu Signed-off-by: Wenwei Tao --- mm/memcontrol.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 437ae2c..3d7a046 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1230,8 +1230,10 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, memcg = __mem_cgroup_iter_next(root, last_visited); if (reclaim) { - if (last_visited && last_visited != root) + if (last_visited && last_visited != root) { css_put(_visited->css); + last_visited = NULL; + } iter->last_visited = memcg; smp_wmb(); -- 1.8.3.1
[RFC PATCH] mm: memcg: fix css double put in mem_cgroup_iter
From: Wenwei Tao By removing the child cgroup while the parent cgroup is under reclaim, we could trigger the following kernel panic on kernel 3.10: kernel BUG at kernel/cgroup.c:893! invalid opcode: [#1] SMP CPU: 1 PID: 22477 Comm: kworker/1:1 Not tainted 3.10.107 #1 Workqueue: cgroup_destroy css_dput_fn task: 8817959a5780 ti: 8817e8886000 task.ti: 8817e8886000 RIP: 0010:[] [] cgroup_diput+0xc0/0xf0 RSP: :8817e8887da0 EFLAGS: 00010246 RAX: RBX: 8817a5dd5d40 RCX: dead0200 RDX: RSI: 8817973a6910 RDI: 8817f54c2a00 RBP: 8817e8887dc8 R08: 8817a5dd5dd0 R09: df9fb35794b01820 R10: df9fb35794b01820 R11: 7fa95b1efcda R12: 8817a5dd5d9c R13: 8817f38b3a40 R14: 8817973a6910 R15: 8817973a6910 FS: () GS:88181f22() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fa6e6234000 CR3: 00179f19d000 CR4: 000407e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Stack: 8817a5dd5d40 8817a5dd5d9c 8817f38b3a40 8817973a6910 0040 8817e8887df8 811b37c2 8817fa23c000 8817f57dbb80 88181f232ac0 88181f237500 8817e8887e10 Call Trace: [] dput+0x1a2/0x2f0 [] cgroup_dput.isra.21+0x1c/0x30 [] css_dput_fn+0x1d/0x20 [] process_one_work+0x17c/0x460 [] worker_thread+0x116/0x3b0 [] ? manage_workers.isra.25+0x290/0x290 [] kthread+0xc0/0xd0 [] ? insert_kthread_work+0x40/0x40 [] ret_from_fork+0x58/0x90 [] ? insert_kthread_work+0x40/0x40 Code: 41 5e 41 5f 5d c3 0f 1f 44 00 00 48 8b 7f 78 48 8b 07 a8 01 74 15 48 81 c7 30 01 00 00 48 c7 c6 a0 a7 0c 81 e8 b2 83 02 00 eb c8 <0f> 0b 49 8b 4e 18 48 c7 c2 7e f1 7a 81 be 85 03 00 00 48 c7 c7 RIP [] cgroup_diput+0xc0/0xf0 RSP ---[ end trace 85eeea5212c44f51 ]--- I think there is a css double put in mem_cgroup_iter. Under reclaim, we call mem_cgroup_iter the first time with prev == NULL, and we get last_visited memcg from per zone's reclaim_iter then call __mem_cgroup_iter_next try to get next alive memcg, __mem_cgroup_iter_next could return NULL if last_visited is already the last one so we put the last_visited's memcg css and continue to the next while loop, this time we might not do css_tryget(_visited->css) if the dead_count is changed, but we still do css_put(_visited->css), we put it twice, this could trigger the BUG_ON at kernel/cgroup.c:893. Reported-by: Wang Yu Tested-by: Wang Yu Signed-off-by: Wenwei Tao --- mm/memcontrol.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 437ae2c..3d7a046 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1230,8 +1230,10 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, memcg = __mem_cgroup_iter_next(root, last_visited); if (reclaim) { - if (last_visited && last_visited != root) + if (last_visited && last_visited != root) { css_put(_visited->css); + last_visited = NULL; + } iter->last_visited = memcg; smp_wmb(); -- 1.8.3.1