Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-19 Thread Petr Mladek
On Tue 2016-04-19 10:01:21, Michal Hocko wrote: > On Mon 18-04-16 16:40:23, Petr Mladek wrote: > > On Fri 2016-04-15 10:38:15, Tejun Heo wrote: > > > > Anyway, before we go that way, can we at least consider the possibility > > > > of removing the kworker creation dependency on the global rwsem?

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-19 Thread Petr Mladek
On Tue 2016-04-19 10:01:21, Michal Hocko wrote: > On Mon 18-04-16 16:40:23, Petr Mladek wrote: > > On Fri 2016-04-15 10:38:15, Tejun Heo wrote: > > > > Anyway, before we go that way, can we at least consider the possibility > > > > of removing the kworker creation dependency on the global rwsem?

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-19 Thread Michal Hocko
On Mon 18-04-16 16:40:23, Petr Mladek wrote: > On Fri 2016-04-15 10:38:15, Tejun Heo wrote: > > > Anyway, before we go that way, can we at least consider the possibility > > > of removing the kworker creation dependency on the global rwsem? AFAIU > > > this locking was added because of the pid

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-19 Thread Michal Hocko
On Mon 18-04-16 16:40:23, Petr Mladek wrote: > On Fri 2016-04-15 10:38:15, Tejun Heo wrote: > > > Anyway, before we go that way, can we at least consider the possibility > > > of removing the kworker creation dependency on the global rwsem? AFAIU > > > this locking was added because of the pid

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-18 Thread Petr Mladek
On Fri 2016-04-15 10:38:15, Tejun Heo wrote: > > Anyway, before we go that way, can we at least consider the possibility > > of removing the kworker creation dependency on the global rwsem? AFAIU > > this locking was added because of the pid controller. Do we even care > > about something as

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-18 Thread Petr Mladek
On Fri 2016-04-15 10:38:15, Tejun Heo wrote: > > Anyway, before we go that way, can we at least consider the possibility > > of removing the kworker creation dependency on the global rwsem? AFAIU > > this locking was added because of the pid controller. Do we even care > > about something as

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-17 Thread Michal Hocko
On Fri 15-04-16 11:25:26, Tejun Heo wrote: > Hello, Hi, > On Fri, Apr 15, 2016 at 05:08:15PM +0200, Michal Hocko wrote: [...] > > Well it certainly is not that trivial because it relies on being > > exclusive with global context. I will have to look closer of course but > > I cannot guarantee I

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-17 Thread Michal Hocko
On Fri 15-04-16 11:25:26, Tejun Heo wrote: > Hello, Hi, > On Fri, Apr 15, 2016 at 05:08:15PM +0200, Michal Hocko wrote: [...] > > Well it certainly is not that trivial because it relies on being > > exclusive with global context. I will have to look closer of course but > > I cannot guarantee I

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Tejun Heo
Hello, On Fri, Apr 15, 2016 at 05:08:15PM +0200, Michal Hocko wrote: > On Fri 15-04-16 10:38:15, Tejun Heo wrote: > > Not necessarily. The only thing necessary is flushing the work item > > after releasing locks but before returning to user. > > cpuset_post_attach_flush() does exactly the same

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Tejun Heo
Hello, On Fri, Apr 15, 2016 at 05:08:15PM +0200, Michal Hocko wrote: > On Fri 15-04-16 10:38:15, Tejun Heo wrote: > > Not necessarily. The only thing necessary is flushing the work item > > after releasing locks but before returning to user. > > cpuset_post_attach_flush() does exactly the same

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Michal Hocko
On Fri 15-04-16 10:38:15, Tejun Heo wrote: > Hello, Michal. > > On Fri, Apr 15, 2016 at 09:06:01AM +0200, Michal Hocko wrote: > > Tejun was proposing to do the migration async (move the whole > > mem_cgroup_move_charge into the work item). This would solve the problem > > of course. I haven't

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Michal Hocko
On Fri 15-04-16 10:38:15, Tejun Heo wrote: > Hello, Michal. > > On Fri, Apr 15, 2016 at 09:06:01AM +0200, Michal Hocko wrote: > > Tejun was proposing to do the migration async (move the whole > > mem_cgroup_move_charge into the work item). This would solve the problem > > of course. I haven't

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Tejun Heo
Hello, Michal. On Fri, Apr 15, 2016 at 09:06:01AM +0200, Michal Hocko wrote: > Tejun was proposing to do the migration async (move the whole > mem_cgroup_move_charge into the work item). This would solve the problem > of course. I haven't checked whether this would be safe but it at least >

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Tejun Heo
Hello, Michal. On Fri, Apr 15, 2016 at 09:06:01AM +0200, Michal Hocko wrote: > Tejun was proposing to do the migration async (move the whole > mem_cgroup_move_charge into the work item). This would solve the problem > of course. I haven't checked whether this would be safe but it at least >

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Michal Hocko
On Thu 14-04-16 13:50:55, Johannes Weiner wrote: > On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > > I think we can live without lru_add_drain_all() in the migration path. > > Agreed. Michal, would you care to send a patch to remove it? Now that I am looking closer I am not sure

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-15 Thread Michal Hocko
On Thu 14-04-16 13:50:55, Johannes Weiner wrote: > On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > > I think we can live without lru_add_drain_all() in the migration path. > > Agreed. Michal, would you care to send a patch to remove it? Now that I am looking closer I am not sure

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-14 Thread Johannes Weiner
On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > I think we can live without lru_add_drain_all() in the migration path. Agreed. Michal, would you care to send a patch to remove it?

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-14 Thread Johannes Weiner
On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > I think we can live without lru_add_drain_all() in the migration path. Agreed. Michal, would you care to send a patch to remove it?

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-14 Thread Tejun Heo
Hello, On Thu, Apr 14, 2016 at 09:06:23AM +0200, Michal Hocko wrote: > On Wed 13-04-16 21:48:20, Michal Hocko wrote: > [...] > > I was thinking about something like flush_per_cpu_work() which would > > assert on group_threadgroup_rwsem held for write. > > I have thought about this some more and

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-14 Thread Tejun Heo
Hello, On Thu, Apr 14, 2016 at 09:06:23AM +0200, Michal Hocko wrote: > On Wed 13-04-16 21:48:20, Michal Hocko wrote: > [...] > > I was thinking about something like flush_per_cpu_work() which would > > assert on group_threadgroup_rwsem held for write. > > I have thought about this some more and

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-14 Thread Michal Hocko
On Wed 13-04-16 21:48:20, Michal Hocko wrote: [...] > I was thinking about something like flush_per_cpu_work() which would > assert on group_threadgroup_rwsem held for write. I have thought about this some more and I guess this is not limitted to per cpu workers. Basically any flush_work with

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-14 Thread Michal Hocko
On Wed 13-04-16 21:48:20, Michal Hocko wrote: [...] > I was thinking about something like flush_per_cpu_work() which would > assert on group_threadgroup_rwsem held for write. I have thought about this some more and I guess this is not limitted to per cpu workers. Basically any flush_work with

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Tejun Heo
Hello, Michal. On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > I think we can live without lru_add_drain_all() in the migration path. > We are talking about 4 pagevecs so 56 pages. The charge migration is Ah, nice. > racy anyway. What concerns me more is how all this is fragile.

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Michal Hocko
On Wed 13-04-16 15:37:34, Tejun Heo wrote: > Hello, Michal. > > On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > > I think we can live without lru_add_drain_all() in the migration path. > > We are talking about 4 pagevecs so 56 pages. The charge migration is > > Ah, nice. > > >

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Tejun Heo
Hello, Michal. On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > I think we can live without lru_add_drain_all() in the migration path. > We are talking about 4 pagevecs so 56 pages. The charge migration is Ah, nice. > racy anyway. What concerns me more is how all this is fragile.

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Michal Hocko
On Wed 13-04-16 15:37:34, Tejun Heo wrote: > Hello, Michal. > > On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote: > > I think we can live without lru_add_drain_all() in the migration path. > > We are talking about 4 pagevecs so 56 pages. The charge migration is > > Ah, nice. > > >

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Michal Hocko
On Wed 13-04-16 21:23:13, Michal Hocko wrote: > On Wed 13-04-16 14:33:09, Tejun Heo wrote: > > Hello, Petr. > > > > (cc'ing Johannes) > > > > On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote: > > ... > > > By other words, "memcg_move_char/2860" flushes a work. But it cannot > > > get

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Michal Hocko
On Wed 13-04-16 21:23:13, Michal Hocko wrote: > On Wed 13-04-16 14:33:09, Tejun Heo wrote: > > Hello, Petr. > > > > (cc'ing Johannes) > > > > On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote: > > ... > > > By other words, "memcg_move_char/2860" flushes a work. But it cannot > > > get

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Michal Hocko
On Wed 13-04-16 14:33:09, Tejun Heo wrote: > Hello, Petr. > > (cc'ing Johannes) > > On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote: > ... > > By other words, "memcg_move_char/2860" flushes a work. But it cannot > > get flushed because one worker is blocked and another one could not

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Michal Hocko
On Wed 13-04-16 14:33:09, Tejun Heo wrote: > Hello, Petr. > > (cc'ing Johannes) > > On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote: > ... > > By other words, "memcg_move_char/2860" flushes a work. But it cannot > > get flushed because one worker is blocked and another one could not

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Tejun Heo
On Wed, Apr 13, 2016 at 02:33:09PM -0400, Tejun Heo wrote: > An easy solution would be to make lru_add_drain_all() use a > WQ_MEM_RECLAIM workqueue. A better way would be making charge moving > asynchronous similar to cpuset node migration but I don't know whether > that's realistic. Will prep a

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Tejun Heo
On Wed, Apr 13, 2016 at 02:33:09PM -0400, Tejun Heo wrote: > An easy solution would be to make lru_add_drain_all() use a > WQ_MEM_RECLAIM workqueue. A better way would be making charge moving > asynchronous similar to cpuset node migration but I don't know whether > that's realistic. Will prep a

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Tejun Heo
Hello, Petr. (cc'ing Johannes) On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote: ... > By other words, "memcg_move_char/2860" flushes a work. But it cannot > get flushed because one worker is blocked and another one could not > get created. All these operations are blocked by the very

Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Tejun Heo
Hello, Petr. (cc'ing Johannes) On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote: ... > By other words, "memcg_move_char/2860" flushes a work. But it cannot > get flushed because one worker is blocked and another one could not > get created. All these operations are blocked by the very

[BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Petr Mladek
Hi, Cyril reported a system lock up when running memcg_move_charge_at_immigrate_test.sh test[*] repeatedly. I have reproduced it also with the plain 4.6-rc3. There seems to be a deadlock where 4 processes are involved. It makes the system unable to fork any new processes. I had to use alt-sysrq

[BUG] cgroup/workques/fork: deadlock when moving cgroups

2016-04-13 Thread Petr Mladek
Hi, Cyril reported a system lock up when running memcg_move_charge_at_immigrate_test.sh test[*] repeatedly. I have reproduced it also with the plain 4.6-rc3. There seems to be a deadlock where 4 processes are involved. It makes the system unable to fork any new processes. I had to use alt-sysrq