Hello, Petr.

(cc'ing Johannes)

On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote:
...
> By other words, "memcg_move_char/2860" flushes a work. But it cannot
> get flushed because one worker is blocked and another one could not
> get created. All these operations are blocked by the very same
> "memcg_move_char/2860".
> 
> Note that also "systemd/1" is waiting for "cgroup_mutex" in
> proc_cgroup_show(). But it seems that it is not in the main
> cycle causing the deadlock.
> 
> I am able to reproduce this problem quite easily (within few minutes).
> There are often even more tasks waiting for the cgroups-related locks
> but they are not causing the deadlock.
> 
> 
> The question is how to solve this problem. I see several possibilities:
> 
>   + avoid using workqueues in lru_add_drain_all()
> 
>   + make lru_add_drain_all() killable and restartable
> 
>   + do not block fork() when lru_add_drain_all() is running,
>     e.g. using some lazy techniques like RCU, workqueues
> 
>   + at least do not block fork of workers; AFAIK, they have a limited
>      cgroups usage anyway because they are marked with PF_NO_SETAFFINITY
> 
> 
> I am willing to test any potential fix or even work on the fix.
> But I do not have that big insight into the problem, so I would
> need some pointers.

An easy solution would be to make lru_add_drain_all() use a
WQ_MEM_RECLAIM workqueue.  A better way would be making charge moving
asynchronous similar to cpuset node migration but I don't know whether
that's realistic.  Will prep a patch to add a rescuer to
lru_add_drain_all().

Thanks.

-- 
tejun

Reply via email to