On Thu, 26 Sep 2013, Tetsuo Handa wrote:

> > wait_for_completion() is scary if that completion requires memory that 
> > cannot be allocated because the caller is killed but uninterruptible.
> 
> I don't think these lines are specific to wait_for_completion() users.
> 
> Currently the OOM killer is disabled throughout from "the moment the OOM 
> killer
> chose a process to kill" to "the moment the task_struct of the chosen process
> becomes unreachable". Any blocking functions which wait in 
> TASK_UNINTERRUPTIBLE
> (e.g. mutex_lock()) can disable the OOM killer if the current thread is chosen
> by the OOM killer. Therefore, any users of blocking functions which wait in
> TASK_UNINTERRUPTIBLE are considered scary if they assume that the current
> thread will not be chosen by the OOM killer.
> 

Yeah, that's always been true.

> But it seems to me that re-enabling the OOM killer at some point is more
> realizable than purging all such users.
> 
> To re-enable the OOM killer at some point, the OOM killer needs to choose more
> processes if the to-be-killed process cannot be terminated within an adequate
> period.
> 
> For example, add "unsigned long memdie_stamp;" to "struct task_struct" and do
> "p->memdie_stamp = jiffies + 5 * HZ;" before "set_tsk_thread_flag(p, 
> TIF_MEMDIE);"
> and do
> 
>       if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
>               if (unlikely(frozen(task)))
>                       __thaw_task(task);
> +             /* Choose more processes if the chosen process cannot die. */
> +             if (time_after(jiffies, p->memdie_stamp) &&
> +                 task->state == TASK_UNINTERRUPTIBLE)
> +                     return OOM_SCAN_CONTINUE;
>               if (!force_kill)
>                       return OOM_SCAN_ABORT;
>       }
> 
> in oom_scan_process_thread().
> 

There may not be any eligible processes left and then the machine panics.  
These time-based delays also have caused a complete depletion of memory 
reserves if more than one process is chosen and each consumes an 
non-neglible amount of memory which would then cause livelock.  We used to 
have a jiffies-based rekill in 2.6.18 internally and we finally could 
remove it when mm->mmap_sem issues were fixed (mostly by checking for 
fatal_signal_pending() and aborting when necessary).

> [PATCH v3] kthread: Make kthread_create() killable.
> 
> Any user process callers of wait_for_completion() except global init process
> might be chosen by the OOM killer while waiting for completion() call by some
> other process which does memory allocation.
> 
> When such users are chosen by the OOM killer when they are waiting for
> completion() in TASK_UNINTERRUPTIBLE, the system will be kept stressed
> due to memory starvation because the OOM killer cannot kill such users.
> 
> kthread_create() is one of such users and this patch fixes the problem for
> kthreadd by making kthread_create() killable.
> 
> Signed-off-by: Tetsuo Handa <penguin-ker...@i-love.sakura.ne.jp>
> Cc: Oleg Nesterov <o...@redhat.com>
> Acked-by: David Rientjes <rient...@google.com>
> Signed-off-by: Andrew Morton <a...@linux-foundation.org>

Absolutely, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to