On 04/12, Michal Hocko wrote:
>
> We shouldn't consider the task
> unless the whole thread group is going down.

Yes, agreed. I'd even say that oom-killer should never look at individual
task/threads, it should work with mm's. And one of the big mistakes (imo)
was the s/for_each_process/for_each_thread/ change in select_bad_process()
a while ago.

Michal, I won't even try to actually review this patch, I lost any hope
to understand OOM-killer a long ago ;) But I do agree with this change,
we obviously should not rely on PF_EXITING.

>  static inline bool task_will_free_mem(struct task_struct *task)
>  {
> +     struct signal_struct *sig = task->signal;
> +
>       /*
>        * A coredumping process may sleep for an extended period in exit_mm(),
>        * so the oom killer cannot assume that the process will promptly exit
>        * and release memory.
>        */
> -     return (task->flags & PF_EXITING) &&
> -             !(task->signal->flags & SIGNAL_GROUP_COREDUMP);
> +     if (sig->flags & SIGNAL_GROUP_COREDUMP)
> +             return false;
> +
> +     if (!(task->flags & PF_EXITING))
> +             return false;
> +
> +     /* Make sure that the whole thread group is going down */
> +     if (!thread_group_empty(task) && !(sig->flags & SIGNAL_GROUP_EXIT))
> +             return false;
> +
> +     return true;

So this looks certainly better to me, but perhaps it should do

        if (SIGNAL_GROUP_COREDUMP)
                return false;

        if (SIGNAL_GROUP_EXIT)
                return true;

        if (thread_group_empty() && PF_EXITING)
                return true;

        return false;

?

I won't insist, I do not even know if this would be better or not. But if
SIGNAL_GROUP_EXIT is set all sub-threads should go away even if PF_EXITING
is not set yet because this task didn't dequeue SIGKILL yet.

Up to you in any case.

Oleg.

Reply via email to