On Mon 30-05-16 20:18:16, Oleg Nesterov wrote:
> On 05/30, Michal Hocko wrote:
> >
> > @@ -852,8 +852,7 @@ void oom_kill_process(struct oom_control *oc, struct 
> > task_struct *p,
> >                     continue;
> >             if (same_thread_group(p, victim))
> >                     continue;
> > -           if (unlikely(p->flags & PF_KTHREAD) || is_global_init(p) ||
> > -               p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) {
> > +           if (unlikely(p->flags & PF_KTHREAD) || is_global_init(p)) {
> >                     /*
> >                      * We cannot use oom_reaper for the mm shared by this
> >                      * process because it wouldn't get killed and so the
> > @@ -862,6 +861,11 @@ void oom_kill_process(struct oom_control *oc, struct 
> > task_struct *p,
> >                     can_oom_reap = false;
> >                     continue;
> >             }
> > +           if (p->signal->oom_score_adj == OOM_ADJUST_MIN)
> > +                   pr_warn("%s pid=%d shares mm with oom disabled %s 
> > pid=%d. Seems like misconfiguration, killing anyway!"
> > +                                   " Report at linux...@kvack.org\n",
> > +                                   victim->comm, task_pid_nr(victim),
> > +                                   p->comm, task_pid_nr(p));
> 
> Oh, yes, I personally do agree ;)
> 
> perhaps the is_global_init() == T case needs a warning too? the previous 
> changes
> take care about vfork() from /sbin/init, so the only reason we can see it true
> is that /sbin/init shares the memory with a memory hog... Nevermind, forget.

I have another two patches waiting for this to settle and one of them
adds a warning to that path.

> This is a bit off-topic, but perhaps we can also change the PF_KTHREAD check 
> later.
> Of course we should not try to kill this kthread, but can_oom_reap can be 
> true in
> this case. A kernel thread which does use_mm() should handle the errors 
> correctly
> if (say) get_user() fails because we unmap the memory.

I was worried that the kernel thread would see a zero page so this could
lead to a data corruption.
-- 
Michal Hocko
SUSE Labs

Reply via email to