On Thu 04-02-16 23:22:18, Tetsuo Handa wrote: > Michal Hocko wrote: > > From: Michal Hocko <mho...@suse.com> > > > > When oom_reaper manages to unmap all the eligible vmas there shouldn't > > be much of the freable memory held by the oom victim left anymore so it > > makes sense to clear the TIF_MEMDIE flag for the victim and allow the > > OOM killer to select another task. > > Just a confirmation. Is it safe to clear TIF_MEMDIE without reaching do_exit() > with regard to freezing_slow_path()? Since clearing TIF_MEMDIE from the OOM > reaper confuses > > wait_event(oom_victims_wait, !atomic_read(&oom_victims)); > > in oom_killer_disable(), I'm worrying that the freezing operation continues > before the OOM victim which escaped the __refrigerator() actually releases > memory. Does this cause consistency problem?
This is a good question! At first sight it seems this is not safe and we might need to make the oom_reaper freezable so that it doesn't wake up during suspend and interfere. Let me think about that. > > + /* > > + * Clear TIF_MEMDIE because the task shouldn't be sitting on a > > + * reasonably reclaimable memory anymore. OOM killer can continue > > + * by selecting other victim if unmapping hasn't led to any > > + * improvements. This also means that selecting this task doesn't > > + * make any sense. > > + */ > > + tsk->signal->oom_score_adj = OOM_SCORE_ADJ_MIN; > > + exit_oom_victim(tsk); > > I noticed that updating only one thread group's oom_score_adj disables > further wake_oom_reaper() calls due to rough-grained can_oom_reap check at > > p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN > > in oom_kill_process(). I think we need to either update all thread groups' > oom_score_adj using the reaped mm equally or use more fine-grained > can_oom_reap > check which ignores OOM_SCORE_ADJ_MIN if all threads in that thread group are > dying or exiting. I do not understand. Why would you want to reap the mm again when this has been done already? The mm is shared, right? -- Michal Hocko SUSE Labs