Michal Hocko wrote:
> On Fri 28-07-17 22:55:51, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > On Fri 28-07-17 22:15:01, Tetsuo Handa wrote:
> > > > task_will_free_mem(current) in out_of_memory() returning false due to
> > > > MMF_OOM_SKIP already set allowed each thread sharing that mm to select 
> > > > a new
> > > > OOM victim. If task_will_free_mem(current) in out_of_memory() did not 
> > > > return
> > > > false, threads sharing MMF_OOM_SKIP mm would not have selected new 
> > > > victims
> > > > to the level where all OOM killable processes are killed and calls 
> > > > panic().
> > > 
> > > I am not sure I understand. Do you mean this?
> > 
> > Yes.
> > 
> > > ---
> > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > > index 9e8b4f030c1c..671e4a4107d0 100644
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -779,13 +779,6 @@ static bool task_will_free_mem(struct task_struct 
> > > *task)
> > >   if (!__task_will_free_mem(task))
> > >           return false;
> > >  
> > > - /*
> > > -  * This task has already been drained by the oom reaper so there are
> > > -  * only small chances it will free some more
> > > -  */
> > > - if (test_bit(MMF_OOM_SKIP, &mm->flags))
> > > -         return false;
> > > -
> > >   if (atomic_read(&mm->mm_users) <= 1)
> > >           return true;
> > >  
> > > If yes I would have to think about this some more because that might
> > > have weird side effects (e.g. oom_victims counting after threads passed
> > > exit_oom_victim).
> > 
> > But this check should not be removed unconditionally. We should still return
> > false if returning true was not sufficient to solve the OOM situation, for
> > we need to select next OOM victim in that case.
> > 

I think that below one can manage this race condition.

---
 include/linux/sched.h |  1 +
 mm/oom_kill.c         | 21 ++++++++++++++-------
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0db4870..3fccf72 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -652,6 +652,7 @@ struct task_struct {
        /* disallow userland-initiated cgroup migration */
        unsigned                        no_cgroup_migration:1;
 #endif
+       unsigned                        oom_kill_free_check_raced:1;
 
        unsigned long                   atomic_flags; /* Flags requiring atomic 
access. */
 
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 9e8b4f0..a093193 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -779,13 +779,6 @@ static bool task_will_free_mem(struct task_struct *task)
        if (!__task_will_free_mem(task))
                return false;
 
-       /*
-        * This task has already been drained by the oom reaper so there are
-        * only small chances it will free some more
-        */
-       if (test_bit(MMF_OOM_SKIP, &mm->flags))
-               return false;
-
        if (atomic_read(&mm->mm_users) <= 1)
                return true;
 
@@ -806,6 +799,20 @@ static bool task_will_free_mem(struct task_struct *task)
        }
        rcu_read_unlock();
 
+       /*
+        * It is possible that current thread fails to try allocation from
+        * memory reserves if the OOM reaper set MMF_OOM_SKIP on this mm before
+        * current thread calls out_of_memory() in order to get TIF_MEMDIE.
+        * In that case, allow current thread to try TIF_MEMDIE allocation
+        * before start selecting next OOM victims.
+        */
+       if (ret && test_bit(MMF_OOM_SKIP, &mm->flags)) {
+               if (task == current && !task->oom_kill_free_check_raced)
+                       task->oom_kill_free_check_raced = true;
+               else
+                       ret = false;
+       }
+
        return ret;
 }
 
-- 
1.8.3.1

What is "oom_victims counting after threads passed exit_oom_victim" ?

Reply via email to