On Tue, 6 Oct 2015, Michal Hocko wrote:
> On Tue 06-10-15 18:28:04, Oleg Nesterov wrote:
> > oom_kill_process() does atomic_inc(&mm->mm_users) to ensure that
> > this ->mm can't go away and this is wrong, change it to rely on
> > ->mm_count and mmdrop().
> > 
> > Firstly, we do not want to delay exit_mmap/etc if the victim exits
> > before we do mmput(), but this is minor.
> > 
> > More importantly, we simply can not do mmput() in oom_kill_process(),
> > this can deadlock if (for example) the caller holds i_mmap_rwsem and
> > mmput() actually leads to exit_mmap(); the victim can have this file
> > mmaped and in this case unmap_vmas/free_pgtables paths will take the
> > same lock for writing. And at least huge_pmd_share() does pmd_alloc()
> > under i_mmap_rwsem because VM_HUGETLB memory is not reclaimable.
> 
> Ouch, I have completely missed this during review! Thanks for catching
> this. On the second thought it is clear now. We really want to pin the
> mm_struct not the address space.
> 
> > Signed-off-by: Oleg Nesterov <[email protected]>
> 
> Acked-by: Michal Hocko <[email protected]>

Acked-by: Hugh Dickins <[email protected]>

Thanks: looks like this is what was behind recent trinity/KSM deadlock,
https://lkml.org/lkml/2015/10/1/563

> 
> > ---
> >  mm/oom_kill.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 034d219..52abb78 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -571,7 +571,7 @@ void oom_kill_process(struct oom_control *oc, struct 
> > task_struct *p,
> >  
> >     /* Get a reference to safely compare mm after task_unlock(victim) */
> >     mm = victim->mm;
> > -   atomic_inc(&mm->mm_users);
> > +   atomic_inc(&mm->mm_count);
> >     /*
> >      * We should send SIGKILL before setting TIF_MEMDIE in order to prevent
> >      * the OOM victim from depleting the memory reserves from the user
> > @@ -609,7 +609,7 @@ void oom_kill_process(struct oom_control *oc, struct 
> > task_struct *p,
> >     }
> >     rcu_read_unlock();
> >  
> > -   mmput(mm);
> > +   mmdrop(mm);
> >     put_task_struct(victim);
> >  }
> >  #undef K
> > -- 
> > 2.4.3
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to