Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Tetsuo Handa
Oleg Nesterov wrote: > Tetsuo, sorry, I don't understand your question... > > > because it is possible that T starts the coredump, T sends SIGKILL to P, > > P calls out_of_memory() on GFP_FS allocation, > > yes, and since fatal_signal_pending() == T we do not even check > task_will_free_mem(). >

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Michal Hocko
On Fri 02-10-15 15:57:56, Oleg Nesterov wrote: > On 10/02, Michal Hocko wrote: > > > > So clone without CLONE_THREAD should create a new thread group leader > > and so create a new thread group. > > Yes. > > > Unless there is some other trickery > > which I do not see right now for_each_thread fr

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Oleg Nesterov
On 10/02, Tetsuo Handa wrote: > > Indeed. I retested with updated patch. Not all of them are killed from > the same zap_process, but all of them are killed by the same SEGV event. > I think that coredump stops all threads sharing the same memory. Yes sure. Please see other emails. Oleg. -- To un

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Tetsuo Handa
Michal Hocko wrote: > > --- a/fs/coredump.c > > +++ b/fs/coredump.c > > @@ -295,6 +295,8 @@ static int zap_process(struct task_struct *start, int > > exit_code, int flags) > > for_each_thread(start, t) { > > task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK); > >

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Oleg Nesterov
Tetsuo, sorry, I don't understand your question... On 10/02, Tetsuo Handa wrote: > > Oleg Nesterov wrote: > > On 10/01, Michal Hocko wrote: > > > > > > zap_process will add SIGKILL to all threads but the > > > current which will go on without being killed and if this is not a > > > thread group le

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Oleg Nesterov
On 10/02, Michal Hocko wrote: > > So clone without CLONE_THREAD should create a new thread group leader > and so create a new thread group. Yes. > Unless there is some other trickery > which I do not see right now for_each_thread from the parent task > shouldn't see those which are cloned without

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Michal Hocko
On Fri 02-10-15 21:33:08, Tetsuo Handa wrote: > Michal Hocko wrote: > > > Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs > > > to do > > > > It does that only to all threads in the _same_ thread group AFAIU. > > I'm confused. What the _same_ thread group? > > I can obs

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Tetsuo Handa
Michal Hocko wrote: > > Since T sends SIGKILL to all clone(CLONE_VM) tasks upon coredump, P needs > > to do > > It does that only to all threads in the _same_ thread group AFAIU. I'm confused. What the _same_ thread group? I can observe that SIGKILL is sent to all clone(CLONE_THREAD | CLONE_S

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Michal Hocko
On Fri 02-10-15 20:32:41, Tetsuo Handa wrote: > Oleg Nesterov wrote: > > On 10/01, Michal Hocko wrote: > > > > > > zap_process will add SIGKILL to all threads but the > > > current which will go on without being killed and if this is not a > > > thread group leader then we would miss it. > > > > Y

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-02 Thread Tetsuo Handa
Oleg Nesterov wrote: > On 10/01, Michal Hocko wrote: > > > > zap_process will add SIGKILL to all threads but the > > current which will go on without being killed and if this is not a > > thread group leader then we would miss it. > > Yes. And note that de_thread() does the same. Speaking of oom-k

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-01 Thread Oleg Nesterov
On 10/01, Michal Hocko wrote: > > zap_process will add SIGKILL to all threads but the > current which will go on without being killed and if this is not a > thread group leader then we would miss it. Yes. And note that de_thread() does the same. Speaking of oom-killer this is mostly fine, the exec

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-01 Thread Michal Hocko
On Thu 01-10-15 17:41:15, Oleg Nesterov wrote: > On 10/01, Michal Hocko wrote: > > > > On Thu 01-10-15 17:00:10, Oleg Nesterov wrote: > > > On 10/01, Michal Hocko wrote: > > > > > > > > On Wed 30-09-15 20:24:05, Oleg Nesterov wrote: > > > > [...] > > > > > It is possible that the group leader > > >

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-01 Thread Oleg Nesterov
On 10/01, Michal Hocko wrote: > > On Thu 01-10-15 17:00:10, Oleg Nesterov wrote: > > On 10/01, Michal Hocko wrote: > > > > > > On Wed 30-09-15 20:24:05, Oleg Nesterov wrote: > > > [...] > > > > It is possible that the group leader > > > > has the pending SIGKILL because its sub-thread originated th

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-01 Thread Michal Hocko
On Thu 01-10-15 17:00:10, Oleg Nesterov wrote: > On 10/01, Michal Hocko wrote: > > > > On Wed 30-09-15 20:24:05, Oleg Nesterov wrote: > > [...] > > > It is possible that the group leader > > > has the pending SIGKILL because its sub-thread originated the coredump, > > > in this case we must not ski

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-01 Thread Oleg Nesterov
On 10/01, Michal Hocko wrote: > > On Wed 30-09-15 20:24:05, Oleg Nesterov wrote: > [...] > > It is possible that the group leader > > has the pending SIGKILL because its sub-thread originated the coredump, > > in this case we must not skip this process. > > I do not understand this. If the group le

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-10-01 Thread Michal Hocko
On Wed 30-09-15 20:24:05, Oleg Nesterov wrote: [...] > It is possible that the group leader > has the pending SIGKILL because its sub-thread originated the coredump, > in this case we must not skip this process. I do not understand this. If the group leader has SIGKILL pending it will die anyway r

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending()check in oom_kill_process()

2015-10-01 Thread Tetsuo Handa
David Rientjes wrote: > On Wed, 30 Sep 2015, Oleg Nesterov wrote: > > > The fatal_signal_pending() was added to suppress unnecessary "sharing > > same memory" message, but it can't 100% help anyway because it can be > > false-negative; SIGKILL can be already dequeued. > > > > And worse, it can be

Re: [PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-09-30 Thread David Rientjes
On Wed, 30 Sep 2015, Oleg Nesterov wrote: > The fatal_signal_pending() was added to suppress unnecessary "sharing > same memory" message, but it can't 100% help anyway because it can be > false-negative; SIGKILL can be already dequeued. > > And worse, it can be false-positive due to exec or cored

[PATCH -mm v2 1/3] mm/oom_kill: remove the wrong fatal_signal_pending() check in oom_kill_process()

2015-09-30 Thread Oleg Nesterov
The fatal_signal_pending() was added to suppress unnecessary "sharing same memory" message, but it can't 100% help anyway because it can be false-negative; SIGKILL can be already dequeued. And worse, it can be false-positive due to exec or coredump. exec is mostly fine, but coredump is not. It is