Re: [RFC -v2] panic_on_oom_timeout

2015-07-29 Thread Tetsuo Handa
Michal Hocko wrote: > On Wed 17-06-15 15:24:27, Michal Hocko wrote: > > On Wed 17-06-15 14:51:27, Michal Hocko wrote: > > [...] > > > The important thing is to decide what is the reasonable way forward. We > > > have two two implementations of panic based timeout. So we should decide > > > > And

Re: [RFC -v2] panic_on_oom_timeout

2015-07-29 Thread Michal Hocko
On Wed 17-06-15 15:24:27, Michal Hocko wrote: > On Wed 17-06-15 14:51:27, Michal Hocko wrote: > [...] > > The important thing is to decide what is the reasonable way forward. We > > have two two implementations of panic based timeout. So we should decide > > And the most obvious question, of

Re: [RFC -v2] panic_on_oom_timeout

2015-07-29 Thread Michal Hocko
On Wed 17-06-15 15:24:27, Michal Hocko wrote: On Wed 17-06-15 14:51:27, Michal Hocko wrote: [...] The important thing is to decide what is the reasonable way forward. We have two two implementations of panic based timeout. So we should decide And the most obvious question, of course. -

Re: [RFC -v2] panic_on_oom_timeout

2015-07-29 Thread Tetsuo Handa
Michal Hocko wrote: On Wed 17-06-15 15:24:27, Michal Hocko wrote: On Wed 17-06-15 14:51:27, Michal Hocko wrote: [...] The important thing is to decide what is the reasonable way forward. We have two two implementations of panic based timeout. So we should decide And the most

Re: [RFC -v2] panic_on_oom_timeout

2015-06-20 Thread Tetsuo Handa
Tetsuo Handa wrote: > One case is that the system can not panic of threads are unable to call > out_of_memory() for some reason. ^ if > Well, if without analysis purpose, > > if (time_after(jiffies, oom_start + sysctl_panic_on_oom_timeout * HZ)) >

Re: [RFC -v2] panic_on_oom_timeout

2015-06-20 Thread Tetsuo Handa
Tetsuo Handa wrote: One case is that the system can not panic of threads are unable to call out_of_memory() for some reason. ^ if Well, if without analysis purpose, if (time_after(jiffies, oom_start + sysctl_panic_on_oom_timeout * HZ))

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: > Yes I was thinking about this as well because the primary assumption > of the OOM killer is that the victim will release some memory. And it > doesn't matter whether the OOM killer was constrained or the global > one. So the above looks good at first sight, I am just afraid

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Michal Hocko
On Fri 19-06-15 20:30:10, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > Fixed in my local version. I will post the new version of the patch > > after we settle with the approach. > > > > I'd like to see now, Sure see below [...] > But oom_victims is incremented via mark_oom_victim() for

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: > On Wed 17-06-15 22:59:54, Tetsuo Handa wrote: > > Michal Hocko wrote: > [...] > > > But you have a point that we could have > > > - constrained OOM which elevates oom_victims > > > - global OOM killer strikes but wouldn't start the timer > > > > > > This is certainly

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: On Wed 17-06-15 22:59:54, Tetsuo Handa wrote: Michal Hocko wrote: [...] But you have a point that we could have - constrained OOM which elevates oom_victims - global OOM killer strikes but wouldn't start the timer This is certainly possible and

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Michal Hocko
On Fri 19-06-15 20:30:10, Tetsuo Handa wrote: Michal Hocko wrote: [...] Fixed in my local version. I will post the new version of the patch after we settle with the approach. I'd like to see now, Sure see below [...] But oom_victims is incremented via mark_oom_victim() for both

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: Yes I was thinking about this as well because the primary assumption of the OOM killer is that the victim will release some memory. And it doesn't matter whether the OOM killer was constrained or the global one. So the above looks good at first sight, I am just afraid it is

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 22:59:54, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > But you have a point that we could have > > - constrained OOM which elevates oom_victims > > - global OOM killer strikes but wouldn't start the timer > > > > This is certainly possible and timer_pending(_on_oom)

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: > > > + if (sysctl_panic_on_oom_timeout) { > > > + if (sysctl_panic_on_oom > 1) { > > > + pr_warn("panic_on_oom_timeout is ignored for > > > panic_on_oom=2\n"); > > > + } else { > > > + /* > > > + * Only schedule

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 14:51:27, Michal Hocko wrote: [...] > The important thing is to decide what is the reasonable way forward. We > have two two implementations of panic based timeout. So we should decide And the most obvious question, of course. - Should we add a panic timeout at all? > - Should be

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 21:31:21, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > I think we can rely on timers. A downside would be that we cannot dump > > the full OOM report from the IRQ context because we rely on task_lock > > which is not IRQ safe. But I do not think we really need it. An OOM > >

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: > Hi, > I was thinking about this and I am more and more convinced that we > shouldn't care about panic_on_oom=2 configuration for now and go with > the simplest solution first. I have revisited my original patch and > replaced delayed work by a timer based on the feedback from

[RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
Hi, I was thinking about this and I am more and more convinced that we shouldn't care about panic_on_oom=2 configuration for now and go with the simplest solution first. I have revisited my original patch and replaced delayed work by a timer based on the feedback from Tetsuo. I think we can rely

[RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
Hi, I was thinking about this and I am more and more convinced that we shouldn't care about panic_on_oom=2 configuration for now and go with the simplest solution first. I have revisited my original patch and replaced delayed work by a timer based on the feedback from Tetsuo. I think we can rely

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: Hi, I was thinking about this and I am more and more convinced that we shouldn't care about panic_on_oom=2 configuration for now and go with the simplest solution first. I have revisited my original patch and replaced delayed work by a timer based on the feedback from

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 21:31:21, Tetsuo Handa wrote: Michal Hocko wrote: [...] I think we can rely on timers. A downside would be that we cannot dump the full OOM report from the IRQ context because we rely on task_lock which is not IRQ safe. But I do not think we really need it. An OOM report

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 14:51:27, Michal Hocko wrote: [...] The important thing is to decide what is the reasonable way forward. We have two two implementations of panic based timeout. So we should decide And the most obvious question, of course. - Should we add a panic timeout at all? - Should be

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: + if (sysctl_panic_on_oom_timeout) { + if (sysctl_panic_on_oom 1) { + pr_warn(panic_on_oom_timeout is ignored for panic_on_oom=2\n); + } else { + /* + * Only schedule the delayed

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 22:59:54, Tetsuo Handa wrote: Michal Hocko wrote: [...] But you have a point that we could have - constrained OOM which elevates oom_victims - global OOM killer strikes but wouldn't start the timer This is certainly possible and timer_pending(panic_on_oom) replacing