Re: [RFC -v2] panic_on_oom_timeout

2015-07-29 Thread Tetsuo Handa
Michal Hocko wrote: > On Wed 17-06-15 15:24:27, Michal Hocko wrote: > > On Wed 17-06-15 14:51:27, Michal Hocko wrote: > > [...] > > > The important thing is to decide what is the reasonable way forward. We > > > have two two implementations of panic based timeout. So we should decide > > > > And t

Re: [RFC -v2] panic_on_oom_timeout

2015-07-29 Thread Michal Hocko
On Wed 17-06-15 15:24:27, Michal Hocko wrote: > On Wed 17-06-15 14:51:27, Michal Hocko wrote: > [...] > > The important thing is to decide what is the reasonable way forward. We > > have two two implementations of panic based timeout. So we should decide > > And the most obvious question, of cours

Re: [RFC -v2] panic_on_oom_timeout

2015-06-20 Thread Tetsuo Handa
Tetsuo Handa wrote: > One case is that the system can not panic of threads are unable to call > out_of_memory() for some reason. ^ if > Well, if without analysis purpose, > > if (time_after(jiffies, oom_start + sysctl_panic_on_oom_timeout * HZ)) >

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: > Yes I was thinking about this as well because the primary assumption > of the OOM killer is that the victim will release some memory. And it > doesn't matter whether the OOM killer was constrained or the global > one. So the above looks good at first sight, I am just afraid it

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Michal Hocko
On Fri 19-06-15 20:30:10, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > Fixed in my local version. I will post the new version of the patch > > after we settle with the approach. > > > > I'd like to see now, Sure see below [...] > But oom_victims is incremented via mark_oom_victim() for b

Re: [RFC -v2] panic_on_oom_timeout

2015-06-19 Thread Tetsuo Handa
Michal Hocko wrote: > On Wed 17-06-15 22:59:54, Tetsuo Handa wrote: > > Michal Hocko wrote: > [...] > > > But you have a point that we could have > > > - constrained OOM which elevates oom_victims > > > - global OOM killer strikes but wouldn't start the timer > > > > > > This is certainly possible

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 22:59:54, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > But you have a point that we could have > > - constrained OOM which elevates oom_victims > > - global OOM killer strikes but wouldn't start the timer > > > > This is certainly possible and timer_pending(&panic_on_oom) re

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: > > > + if (sysctl_panic_on_oom_timeout) { > > > + if (sysctl_panic_on_oom > 1) { > > > + pr_warn("panic_on_oom_timeout is ignored for > > > panic_on_oom=2\n"); > > > + } else { > > > + /* > > > + * Only schedule

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 14:51:27, Michal Hocko wrote: [...] > The important thing is to decide what is the reasonable way forward. We > have two two implementations of panic based timeout. So we should decide And the most obvious question, of course. - Should we add a panic timeout at all? > - Should be

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
On Wed 17-06-15 21:31:21, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > I think we can rely on timers. A downside would be that we cannot dump > > the full OOM report from the IRQ context because we rely on task_lock > > which is not IRQ safe. But I do not think we really need it. An OOM > >

Re: [RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Tetsuo Handa
Michal Hocko wrote: > Hi, > I was thinking about this and I am more and more convinced that we > shouldn't care about panic_on_oom=2 configuration for now and go with > the simplest solution first. I have revisited my original patch and > replaced delayed work by a timer based on the feedback from

[RFC -v2] panic_on_oom_timeout

2015-06-17 Thread Michal Hocko
Hi, I was thinking about this and I am more and more convinced that we shouldn't care about panic_on_oom=2 configuration for now and go with the simplest solution first. I have revisited my original patch and replaced delayed work by a timer based on the feedback from Tetsuo. I think we can rely o