Michal Hocko wrote: > Hi, > I was thinking about this and I am more and more convinced that we > shouldn't care about panic_on_oom=2 configuration for now and go with > the simplest solution first. I have revisited my original patch and > replaced delayed work by a timer based on the feedback from Tetsuo. >
To me, obsolating panic_on_oom > 0 sounds cleaner. > I think we can rely on timers. A downside would be that we cannot dump > the full OOM report from the IRQ context because we rely on task_lock > which is not IRQ safe. But I do not think we really need it. An OOM > report will be in the log already most of the time and show_mem will > tell us the current memory situation. > > What do you think? We can rely on timers, but we can't rely on global timer. > + if (sysctl_panic_on_oom_timeout) { > + if (sysctl_panic_on_oom > 1) { > + pr_warn("panic_on_oom_timeout is ignored for > panic_on_oom=2\n"); > + } else { > + /* > + * Only schedule the delayed panic_on_oom when this is > + * the first OOM triggered. oom_lock will protect us > + * from races > + */ > + if (atomic_read(&oom_victims)) > + return; > + > + mod_timer(&panic_on_oom_timer, > + jiffies + (sysctl_panic_on_oom_timeout > * HZ)); > + return; > + } > + } Since this version uses global panic_on_oom_timer, you cannot handle OOM race like below. (1) p1 in memcg1 calls out_of_memory(). (2) 5 seconds of timeout is started by p1. (3) p1 takes 3 seconds for some reason. (4) p2 in memcg2 calls out_of_memory(). (5) p1 calls unmark_oom_victim() but timer continues. (6) p2 takes 2 seconds for some reason. (7) 5 seconds of timeout expires despite individual delay was less than 5 seconds. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/