On 03.09.2015 14:06, Kirill Tkhai wrote: > > > On 03.09.2015 13:13, Vladimir Davydov wrote: >> On Thu, Sep 03, 2015 at 01:09:36PM +0300, Kirill Tkhai wrote: >>> >>> >>> On 14.08.2015 20:03, Vladimir Davydov wrote: >>>> If an oom victim process has a low prio (nice or via cpu cgroup), it may >>>> take it very long to complete, which is bad, because the system cannot >>>> make progress until it dies. To avoid that, this patch makes oom killer >>>> set victim task prio to the highest possible. >>>> >>>> It might be worth submitting this patch upstream. I will probably try. >>>> >>>> Signed-off-by: Vladimir Davydov <vdavy...@parallels.com> >>>> --- >>>> mm/oom_kill.c | 17 +++++++++++++++-- >>>> 1 file changed, 15 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c >>>> index 0e6f7535a565..ca765a82fa1a 100644 >>>> --- a/mm/oom_kill.c >>>> +++ b/mm/oom_kill.c >>>> @@ -294,6 +294,15 @@ enum oom_scan_t oom_scan_process_thread(struct >>>> task_struct *task, >>>> return OOM_SCAN_OK; >>>> } >>>> >>>> +static void boost_dying_task(struct task_struct *p) >>>> +{ >>>> + /* >>>> + * Set the dying task scheduling priority to the highest possible so >>>> + * that it will die quickly irrespective of its scheduling policy. >>>> + */ >>>> + sched_boost_task(p, 0); >>>> +} >>>> + >>>> /* >>>> * Simple selection loop. We chose the process with the highest >>>> * number of 'points'. >>>> @@ -321,6 +330,7 @@ static struct task_struct *select_bad_process(unsigned >>>> int *ppoints, >>>> case OOM_SCAN_CONTINUE: >>>> continue; >>>> case OOM_SCAN_ABORT: >>>> + boost_dying_task(p); >>> >>> This is potential livelock as you are holding at least >>> try_set_zonelist_oom() bits locked >>> and concurrent thread may use GFP_NOFAIL in __alloc_pages_slowpath(). This >>> case it will be >>> looping forever. >> >> It won't. There schedule_timeouts all over the place. Besides, if >> try_set_zonelist_oom fails, the caller will call schedule_timeout. > > Really? What if a victim has signal_pending() flag? > > Even if it's not, you can't base on schedule_timeout(). No guarantees lock > holder will be > choosen for execution as at all.
Ah, schedule_timeout_uninterruptible() is there. So, it's OK. But guarantees are still absense... >>> >>> Furthermore, you manually do schedule_timeout_killable() in >>> out_of_memory(), so this problem >>> is a problem of !PREEMPTIBLE kernel too. >> >> I don't get this sentence. What's the problem? > > It's clarification to main problem, that it affects us. > >>> >>> You mustn't leave processor before you're cleared the bits. >> >> Wrong, see above. >> _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel