Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
On Thu 05-10-17 15:02:18, David Rientjes wrote: [...] > I would need to add patches to add the "evaluate as a whole but do not > kill all" knob and a knob for "oom priority" so that userspace has the > same influence over a cgroup based comparison that it does with a process > based comparison t

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread David Rientjes
On Thu, 5 Oct 2017, Roman Gushchin wrote: > > This patchset exists because overcommit is real, exactly the same as > > overcommit within memcg hierarchies is real. 99% of the time we don't run > > into global oom because people aren't using their limits so it just works > > out. 1% of the tim

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread David Rientjes
On Thu, 5 Oct 2017, Johannes Weiner wrote: > > It is, because it can quite clearly be a DoSand was prevented with > > Roman's earlier design of iterating usage up the hierarchy and comparing > > siblings based on that criteria. I know exactly why he chose that > > implementation detail early o

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Roman Gushchin
On Thu, Oct 05, 2017 at 01:12:30PM +0200, Michal Hocko wrote: > On Thu 05-10-17 11:27:07, Roman Gushchin wrote: > > On Wed, Oct 04, 2017 at 02:24:26PM -0700, Shakeel Butt wrote: > [...] > > > Sorry about the confusion. There are two things. First, should we do a > > > css_get on the newly selected

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
On Wed 04-10-17 16:46:35, Roman Gushchin wrote: > Traditionally, the OOM killer is operating on a process level. > Under oom conditions, it finds a process with the highest oom score > and kills it. > > This behavior doesn't suit well the system with many running > containers: > > 1) There is no

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
On Wed 04-10-17 16:31:38, Johannes Weiner wrote: > On Wed, Oct 04, 2017 at 01:17:14PM -0700, David Rientjes wrote: > > On Wed, 4 Oct 2017, Roman Gushchin wrote: > > > > > > > @@ -828,6 +828,12 @@ static void __oom_kill_process(struct > > > > > task_struct *victim) > > > > > struct mm_struct

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
On Thu 05-10-17 11:27:07, Roman Gushchin wrote: > On Wed, Oct 04, 2017 at 02:24:26PM -0700, Shakeel Butt wrote: [...] > > Sorry about the confusion. There are two things. First, should we do a > > css_get on the newly selected memcg within the for loop when we still > > have a reference to it? > >

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Roman Gushchin
On Thu, Oct 05, 2017 at 01:40:09AM -0700, David Rientjes wrote: > On Wed, 4 Oct 2017, Johannes Weiner wrote: > > > > By only considering leaf memcgs, does this penalize users if their memcg > > > becomes oc->chosen_memcg purely because it has aggregated all of its > > > processes to be members o

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Roman Gushchin
On Wed, Oct 04, 2017 at 02:24:26PM -0700, Shakeel Butt wrote: > >> > + if (memcg_has_children(iter)) > >> > + continue; > >> > >> && iter != root_mem_cgroup ? > > > > Oh, sure. I had a stupid bug in my test script, which prevented me from > > catching this. Thank

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Johannes Weiner
On Thu, Oct 05, 2017 at 01:40:09AM -0700, David Rientjes wrote: > On Wed, 4 Oct 2017, Johannes Weiner wrote: > > > > By only considering leaf memcgs, does this penalize users if their memcg > > > becomes oc->chosen_memcg purely because it has aggregated all of its > > > processes to be members o

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread David Rientjes
On Wed, 4 Oct 2017, Johannes Weiner wrote: > > By only considering leaf memcgs, does this penalize users if their memcg > > becomes oc->chosen_memcg purely because it has aggregated all of its > > processes to be members of that memcg, which would otherwise be the > > standard behavior? > > >

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Shakeel Butt
>> > + if (memcg_has_children(iter)) >> > + continue; >> >> && iter != root_mem_cgroup ? > > Oh, sure. I had a stupid bug in my test script, which prevented me from > catching this. Thanks! > > This should fix the problem. > -- > diff --git a/mm/memcontrol.c b/mm

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Johannes Weiner
On Wed, Oct 04, 2017 at 01:27:14PM -0700, David Rientjes wrote: > By only considering leaf memcgs, does this penalize users if their memcg > becomes oc->chosen_memcg purely because it has aggregated all of its > processes to be members of that memcg, which would otherwise be the > standard behav

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Johannes Weiner
On Wed, Oct 04, 2017 at 01:17:14PM -0700, David Rientjes wrote: > On Wed, 4 Oct 2017, Roman Gushchin wrote: > > > > > @@ -828,6 +828,12 @@ static void __oom_kill_process(struct task_struct > > > > *victim) > > > > struct mm_struct *mm; > > > > bool can_oom_reap = true; > > > > >

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread David Rientjes
On Wed, 4 Oct 2017, Roman Gushchin wrote: > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index b4de17a78dc1..79f30c281185 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2670,6 +2670,178 @@ static inline bool memcg_has_children(struct > mem_cgroup *memcg) > return ret; > }

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Roman Gushchin
On Wed, Oct 04, 2017 at 01:17:14PM -0700, David Rientjes wrote: > On Wed, 4 Oct 2017, Roman Gushchin wrote: > > > > > @@ -828,6 +828,12 @@ static void __oom_kill_process(struct task_struct > > > > *victim) > > > > struct mm_struct *mm; > > > > bool can_oom_reap = true; > > > > >

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread David Rientjes
On Wed, 4 Oct 2017, Roman Gushchin wrote: > > > @@ -828,6 +828,12 @@ static void __oom_kill_process(struct task_struct > > > *victim) > > > struct mm_struct *mm; > > > bool can_oom_reap = true; > > > > > > + if (is_global_init(victim) || (victim->flags & PF_KTHREAD) || > > > + victim->s

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Roman Gushchin
On Wed, Oct 04, 2017 at 12:48:03PM -0700, Shakeel Butt wrote: > > + > > +static void select_victim_memcg(struct mem_cgroup *root, struct > > oom_control *oc) > > +{ > > + struct mem_cgroup *iter; > > + > > + oc->chosen_memcg = NULL; > > + oc->chosen_points = 0; > > + > > +

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Roman Gushchin
On Wed, Oct 04, 2017 at 03:27:20PM -0400, Johannes Weiner wrote: > On Wed, Oct 04, 2017 at 04:46:35PM +0100, Roman Gushchin wrote: > > Traditionally, the OOM killer is operating on a process level. > > Under oom conditions, it finds a process with the highest oom score > > and kills it. > > > > Th

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Shakeel Butt
> + > +static void select_victim_memcg(struct mem_cgroup *root, struct oom_control > *oc) > +{ > + struct mem_cgroup *iter; > + > + oc->chosen_memcg = NULL; > + oc->chosen_points = 0; > + > + /* > +* The oom_score is calculated for leaf memory cgroups (including > +

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Johannes Weiner
On Wed, Oct 04, 2017 at 04:46:35PM +0100, Roman Gushchin wrote: > Traditionally, the OOM killer is operating on a process level. > Under oom conditions, it finds a process with the highest oom score > and kills it. > > This behavior doesn't suit well the system with many running > containers: > >

[v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Roman Gushchin
Traditionally, the OOM killer is operating on a process level. Under oom conditions, it finds a process with the highest oom score and kills it. This behavior doesn't suit well the system with many running containers: 1) There is no fairness between containers. A small container with few large pr