Re: [PATCH 0/3] OOM detection rework v4

2016-03-13 Thread Tetsuo Handa
Tetsuo Handa wrote: > Michal Hocko wrote: > > OK, that would suggest that the oom rework patches are not really > > related. They just moved from the livelock to a sleep which is good in > > general IMHO. We even know that it is most probably the IO that is the > > problem because we know that more

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Tetsuo Handa
Michal Hocko wrote: > OK, that would suggest that the oom rework patches are not really > related. They just moved from the livelock to a sleep which is good in > general IMHO. We even know that it is most probably the IO that is the > problem because we know that more than half of the reclaimable

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Tetsuo Handa
Michal Hocko wrote: > On Sat 12-03-16 01:49:26, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > What happens without this patch applied. In other words, it all smells > > > like the IO got stuck somewhere and the direct reclaim cannot perform it > > > so we have to wait for the flushers to make a

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Michal Hocko
On Sat 12-03-16 01:49:26, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Fri 11-03-16 22:32:02, Tetsuo Handa wrote: > > > Michal Hocko wrote: > > > > On Fri 11-03-16 19:45:29, Tetsuo Handa wrote: > > > > > (Posting as a reply to this thread.) > > > > > > > > I really do not see how this is relat

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Tetsuo Handa
Michal Hocko wrote: > On Fri 11-03-16 22:32:02, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > On Fri 11-03-16 19:45:29, Tetsuo Handa wrote: > > > > (Posting as a reply to this thread.) > > > > > > I really do not see how this is related to this thread. > > > > All allocating tasks are looping

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Michal Hocko
On Fri 11-03-16 22:32:02, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Fri 11-03-16 19:45:29, Tetsuo Handa wrote: > > > (Posting as a reply to this thread.) > > > > I really do not see how this is related to this thread. > > All allocating tasks are looping at > > /*

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-11 Thread Michal Hocko
On Fri 11-03-16 23:53:18, Joonsoo Kim wrote: > 2016-03-09 19:41 GMT+09:00 Michal Hocko : > > On Wed 09-03-16 02:03:59, Joonsoo Kim wrote: > >> 2016-03-09 1:05 GMT+09:00 Michal Hocko : > >> > On Wed 09-03-16 00:19:03, Joonsoo Kim wrote: [...] > >> >> What's your purpose of OOM rework? From my unders

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-11 Thread Joonsoo Kim
2016-03-09 19:41 GMT+09:00 Michal Hocko : > On Wed 09-03-16 02:03:59, Joonsoo Kim wrote: >> 2016-03-09 1:05 GMT+09:00 Michal Hocko : >> > On Wed 09-03-16 00:19:03, Joonsoo Kim wrote: >> >> 2016-03-08 1:08 GMT+09:00 Michal Hocko : >> >> > On Mon 29-02-16 22:02:13, Michal Hocko wrote: >> >> >> Andrew

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Tetsuo Handa
Michal Hocko wrote: > On Fri 11-03-16 19:45:29, Tetsuo Handa wrote: > > (Posting as a reply to this thread.) > > I really do not see how this is related to this thread. All allocating tasks are looping at /* * If we didn't make any progress and ha

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Michal Hocko
On Fri 11-03-16 19:45:29, Tetsuo Handa wrote: > (Posting as a reply to this thread.) I really do not see how this is related to this thread. -- Michal Hocko SUSE Labs

Re: [PATCH 0/3] OOM detection rework v4

2016-03-11 Thread Tetsuo Handa
(Posting as a reply to this thread.) I was trying to test side effect of "oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task" compared to "oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space" using a reproducer shown below. -- Reproducer start -

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-09 Thread Michal Hocko
On Wed 09-03-16 02:03:59, Joonsoo Kim wrote: > 2016-03-09 1:05 GMT+09:00 Michal Hocko : > > On Wed 09-03-16 00:19:03, Joonsoo Kim wrote: > >> 2016-03-08 1:08 GMT+09:00 Michal Hocko : > >> > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > >> >> Andrew, > >> >> could you queue this one as well, pleas

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Joonsoo Kim
2016-03-09 1:05 GMT+09:00 Michal Hocko : > On Wed 09-03-16 00:19:03, Joonsoo Kim wrote: >> 2016-03-08 1:08 GMT+09:00 Michal Hocko : >> > On Mon 29-02-16 22:02:13, Michal Hocko wrote: >> >> Andrew, >> >> could you queue this one as well, please? This is more a band aid than a >> >> real solution whi

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Michal Hocko
On Wed 09-03-16 00:19:03, Joonsoo Kim wrote: > 2016-03-08 1:08 GMT+09:00 Michal Hocko : > > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > >> Andrew, > >> could you queue this one as well, please? This is more a band aid than a > >> real solution which I will be working on as soon as I am able to

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Joonsoo Kim
2016-03-08 1:08 GMT+09:00 Michal Hocko : > On Mon 29-02-16 22:02:13, Michal Hocko wrote: >> Andrew, >> could you queue this one as well, please? This is more a band aid than a >> real solution which I will be working on as soon as I am able to >> reproduce the issue but the patch should help to som

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Michal Hocko
On Tue 08-03-16 18:58:24, Sergey Senozhatsky wrote: > On (03/07/16 17:08), Michal Hocko wrote: > > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > > > Andrew, > > > could you queue this one as well, please? This is more a band aid than a > > > real solution which I will be working on as soon as I a

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Hugh Dickins
On Mon, 7 Mar 2016, Michal Hocko wrote: > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > > Andrew, > > could you queue this one as well, please? This is more a band aid than a > > real solution which I will be working on as soon as I am able to > > reproduce the issue but the patch should help to

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Sergey Senozhatsky
On (03/07/16 17:08), Michal Hocko wrote: > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > > Andrew, > > could you queue this one as well, please? This is more a band aid than a > > real solution which I will be working on as soon as I am able to > > reproduce the issue but the patch should help to

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Sergey Senozhatsky
On (03/08/16 10:08), Michal Hocko wrote: > On Tue 08-03-16 12:51:04, Sergey Senozhatsky wrote: > > Hello Michal, > > > > On (03/07/16 17:08), Michal Hocko wrote: > > > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > > > > Andrew, > > > > could you queue this one as well, please? This is more a ban

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-08 Thread Michal Hocko
On Tue 08-03-16 12:51:04, Sergey Senozhatsky wrote: > Hello Michal, > > On (03/07/16 17:08), Michal Hocko wrote: > > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > > > Andrew, > > > could you queue this one as well, please? This is more a band aid than a > > > real solution which I will be workin

Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-07 Thread Sergey Senozhatsky
Hello Michal, On (03/07/16 17:08), Michal Hocko wrote: > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > > Andrew, > > could you queue this one as well, please? This is more a band aid than a > > real solution which I will be working on as soon as I am able to > > reproduce the issue but the patch

[PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4)

2016-03-07 Thread Michal Hocko
On Mon 29-02-16 22:02:13, Michal Hocko wrote: > Andrew, > could you queue this one as well, please? This is more a band aid than a > real solution which I will be working on as soon as I am able to > reproduce the issue but the patch should help to some degree at least. Joonsoo wasn't very happy a

Re: [PATCH 0/3] OOM detection rework v4

2016-03-06 Thread Joonsoo Kim
On Fri, Mar 04, 2016 at 04:15:58PM +0100, Michal Hocko wrote: > On Fri 04-03-16 14:23:27, Joonsoo Kim wrote: > > On Thu, Mar 03, 2016 at 04:25:15PM +0100, Michal Hocko wrote: > > > On Thu 03-03-16 23:10:09, Joonsoo Kim wrote: > > > > 2016-03-03 18:26 GMT+09:00 Michal Hocko : > [...] > > > > >> I gu

Re: [PATCH 0/3] OOM detection rework v4

2016-03-04 Thread Michal Hocko
On Fri 04-03-16 16:15:58, Michal Hocko wrote: > On Fri 04-03-16 14:23:27, Joonsoo Kim wrote: [...] > > Unconditional 16 looping and then OOM kill really doesn't make any > > sense, because it doesn't mean that we already do our best. > > 16 is not really that important. We can change that if that

Re: [PATCH 0/3] OOM detection rework v4

2016-03-04 Thread Michal Hocko
On Fri 04-03-16 14:23:27, Joonsoo Kim wrote: > On Thu, Mar 03, 2016 at 04:25:15PM +0100, Michal Hocko wrote: > > On Thu 03-03-16 23:10:09, Joonsoo Kim wrote: > > > 2016-03-03 18:26 GMT+09:00 Michal Hocko : [...] > > > >> I guess that usual case for high order allocation failure has enough > > > >>

Re: [PATCH 0/3] OOM detection rework v4

2016-03-04 Thread Michal Hocko
On Thu 03-03-16 01:54:43, Hugh Dickins wrote: > On Tue, 1 Mar 2016, Michal Hocko wrote: > > [Adding Vlastimil and Joonsoo for compaction related things - this was a > > large thread but the more interesting part starts with > > http://lkml.kernel.org/r/alpine.LSU.2.11.1602241832160.15564@eggly.anvi

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Joonsoo Kim
On Thu, Mar 03, 2016 at 01:54:43AM -0800, Hugh Dickins wrote: > On Tue, 1 Mar 2016, Michal Hocko wrote: > > [Adding Vlastimil and Joonsoo for compaction related things - this was a > > large thread but the more interesting part starts with > > http://lkml.kernel.org/r/alpine.LSU.2.11.1602241832160.

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Vlastimil Babka
On 03/03/2016 09:57 PM, Hugh Dickins wrote: > >> >> I do not have an explanation why it would cause oom sooner but this >> turned out to be incomplete. There is another wmaark check deeper in the >> compaction path. Could you try the one from >> http://lkml.kernel.org/r/20160302130022.gg26...@dhcp

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Joonsoo Kim
On Thu, Mar 03, 2016 at 04:50:16PM +0100, Vlastimil Babka wrote: > On 03/03/2016 03:10 PM, Joonsoo Kim wrote: > > > >> [...] > > At least, reset no_progress_loops when did_some_progress. High > > order allocation up to PAGE_ALLOC_COSTLY_ORDER is as important > > as order 0. And, reclai

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Joonsoo Kim
On Thu, Mar 03, 2016 at 04:25:15PM +0100, Michal Hocko wrote: > On Thu 03-03-16 23:10:09, Joonsoo Kim wrote: > > 2016-03-03 18:26 GMT+09:00 Michal Hocko : > > > On Wed 02-03-16 23:34:21, Joonsoo Kim wrote: > > >> 2016-03-02 23:06 GMT+09:00 Michal Hocko : > > >> > On Wed 02-03-16 22:32:09, Joonsoo K

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Hugh Dickins
On Thu, 3 Mar 2016, Michal Hocko wrote: > On Thu 03-03-16 01:54:43, Hugh Dickins wrote: > > On Tue, 1 Mar 2016, Michal Hocko wrote: > [...] > > > So I have tried the following: > > > diff --git a/mm/compaction.c b/mm/compaction.c > > > index 4d99e1f5055c..7364e48cf69a 100644 > > > --- a/mm/compacti

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Michal Hocko
On Thu 03-03-16 16:50:16, Vlastimil Babka wrote: > On 03/03/2016 03:10 PM, Joonsoo Kim wrote: > > > >> [...] > > At least, reset no_progress_loops when did_some_progress. High > > order allocation up to PAGE_ALLOC_COSTLY_ORDER is as important > > as order 0. And, reclaim something woul

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Vlastimil Babka
On 03/03/2016 03:10 PM, Joonsoo Kim wrote: > >> [...] > At least, reset no_progress_loops when did_some_progress. High > order allocation up to PAGE_ALLOC_COSTLY_ORDER is as important > as order 0. And, reclaim something would increase probability of > compaction success.

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Michal Hocko
On Thu 03-03-16 23:10:09, Joonsoo Kim wrote: > 2016-03-03 18:26 GMT+09:00 Michal Hocko : > > On Wed 02-03-16 23:34:21, Joonsoo Kim wrote: > >> 2016-03-02 23:06 GMT+09:00 Michal Hocko : > >> > On Wed 02-03-16 22:32:09, Joonsoo Kim wrote: > >> >> 2016-03-02 18:50 GMT+09:00 Michal Hocko : > >> >> > On

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Joonsoo Kim
2016-03-03 18:26 GMT+09:00 Michal Hocko : > On Wed 02-03-16 23:34:21, Joonsoo Kim wrote: >> 2016-03-02 23:06 GMT+09:00 Michal Hocko : >> > On Wed 02-03-16 22:32:09, Joonsoo Kim wrote: >> >> 2016-03-02 18:50 GMT+09:00 Michal Hocko : >> >> > On Wed 02-03-16 11:19:54, Joonsoo Kim wrote: >> >> >> On Mo

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Michal Hocko
On Thu 03-03-16 01:54:43, Hugh Dickins wrote: > On Tue, 1 Mar 2016, Michal Hocko wrote: [...] > > So I have tried the following: > > diff --git a/mm/compaction.c b/mm/compaction.c > > index 4d99e1f5055c..7364e48cf69a 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -1276,6 +1276,9

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Tetsuo Handa
Michal Hocko wrote: > Sure we can be more intelligent and reset the counter if the > feedback from compaction is optimistic and we are making some > progress. This would be less hackish and the XXX comment points into > that direction. For now I would like this to catch most loads reasonably > and

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Hugh Dickins
On Tue, 1 Mar 2016, Michal Hocko wrote: > [Adding Vlastimil and Joonsoo for compaction related things - this was a > large thread but the more interesting part starts with > http://lkml.kernel.org/r/alpine.LSU.2.11.1602241832160.15564@eggly.anvils] > > On Mon 29-02-16 23:29:06, Hugh Dickins wrote:

Re: [PATCH 0/3] OOM detection rework v4

2016-03-03 Thread Michal Hocko
On Wed 02-03-16 23:34:21, Joonsoo Kim wrote: > 2016-03-02 23:06 GMT+09:00 Michal Hocko : > > On Wed 02-03-16 22:32:09, Joonsoo Kim wrote: > >> 2016-03-02 18:50 GMT+09:00 Michal Hocko : > >> > On Wed 02-03-16 11:19:54, Joonsoo Kim wrote: > >> >> On Mon, Feb 29, 2016 at 10:02:13PM +0100, Michal Hocko

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Minchan Kim
On Wed, Mar 02, 2016 at 10:50:56AM +0100, Michal Hocko wrote: > On Wed 02-03-16 11:19:54, Joonsoo Kim wrote: > > On Mon, Feb 29, 2016 at 10:02:13PM +0100, Michal Hocko wrote: > [...] > > > > + /* > > > > +* OK, so the watermak check has failed. Make sure we do all the > > > > +

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Joonsoo Kim
2016-03-02 23:06 GMT+09:00 Michal Hocko : > On Wed 02-03-16 22:32:09, Joonsoo Kim wrote: >> 2016-03-02 18:50 GMT+09:00 Michal Hocko : >> > On Wed 02-03-16 11:19:54, Joonsoo Kim wrote: >> >> On Mon, Feb 29, 2016 at 10:02:13PM +0100, Michal Hocko wrote: >> > [...] >> >> > > + /* >> >> > > + * OK, so

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Joonsoo Kim
2016-03-02 21:37 GMT+09:00 Michal Hocko : > On Wed 02-03-16 11:55:07, Joonsoo Kim wrote: >> On Tue, Mar 01, 2016 at 07:14:08PM +0100, Vlastimil Babka wrote: > [...] >> > Yes, compaction is historically quite careful to avoid making low >> > memory conditions worse, and to prevent work if it doesn't

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Michal Hocko
On Wed 02-03-16 22:32:09, Joonsoo Kim wrote: > 2016-03-02 18:50 GMT+09:00 Michal Hocko : > > On Wed 02-03-16 11:19:54, Joonsoo Kim wrote: > >> On Mon, Feb 29, 2016 at 10:02:13PM +0100, Michal Hocko wrote: > > [...] > >> > > + /* > >> > > + * OK, so the watermak check has failed. Make sure we do al

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Joonsoo Kim
2016-03-02 18:50 GMT+09:00 Michal Hocko : > On Wed 02-03-16 11:19:54, Joonsoo Kim wrote: >> On Mon, Feb 29, 2016 at 10:02:13PM +0100, Michal Hocko wrote: > [...] >> > > + /* >> > > + * OK, so the watermak check has failed. Make sure we do all the >> > > + * retries for !costly high order requests

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Vlastimil Babka
On 03/02/2016 01:24 PM, Michal Hocko wrote: On Tue 01-03-16 19:14:08, Vlastimil Babka wrote: I was under impression that similar checks to compaction_suitable() were done also in compact_finished(), to stop compacting if memory got low due to parallel activity. But I guess it was a patch from J

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Michal Hocko
On Wed 02-03-16 11:28:46, Joonsoo Kim wrote: > On Tue, Mar 01, 2016 at 02:38:46PM +0100, Michal Hocko wrote: > > > I'd expect a build in 224M > > > RAM plus 2G of swap to take so long, that I'd be very grateful to be > > > OOM killed, even if there is technically enough space. Unless > > > perhaps

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Michal Hocko
On Wed 02-03-16 11:55:07, Joonsoo Kim wrote: > On Tue, Mar 01, 2016 at 07:14:08PM +0100, Vlastimil Babka wrote: [...] > > Yes, compaction is historically quite careful to avoid making low > > memory conditions worse, and to prevent work if it doesn't look like > > it can ultimately succeed the allo

Re: [PATCH 0/3] OOM detection rework v4

2016-03-02 Thread Michal Hocko
On Wed 02-03-16 11:19:54, Joonsoo Kim wrote: > On Mon, Feb 29, 2016 at 10:02:13PM +0100, Michal Hocko wrote: [...] > > > + /* > > > + * OK, so the watermak check has failed. Make sure we do all the > > > + * retries for !costly high order requests and hope that multiple > > > + * runs of compact

Re: [PATCH 0/3] OOM detection rework v4

2016-03-01 Thread Joonsoo Kim
On Tue, Mar 01, 2016 at 07:14:08PM +0100, Vlastimil Babka wrote: > On 03/01/2016 02:38 PM, Michal Hocko wrote: > >$ grep compact /proc/vmstat > >compact_migrate_scanned 113983 > >compact_free_scanned 1433503 > >compact_isolated 134307 > >compact_stall 128 > >compact_fail 26 > >compact_success 102 >

Re: [PATCH 0/3] OOM detection rework v4

2016-03-01 Thread Joonsoo Kim
On Tue, Mar 01, 2016 at 02:38:46PM +0100, Michal Hocko wrote: > > I'd expect a build in 224M > > RAM plus 2G of swap to take so long, that I'd be very grateful to be > > OOM killed, even if there is technically enough space. Unless > > perhaps it's some superfast swap that you have? > > the swap

Re: [PATCH 0/3] OOM detection rework v4

2016-03-01 Thread Joonsoo Kim
On Mon, Feb 29, 2016 at 10:02:13PM +0100, Michal Hocko wrote: > Andrew, > could you queue this one as well, please? This is more a band aid than a > real solution which I will be working on as soon as I am able to > reproduce the issue but the patch should help to some degree at least. I'm not sur

Re: [PATCH 0/3] OOM detection rework v4

2016-03-01 Thread Vlastimil Babka
On 03/01/2016 02:38 PM, Michal Hocko wrote: $ grep compact /proc/vmstat compact_migrate_scanned 113983 compact_free_scanned 1433503 compact_isolated 134307 compact_stall 128 compact_fail 26 compact_success 102 compact_kcompatd_wake 0 So the whole load has done the direct compaction only 128 time

Re: [PATCH 0/3] OOM detection rework v4

2016-03-01 Thread Michal Hocko
On Tue 01-03-16 14:38:46, Michal Hocko wrote: [...] > the time increased but I haven't checked how stable the result is. And those results vary a lot (even when executed from the fresh boot) as per my further testing. Sure it might be related to the virtual environment but I do not think this par

Re: [PATCH 0/3] OOM detection rework v4

2016-03-01 Thread Michal Hocko
[Adding Vlastimil and Joonsoo for compaction related things - this was a large thread but the more interesting part starts with http://lkml.kernel.org/r/alpine.LSU.2.11.1602241832160.15564@eggly.anvils] On Mon 29-02-16 23:29:06, Hugh Dickins wrote: > On Mon, 29 Feb 2016, Michal Hocko wrote: > > On

Re: [PATCH 0/3] OOM detection rework v4

2016-02-29 Thread Hugh Dickins
On Mon, 29 Feb 2016, Michal Hocko wrote: > On Wed 24-02-16 19:47:06, Hugh Dickins wrote: > [...] > > Boot with mem=1G (or boot your usual way, and do something to occupy > > most of the memory: I think /proc/sys/vm/nr_hugepages provides a great > > way to gobble up most of the memory, though it's n

Re: [PATCH 0/3] OOM detection rework v4

2016-02-29 Thread Michal Hocko
Andrew, could you queue this one as well, please? This is more a band aid than a real solution which I will be working on as soon as I am able to reproduce the issue but the patch should help to some degree at least. On Thu 25-02-16 10:23:15, Michal Hocko wrote: > From d09de26cee148b4d8c486943b4e8

Re: [PATCH 0/3] OOM detection rework v4

2016-02-29 Thread Michal Hocko
On Wed 24-02-16 19:47:06, Hugh Dickins wrote: [...] > Boot with mem=1G (or boot your usual way, and do something to occupy > most of the memory: I think /proc/sys/vm/nr_hugepages provides a great > way to gobble up most of the memory, though it's not how I've done it). > > Make sure you have swap:

Re: [PATCH 0/3] OOM detection rework v4

2016-02-26 Thread Michal Hocko
On Fri 26-02-16 18:27:16, Hillf Danton wrote: > >> > > > --- a/mm/page_alloc.c Thu Feb 25 15:43:18 2016 > > > +++ b/mm/page_alloc.c Fri Feb 26 15:18:55 2016 > > > @@ -3113,6 +3113,8 @@ should_reclaim_retry(gfp_t gfp_mask, uns > > > struct zone *zone; > > > struct zoneref *z; > > > > >

Re: [PATCH 0/3] OOM detection rework v4

2016-02-26 Thread Hillf Danton
>> > > --- a/mm/page_alloc.c Thu Feb 25 15:43:18 2016 > > +++ b/mm/page_alloc.c Fri Feb 26 15:18:55 2016 > > @@ -3113,6 +3113,8 @@ should_reclaim_retry(gfp_t gfp_mask, uns > > struct zone *zone; > > struct zoneref *z; > > > > + if (order <= PAGE_ALLOC_COSTLY_ORDER) > > +

Re: [PATCH 0/3] OOM detection rework v4

2016-02-26 Thread Michal Hocko
On Thu 25-02-16 22:32:54, Hugh Dickins wrote: > On Thu, 25 Feb 2016, Michal Hocko wrote: [...] > > From d09de26cee148b4d8c486943b4e8f3bd7ad6f4be Mon Sep 17 00:00:00 2001 > > From: Michal Hocko > > Date: Thu, 4 Feb 2016 14:56:59 +0100 > > Subject: [PATCH] mm, oom: protect !costly allocations some m

Re: [PATCH 0/3] OOM detection rework v4

2016-02-26 Thread Michal Hocko
On Fri 26-02-16 15:54:19, Hillf Danton wrote: > > > > It didn't really help, I'm afraid: it reduces the actual number of OOM > > kills which occur before the job is terminated, but doesn't stop the > > job from being terminated very soon. > > > > I also tried Hillf's patch (separately) too, but a

Re: [PATCH 0/3] OOM detection rework v4

2016-02-25 Thread Hillf Danton
> > It didn't really help, I'm afraid: it reduces the actual number of OOM > kills which occur before the job is terminated, but doesn't stop the > job from being terminated very soon. > > I also tried Hillf's patch (separately) too, but as you expected, > it didn't seem to make any difference. >

Re: [PATCH 0/3] OOM detection rework v4

2016-02-25 Thread Hugh Dickins
On Thu, 25 Feb 2016, Michal Hocko wrote: > On Wed 24-02-16 19:47:06, Hugh Dickins wrote: > [...] > > Boot with mem=1G (or boot your usual way, and do something to occupy > > most of the memory: I think /proc/sys/vm/nr_hugepages provides a great > > way to gobble up most of the memory, though it's n

Re: [PATCH 0/3] OOM detection rework v4

2016-02-25 Thread Sergey Senozhatsky
On (02/25/16 17:48), Hillf Danton wrote: > > > Can you please schedule a run for the diff attached, in which > > > non-expensive allocators are allowed to burn more CPU cycles. > > > > I do not think your patch will help. As you can see, both OOMs were for > > order-2 and there simply are no order

Re: [PATCH 0/3] OOM detection rework v4

2016-02-25 Thread Hillf Danton
> > > > Can you please schedule a run for the diff attached, in which > > non-expensive allocators are allowed to burn more CPU cycles. > > I do not think your patch will help. As you can see, both OOMs were for > order-2 and there simply are no order-2+ free blocks usable for the > allocation req

Re: [PATCH 0/3] OOM detection rework v4

2016-02-25 Thread Michal Hocko
On Thu 25-02-16 17:17:45, Hillf Danton wrote: [...] > > OOM example: > > > > [ 2392.663170] zram-test.sh invoked oom-killer: > > gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, > > oom_score_adj=0 [...] > > [ 2392.663260] DMA: 4*4kB (M) 1*8kB (M) 4*16kB (ME) 1*32kB (M) 2*64kB (UE

Re: [PATCH 0/3] OOM detection rework v4

2016-02-25 Thread Michal Hocko
On Wed 24-02-16 19:47:06, Hugh Dickins wrote: [...] > Boot with mem=1G (or boot your usual way, and do something to occupy > most of the memory: I think /proc/sys/vm/nr_hugepages provides a great > way to gobble up most of the memory, though it's not how I've done it). > > Make sure you have swap:

Re: [PATCH 0/3] OOM detection rework v4

2016-02-25 Thread Hillf Danton
> > On (02/24/16 19:47), Hugh Dickins wrote: > > On Wed, 3 Feb 2016, Michal Hocko wrote: > > > Hi, > > > this thread went mostly quite. Are all the main concerns clarified? > > > Are there any new concerns? Are there any objections to targeting > > > this for the next merge window? > > > > Sorry t

Re: [PATCH 0/3] OOM detection rework v4

2016-02-24 Thread Sergey Senozhatsky
Hello, On (02/24/16 19:47), Hugh Dickins wrote: > On Wed, 3 Feb 2016, Michal Hocko wrote: > > Hi, > > this thread went mostly quite. Are all the main concerns clarified? > > Are there any new concerns? Are there any objections to targeting > > this for the next merge window? > > Sorry to say at t

Re: [PATCH 0/3] OOM detection rework v4

2016-02-24 Thread Hugh Dickins
On Wed, 3 Feb 2016, Michal Hocko wrote: > Hi, > this thread went mostly quite. Are all the main concerns clarified? > Are there any new concerns? Are there any objections to targeting > this for the next merge window? Sorry to say at this late date, but I do have one concern: hopefully you can twe

Re: [PATCH 0/3] OOM detection rework v4

2016-02-16 Thread Michal Hocko
On Tue 16-02-16 22:10:01, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Sun 07-02-16 13:09:33, Tetsuo Handa wrote: > > [...] > > > FYI, I again hit unexpected OOM-killer during genxref on linux-4.5-rc2 > > > source. > > > I think current patchset is too fragile to merge. > > > -

Re: [PATCH 0/3] OOM detection rework v4

2016-02-16 Thread Tetsuo Handa
Michal Hocko wrote: > On Sun 07-02-16 13:09:33, Tetsuo Handa wrote: > [...] > > FYI, I again hit unexpected OOM-killer during genxref on linux-4.5-rc2 > > source. > > I think current patchset is too fragile to merge. > > > > [ 3101.626995] smbd invoked oom-

Re: [PATCH 0/3] OOM detection rework v4

2016-02-15 Thread Michal Hocko
On Sun 07-02-16 13:09:33, Tetsuo Handa wrote: [...] > FYI, I again hit unexpected OOM-killer during genxref on linux-4.5-rc2 source. > I think current patchset is too fragile to merge. > > [ 3101.626995] smbd invoked oom-killer: > gfp_mask=0x27000c0(GFP_KER

Re: [PATCH 0/3] OOM detection rework v4

2016-02-06 Thread Tetsuo Handa
Michal Hocko wrote: > On Thu 04-02-16 22:10:54, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > I am not sure we can fix these pathological loads where we hit the > > > higher order depletion and there is a chance that one of the thousands > > > tasks terminates in an unpredictable way which happ

Re: [PATCH 0/3] OOM detection rework v4

2016-02-04 Thread Michal Hocko
On Thu 04-02-16 14:39:05, Michal Hocko wrote: > On Thu 04-02-16 22:10:54, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > I am not sure we can fix these pathological loads where we hit the > > > higher order depletion and there is a chance that one of the thousands > > > tasks terminates in an un

Re: [PATCH 0/3] OOM detection rework v4

2016-02-04 Thread Michal Hocko
On Thu 04-02-16 22:10:54, Tetsuo Handa wrote: > Michal Hocko wrote: > > I am not sure we can fix these pathological loads where we hit the > > higher order depletion and there is a chance that one of the thousands > > tasks terminates in an unpredictable way which happens to race with the > > OOM k

Re: [PATCH 0/3] OOM detection rework v4

2016-02-04 Thread Tetsuo Handa
Michal Hocko wrote: > I am not sure we can fix these pathological loads where we hit the > higher order depletion and there is a chance that one of the thousands > tasks terminates in an unpredictable way which happens to race with the > OOM killer. When I hit this problem on Dec 24th, I didn't ru

Re: [PATCH 0/3] OOM detection rework v4

2016-02-04 Thread Michal Hocko
On Wed 03-02-16 14:58:06, David Rientjes wrote: > On Wed, 3 Feb 2016, Michal Hocko wrote: > > > Hi, > > this thread went mostly quite. Are all the main concerns clarified? > > Are there any new concerns? Are there any objections to targeting > > this for the next merge window? > > Did we ever fig

Re: [PATCH 0/3] OOM detection rework v4

2016-02-03 Thread David Rientjes
On Wed, 3 Feb 2016, Michal Hocko wrote: > Hi, > this thread went mostly quite. Are all the main concerns clarified? > Are there any new concerns? Are there any objections to targeting > this for the next merge window? Did we ever figure out what was causing the oom killer to be called much earli

Re: [PATCH 0/3] OOM detection rework v4

2016-02-03 Thread Michal Hocko
Hi, this thread went mostly quite. Are all the main concerns clarified? Are there any new concerns? Are there any objections to targeting this for the next merge window? -- Michal Hocko SUSE Labs

Re: [PATCH 0/3] OOM detection rework v4

2016-01-28 Thread Michal Hocko
On Wed 27-01-16 15:18:11, David Rientjes wrote: > On Wed, 20 Jan 2016, Michal Hocko wrote: > > > > That trigger was introduced by commit 97a16fc82a7c5b0c ("mm, page_alloc: > > > only > > > enforce watermarks for order-0 allocations"), and "mm, oom: rework oom > > > detection" > > > patch hits th

Re: [PATCH 0/3] OOM detection rework v4

2016-01-27 Thread David Rientjes
On Wed, 20 Jan 2016, Michal Hocko wrote: > > That trigger was introduced by commit 97a16fc82a7c5b0c ("mm, page_alloc: > > only > > enforce watermarks for order-0 allocations"), and "mm, oom: rework oom > > detection" > > patch hits the trigger. > [] > > [ 154.829582] zone=DMA32 reclaimable=

Re: [PATCH 0/3] OOM detection rework v4

2016-01-06 Thread Vlastimil Babka
On 12/28/2015 03:13 PM, Tetsuo Handa wrote: > Tetsuo Handa wrote: >> Tetsuo Handa wrote: >> > I got OOM killers while running heavy disk I/O (extracting kernel source, >> > running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap / >> > XFS) >> > Do you think these OOM killers reaso

Re: [PATCH 0/3] OOM detection rework v4

2016-01-02 Thread Tetsuo Handa
Tetsuo Handa wrote: > Michal Hocko wrote: > > On Mon 28-12-15 21:08:56, Tetsuo Handa wrote: > > > Tetsuo Handa wrote: > > > > I got OOM killers while running heavy disk I/O (extracting kernel > > > > source, > > > > running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap > > > > /

Re: [PATCH 0/3] OOM detection rework v4

2015-12-30 Thread Tetsuo Handa
Michal Hocko wrote: > On Mon 28-12-15 21:08:56, Tetsuo Handa wrote: > > Tetsuo Handa wrote: > > > I got OOM killers while running heavy disk I/O (extracting kernel source, > > > running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap / > > > XFS) > > > Do you think these OOM killer

Re: [PATCH 0/3] OOM detection rework v4

2015-12-29 Thread Michal Hocko
On Mon 28-12-15 21:08:56, Tetsuo Handa wrote: > Tetsuo Handa wrote: > > I got OOM killers while running heavy disk I/O (extracting kernel source, > > running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap / > > XFS) > > Do you think these OOM killers reasonable? Too weak against f

Re: [PATCH 0/3] OOM detection rework v4

2015-12-29 Thread Michal Hocko
On Thu 24-12-15 21:41:19, Tetsuo Handa wrote: > I got OOM killers while running heavy disk I/O (extracting kernel source, > running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap / XFS) > Do you think these OOM killers reasonable? Too weak against fragmentation? I will have a look

Re: [PATCH 0/3] OOM detection rework v4

2015-12-28 Thread Tetsuo Handa
Tetsuo Handa wrote: > Tetsuo Handa wrote: > > I got OOM killers while running heavy disk I/O (extracting kernel source, > > running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap / > > XFS) > > Do you think these OOM killers reasonable? Too weak against fragmentation? > > Since I

Re: [PATCH 0/3] OOM detection rework v4

2015-12-28 Thread Tetsuo Handa
Tetsuo Handa wrote: > I got OOM killers while running heavy disk I/O (extracting kernel source, > running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap / XFS) > Do you think these OOM killers reasonable? Too weak against fragmentation? Well, current patch invokes OOM killers when

Re: [PATCH 0/3] OOM detection rework v4

2015-12-24 Thread Tetsuo Handa
I got OOM killers while running heavy disk I/O (extracting kernel source, running lxr's genxref command). (Environ: 4 CPUs / 2048MB RAM / no swap / XFS) Do you think these OOM killers reasonable? Too weak against fragmentation? [ 3902.430630] kthreadd invoked oom-killer: order=2, oom_score_adj=0,

Re: [PATCH 0/3] OOM detection rework v4

2015-12-18 Thread Johannes Weiner
On Fri, Dec 18, 2015 at 02:15:09PM +0100, Michal Hocko wrote: > On Wed 16-12-15 15:58:44, Andrew Morton wrote: > > It's hard to say how long declaration of oom should take. Correctness > > comes first. But what is "correct"? oom isn't a binary condition - > > there's a chance that if we keep chu

Re: [PATCH 0/3] OOM detection rework v4

2015-12-18 Thread Michal Hocko
On Wed 16-12-15 15:58:44, Andrew Morton wrote: > On Tue, 15 Dec 2015 19:19:43 +0100 Michal Hocko wrote: > > > > > ... > > > > * base kernel > > $ grep "Killed process" base-oom-run1.log | tail -n1 > > [ 211.824379] Killed process 3086 (mem_eater) total-vm:85852kB, > > anon-rss:81996kB, file-rs

Re: [PATCH 0/3] OOM detection rework v4

2015-12-18 Thread Michal Hocko
On Wed 16-12-15 15:35:13, Andrew Morton wrote: [...] > So... please have a think about it? What can we add in here to make it > as easy as possible for us (ie: you ;)) to get this code working well? > At this time, too much developer support code will be better than too > little. We can take it

Re: [PATCH 0/3] OOM detection rework v4

2015-12-16 Thread Andrew Morton
On Tue, 15 Dec 2015 19:19:43 +0100 Michal Hocko wrote: > > ... > > * base kernel > $ grep "Killed process" base-oom-run1.log | tail -n1 > [ 211.824379] Killed process 3086 (mem_eater) total-vm:85852kB, > anon-rss:81996kB, file-rss:332kB, shmem-rss:0kB > $ grep "Killed process" base-oom-run2.lo

Re: [PATCH 0/3] OOM detection rework v4

2015-12-16 Thread Andrew Morton
On Tue, 15 Dec 2015 19:19:43 +0100 Michal Hocko wrote: > This is an attempt to make the OOM detection more deterministic and > easier to follow because each reclaimer basically tracks its own > progress which is implemented at the page allocator layer rather spread > out between the allocator and

[PATCH 0/3] OOM detection rework v4

2015-12-15 Thread Michal Hocko
Hi, This is v4 of the series. The previous version was posted [1]. I have dropped the RFC because this has been sitting and waiting for the fundamental objections for quite some time and there were none. I still do not think we should rush this and merge it no sooner than 4.6. Having this in the