Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Mel Gorman
On Fri, Dec 30, 2016 at 12:05:45PM +0100, Michal Hocko wrote: > On Fri 30-12-16 10:19:26, Mel Gorman wrote: > > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: >

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Mel Gorman
On Fri, Dec 30, 2016 at 12:05:45PM +0100, Michal Hocko wrote: > On Fri 30-12-16 10:19:26, Mel Gorman wrote: > > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: >

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Michal Hocko
On Fri 30-12-16 10:19:26, Mel Gorman wrote: > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > > > Nils, even though this is still highly experimental,

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Michal Hocko
On Fri 30-12-16 10:19:26, Mel Gorman wrote: > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > > > Nils, even though this is still highly experimental,

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Michal Hocko
On Fri 30-12-16 11:05:22, Minchan Kim wrote: > On Thu, Dec 29, 2016 at 10:04:32AM +0100, Michal Hocko wrote: > > On Thu 29-12-16 10:20:26, Minchan Kim wrote: > > > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: [...] > > > > + * given zone_idx > > > > + */ > > > > +static unsigned

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Michal Hocko
On Fri 30-12-16 11:05:22, Minchan Kim wrote: > On Thu, Dec 29, 2016 at 10:04:32AM +0100, Michal Hocko wrote: > > On Thu 29-12-16 10:20:26, Minchan Kim wrote: > > > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: [...] > > > > + * given zone_idx > > > > + */ > > > > +static unsigned

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Mel Gorman
On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > Nils, even though this is still highly experimental, could you give it a > > > try please? > > > > Yes, no

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-30 Thread Mel Gorman
On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > Nils, even though this is still highly experimental, could you give it a > > > try please? > > > > Yes, no

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-29 Thread Minchan Kim
On Thu, Dec 29, 2016 at 10:04:32AM +0100, Michal Hocko wrote: > On Thu 29-12-16 10:20:26, Minchan Kim wrote: > > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > > > Hi, > > > could you try to run with the following patch on top of the previous > > > one? I do not think it will make

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-29 Thread Minchan Kim
On Thu, Dec 29, 2016 at 10:04:32AM +0100, Michal Hocko wrote: > On Thu 29-12-16 10:20:26, Minchan Kim wrote: > > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > > > Hi, > > > could you try to run with the following patch on top of the previous > > > one? I do not think it will make

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-29 Thread Michal Hocko
On Thu 29-12-16 10:20:26, Minchan Kim wrote: > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > > Hi, > > could you try to run with the following patch on top of the previous > > one? I do not think it will make a large change in your workload but > > I think we need something like

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-29 Thread Michal Hocko
On Thu 29-12-16 10:20:26, Minchan Kim wrote: > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > > Hi, > > could you try to run with the following patch on top of the previous > > one? I do not think it will make a large change in your workload but > > I think we need something like

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-29 Thread Michal Hocko
On Thu 29-12-16 09:48:24, Minchan Kim wrote: > On Thu, Dec 29, 2016 at 09:31:54AM +0900, Minchan Kim wrote: [...] > > Acked-by: Minchan Kim Thanks! > Nit: > > WARNING: line over 80 characters > #53: FILE: include/linux/memcontrol.h:689: > +unsigned long

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-29 Thread Michal Hocko
On Thu 29-12-16 09:48:24, Minchan Kim wrote: > On Thu, Dec 29, 2016 at 09:31:54AM +0900, Minchan Kim wrote: [...] > > Acked-by: Minchan Kim Thanks! > Nit: > > WARNING: line over 80 characters > #53: FILE: include/linux/memcontrol.h:689: > +unsigned long mem_cgroup_get_zone_lru_size(struct

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Minchan Kim
On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > Hi, > could you try to run with the following patch on top of the previous > one? I do not think it will make a large change in your workload but > I think we need something like that so some testing under which is known > to make a

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Minchan Kim
On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > Hi, > could you try to run with the following patch on top of the previous > one? I do not think it will make a large change in your workload but > I think we need something like that so some testing under which is known > to make a

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Minchan Kim
On Thu, Dec 29, 2016 at 09:31:54AM +0900, Minchan Kim wrote: > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > > > Nils, even though this is still highly

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Minchan Kim
On Thu, Dec 29, 2016 at 09:31:54AM +0900, Minchan Kim wrote: > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > > > Nils, even though this is still highly

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Minchan Kim
On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > Nils, even though this is still highly experimental, could you give it a > > > try please? > > > > Yes, no

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Minchan Kim
On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > Nils, even though this is still highly experimental, could you give it a > > > try please? > > > > Yes, no

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Michal Hocko
On Tue 27-12-16 20:33:09, Nils Holland wrote: > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > > Hi, > > could you try to run with the following patch on top of the previous > > one? I do not think it will make a large change in your workload but > > I think we need something like

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-28 Thread Michal Hocko
On Tue 27-12-16 20:33:09, Nils Holland wrote: > On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > > Hi, > > could you try to run with the following patch on top of the previous > > one? I do not think it will make a large change in your workload but > > I think we need something like

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Nils Holland
On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > Hi, > could you try to run with the following patch on top of the previous > one? I do not think it will make a large change in your workload but > I think we need something like that so some testing under which is known > to make a

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Nils Holland
On Tue, Dec 27, 2016 at 04:55:33PM +0100, Michal Hocko wrote: > Hi, > could you try to run with the following patch on top of the previous > one? I do not think it will make a large change in your workload but > I think we need something like that so some testing under which is known > to make a

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Michal Hocko
Hi, could you try to run with the following patch on top of the previous one? I do not think it will make a large change in your workload but I think we need something like that so some testing under which is known to make a high lowmem pressure would be really appreciated. If you have more time

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Michal Hocko
Hi, could you try to run with the following patch on top of the previous one? I do not think it will make a large change in your workload but I think we need something like that so some testing under which is known to make a high lowmem pressure would be really appreciated. If you have more time

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Michal Hocko
On Tue 27-12-16 12:23:13, Nils Holland wrote: > On Tue, Dec 27, 2016 at 09:08:38AM +0100, Michal Hocko wrote: > > On Mon 26-12-16 19:57:03, Nils Holland wrote: > > > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > > > On

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Michal Hocko
On Tue 27-12-16 12:23:13, Nils Holland wrote: > On Tue, Dec 27, 2016 at 09:08:38AM +0100, Michal Hocko wrote: > > On Mon 26-12-16 19:57:03, Nils Holland wrote: > > > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > > > On

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Nils Holland
On Tue, Dec 27, 2016 at 09:08:38AM +0100, Michal Hocko wrote: > On Mon 26-12-16 19:57:03, Nils Holland wrote: > > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote:

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Nils Holland
On Tue, Dec 27, 2016 at 09:08:38AM +0100, Michal Hocko wrote: > On Mon 26-12-16 19:57:03, Nils Holland wrote: > > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote:

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Michal Hocko
On Mon 26-12-16 19:57:03, Nils Holland wrote: > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > > > Nils, even though this is still highly experimental,

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-27 Thread Michal Hocko
On Mon 26-12-16 19:57:03, Nils Holland wrote: > On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > > > Nils, even though this is still highly experimental,

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-26 Thread Nils Holland
On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > Nils, even though this is still highly experimental, could you give it a > > > try please? > > > > Yes, no

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-26 Thread Nils Holland
On Mon, Dec 26, 2016 at 01:48:40PM +0100, Michal Hocko wrote: > On Fri 23-12-16 23:26:00, Nils Holland wrote: > > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > > > Nils, even though this is still highly experimental, could you give it a > > > try please? > > > > Yes, no

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-26 Thread Michal Hocko
On Fri 23-12-16 23:26:00, Nils Holland wrote: > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > Nils, even though this is still highly experimental, could you give it a > > try please? > > Yes, no problem! So I kept the very first patch you sent but had to > revert the

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-26 Thread Michal Hocko
On Fri 23-12-16 23:26:00, Nils Holland wrote: > On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > > > Nils, even though this is still highly experimental, could you give it a > > try please? > > Yes, no problem! So I kept the very first patch you sent but had to > revert the

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-23 Thread Nils Holland
On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > Nils, even though this is still highly experimental, could you give it a > try please? Yes, no problem! So I kept the very first patch you sent but had to revert the latest version of the debugging patch (the one in which you

Re: [RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-23 Thread Nils Holland
On Fri, Dec 23, 2016 at 03:47:39PM +0100, Michal Hocko wrote: > > Nils, even though this is still highly experimental, could you give it a > try please? Yes, no problem! So I kept the very first patch you sent but had to revert the latest version of the debugging patch (the one in which you

[RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-23 Thread Michal Hocko
[Add Mel, Johannes and Vladimir - the email thread started here http://lkml.kernel.org/r/20161215225702.ga27...@boerne.fritz.box The long story short, the zone->node reclaim change has broken active list aging for lowmem requests when memory cgroups are enabled. More details below. On Fri

[RFC PATCH] mm, memcg: fix (Re: OOM: Better, but still there on)

2016-12-23 Thread Michal Hocko
[Add Mel, Johannes and Vladimir - the email thread started here http://lkml.kernel.org/r/20161215225702.ga27...@boerne.fritz.box The long story short, the zone->node reclaim change has broken active list aging for lowmem requests when memory cgroups are enabled. More details below. On Fri

Re: OOM: Better, but still there on

2016-12-23 Thread Michal Hocko
On Fri 23-12-16 13:18:51, Nils Holland wrote: > On Fri, Dec 23, 2016 at 11:51:57AM +0100, Michal Hocko wrote: > > TL;DR > > drop the last patch, check whether memory cgroup is enabled and retest > > with cgroup_disable=memory to see whether this is memcg related and if > > it is _not_ then try to

Re: OOM: Better, but still there on

2016-12-23 Thread Michal Hocko
On Fri 23-12-16 13:18:51, Nils Holland wrote: > On Fri, Dec 23, 2016 at 11:51:57AM +0100, Michal Hocko wrote: > > TL;DR > > drop the last patch, check whether memory cgroup is enabled and retest > > with cgroup_disable=memory to see whether this is memcg related and if > > it is _not_ then try to

Re: OOM: Better, but still there on

2016-12-23 Thread Nils Holland
On Fri, Dec 23, 2016 at 11:51:57AM +0100, Michal Hocko wrote: > TL;DR > drop the last patch, check whether memory cgroup is enabled and retest > with cgroup_disable=memory to see whether this is memcg related and if > it is _not_ then try to test with the patch below Right, it seems we might be

Re: OOM: Better, but still there on

2016-12-23 Thread Nils Holland
On Fri, Dec 23, 2016 at 11:51:57AM +0100, Michal Hocko wrote: > TL;DR > drop the last patch, check whether memory cgroup is enabled and retest > with cgroup_disable=memory to see whether this is memcg related and if > it is _not_ then try to test with the patch below Right, it seems we might be

Re: OOM: Better, but still there on

2016-12-23 Thread Michal Hocko
TL;DR drop the last patch, check whether memory cgroup is enabled and retest with cgroup_disable=memory to see whether this is memcg related and if it is _not_ then try to test with the patch below On Thu 22-12-16 22:46:11, Nils Holland wrote: > On Thu, Dec 22, 2016 at 08:17:19PM +0100, Michal

Re: OOM: Better, but still there on

2016-12-23 Thread Michal Hocko
TL;DR drop the last patch, check whether memory cgroup is enabled and retest with cgroup_disable=memory to see whether this is memcg related and if it is _not_ then try to test with the patch below On Thu 22-12-16 22:46:11, Nils Holland wrote: > On Thu, Dec 22, 2016 at 08:17:19PM +0100, Michal

Re: OOM: Better, but still there on

2016-12-22 Thread Nils Holland
On Thu, Dec 22, 2016 at 08:17:19PM +0100, Michal Hocko wrote: > TL;DR I still do not see what is going on here and it still smells like > multiple issues. Please apply the patch below on _top_ of what you had. I've run the usual procedure again with the new patch on top and the log is now up at:

Re: OOM: Better, but still there on

2016-12-22 Thread Nils Holland
On Thu, Dec 22, 2016 at 08:17:19PM +0100, Michal Hocko wrote: > TL;DR I still do not see what is going on here and it still smells like > multiple issues. Please apply the patch below on _top_ of what you had. I've run the usual procedure again with the new patch on top and the log is now up at:

Re: OOM: Better, but still there on

2016-12-22 Thread Michal Hocko
TL;DR I still do not see what is going on here and it still smells like multiple issues. Please apply the patch below on _top_ of what you had. On Thu 22-12-16 11:10:29, Nils Holland wrote: [...] > http://ftp.tisys.org/pub/misc/boerne_2016-12-22.log.xz It took me a while to realize that

Re: OOM: Better, but still there on

2016-12-22 Thread Michal Hocko
TL;DR I still do not see what is going on here and it still smells like multiple issues. Please apply the patch below on _top_ of what you had. On Thu 22-12-16 11:10:29, Nils Holland wrote: [...] > http://ftp.tisys.org/pub/misc/boerne_2016-12-22.log.xz It took me a while to realize that

Re: OOM: Better, but still there on

2016-12-22 Thread Tetsuo Handa
Nils Holland wrote: > Well, the issue is that I could only do everything via ssh today and > don't have any physical access to the machines. In fact, both seem to > have suffered a genuine kernel panic, which is also visible in the > last few lines of the log I provided today. So, basically, both

Re: OOM: Better, but still there on

2016-12-22 Thread Tetsuo Handa
Nils Holland wrote: > Well, the issue is that I could only do everything via ssh today and > don't have any physical access to the machines. In fact, both seem to > have suffered a genuine kernel panic, which is also visible in the > last few lines of the log I provided today. So, basically, both

Re: OOM: Better, but still there on

2016-12-22 Thread Nils Holland
On Thu, Dec 22, 2016 at 11:27:25AM +0100, Michal Hocko wrote: > On Thu 22-12-16 11:10:29, Nils Holland wrote: > > > However, the log comes from machine #2 again today, as I'm > > unfortunately forced to try this via VPN from work to home today, so I > > have exactly one attempt per machine before

Re: OOM: Better, but still there on

2016-12-22 Thread Nils Holland
On Thu, Dec 22, 2016 at 11:27:25AM +0100, Michal Hocko wrote: > On Thu 22-12-16 11:10:29, Nils Holland wrote: > > > However, the log comes from machine #2 again today, as I'm > > unfortunately forced to try this via VPN from work to home today, so I > > have exactly one attempt per machine before

Re: OOM: Better, but still there on

2016-12-22 Thread Michal Hocko
On Thu 22-12-16 11:10:29, Nils Holland wrote: > On Wed, Dec 21, 2016 at 08:36:59AM +0100, Michal Hocko wrote: > > TL;DR > > there is another version of the debugging patch. Just revert the > > previous one and apply this one instead. It's still not clear what > > is going on but I suspect either

Re: OOM: Better, but still there on

2016-12-22 Thread Michal Hocko
On Thu 22-12-16 11:10:29, Nils Holland wrote: > On Wed, Dec 21, 2016 at 08:36:59AM +0100, Michal Hocko wrote: > > TL;DR > > there is another version of the debugging patch. Just revert the > > previous one and apply this one instead. It's still not clear what > > is going on but I suspect either

Re: OOM: Better, but still there on

2016-12-22 Thread Nils Holland
On Wed, Dec 21, 2016 at 08:36:59AM +0100, Michal Hocko wrote: > TL;DR > there is another version of the debugging patch. Just revert the > previous one and apply this one instead. It's still not clear what > is going on but I suspect either some misaccounting or unexpeted > pages on the LRU lists.

Re: OOM: Better, but still there on

2016-12-22 Thread Nils Holland
On Wed, Dec 21, 2016 at 08:36:59AM +0100, Michal Hocko wrote: > TL;DR > there is another version of the debugging patch. Just revert the > previous one and apply this one instead. It's still not clear what > is going on but I suspect either some misaccounting or unexpeted > pages on the LRU lists.

Re: OOM: Better, but still there on

2016-12-21 Thread Chris Mason
On Wed, Dec 21, 2016 at 12:16:53PM +0100, Michal Hocko wrote: On Wed 21-12-16 20:00:38, Tetsuo Handa wrote: One thing to note here, when we are talking about 32b kernel, things have changed in 4.8 when we moved from the zone based to node based reclaim (see b2e18757f2c9 ("mm, vmscan: begin

Re: OOM: Better, but still there on

2016-12-21 Thread Chris Mason
On Wed, Dec 21, 2016 at 12:16:53PM +0100, Michal Hocko wrote: On Wed 21-12-16 20:00:38, Tetsuo Handa wrote: One thing to note here, when we are talking about 32b kernel, things have changed in 4.8 when we moved from the zone based to node based reclaim (see b2e18757f2c9 ("mm, vmscan: begin

Re: OOM: Better, but still there on

2016-12-21 Thread Michal Hocko
On Wed 21-12-16 20:00:38, Tetsuo Handa wrote: > Michal Hocko wrote: > > TL;DR > > there is another version of the debugging patch. Just revert the > > previous one and apply this one instead. It's still not clear what > > is going on but I suspect either some misaccounting or unexpeted > > pages

Re: OOM: Better, but still there on

2016-12-21 Thread Michal Hocko
On Wed 21-12-16 20:00:38, Tetsuo Handa wrote: > Michal Hocko wrote: > > TL;DR > > there is another version of the debugging patch. Just revert the > > previous one and apply this one instead. It's still not clear what > > is going on but I suspect either some misaccounting or unexpeted > > pages

Re: OOM: Better, but still there on

2016-12-21 Thread Tetsuo Handa
Michal Hocko wrote: > TL;DR > there is another version of the debugging patch. Just revert the > previous one and apply this one instead. It's still not clear what > is going on but I suspect either some misaccounting or unexpeted > pages on the LRU lists. I have added one more tracepoint, so

Re: OOM: Better, but still there on

2016-12-21 Thread Tetsuo Handa
Michal Hocko wrote: > TL;DR > there is another version of the debugging patch. Just revert the > previous one and apply this one instead. It's still not clear what > is going on but I suspect either some misaccounting or unexpeted > pages on the LRU lists. I have added one more tracepoint, so

Re: OOM: Better, but still there on

2016-12-20 Thread Michal Hocko
TL;DR there is another version of the debugging patch. Just revert the previous one and apply this one instead. It's still not clear what is going on but I suspect either some misaccounting or unexpeted pages on the LRU lists. I have added one more tracepoint, so please enable also

Re: OOM: Better, but still there on

2016-12-20 Thread Michal Hocko
TL;DR there is another version of the debugging patch. Just revert the previous one and apply this one instead. It's still not clear what is going on but I suspect either some misaccounting or unexpeted pages on the LRU lists. I have added one more tracepoint, so please enable also

Re: OOM: Better, but still there on

2016-12-19 Thread Nils Holland
On Mon, Dec 19, 2016 at 02:45:34PM +0100, Michal Hocko wrote: > Unfortunatelly shrink_active_list doesn't have any tracepoint so we do > not know whether we managed to rotate those pages. If they are referenced > quickly enough we might just keep refaulting them... Could you try to apply > the

Re: OOM: Better, but still there on

2016-12-19 Thread Nils Holland
On Mon, Dec 19, 2016 at 02:45:34PM +0100, Michal Hocko wrote: > Unfortunatelly shrink_active_list doesn't have any tracepoint so we do > not know whether we managed to rotate those pages. If they are referenced > quickly enough we might just keep refaulting them... Could you try to apply > the

Re: OOM: Better, but still there on

2016-12-19 Thread Michal Hocko
On Sat 17-12-16 22:06:47, Nils Holland wrote: [...] > Unfortunately, the reclaim trace messages stopped a while after the first > OOM messages show up - most likely my "cat" had been killed at that > point or became unresponsive. :-/ The later is more probable because I do not see the OOM killer

Re: OOM: Better, but still there on

2016-12-19 Thread Michal Hocko
On Sat 17-12-16 22:06:47, Nils Holland wrote: [...] > Unfortunately, the reclaim trace messages stopped a while after the first > OOM messages show up - most likely my "cat" had been killed at that > point or became unresponsive. :-/ The later is more probable because I do not see the OOM killer

Re: OOM: Better, but still there on

2016-12-17 Thread Tetsuo Handa
Nils Holland wrote: > On Sat, Dec 17, 2016 at 11:44:45PM +0900, Tetsuo Handa wrote: > > On 2016/12/17 21:59, Nils Holland wrote: > > > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > > >> mount -t tracefs none /debug/trace > > >> echo 1 > /debug/trace/events/vmscan/enable > > >>

Re: OOM: Better, but still there on

2016-12-17 Thread Tetsuo Handa
Nils Holland wrote: > On Sat, Dec 17, 2016 at 11:44:45PM +0900, Tetsuo Handa wrote: > > On 2016/12/17 21:59, Nils Holland wrote: > > > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > > >> mount -t tracefs none /debug/trace > > >> echo 1 > /debug/trace/events/vmscan/enable > > >>

Re: OOM: Better, but still there on

2016-12-17 Thread Xin Zhou
;Nils Holland" <nholl...@tisys.org>, "Michal Hocko" <mho...@kernel.org> Cc: linux-kernel@vger.kernel.org, linux...@kvack.org, "Chris Mason" <c...@fb.com>, "David Sterba" <dste...@suse.cz>, linux-bt...@vger.kernel.org Subject: Re: OOM: Better, bu

Re: OOM: Better, but still there on

2016-12-17 Thread Xin Zhou
Hocko" Cc: linux-kernel@vger.kernel.org, linux...@kvack.org, "Chris Mason" , "David Sterba" , linux-bt...@vger.kernel.org Subject: Re: OOM: Better, but still there on On 2016/12/17 21:59, Nils Holland wrote: > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: >>

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland
On Sat, Dec 17, 2016 at 11:44:45PM +0900, Tetsuo Handa wrote: > On 2016/12/17 21:59, Nils Holland wrote: > > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > >> mount -t tracefs none /debug/trace > >> echo 1 > /debug/trace/events/vmscan/enable > >> cat /debug/trace/trace_pipe >

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland
On Sat, Dec 17, 2016 at 11:44:45PM +0900, Tetsuo Handa wrote: > On 2016/12/17 21:59, Nils Holland wrote: > > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > >> mount -t tracefs none /debug/trace > >> echo 1 > /debug/trace/events/vmscan/enable > >> cat /debug/trace/trace_pipe >

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland
On Sat, Dec 17, 2016 at 11:44:45PM +0900, Tetsuo Handa wrote: > On 2016/12/17 21:59, Nils Holland wrote: > > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > >> mount -t tracefs none /debug/trace > >> echo 1 > /debug/trace/events/vmscan/enable > >> cat /debug/trace/trace_pipe >

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland
On Sat, Dec 17, 2016 at 11:44:45PM +0900, Tetsuo Handa wrote: > On 2016/12/17 21:59, Nils Holland wrote: > > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > >> mount -t tracefs none /debug/trace > >> echo 1 > /debug/trace/events/vmscan/enable > >> cat /debug/trace/trace_pipe >

Re: OOM: Better, but still there on

2016-12-17 Thread Tetsuo Handa
On 2016/12/17 21:59, Nils Holland wrote: > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: >> mount -t tracefs none /debug/trace >> echo 1 > /debug/trace/events/vmscan/enable >> cat /debug/trace/trace_pipe > trace.log >> >> should help >> [...] > > No problem! I enabled writing the

Re: OOM: Better, but still there on

2016-12-17 Thread Tetsuo Handa
On 2016/12/17 21:59, Nils Holland wrote: > On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: >> mount -t tracefs none /debug/trace >> echo 1 > /debug/trace/events/vmscan/enable >> cat /debug/trace/trace_pipe > trace.log >> >> should help >> [...] > > No problem! I enabled writing the

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland
On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > On Fri 16-12-16 19:47:00, Nils Holland wrote: > > > > Dec 16 18:56:24 boerne.fritz.box kernel: Purging GPU memory, 37 pages > > freed, 10219 pages still pinned. > > Dec 16 18:56:29 boerne.fritz.box kernel: kthreadd invoked

Re: OOM: Better, but still there on

2016-12-17 Thread Nils Holland
On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote: > On Fri 16-12-16 19:47:00, Nils Holland wrote: > > > > Dec 16 18:56:24 boerne.fritz.box kernel: Purging GPU memory, 37 pages > > freed, 10219 pages still pinned. > > Dec 16 18:56:29 boerne.fritz.box kernel: kthreadd invoked

Re: OOM: Better, but still there on

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 19:47:00, Nils Holland wrote: [...] > Despite the fact that I'm no expert, I can see that there's no more > GFP_NOFS being logged, which seems to be what the patches tried to > achieve. What the still present OOMs mean remains up for > interpretation by the experts, all I can say

Re: OOM: Better, but still there on

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 19:47:00, Nils Holland wrote: [...] > Despite the fact that I'm no expert, I can see that there's no more > GFP_NOFS being logged, which seems to be what the patches tried to > achieve. What the still present OOMs mean remains up for > interpretation by the experts, all I can say

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 17:47:25, Chris Mason wrote: > On 12/16/2016 05:14 PM, Michal Hocko wrote: > > On Fri 16-12-16 13:15:18, Chris Mason wrote: > > > On 12/16/2016 02:39 AM, Michal Hocko wrote: > > [...] > > > > I believe the right way to go around this is to pursue what I've started > > > > in [1]. I

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 17:47:25, Chris Mason wrote: > On 12/16/2016 05:14 PM, Michal Hocko wrote: > > On Fri 16-12-16 13:15:18, Chris Mason wrote: > > > On 12/16/2016 02:39 AM, Michal Hocko wrote: > > [...] > > > > I believe the right way to go around this is to pursue what I've started > > > > in [1]. I

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Chris Mason
On 12/16/2016 05:14 PM, Michal Hocko wrote: On Fri 16-12-16 13:15:18, Chris Mason wrote: On 12/16/2016 02:39 AM, Michal Hocko wrote: [...] I believe the right way to go around this is to pursue what I've started in [1]. I will try to prepare something for testing today for you. Stay tuned.

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Chris Mason
On 12/16/2016 05:14 PM, Michal Hocko wrote: On Fri 16-12-16 13:15:18, Chris Mason wrote: On 12/16/2016 02:39 AM, Michal Hocko wrote: [...] I believe the right way to go around this is to pursue what I've started in [1]. I will try to prepare something for testing today for you. Stay tuned.

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 13:15:18, Chris Mason wrote: > On 12/16/2016 02:39 AM, Michal Hocko wrote: [...] > > I believe the right way to go around this is to pursue what I've started > > in [1]. I will try to prepare something for testing today for you. Stay > > tuned. But I would be really happy if

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 13:15:18, Chris Mason wrote: > On 12/16/2016 02:39 AM, Michal Hocko wrote: [...] > > I believe the right way to go around this is to pursue what I've started > > in [1]. I will try to prepare something for testing today for you. Stay > > tuned. But I would be really happy if

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Chris Mason
On 12/16/2016 02:39 AM, Michal Hocko wrote: [CC linux-mm and btrfs guys] On Thu 15-12-16 23:57:04, Nils Holland wrote: [...] Of course, none of this are workloads that are new / special in any way - prior to 4.8, I never experienced any issues doing the exact same things. Dec 15 19:02:16

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Chris Mason
On 12/16/2016 02:39 AM, Michal Hocko wrote: [CC linux-mm and btrfs guys] On Thu 15-12-16 23:57:04, Nils Holland wrote: [...] Of course, none of this are workloads that are new / special in any way - prior to 4.8, I never experienced any issues doing the exact same things. Dec 15 19:02:16

Re: OOM: Better, but still there on

2016-12-16 Thread Nils Holland
On Fri, Dec 16, 2016 at 04:58:06PM +0100, Michal Hocko wrote: > On Fri 16-12-16 08:39:41, Michal Hocko wrote: > [...] > > That being said, the OOM killer invocation is clearly pointless and > > pre-mature. We normally do not invoke it normally for GFP_NOFS requests > > exactly for these reasons.

Re: OOM: Better, but still there on

2016-12-16 Thread Nils Holland
On Fri, Dec 16, 2016 at 04:58:06PM +0100, Michal Hocko wrote: > On Fri 16-12-16 08:39:41, Michal Hocko wrote: > [...] > > That being said, the OOM killer invocation is clearly pointless and > > pre-mature. We normally do not invoke it normally for GFP_NOFS requests > > exactly for these reasons.

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Chris Mason
On 12/16/2016 02:39 AM, Michal Hocko wrote: [CC linux-mm and btrfs guys] On Thu 15-12-16 23:57:04, Nils Holland wrote: [...] Of course, none of this are workloads that are new / special in any way - prior to 4.8, I never experienced any issues doing the exact same things. Dec 15 19:02:16

Re: OOM: Better, but still there on 4.9

2016-12-16 Thread Chris Mason
On 12/16/2016 02:39 AM, Michal Hocko wrote: [CC linux-mm and btrfs guys] On Thu 15-12-16 23:57:04, Nils Holland wrote: [...] Of course, none of this are workloads that are new / special in any way - prior to 4.8, I never experienced any issues doing the exact same things. Dec 15 19:02:16

Re: OOM: Better, but still there on

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 08:39:41, Michal Hocko wrote: [...] > That being said, the OOM killer invocation is clearly pointless and > pre-mature. We normally do not invoke it normally for GFP_NOFS requests > exactly for these reasons. But this is GFP_NOFS|__GFP_NOFAIL which > behaves differently. I am about

Re: OOM: Better, but still there on

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 08:39:41, Michal Hocko wrote: [...] > That being said, the OOM killer invocation is clearly pointless and > pre-mature. We normally do not invoke it normally for GFP_NOFS requests > exactly for these reasons. But this is GFP_NOFS|__GFP_NOFAIL which > behaves differently. I am about

Re: OOM: Better, but still there on 4.9

2016-12-15 Thread Michal Hocko
[CC linux-mm and btrfs guys] On Thu 15-12-16 23:57:04, Nils Holland wrote: [...] > Of course, none of this are workloads that are new / special in any > way - prior to 4.8, I never experienced any issues doing the exact > same things. > > Dec 15 19:02:16 teela kernel: kworker/u4:5 invoked

Re: OOM: Better, but still there on 4.9

2016-12-15 Thread Michal Hocko
[CC linux-mm and btrfs guys] On Thu 15-12-16 23:57:04, Nils Holland wrote: [...] > Of course, none of this are workloads that are new / special in any > way - prior to 4.8, I never experienced any issues doing the exact > same things. > > Dec 15 19:02:16 teela kernel: kworker/u4:5 invoked

  1   2   >