Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-10 Thread Michal Hocko
On Thu 06-11-14 11:28:45, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 05:02:23PM +0100, Michal Hocko wrote: [...] > > > We're doing one thing for non-PM freezing and the other way around for > > > PM freezing, which indicates one of the two directions is wrong. > > > > Because those two paths are q

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Michal Hocko
On Thu 06-11-14 11:33:04, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 05:31:24PM +0100, Michal Hocko wrote: > > On Thu 06-11-14 11:12:11, Tejun Heo wrote: [...] > > > Draining oom killer then doesn't mean anything, no? OOM killer may > > > have been disabled and drained but the killed tasks might w

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Tejun Heo
On Thu, Nov 06, 2014 at 05:31:24PM +0100, Michal Hocko wrote: > On Thu 06-11-14 11:12:11, Tejun Heo wrote: > > On Thu, Nov 06, 2014 at 05:01:58PM +0100, Michal Hocko wrote: > > > Yes, OOM killer simply kicks the process sets TIF_MEMDIE and terminates. > > > That will release the read_lock, allow th

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Michal Hocko
On Thu 06-11-14 11:12:11, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 05:01:58PM +0100, Michal Hocko wrote: > > Yes, OOM killer simply kicks the process sets TIF_MEMDIE and terminates. > > That will release the read_lock, allow this to take the write lock and > > check whether it the current has bee

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Tejun Heo
On Thu, Nov 06, 2014 at 05:02:23PM +0100, Michal Hocko wrote: > > Why would PM freezing make OOM killing fail? That doesn't make much > > sense. Sure, it can block it for a finite duration for sync purposes > > but making OOM killing fail seems the wrong way around. > > We cannot block in the

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Tejun Heo
On Thu, Nov 06, 2014 at 05:01:58PM +0100, Michal Hocko wrote: > Yes, OOM killer simply kicks the process sets TIF_MEMDIE and terminates. > That will release the read_lock, allow this to take the write lock and > check whether it the current has been killed without any races. > OOM killer doesn't wa

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Michal Hocko
On Thu 06-11-14 10:01:21, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 01:49:53PM +0100, Michal Hocko wrote: > > On Wed 05-11-14 12:55:27, Tejun Heo wrote: > > > On Wed, Nov 05, 2014 at 06:46:09PM +0100, Michal Hocko wrote: > > > > Because out_of_memory can be called from mutliple paths. And > > > >

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Michal Hocko
On Thu 06-11-14 10:09:27, Tejun Heo wrote: > On Thu, Nov 06, 2014 at 02:05:43PM +0100, Michal Hocko wrote: > > But this is nothing new. Suspend hasn't been checking for fatal signals > > nor for TIF_MEMDIE since the OOM disabling was introduced and I suppose > > even before. > > > > This is not ha

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Tejun Heo
On Thu, Nov 06, 2014 at 02:05:43PM +0100, Michal Hocko wrote: > But this is nothing new. Suspend hasn't been checking for fatal signals > nor for TIF_MEMDIE since the OOM disabling was introduced and I suppose > even before. > > This is not harmful though. The previous OOM kill attempt would kick

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Tejun Heo
On Thu, Nov 06, 2014 at 01:49:53PM +0100, Michal Hocko wrote: > On Wed 05-11-14 12:55:27, Tejun Heo wrote: > > On Wed, Nov 05, 2014 at 06:46:09PM +0100, Michal Hocko wrote: > > > Because out_of_memory can be called from mutliple paths. And > > > the only interesting one should be the page allocatio

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Michal Hocko
On Wed 05-11-14 12:01:11, Tejun Heo wrote: > On Wed, Nov 05, 2014 at 11:54:28AM -0500, Tejun Heo wrote: > > > Still not following. How do you want to detect an on-going OOM without > > > any interface around out_of_memory? > > > > I thought you were using oom_killer_allowed_start() outside OOM pat

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-06 Thread Michal Hocko
On Wed 05-11-14 12:55:27, Tejun Heo wrote: > On Wed, Nov 05, 2014 at 06:46:09PM +0100, Michal Hocko wrote: > > Because out_of_memory can be called from mutliple paths. And > > the only interesting one should be the page allocation path. > > pagefault_out_of_memory is not interesting because it cann

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Tejun Heo
On Wed, Nov 05, 2014 at 06:46:09PM +0100, Michal Hocko wrote: > Because out_of_memory can be called from mutliple paths. And > the only interesting one should be the page allocation path. > pagefault_out_of_memory is not interesting because it cannot happen for > the frozen task. Hmmm wouldn't

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Wed 05-11-14 11:54:28, Tejun Heo wrote: > On Wed, Nov 05, 2014 at 05:39:56PM +0100, Michal Hocko wrote: > > On Wed 05-11-14 11:29:29, Tejun Heo wrote: > > > Hello, Michal. > > > > > > On Wed, Nov 05, 2014 at 05:01:15PM +0100, Michal Hocko wrote: > > > > I am not sure I am following. With the la

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Tejun Heo
On Wed, Nov 05, 2014 at 11:54:28AM -0500, Tejun Heo wrote: > > Still not following. How do you want to detect an on-going OOM without > > any interface around out_of_memory? > > I thought you were using oom_killer_allowed_start() outside OOM path. > Ugh why is everything weirdly structured? o

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Tejun Heo
On Wed, Nov 05, 2014 at 05:39:56PM +0100, Michal Hocko wrote: > On Wed 05-11-14 11:29:29, Tejun Heo wrote: > > Hello, Michal. > > > > On Wed, Nov 05, 2014 at 05:01:15PM +0100, Michal Hocko wrote: > > > I am not sure I am following. With the latest patch OOM path is no > > > longer blocked by the P

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Wed 05-11-14 11:29:29, Tejun Heo wrote: > Hello, Michal. > > On Wed, Nov 05, 2014 at 05:01:15PM +0100, Michal Hocko wrote: > > I am not sure I am following. With the latest patch OOM path is no > > longer blocked by the PM (aka oom_killer_disable()). Allocations simply > > fail if the read_tryl

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Tejun Heo
Hello, Michal. On Wed, Nov 05, 2014 at 05:01:15PM +0100, Michal Hocko wrote: > I am not sure I am following. With the latest patch OOM path is no > longer blocked by the PM (aka oom_killer_disable()). Allocations simply > fail if the read_trylock fails. > oom_killer_disable is moved before tasks a

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Wed 05-11-14 10:44:36, Tejun Heo wrote: > On Wed, Nov 05, 2014 at 02:42:19PM +0100, Michal Hocko wrote: > > On Wed 05-11-14 14:31:00, Michal Hocko wrote: > > > On Wed 05-11-14 08:02:47, Tejun Heo wrote: > > [...] > > > > Also, why isn't this part of > > > > oom_killer_disable/enable()? The way

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Tejun Heo
On Wed, Nov 05, 2014 at 02:42:19PM +0100, Michal Hocko wrote: > On Wed 05-11-14 14:31:00, Michal Hocko wrote: > > On Wed 05-11-14 08:02:47, Tejun Heo wrote: > [...] > > > Also, why isn't this part of > > > oom_killer_disable/enable()? The way they're implemented is really > > > silly now. It just

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
Ups, just noticed that I have a compile fix staged which didn't make it into git format-patch. Will repost after/if you are OK with this approach. But I guess this is much better outcome. Thanks for pushing Tejun! On Wed 05-11-14 15:14:58, Michal Hocko wrote: [...] > diff --git a/mm/oom_kill.c b/m

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Wed 05-11-14 13:46:20, Michal Hocko wrote: [...] > From ef6227565fa65b52986c4626d49ba53b499e54d1 Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Wed, 5 Nov 2014 11:49:14 +0100 > Subject: [PATCH] OOM, PM: make OOM detection in the freezer path raceless > > 5695be142e20 (OOM, PM: OOM kille

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Wed 05-11-14 14:42:19, Michal Hocko wrote: > On Wed 05-11-14 14:31:00, Michal Hocko wrote: > > On Wed 05-11-14 08:02:47, Tejun Heo wrote: > [...] > > > Also, why isn't this part of > > > oom_killer_disable/enable()? The way they're implemented is really > > > silly now. It just sets a flag and

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Wed 05-11-14 14:31:00, Michal Hocko wrote: > On Wed 05-11-14 08:02:47, Tejun Heo wrote: [...] > > Also, why isn't this part of > > oom_killer_disable/enable()? The way they're implemented is really > > silly now. It just sets a flag and returns whether there's a > > currently running instance

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Wed 05-11-14 08:02:47, Tejun Heo wrote: > Hello, Michal. > > On Wed, Nov 05, 2014 at 01:46:20PM +0100, Michal Hocko wrote: > > As I've said I wasn't entirely happy with this half solution but it helped > > the current situation at the time. The full solution would require to > > I don't think

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Tejun Heo
Hello, Michal. On Wed, Nov 05, 2014 at 01:46:20PM +0100, Michal Hocko wrote: > As I've said I wasn't entirely happy with this half solution but it helped > the current situation at the time. The full solution would require to I don't think this helps the situation. It just makes the bug more obs

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-05 Thread Michal Hocko
On Tue 04-11-14 14:27:05, Tejun Heo wrote: > Hello, > > Sorry about the delay. > > On Tue, Oct 21, 2014 at 04:29:39PM +0200, Michal Hocko wrote: > > Reduce the race window by checking all tasks after OOM killer has been > > Ugh... this is never a good direction to take. It often just ends up >

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-11-04 Thread Tejun Heo
Hello, Sorry about the delay. On Tue, Oct 21, 2014 at 04:29:39PM +0200, Michal Hocko wrote: > Reduce the race window by checking all tasks after OOM killer has been Ugh... this is never a good direction to take. It often just ends up making bugs harder to reproduce and track down. > disabled.

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-26 Thread Pavel Machek
Hi! > > +/* > + * Number of OOM killer invocations (including memcg OOM killer). > + * Primarily used by PM freezer to check for potential races with > + * OOM killed frozen task. > + */ > +static atomic_t oom_kills = ATOMIC_INIT(0); > + > +int oom_kills_count(void) > +{ > + return atomic_rea

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-26 Thread Pavel Machek
Hi! > + > + /* > + * There might have been an OOM kill while we were > + * freezing tasks and the killed task might be still > + * on the way out so we have to double check for race. > + */ ", so" > /* > + * PM-freezer sh

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-22 Thread Rafael J. Wysocki
On Wednesday, October 22, 2014 04:22:26 PM Michal Hocko wrote: > On Wed 22-10-14 16:39:12, Rafael J. Wysocki wrote: > > On Tuesday, October 21, 2014 04:29:39 PM Michal Hocko wrote: > > > On Tue 21-10-14 16:41:07, Rafael J. Wysocki wrote: > > > > On Tuesday, October 21, 2014 04:11:59 PM Michal Hocko

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-22 Thread Michal Hocko
On Wed 22-10-14 16:39:12, Rafael J. Wysocki wrote: > On Tuesday, October 21, 2014 04:29:39 PM Michal Hocko wrote: > > On Tue 21-10-14 16:41:07, Rafael J. Wysocki wrote: > > > On Tuesday, October 21, 2014 04:11:59 PM Michal Hocko wrote: > > [...] > > > > OK, incremental diff on top. I will post the

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-22 Thread Rafael J. Wysocki
On Tuesday, October 21, 2014 04:29:39 PM Michal Hocko wrote: > On Tue 21-10-14 16:41:07, Rafael J. Wysocki wrote: > > On Tuesday, October 21, 2014 04:11:59 PM Michal Hocko wrote: > [...] > > > OK, incremental diff on top. I will post the complete patch if you are > > > happier with this change > >

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-21 Thread Michal Hocko
On Tue 21-10-14 16:41:07, Rafael J. Wysocki wrote: > On Tuesday, October 21, 2014 04:11:59 PM Michal Hocko wrote: [...] > > OK, incremental diff on top. I will post the complete patch if you are > > happier with this change > > Yes, I am. --- >From 9ab46fe539cded8e7b6425b2cd23ba9184002fde Mon Sep

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-21 Thread Rafael J. Wysocki
On Tuesday, October 21, 2014 04:11:59 PM Michal Hocko wrote: > On Tue 21-10-14 15:42:23, Rafael J. Wysocki wrote: > > On Tuesday, October 21, 2014 03:14:45 PM Michal Hocko wrote: > > > On Tue 21-10-14 14:09:27, Rafael J. Wysocki wrote: > > > [...] > > > > > @@ -131,12 +132,40 @@ int freeze_processe

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-21 Thread Michal Hocko
On Tue 21-10-14 15:42:23, Rafael J. Wysocki wrote: > On Tuesday, October 21, 2014 03:14:45 PM Michal Hocko wrote: > > On Tue 21-10-14 14:09:27, Rafael J. Wysocki wrote: > > [...] > > > > @@ -131,12 +132,40 @@ int freeze_processes(void) > > > > > > > > printk("Freezing user space processes

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-21 Thread Rafael J. Wysocki
On Tuesday, October 21, 2014 03:14:45 PM Michal Hocko wrote: > On Tue 21-10-14 14:09:27, Rafael J. Wysocki wrote: > [...] > > > @@ -131,12 +132,40 @@ int freeze_processes(void) > > > > > > printk("Freezing user space processes ... "); > > > pm_freezing = true; > > > + oom_kills_saved = oom_ki

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-21 Thread Michal Hocko
On Tue 21-10-14 14:09:27, Rafael J. Wysocki wrote: [...] > > @@ -131,12 +132,40 @@ int freeze_processes(void) > > > > printk("Freezing user space processes ... "); > > pm_freezing = true; > > + oom_kills_saved = oom_kills_count(); > > error = try_to_freeze_tasks(true); > > if (!

Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-21 Thread Rafael J. Wysocki
On Tuesday, October 21, 2014 09:27:14 AM Michal Hocko wrote: > PM freezer relies on having all tasks frozen by the time devices are > getting frozen so that no task will touch them while they are getting > frozen. But OOM killer is allowed to kill an already frozen task in > order to handle OOM sit

[PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

2014-10-21 Thread Michal Hocko
PM freezer relies on having all tasks frozen by the time devices are getting frozen so that no task will touch them while they are getting frozen. But OOM killer is allowed to kill an already frozen task in order to handle OOM situtation. In order to protect from late wake ups OOM killer is disable