Re: System freezes after OOM

2016-07-18 Thread David Rientjes
On Mon, 18 Jul 2016, Michal Hocko wrote: > > There's > > two fundamental ways to go about it: (1) ensure mempool_alloc() can make > > forward progress (whether that's by way of gfp flags or access to memory > > reserves, which may depend on the process context such as PF_MEMALLOC) or > > (2) r

Re: System freezes after OOM

2016-07-18 Thread Johannes Weiner
CC Dave Chinner, who I recall had strong opinions on the mempool model The context is commit f9054c7 ("mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"), which gives MEMALLOC/TIF_MEMDIE mempool allocations access to the system emergency reserves when there is no reserved object c

Re: System freezes after OOM

2016-07-18 Thread Michal Hocko
On Fri 15-07-16 14:47:30, David Rientjes wrote: > On Fri, 15 Jul 2016, Michal Hocko wrote: [...] > > And let me repeat your proposed patch > > has a undesirable side effects so we should think about a way to deal > > with those cases. It might work for your setups but it shouldn't break > > others

Re: System freezes after OOM

2016-07-18 Thread Michal Hocko
On Fri 15-07-16 13:02:17, Mikulas Patocka wrote: > > > On Fri, 15 Jul 2016, Michal Hocko wrote: > > > On Fri 15-07-16 08:11:22, Mikulas Patocka wrote: > > > > > > The stacktraces showed that the kcryptd process was throttled when it > > > tried to do mempool allocation. Mempool adds the __GFP_

Re: System freezes after OOM

2016-07-15 Thread Mikulas Patocka
On Fri, 15 Jul 2016, David Rientjes wrote: > Kworkers are processing writeback, ext4_writepages() relies on kmem that ext4_writepages is above device mapper, not below, so how it could block device mapper progress? Do you use device mapper on the top of block loop device? Writing to loop is

Re: System freezes after OOM

2016-07-15 Thread David Rientjes
On Fri, 15 Jul 2016, Mikulas Patocka wrote: > And what about the oom reaper? It should have freed all victim's pages > even if the victim is looping in mempool_alloc. Why the oom reaper didn't > free up memory? > Is that possible with mlock or shared memory? Nope. The oom killer does not ha

Re: System freezes after OOM

2016-07-15 Thread David Rientjes
On Fri, 15 Jul 2016, Michal Hocko wrote: > > If PF_MEMALLOC context is allocating too much memory reserves, then I'd > > argue that is a problem independent of using mempool_alloc() since > > mempool_alloc() can evolve directly into a call to the page allocator. > > How does such a process gua

Re: System freezes after OOM

2016-07-15 Thread Mikulas Patocka
On Fri, 15 Jul 2016, David Rientjes wrote: > On Fri, 15 Jul 2016, Mikulas Patocka wrote: > > > > There is no guarantee that _anything_ can return memory to the mempool, > > > > You misunderstand mempools if you make such claims. > > > > There is in fact guarantee that objects will be returned

Re: System freezes after OOM

2016-07-15 Thread David Rientjes
On Fri, 15 Jul 2016, Mikulas Patocka wrote: > > There is no guarantee that _anything_ can return memory to the mempool, > > You misunderstand mempools if you make such claims. > > There is in fact guarantee that objects will be returned to mempool. In > the past I reviewed device mapper thoroug

Re: System freezes after OOM

2016-07-15 Thread David Rientjes
On Fri, 15 Jul 2016, Mikulas Patocka wrote: > > Umm, show me an explicit guarantee where the oom reaper will free memory > > such that other threads may return memory to this process's mempool so it > > can make forward progress in mempool_alloc() without the need of utilizing > > memory reserv

Re: System freezes after OOM

2016-07-15 Thread Mikulas Patocka
On Fri, 15 Jul 2016, Michal Hocko wrote: > On Fri 15-07-16 08:11:22, Mikulas Patocka wrote: > > > > The stacktraces showed that the kcryptd process was throttled when it > > tried to do mempool allocation. Mempool adds the __GFP_NORETRY flag to the > > allocation, but unfortunatelly, this fla

Re: System freezes after OOM

2016-07-15 Thread Michal Hocko
On Fri 15-07-16 08:11:22, Mikulas Patocka wrote: > > > On Fri, 15 Jul 2016, Michal Hocko wrote: > > > On Thu 14-07-16 13:35:35, Mikulas Patocka wrote: > > > On Thu, 14 Jul 2016, Michal Hocko wrote: > > > > On Thu 14-07-16 10:00:16, Mikulas Patocka wrote: > > > > > But it needs other changes to h

Re: System freezes after OOM

2016-07-15 Thread Mikulas Patocka
On Fri, 15 Jul 2016, Michal Hocko wrote: > On Thu 14-07-16 13:35:35, Mikulas Patocka wrote: > > On Thu, 14 Jul 2016, Michal Hocko wrote: > > > On Thu 14-07-16 10:00:16, Mikulas Patocka wrote: > > > > But it needs other changes to honor the PF_LESS_THROTTLE flag: > > > > > > > > static int curre

Re: System freezes after OOM

2016-07-15 Thread Mikulas Patocka
On Fri, 15 Jul 2016, Michal Hocko wrote: > On Thu 14-07-16 13:38:42, David Rientjes wrote: > > On Thu, 14 Jul 2016, Michal Hocko wrote: > > > > > > It prevents the whole system from livelocking due to an oom killed > > > > process > > > > stalling forever waiting for mempool_alloc() to return

Re: System freezes after OOM

2016-07-15 Thread Tetsuo Handa
On 2016/07/15 2:07, Ondrej Kozina wrote: > On 07/14/2016 05:31 PM, Michal Hocko wrote: >> On Thu 14-07-16 16:08:28, Ondrej Kozina wrote: >> [...] >>> As Mikulas pointed out, this doesn't work. The system froze as well with the >>> patch above. Will try to tweak the patch with Mikulas's suggestion..

Re: System freezes after OOM

2016-07-15 Thread Mikulas Patocka
On Thu, 14 Jul 2016, David Rientjes wrote: > On Thu, 14 Jul 2016, Tetsuo Handa wrote: > > > David Rientjes wrote: > > > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > > > > > What are the real problems that > > > > f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > > > tries to fix? > > > > >

Re: System freezes after OOM

2016-07-15 Thread Mikulas Patocka
On Thu, 14 Jul 2016, David Rientjes wrote: > There is no guarantee that _anything_ can return memory to the mempool, You misunderstand mempools if you make such claims. There is in fact guarantee that objects will be returned to mempool. In the past I reviewed device mapper thoroughly to make

Re: System freezes after OOM

2016-07-15 Thread Michal Hocko
On Thu 14-07-16 13:35:35, Mikulas Patocka wrote: > On Thu, 14 Jul 2016, Michal Hocko wrote: > > On Thu 14-07-16 10:00:16, Mikulas Patocka wrote: > > > But it needs other changes to honor the PF_LESS_THROTTLE flag: > > > > > > static int current_may_throttle(void) > > > { > > > return !(cur

Re: System freezes after OOM

2016-07-15 Thread Michal Hocko
Let me paste the patch with the full changelog and the explanation so that we can reason about it more easily. If I am making some false assumptions then please point them out. --- >From ed46e3f7f5a6e896331eeadc9d09e2796acb3d01 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Thu, 14 Jul 2016 19

Re: System freezes after OOM

2016-07-15 Thread Michal Hocko
On Thu 14-07-16 13:38:42, David Rientjes wrote: > On Thu, 14 Jul 2016, Michal Hocko wrote: > > > > It prevents the whole system from livelocking due to an oom killed > > > process > > > stalling forever waiting for mempool_alloc() to return. No other threads > > > may be oom killed while waiti

Re: System freezes after OOM

2016-07-14 Thread David Rientjes
On Fri, 15 Jul 2016, Tetsuo Handa wrote: > Whether the OOM reaper will free some memory no longer matters. Instead, > whether the OOM reaper will let the OOM killer select next OOM victim matters. > > Are you aware that the OOM reaper will let the OOM killer select next OOM > victim (currently by

Re: System freezes after OOM

2016-07-14 Thread Tetsuo Handa
David Rientjes wrote: > On Thu, 14 Jul 2016, Tetsuo Handa wrote: > > > David Rientjes wrote: > > > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > > > > > What are the real problems that > > > > f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > > > tries to fix? > > > > > > > > > > It prevents t

Re: System freezes after OOM

2016-07-14 Thread David Rientjes
On Thu, 14 Jul 2016, Michal Hocko wrote: > > It prevents the whole system from livelocking due to an oom killed process > > stalling forever waiting for mempool_alloc() to return. No other threads > > may be oom killed while waiting for it to exit. > > But it is true that the patch has uninten

Re: System freezes after OOM

2016-07-14 Thread David Rientjes
On Thu, 14 Jul 2016, Tetsuo Handa wrote: > David Rientjes wrote: > > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > > > What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > > tries to fix? > > > > > > > It prevents the whole system from livelocking due to an oom kill

Re: System freezes after OOM

2016-07-14 Thread David Rientjes
On Thu, 14 Jul 2016, Mikulas Patocka wrote: > > schedule > > schedule_timeout > > io_schedule_timeout > > mempool_alloc > > __split_and_process_bio > > dm_request > > generic_make_request > > submit_bio > > mpage_readpages > > ext4_readpages > > __do_page_cache_readahead > > ra_submit > > filemap_

Re: System freezes after OOM

2016-07-14 Thread Mikulas Patocka
On Thu, 14 Jul 2016, Michal Hocko wrote: > On Thu 14-07-16 10:00:16, Mikulas Patocka wrote: > > > > > > On Thu, 14 Jul 2016, Michal Hocko wrote: > > > > > On Wed 13-07-16 11:02:15, Mikulas Patocka wrote: > > > > > > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > > > > > index

Re: System freezes after OOM

2016-07-14 Thread Michal Hocko
On Thu 14-07-16 19:36:59, Michal Hocko wrote: > On Thu 14-07-16 19:07:52, Ondrej Kozina wrote: > > On 07/14/2016 05:31 PM, Michal Hocko wrote: > > > On Thu 14-07-16 16:08:28, Ondrej Kozina wrote: > > > [...] > > > > As Mikulas pointed out, this doesn't work. The system froze as well > > > > with t

Re: System freezes after OOM

2016-07-14 Thread Michal Hocko
On Thu 14-07-16 19:07:52, Ondrej Kozina wrote: > On 07/14/2016 05:31 PM, Michal Hocko wrote: > > On Thu 14-07-16 16:08:28, Ondrej Kozina wrote: > > [...] > > > As Mikulas pointed out, this doesn't work. The system froze as well with > > > the > > > patch above. Will try to tweak the patch with Mik

Re: System freezes after OOM

2016-07-14 Thread Ondrej Kozina
On 07/14/2016 05:31 PM, Michal Hocko wrote: On Thu 14-07-16 16:08:28, Ondrej Kozina wrote: [...] As Mikulas pointed out, this doesn't work. The system froze as well with the patch above. Will try to tweak the patch with Mikulas's suggestion... Thank you for testing! Do you happen to have trace

Re: System freezes after OOM

2016-07-14 Thread Michal Hocko
On Thu 14-07-16 16:08:28, Ondrej Kozina wrote: [...] > As Mikulas pointed out, this doesn't work. The system froze as well with the > patch above. Will try to tweak the patch with Mikulas's suggestion... Thank you for testing! Do you happen to have traces of the frozen processes? Does the flusher

Re: System freezes after OOM

2016-07-14 Thread Michal Hocko
On Wed 13-07-16 16:53:28, David Rientjes wrote: > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > tries to fix? > > > > It prevents the whole system from livelocking due to an oom killed process > stalling forever w

Re: System freezes after OOM

2016-07-14 Thread Ondrej Kozina
On 07/14/2016 04:59 PM, Michal Hocko wrote: On Thu 14-07-16 10:00:16, Mikulas Patocka wrote: On Thu, 14 Jul 2016, Michal Hocko wrote: On Wed 13-07-16 11:02:15, Mikulas Patocka wrote: diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 4f3cb3554944..0b806810efab 100644 --- a/d

Re: System freezes after OOM

2016-07-14 Thread Ondrej Kozina
On 07/14/2016 02:51 PM, Michal Hocko wrote: On Wed 13-07-16 11:02:15, Mikulas Patocka wrote: On Wed, 13 Jul 2016, Michal Hocko wrote: [...] We are discussing several topics together so let's focus on this particlar thing for now The kernel 4.7-rc almost deadlocks in another way. The machine

Re: System freezes after OOM

2016-07-14 Thread Michal Hocko
On Thu 14-07-16 10:00:16, Mikulas Patocka wrote: > > > On Thu, 14 Jul 2016, Michal Hocko wrote: > > > On Wed 13-07-16 11:02:15, Mikulas Patocka wrote: > > > > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > > > > index 4f3cb3554944..0b806810efab 100644 > > > > --- a/drivers/md/dm

Re: System freezes after OOM

2016-07-14 Thread Mikulas Patocka
On Thu, 14 Jul 2016, Michal Hocko wrote: > On Wed 13-07-16 11:02:15, Mikulas Patocka wrote: > > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > > > index 4f3cb3554944..0b806810efab 100644 > > > --- a/drivers/md/dm-crypt.c > > > +++ b/drivers/md/dm-crypt.c > > > @@ -1392,11 +1392,

Re: System freezes after OOM

2016-07-14 Thread Michal Hocko
On Wed 13-07-16 11:02:15, Mikulas Patocka wrote: > On Wed, 13 Jul 2016, Michal Hocko wrote: [...] We are discussing several topics together so let's focus on this particlar thing for now > > > The kernel 4.7-rc almost deadlocks in another way. The machine got stuck > > > and the following stackt

Re: System freezes after OOM

2016-07-14 Thread Mikulas Patocka
On Thu, 14 Jul 2016, Tetsuo Handa wrote: > Michal Hocko wrote: > > OK, this is the part I have missed. I didn't realize that the swapout > > path, which is indeed PF_MEMALLOC, can get down to blk code which uses > > mempools. A quick code travers shows that at least > > make_request_fn = blk

Re: System freezes after OOM

2016-07-14 Thread Mikulas Patocka
On Wed, 13 Jul 2016, David Rientjes wrote: > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > tries to fix? > > > > It prevents the whole system from livelocking due to an oom killed process > stalling forever wai

Re: System freezes after OOM

2016-07-14 Thread Tetsuo Handa
Michal Hocko wrote: > OK, this is the part I have missed. I didn't realize that the swapout > path, which is indeed PF_MEMALLOC, can get down to blk code which uses > mempools. A quick code travers shows that at least > make_request_fn = blk_queue_bio > blk_queue_bio > get_reque

Re: System freezes after OOM

2016-07-14 Thread Milan Broz
On 07/14/2016 11:09 AM, Michal Hocko wrote: > On Wed 13-07-16 11:21:41, Mikulas Patocka wrote: >> >> >> On Wed, 13 Jul 2016, Milan Broz wrote: >> >>> On 07/13/2016 02:50 PM, Michal Hocko wrote: On Wed 13-07-16 13:10:06, Michal Hocko wrote: > On Tue 12-07-16 19:44:11, Mikulas Patocka wrote:

Re: System freezes after OOM

2016-07-14 Thread Michal Hocko
On Wed 13-07-16 11:21:41, Mikulas Patocka wrote: > > > On Wed, 13 Jul 2016, Milan Broz wrote: > > > On 07/13/2016 02:50 PM, Michal Hocko wrote: > > > On Wed 13-07-16 13:10:06, Michal Hocko wrote: > > >> On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: > > > [...] > > >>> As long as swapping is i

Re: System freezes after OOM

2016-07-13 Thread David Rientjes
On Wed, 13 Jul 2016, Tetsuo Handa wrote: > I wonder whether commit f9054c70d28bc214 ("mm, mempool: only set > __GFP_NOMEMALLOC if there are free elements") is doing correct thing. > It says > > If an oom killed thread calls mempool_alloc(), it is possible that it'll > loop forever if ther

Re: System freezes after OOM

2016-07-13 Thread David Rientjes
On Wed, 13 Jul 2016, Mikulas Patocka wrote: > What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 > tries to fix? > It prevents the whole system from livelocking due to an oom killed process stalling forever waiting for mempool_alloc() to return. No other threads may be

Re: System freezes after OOM

2016-07-13 Thread Mikulas Patocka
On Wed, 13 Jul 2016, Milan Broz wrote: > On 07/13/2016 02:50 PM, Michal Hocko wrote: > > On Wed 13-07-16 13:10:06, Michal Hocko wrote: > >> On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: > > [...] > >>> As long as swapping is in progress, the free memory is below the limit > >>> (because the

Re: System freezes after OOM

2016-07-13 Thread Mikulas Patocka
On Wed, 13 Jul 2016, Michal Hocko wrote: > On Wed 13-07-16 10:18:35, Mikulas Patocka wrote: > > > > > > On Wed, 13 Jul 2016, Michal Hocko wrote: > > > > > [CC David] > > > > > > > > It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23. > > > > > Prior to this commit, mempool

Re: System freezes after OOM

2016-07-13 Thread Michal Hocko
On Wed 13-07-16 10:18:35, Mikulas Patocka wrote: > > > On Wed, 13 Jul 2016, Michal Hocko wrote: > > > [CC David] > > > > > > It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23. > > > > Prior to this commit, mempool allocations set __GFP_NOMEMALLOC, so > > > > they never exhau

Re: System freezes after OOM

2016-07-13 Thread Mikulas Patocka
On Wed, 13 Jul 2016, Michal Hocko wrote: > On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: > > The problem of swapping to dm-crypt is this. > > > > The free memory goes low, kswapd decides that some page should be swapped > > out. However, when you swap to an ecrypted device, writeback of eac

Re: System freezes after OOM

2016-07-13 Thread Mikulas Patocka
On Wed, 13 Jul 2016, Michal Hocko wrote: > [CC David] > > > > It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23. > > > Prior to this commit, mempool allocations set __GFP_NOMEMALLOC, so > > > they never exhausted reserved memory. With this commit, mempool > > > allocations

Re: System freezes after OOM

2016-07-13 Thread Mikulas Patocka
On Wed, 13 Jul 2016, Michal Hocko wrote: > On Wed 13-07-16 10:35:01, Jerome Marchand wrote: > > On 07/13/2016 01:44 AM, Mikulas Patocka wrote: > > > The problem of swapping to dm-crypt is this. > > > > > > The free memory goes low, kswapd decides that some page should be swapped > > > out. How

Re: System freezes after OOM

2016-07-13 Thread Milan Broz
On 07/13/2016 02:50 PM, Michal Hocko wrote: > On Wed 13-07-16 13:10:06, Michal Hocko wrote: >> On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: > [...] >>> As long as swapping is in progress, the free memory is below the limit >>> (because the swapping activity itself consumes any memory over the

Re: System freezes after OOM

2016-07-13 Thread Michal Hocko
[CC David] On Wed 13-07-16 22:19:23, Tetsuo Handa wrote: > >> On Mon 11-07-16 11:43:02, Mikulas Patocka wrote: > >> [...] > >>> The general problem is that the memory allocator does 16 retries to > >>> allocate a page and then triggers the OOM killer (and it doesn't take > >>> into > >>> accoun

Re: System freezes after OOM

2016-07-13 Thread Tetsuo Handa
> On Tue, 12 Jul 2016, Michal Hocko wrote: > >> On Mon 11-07-16 11:43:02, Mikulas Patocka wrote: >> [...] >>> The general problem is that the memory allocator does 16 retries to >>> allocate a page and then triggers the OOM killer (and it doesn't take into >>> account how much swap space is free

Re: System freezes after OOM

2016-07-13 Thread Michal Hocko
On Wed 13-07-16 13:10:06, Michal Hocko wrote: > On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: [...] > > As long as swapping is in progress, the free memory is below the limit > > (because the swapping activity itself consumes any memory over the limit). > > And that triggered the OOM killer pr

Re: System freezes after OOM

2016-07-13 Thread Michal Hocko
On Wed 13-07-16 10:35:01, Jerome Marchand wrote: > On 07/13/2016 01:44 AM, Mikulas Patocka wrote: > > The problem of swapping to dm-crypt is this. > > > > The free memory goes low, kswapd decides that some page should be swapped > > out. However, when you swap to an ecrypted device, writeback of

Re: System freezes after OOM

2016-07-13 Thread Michal Hocko
On Tue 12-07-16 19:44:11, Mikulas Patocka wrote: > The problem of swapping to dm-crypt is this. > > The free memory goes low, kswapd decides that some page should be swapped > out. However, when you swap to an ecrypted device, writeback of each page > requires another page to hold the encrypted

Re: System freezes after OOM

2016-07-13 Thread Jerome Marchand
On 07/13/2016 01:44 AM, Mikulas Patocka wrote: > The problem of swapping to dm-crypt is this. > > The free memory goes low, kswapd decides that some page should be swapped > out. However, when you swap to an ecrypted device, writeback of each page > requires another page to hold the encrypted da

Re: System freezes after OOM

2016-07-12 Thread Mikulas Patocka
The problem of swapping to dm-crypt is this. The free memory goes low, kswapd decides that some page should be swapped out. However, when you swap to an ecrypted device, writeback of each page requires another page to hold the encrypted data. dm-crypt uses mempools for all its structures and pa

Re: System freezes after OOM

2016-07-11 Thread Michal Hocko
On Mon 11-07-16 11:43:02, Mikulas Patocka wrote: [...] > The general problem is that the memory allocator does 16 retries to > allocate a page and then triggers the OOM killer (and it doesn't take into > account how much swap space is free or how many dirty pages were really > swapped out while

Re: System freezes after OOM

2016-07-11 Thread Mikulas Patocka
On Mon, 11 Jul 2016, Ondrej Kozina wrote: > On 07/11/2016 01:55 PM, Jerome Marchand wrote: > > On 07/11/2016 01:03 PM, Stanislav Kozina wrote: > > > Hi Jerome, > > > > > > On upstream mailing lists there have been reports of freezing systems > > > due to OOM. Ondra (on CC) managed to reproduce