Which version are you on?

-- Leif 

> On Jan 18, 2017, at 3:21 PM, Kapil Sharma (kapsharm) <[email protected]> 
> wrote:
> 
> We are seeing interesting behavior at high memory usage. Apologize for a long 
> and detailed email.
> 
> Our ATS caches have been running for many months, and reached a point where 
> ATS has allocated huge amount of memory to free list pools (we can confirm 
> this by by dumping mempools). I understand that this is a known ATS 
> behavior/limitation/issue, where freelist mempools once allocated are never 
> reclaimed (And, that there may be a patch for adding reclamation support). 
> 
> But my question is regarding an interesting side issue that is being observed 
> when the ATS cache reach this stage. We see our ATS caches allocating large 
> amount of memory for slab cache - primarily “dentry” cache. This is probably 
> okay, because even though Kernel does greedy allocation for internal caches 
> (page cache, dentry, inode cache etc.), all of that memory is reclaimable 
> during low memory pressure. Now, the more interesting behavior is that during 
> this low memory state, we are observing only one of NUMA zones is exhausting 
> the pages!, and this particular ATS cache has been in this state for several 
> days.
> Here is snipped output from /proc/zoneinfo:
> 
> <snip>
> Node 0, zone   Normal
>   pages free     6320539   <<< Roughly 25GB free (Note, System has a total of 
> 512GB)
>         min      8129
>         low      10161
>         high     12193
>         scanned  0
>         spanned  66584576
>         present  65674240
>     nr_free_pages 6320539
>     nr_inactive_anon 71
>     nr_active_anon 79274
>     nr_inactive_file 1720428
>     nr_active_file 4580107
>     nr_unevictable 39168773
>     nr_mlock     39168773
>     nr_anon_pages 39239109
>     nr_mapped    13298
>     nr_file_pages 6309563
>     nr_dirty     91
>     nr_writeback 0
>     nr_slab_reclaimable 4581560  <<< ~10G.
>     nr_slab_unreclaimable 16047
>  <snip>
> Node 1, zone   Normal
>   pages free     10224        <<<< Check this. It is below low watermark!!!
>         min      8193
>         low      10241
>         high     12289
>         scanned  0
>         spanned  67108864
>         present  66191360
>     nr_free_pages 10224
>     nr_inactive_anon 64
>     nr_active_anon 20886
>     nr_inactive_file 42840
>     nr_active_file 330486
>     nr_unevictable 45630255
>     nr_mlock     45630255
>     nr_anon_pages 45649954
>     nr_mapped    2151
>     nr_file_pages 374576
>     nr_dirty     9
>     nr_writeback 0
>     nr_slab_reclaimable 11939312   <<< ~48G 
>     nr_slab_unreclaimable 17135   
> <snip>
> 
> It would appear page allocations for slab (from slabtop it is pretty much all 
> dentry) is disproportionately hitting NUMA zone 1. Under these conditions, my 
> guess is zone/node 1 memory will be constantly under low memory pressure, 
> causing scan/reclaim of pages to constantly run. Without knowing much about 
> Linux Kernel MM, I am guessing this may be suboptimal?
> 
> Please correct my (wild) assumption on why we may be observing this :
> - My guess is dentry is being created for a each new “accepted”  connection 
> socket.
> - There is only one ACCEPT thread to handle port 80 requests in our cache 
> configuration. ACCEPT thread is responsible for opening FDs for accepted 
> socket connections.
> - ACCEPT thread is confined to run on cpuset belonging to one NUMA zone only….
> (I am connecting a lot of dots here)
> 
> Any insight will be appreciated.
> 
> thanks
> Kapil
> 
> 
> 
> 
> 
> 

Reply via email to