Which version are you on? -- Leif
> On Jan 18, 2017, at 3:21 PM, Kapil Sharma (kapsharm) <[email protected]> > wrote: > > We are seeing interesting behavior at high memory usage. Apologize for a long > and detailed email. > > Our ATS caches have been running for many months, and reached a point where > ATS has allocated huge amount of memory to free list pools (we can confirm > this by by dumping mempools). I understand that this is a known ATS > behavior/limitation/issue, where freelist mempools once allocated are never > reclaimed (And, that there may be a patch for adding reclamation support). > > But my question is regarding an interesting side issue that is being observed > when the ATS cache reach this stage. We see our ATS caches allocating large > amount of memory for slab cache - primarily “dentry” cache. This is probably > okay, because even though Kernel does greedy allocation for internal caches > (page cache, dentry, inode cache etc.), all of that memory is reclaimable > during low memory pressure. Now, the more interesting behavior is that during > this low memory state, we are observing only one of NUMA zones is exhausting > the pages!, and this particular ATS cache has been in this state for several > days. > Here is snipped output from /proc/zoneinfo: > > <snip> > Node 0, zone Normal > pages free 6320539 <<< Roughly 25GB free (Note, System has a total of > 512GB) > min 8129 > low 10161 > high 12193 > scanned 0 > spanned 66584576 > present 65674240 > nr_free_pages 6320539 > nr_inactive_anon 71 > nr_active_anon 79274 > nr_inactive_file 1720428 > nr_active_file 4580107 > nr_unevictable 39168773 > nr_mlock 39168773 > nr_anon_pages 39239109 > nr_mapped 13298 > nr_file_pages 6309563 > nr_dirty 91 > nr_writeback 0 > nr_slab_reclaimable 4581560 <<< ~10G. > nr_slab_unreclaimable 16047 > <snip> > Node 1, zone Normal > pages free 10224 <<<< Check this. It is below low watermark!!! > min 8193 > low 10241 > high 12289 > scanned 0 > spanned 67108864 > present 66191360 > nr_free_pages 10224 > nr_inactive_anon 64 > nr_active_anon 20886 > nr_inactive_file 42840 > nr_active_file 330486 > nr_unevictable 45630255 > nr_mlock 45630255 > nr_anon_pages 45649954 > nr_mapped 2151 > nr_file_pages 374576 > nr_dirty 9 > nr_writeback 0 > nr_slab_reclaimable 11939312 <<< ~48G > nr_slab_unreclaimable 17135 > <snip> > > It would appear page allocations for slab (from slabtop it is pretty much all > dentry) is disproportionately hitting NUMA zone 1. Under these conditions, my > guess is zone/node 1 memory will be constantly under low memory pressure, > causing scan/reclaim of pages to constantly run. Without knowing much about > Linux Kernel MM, I am guessing this may be suboptimal? > > Please correct my (wild) assumption on why we may be observing this : > - My guess is dentry is being created for a each new “accepted” connection > socket. > - There is only one ACCEPT thread to handle port 80 requests in our cache > configuration. ACCEPT thread is responsible for opening FDs for accepted > socket connections. > - ACCEPT thread is confined to run on cpuset belonging to one NUMA zone only…. > (I am connecting a lot of dots here) > > Any insight will be appreciated. > > thanks > Kapil > > > > > >
