[v7 PATCH 01/12] mm: vmscan: use nid from shrink_control for tracepoint

2021-02-09 Thread Yang Shi
. It seems confusing. And the following patch will remove using nid directly in do_shrink_slab(), this patch also helps cleanup the code. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Signed-off-by: Yang Shi --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[v7 PATCH 0/12] Make shrinker's nr_deferred memcg aware

2021-02-09 Thread Yang Shi
emcgs would need ~3.2MB memory. It seems fine. We have been running the patched kernel on some hosts of our fleet (test and production) for months, it works very well. The monitor data shows the working set is sustained as expected. Yang Shi (12): mm: vmscan: use nid from shrink_c

[v7 PATCH 02/12] mm: vmscan: consolidate shrinker_maps handling code

2021-02-09 Thread Yang Shi
can.c for tighter integration with shrinker code, and remove the "memcg_" prefix. There is no functional change. Acked-by: Vlastimil Babka Acked-by: Kirill Tkhai Signed-off-by: Yang Shi --- include/linux/memcontrol.h | 11 ++-- mm/huge_memory.c | 4 +- mm/list_lru.c

[v7 PATCH 07/12] mm: vmscan: use a new flag to indicate shrinker is registered

2021-02-09 Thread Yang Shi
This would prevent the shrinkers from unregistering correctly. Remove SHRINKER_REGISTERING since we could check if shrinker is registered successfully by the new flag. Acked-by: Kirill Tkhai Signed-off-by: Yang Shi --- include/linux/shrinker.h | 7 --- mm/vmscan.c

Re: [v8 PATCH 00/13] Make shrinker's nr_deferred memcg aware

2021-02-25 Thread Yang Shi
h the others so that it can get a broader test. What do you think about it? Thanks, Yang On Tue, Feb 16, 2021 at 4:13 PM Yang Shi wrote: > > > Changelog > v7 --> v8: > * Added lockdep assert in expand_shrinker_info() per Roman. > * Added patch 05/13 to use kvfree

[PATCH] doc: memcontrol: add description for oom_kill

2021-02-25 Thread Yang Shi
. Signed-off-by: Yang Shi --- Documentation/admin-guide/cgroup-v1/memory.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 0936412e044e..44d5429636e2 100644 --- a/Documentation/admin-guide

Re: [PATCH] doc: memcontrol: add description for oom_kill

2021-02-26 Thread Yang Shi
On Thu, Feb 25, 2021 at 11:30 PM Michal Hocko wrote: > > On Thu 25-02-21 18:12:54, Yang Shi wrote: > > When debugging an oom issue, I found the oom_kill counter of memcg is > > confusing. At the first glance without checking document, I thought it > > just counts for mem

Re: [PATCH] doc: memcontrol: add description for oom_kill

2021-02-26 Thread Yang Shi
On Fri, Feb 26, 2021 at 8:42 AM Yang Shi wrote: > > On Thu, Feb 25, 2021 at 11:30 PM Michal Hocko wrote: > > > > On Thu 25-02-21 18:12:54, Yang Shi wrote: > > > When debugging an oom issue, I found the oom_kill counter of memcg is > > > confusing. At the fi

Re: [PATCH] shmem, memcg: enable memcg aware shrinker

2020-06-01 Thread Yang Shi
On Sun, May 31, 2020 at 8:22 PM Greg Thelen wrote: > > Since v4.19 commit b0dedc49a2da ("mm/vmscan.c: iterate only over charged > shrinkers during memcg shrink_slab()") a memcg aware shrinker is only > called when the per-memcg per-node shrinker_map indicates that the > shrinker may have objects t

Re: [RFC linux-next PATCH] mm: khugepaged: remove error message when checking external pins

2020-05-18 Thread Yang Shi
On 5/18/20 3:19 AM, Kirill A. Shutemov wrote: On Wed, May 13, 2020 at 05:03:03AM +0800, Yang Shi wrote: When running khugepaged with higher frequency (for example, set scan_sleep_millisecs to 0), the below error message was reported: khugepaged: expected_refcount (1024) > refcount (

Re: [RFC linux-next PATCH] mm: khugepaged: remove error message when checking external pins

2020-05-21 Thread Yang Shi
On 5/21/20 7:56 AM, Qian Cai wrote: On Wed, May 13, 2020 at 05:03:03AM +0800, Yang Shi wrote: [] mm/khugepaged.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 1fdd677..048f5d4 100644 --- a/mm

Re: [PATCH 1/3] mm: don't call activate_page() on new ksm pages

2020-08-12 Thread Yang Shi
On Tue, Aug 11, 2020 at 9:04 PM Yu Zhao wrote: > > lru_cache_add_active_or_unevictable() already adds new ksm pages to > active lru. Calling activate_page() isn't really necessary in this > case. > > Signed-off-by: Yu Zhao > --- > mm/swapfile.c | 10 +- > 1 file changed, 5 insertions(+),

Re: [PATCH] mm, THP, swap: fix allocating cluster for swapfile by mistake

2020-08-19 Thread Yang Shi
On Wed, Aug 19, 2020 at 1:15 PM Gao Xiang wrote: > > Hi Andrew, > > On Wed, Aug 19, 2020 at 01:05:06PM -0700, Andrew Morton wrote: > > On Thu, 20 Aug 2020 03:56:13 +0800 Gao Xiang wrote: > > > > > SWP_FS doesn't mean the device is file-backed swap device, > > > which just means each writeback req

Re: [PATCH v2 1/3] mm: remove activate_page() from unuse_pte()

2020-08-19 Thread Yang Shi
commit. And make the function static while we are at it. > > Before the commit, we called lru_cache_add_active_or_unevictable() to > add new ksm pages to active lruvec. Therefore, activate_page() wasn't > necessary for them in the first place. Reviewed-by: Yang Shi > > Sign

Re: [PATCH v2 2/3] mm: remove superfluous __ClearPageActive()

2020-08-19 Thread Yang Shi
; always held while calling SetPageActive() on a page. > > SetPageSlabPfmemalloc() also uses SetPageActive(), but it's irrelevant > to LRU pages. Seems fine to me. Reviewed-by: Yang Shi > > Signed-off-by: Yu Zhao > --- > mm/memremap.c | 2 -- > mm/swap.c | 2 -- >

Re: [PATCH v2 3/3] mm: remove superfluous __ClearPageWaiters()

2020-08-19 Thread Yang Shi
On Tue, Aug 18, 2020 at 11:47 AM Yu Zhao wrote: > > Presumably __ClearPageWaiters() was added to follow the previously > removed __ClearPageActive() pattern. > > Only flags that are in PAGE_FLAGS_CHECK_AT_FREE needs to be properly > cleared because otherwise we think there may be some kind of leak

Re: [PATCH v2 3/3] mm: remove superfluous __ClearPageWaiters()

2020-08-19 Thread Yang Shi
On Wed, Aug 19, 2020 at 4:39 PM Yu Zhao wrote: > > On Wed, Aug 19, 2020 at 04:06:32PM -0700, Yang Shi wrote: > > On Tue, Aug 18, 2020 at 11:47 AM Yu Zhao wrote: > > > > > > Presumably __ClearPageWaiters() was added to follow the previously > > &

Re: [RFC][PATCH 0/9] [v3] Migrate Pages in lieu of discard

2020-08-19 Thread Yang Shi
re. My initial thought is to make cpuset process only (the threads in the same process must be in the same cpuset group), but it sounds not too feasible either since it may break some user configurations. > * Migration failures will result in pages being unreclaimable. >Need to be able to fal

Re: [PATCH v2] mm, THP, swap: fix allocating cluster for swapfile by mistake

2020-08-20 Thread Yang Shi
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well > > Symptoms: > - FS corruption (e.g. checksum failure) > - memory corruption at: 0xd2808010 > - segfault > > Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file > ba

Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim

2020-08-20 Thread Yang Shi
On Thu, Aug 20, 2020 at 8:22 AM Dave Hansen wrote: > > On 8/20/20 1:06 AM, Huang, Ying wrote: > >> +/* Migrate pages selected for demotion */ > >> +nr_reclaimed += demote_page_list(&ret_pages, &demote_pages, pgdat, > >> sc); > >> + > >> pgactivate = stat->nr_activate[0] + stat->nr_ac

Re: [RFC][PATCH 2/9] mm/numa: automatically generate node migration order

2020-08-20 Thread Yang Shi
hat node_demotion[] > locking has no chance of becoming a bottleneck on large systems > with lots of CPUs in direct reclaim. > > This code is unused for now. It will be called later in the > series. > > Signed-off-by: Dave Hansen >

Re: [RFC][PATCH 3/9] mm/migrate: update migration order during on hotplug events

2020-08-20 Thread Yang Shi
; > This recalculation is far from optimal, most glaringly that it does > not even attempt to figure out if nodes are actually coming or going. > But, given the expected paucity of hotplug events, this should be > fine. > > Signed-off-by: Dave Hansen > Cc: Yang Shi > Cc:

Re: [RFC][PATCH 6/9] mm/vmscan: add page demotion counter

2020-08-20 Thread Yang Shi
On Tue, Aug 18, 2020 at 11:53 AM Dave Hansen wrote: > > > From: Yang Shi > > Account the number of demoted pages into reclaim_state->nr_demoted. > > Add pgdemote_kswapd and pgdemote_direct VM counters showed in > /proc/vmstat. > > [ daveh: >- __count_vm_e

Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim

2020-08-20 Thread Yang Shi
On Tue, Aug 18, 2020 at 11:51 AM Dave Hansen wrote: > > > From: Dave Hansen > > This is mostly derived from a patch from Yang Shi: > > > https://lore.kernel.org/linux-mm/1560468577-101178-10-git-send-email-yang@linux.alibaba.com/ > > Add code to the

Re: [RFC][PATCH 8/9] mm/vmscan: never demote for memcg reclaim

2020-08-20 Thread Yang Shi
al is to reduce the > total memory consumption of the entire memcg, across all > nodes. Migration does not assist memcg reclaim because > it just moves page contents between nodes rather than > actually reducing memory consumption. > > Signed-off-by: Dave Hansen > Suggested-by:

Re: [RFC][PATCH 6/9] mm/vmscan: add page demotion counter

2020-08-20 Thread Yang Shi
On Tue, Aug 18, 2020 at 11:53 AM Dave Hansen wrote: > > > From: Yang Shi > > Account the number of demoted pages into reclaim_state->nr_demoted. > > Add pgdemote_kswapd and pgdemote_direct VM counters showed in > /proc/vmstat. BTW we'd better add promotion coun

Re: [RFC][PATCH 6/9] mm/vmscan: add page demotion counter

2020-08-20 Thread Yang Shi
On Thu, Aug 20, 2020 at 3:26 PM Yang Shi wrote: > > On Tue, Aug 18, 2020 at 11:53 AM Dave Hansen > wrote: > > > > > > From: Yang Shi > > > > Account the number of demoted pages into reclaim_state->nr_demoted. > > > > Add pgdemote_kswapd and

Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 1:55 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Since memcg_shrinker_map_size just can be changd under holding > > shrinker_rwsem > > exclusively, the read side can be protected by holding read lock, so it > > so

Re: [v3 PATCH 04/11] mm: vmscan: remove memcg_shrinker_map_size

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 2:16 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but > > actually the > > map size can be calculated via shrinker_nr_max, so it seems unnecessary to &

Re: [v3 PATCH 05/11] mm: vmscan: use a new flag to indicate shrinker is registered

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 2:22 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Currently registered shrinker is indicated by non-NULL > > shrinker->nr_deferred. > > This approach is fine with nr_deferred at the shrinker level, but the > >

Re: [v3 PATCH 06/11] mm: memcontrol: rename shrinker_map to shrinker_info

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 3:39 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > The following patch is going to add nr_deferred into shrinker_map, the > > change will > > make shrinker_map not only include map anymore, so rename it to a more > &g

Re: [v3 PATCH 07/11] mm: vmscan: add per memcg shrinker nr_deferred

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 3:07 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Currently the number of deferred objects are per shrinker, but some slabs, > > for example, > > vfs inode/dentry cache are per memcg, this would result in poor isolation >

Re: [v3 PATCH 09/11] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 3:16 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Now nr_deferred is available on per memcg level for memcg aware shrinkers, > > so don't need > > allocate shrinker->nr_deferred for such shrinkers anymore. > > &

Re: [v3 PATCH 10/11] mm: memcontrol: reparent nr_deferred when memcg offline

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 3:35 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to > > parent's > > corresponding nr_deferred when memcg offline. > > > > Signe

Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-01-11 Thread Yang Shi
On Mon, Jan 11, 2021 at 9:34 AM Kirill Tkhai wrote: > > On 11.01.2021 20:08, Yang Shi wrote: > > On Wed, Jan 6, 2021 at 1:55 AM Kirill Tkhai wrote: > >> > >> On 06.01.2021 01:58, Yang Shi wrote: > >>> Since memcg_shrinker_map_size just can be

Re: [v3 PATCH 02/11] mm: vmscan: consolidate shrinker_maps handling code

2021-01-11 Thread Yang Shi
On Wed, Jan 6, 2021 at 4:14 PM Roman Gushchin wrote: > > On Tue, Jan 05, 2021 at 02:58:08PM -0800, Yang Shi wrote: > > The shrinker map management is not really memcg specific, it's just > > allocation > > In the current form it doesn't look so, especially becau

Re: [v3 PATCH 02/11] mm: vmscan: consolidate shrinker_maps handling code

2021-01-11 Thread Yang Shi
On Mon, Jan 11, 2021 at 11:37 AM Roman Gushchin wrote: > > On Mon, Jan 11, 2021 at 11:00:17AM -0800, Yang Shi wrote: > > On Wed, Jan 6, 2021 at 4:14 PM Roman Gushchin wrote: > > > > > > On Tue, Jan 05, 2021 at 02:58:08PM -0800, Yang Shi wrote: > > > >

Re: [PATCH] mm: vmscan: support equal reclaim for anon and file pages

2021-01-11 Thread Yang Shi
On Mon, Jan 11, 2021 at 12:59 PM Sudarshan Rajagopalan wrote: > > When performing memory reclaim support treating anonymous and > file backed pages equally. > Swapping anonymous pages out to memory can be efficient enough > to justify treating anonymous and file backed pages equally. > > Signed-of

Re: [PATCH 5/9] mm: memcontrol: add per memcg shrinker nr_deferred

2020-12-10 Thread Yang Shi
On Thu, Dec 10, 2020 at 7:36 AM Johannes Weiner wrote: > > On Wed, Dec 02, 2020 at 10:27:21AM -0800, Yang Shi wrote: > > @@ -504,6 +577,34 @@ int memcg_expand_shrinker_maps(int new_id) > > return ret; > > } > > > > +int memcg_expand_shrinker_deferred(i

Re: [PATCH 5/9] mm: memcontrol: add per memcg shrinker nr_deferred

2020-12-10 Thread Yang Shi
On Thu, Dec 10, 2020 at 7:36 AM Johannes Weiner wrote: > > On Wed, Dec 02, 2020 at 10:27:21AM -0800, Yang Shi wrote: > > @@ -504,6 +577,34 @@ int memcg_expand_shrinker_maps(int new_id) > > return ret; > > } > > > > +int memcg_expand_shrinker_deferred(i

Re: [PATCH 5/9] mm: memcontrol: add per memcg shrinker nr_deferred

2020-12-11 Thread Yang Shi
On Thu, Dec 10, 2020 at 11:12 AM Yang Shi wrote: > > On Thu, Dec 10, 2020 at 7:36 AM Johannes Weiner wrote: > > > > On Wed, Dec 02, 2020 at 10:27:21AM -0800, Yang Shi wrote: > > > @@ -504,6 +577,34 @@ int memcg_expand_shrinker_maps(int new_id) > > > r

Re: [PATCH 2/9] mm: vmscan: use nid from shrink_control for tracepoint

2020-12-11 Thread Yang Shi
On Wed, Dec 2, 2020 at 7:13 PM Xiaqing (A) wrote: > > > > On 2020/12/3 2:27, Yang Shi wrote: > > The tracepoint's nid should show what node the shrink happens on, the start > > tracepoint > > uses nid from shrinkctl, but the nid might be set to 0 before end &g

Re: [PATCH 6/9] mm: vmscan: use per memcg nr_deferred of shrinker

2020-12-08 Thread Yang Shi
On Thu, Dec 3, 2020 at 3:40 AM Kirill Tkhai wrote: > > On 02.12.2020 21:27, Yang Shi wrote: > > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's > > nr_deferred > > will be used in the following cases: > > 1. Non memcg aware shri

Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-01-12 Thread Yang Shi
On Mon, Jan 11, 2021 at 1:34 PM Kirill Tkhai wrote: > > On 11.01.2021 21:57, Yang Shi wrote: > > On Mon, Jan 11, 2021 at 9:34 AM Kirill Tkhai wrote: > >> > >> On 11.01.2021 20:08, Yang Shi wrote: > >>> On Wed, Jan 6, 2021 at 1:55 AM Kirill Tkhai wrot

Re: [v3 PATCH 05/11] mm: vmscan: use a new flag to indicate shrinker is registered

2021-01-12 Thread Yang Shi
On Mon, Jan 11, 2021 at 1:38 PM Kirill Tkhai wrote: > > On 11.01.2021 21:17, Yang Shi wrote: > > On Wed, Jan 6, 2021 at 2:22 AM Kirill Tkhai wrote: > >> > >> On 06.01.2021 01:58, Yang Shi wrote: > >>> Currently registered shrinker is indicated by non-NUL

Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-01-13 Thread Yang Shi
On Tue, Jan 12, 2021 at 1:23 PM Yang Shi wrote: > > On Mon, Jan 11, 2021 at 1:34 PM Kirill Tkhai wrote: > > > > On 11.01.2021 21:57, Yang Shi wrote: > > > On Mon, Jan 11, 2021 at 9:34 AM Kirill Tkhai wrote: > > >> > > >> On 11.01.2021 20:08,

Re: [PATCH] mm: net: memcg accounting for TCP rx zerocopy

2021-01-13 Thread Yang Shi
On Wed, Jan 13, 2021 at 11:13 AM Shakeel Butt wrote: > > On Wed, Jan 13, 2021 at 10:43 AM Roman Gushchin wrote: > > > > On Tue, Jan 12, 2021 at 04:18:44PM -0800, Shakeel Butt wrote: > > > On Tue, Jan 12, 2021 at 4:12 PM Arjun Roy wrote: > > > > > > > > On Tue, Jan 12, 2021 at 3:48 PM Roman Gushc

Re: [v3 PATCH 04/11] mm: vmscan: remove memcg_shrinker_map_size

2021-01-13 Thread Yang Shi
On Wed, Jan 6, 2021 at 2:16 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but > > actually the > > map size can be calculated via shrinker_nr_max, so it seems unnecessary to &

Re: [v3 PATCH 07/11] mm: vmscan: add per memcg shrinker nr_deferred

2021-01-13 Thread Yang Shi
On Wed, Jan 6, 2021 at 3:07 AM Kirill Tkhai wrote: > > On 06.01.2021 01:58, Yang Shi wrote: > > Currently the number of deferred objects are per shrinker, but some slabs, > > for example, > > vfs inode/dentry cache are per memcg, this would result in poor isolation >

Re: [v3 PATCH 02/11] mm: vmscan: consolidate shrinker_maps handling code

2021-01-07 Thread Yang Shi
On Wed, Jan 6, 2021 at 4:14 PM Roman Gushchin wrote: > > On Tue, Jan 05, 2021 at 02:58:08PM -0800, Yang Shi wrote: > > The shrinker map management is not really memcg specific, it's just > > allocation > > In the current form it doesn't look so, especially becau

Re: [v3 PATCH 08/11] mm: vmscan: use per memcg nr_deferred of shrinker

2021-01-07 Thread Yang Shi
On Wed, Jan 6, 2021 at 4:17 PM Roman Gushchin wrote: > > On Tue, Jan 05, 2021 at 02:58:14PM -0800, Yang Shi wrote: > > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's > > nr_deferred > > will be used in the following cases: > >

Re: [PATCH 1/2] mm: memcg: fix memcg file_dirty numa stat

2020-12-28 Thread Yang Shi
e > for cgroup v2") exposed numa stats for the memcg. So fixing the > file_dirty per-memcg numa stat. > > Fixes: 5f9a4f4a7096 ("mm: memcontrol: add the missing numa_stat interface for > cgroup v2") > Signed-off-by: Shakeel Butt > Cc: Acked-by: Yang Shi > ---

Re: [PATCH 2/2] mm: fix numa stats for thp migration

2020-12-28 Thread Yang Shi
__mod_lruvec_state(new_lruvec, NR_SHMEM, nr); > > } > > if (dirty && mapping_can_writeback(mapping)) { > > - __dec_lruvec_state(old_lruvec, NR_FILE_DIRTY); > > - __dec_zone_state(old

Re: [v2 PATCH 3/9] mm: vmscan: guarantee shrinker_slab_memcg() sees valid shrinker_maps for online memcg

2020-12-28 Thread Yang Shi
k the pointer of the struct. So, it seems this patch is not necessary anymore. This patch will be dropped in v3. On Tue, Dec 15, 2020 at 12:31 PM Yang Shi wrote: > > On Tue, Dec 15, 2020 at 9:16 AM Johannes Weiner wrote: > > > > On Mon, Dec 14, 2020 at 02:37:16PM -0800, Ya

Re: [PATCH v3 1/5] mm/migrate.c: make putback_movable_page() static

2021-03-25 Thread Yang Shi
remove all the 3 VM_BUG_ON_PAGE(). Reviewed-by: Yang Shi > > Signed-off-by: Miaohe Lin > --- > include/linux/migrate.h | 1 - > mm/migrate.c| 7 +-- > 2 files changed, 1 insertion(+), 7 deletions(-) > > diff --git a/include/linux/migrate.h b/inclu

Re: [PATCH v5 1/2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-19 Thread Yang Shi
elftests/vm to utilize the interface by splitting > PMD THPs and PTE-mapped THPs. > > This does not change the old behavior, i.e., writing 1 to the interface > to split all THPs in the system. > > Changelog: > > From v5: > 1. Skipped special VMAs and other fixes. (sugge

Re: [PATCH v5 2/2] mm: huge_memory: debugfs for file-backed THP split.

2021-03-19 Thread Yang Shi
put_page(fpage); > + } > + > + filp_close(candidate, NULL); > + ret = 0; > + > + pr_info("%lu of %lu file-backed THP split\n", split, total); > +out: > + putname(file); > + return ret; > +} > + > +#define MAX_INPUT

Re: [PATCH v2 5/5] mm/migrate.c: fix potential deadlock in NUMA balancing shared exec THP case

2021-03-23 Thread Yang Shi
looked this in the first place. Your fix is correct, and please add the above justification to your commit log. Reviewed-by: Yang Shi > > Fixes: c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA balancing") > Signed-off-by: Miaohe Lin > --- > mm/migrate.c | 4 -

Re: [PATCH v2 2/5] mm/migrate.c: remove unnecessary rc != MIGRATEPAGE_SUCCESS check in 'else' case

2021-03-23 Thread Yang Shi
On Tue, Mar 23, 2021 at 6:54 AM Miaohe Lin wrote: > > It's guaranteed that in the 'else' case of the rc == MIGRATEPAGE_SUCCESS > check, rc does not equal to MIGRATEPAGE_SUCCESS. Remove this unnecessary > check. Reviewed-by: Yang Shi > > Reviewed-by: David Hil

Re: [PATCH v2 1/5] mm/migrate.c: remove unnecessary VM_BUG_ON_PAGE on putback_movable_page()

2021-03-23 Thread Yang Shi
On Tue, Mar 23, 2021 at 6:54 AM Miaohe Lin wrote: > > The !PageLocked() check is implicitly done in PageMovable(). Remove this > explicit one. TBH, I'm a little bit reluctant to have this kind change. If "locked" check is necessary we'd better make it explicit otherwise just remove it. And why n

Re: [PATCH v2 5/5] mm/migrate.c: fix potential deadlock in NUMA balancing shared exec THP case

2021-03-23 Thread Yang Shi
On Tue, Mar 23, 2021 at 10:17 AM Yang Shi wrote: > > On Tue, Mar 23, 2021 at 6:55 AM Miaohe Lin wrote: > > > > Since commit c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA > > balancing"), the NUMA balancing would skip shared exec transhuge pag

Re: [PATCH] mm: gup: remove FOLL_SPLIT

2021-03-30 Thread Yang Shi
On Tue, Mar 30, 2021 at 12:08 AM John Hubbard wrote: > > On 3/29/21 12:38 PM, Yang Shi wrote: > > Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of > > FOLL_SPLIT") > > and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-03-30 Thread Yang Shi
On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer wrote: > > On Mon, 29 Mar 2021 11:33:06 -0700 > Yang Shi wrote: > > > > > When the THP NUMA fault support was added THP migration was not supported > > yet. > > So the ad hoc THP migration was implemented in NU

Re: [PATCH 5/6] mm: migrate: don't split THP for misplaced NUMA page

2021-03-30 Thread Yang Shi
On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer wrote: > > On Mon, 29 Mar 2021 11:33:11 -0700 > Yang Shi wrote: > > > The old behavior didn't split THP if migration is failed due to lack of > > memory on the target node. But the THP migration does split THP, so

Re: [PATCH 3/6] mm: migrate: teach migrate_misplaced_page() about THP

2021-03-30 Thread Yang Shi
On Mon, Mar 29, 2021 at 5:21 PM Huang, Ying wrote: > > Yang Shi writes: > > > In the following patch the migrate_misplaced_page() will be used to migrate > > THP > > for NUMA faul too. Prepare to deal with THP. > > > > Signed-off-by: Yang Shi &

Re: [PATCH 4/6] mm: thp: refactor NUMA fault handling

2021-03-30 Thread Yang Shi
On Mon, Mar 29, 2021 at 5:41 PM Huang, Ying wrote: > > Yang Shi writes: > > > When the THP NUMA fault support was added THP migration was not supported > > yet. > > So the ad hoc THP migration was implemented in NUMA fault handling. Since > > v4.14 > >

[v2 PATCH] mm: gup: remove FOLL_SPLIT

2021-03-30 Thread Yang Shi
Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT") and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_SPLIT has not been used anymore. Remove the dead code. Reviewed-by: John Hubbard Signed-off-by: Yang Shi --- v2: Rem

Re: [PATCH mmotm] mm: vmscan: fix shrinker_rwsem in free_shrinker_info()

2021-03-31 Thread Yang Shi
15.GB28839@xsang-OptiPlex-9020 > > Reported-by: kernel test robot > > Signed-off-by: Hugh Dickins > > Cc: Yang Shi > > --- > > Sorry, I've made no attempt to work out precisely where in the series > > the locking went missing, nor tried to fit this in as a

Re: [PATCH v5 1/2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-22 Thread Yang Shi
On Sun, Mar 21, 2021 at 7:11 PM Zi Yan wrote: > > On 19 Mar 2021, at 19:37, Yang Shi wrote: > > > On Thu, Mar 18, 2021 at 5:52 PM Zi Yan wrote: > >> > >> From: Zi Yan > >> > >> We did not have a direct user interface of splitting the compound

Re: [PATCH v1 00/14] Multigenerational LRU

2021-03-15 Thread Yang Shi
On Fri, Mar 12, 2021 at 11:57 PM Yu Zhao wrote: > > TLDR > > The current page reclaim is too expensive in terms of CPU usage and > often making poor choices about what to evict. We would like to offer > a performant, versatile and straightforward augment. > > Repo > > git fetch https://l

Re: [PATCH v3] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-15 Thread Yang Shi
On Mon, Mar 15, 2021 at 11:37 AM Zi Yan wrote: > > On 15 Mar 2021, at 8:07, Kirill A. Shutemov wrote: > > > On Thu, Mar 11, 2021 at 07:57:12PM -0500, Zi Yan wrote: > >> From: Zi Yan > >> > >> We do not have a direct user interface of splitting the compound page > >> backing a THP > > > > But we d

Re: [PATCH v3 0/4] mm/slub: Fix count_partial() problem

2021-03-15 Thread Yang Shi
On Mon, Mar 15, 2021 at 12:15 PM Roman Gushchin wrote: > > > On Mon, Mar 15, 2021 at 07:49:57PM +0100, Vlastimil Babka wrote: > > On 3/9/21 4:25 PM, Xunlei Pang wrote: > > > count_partial() can hold n->list_lock spinlock for quite long, which > > > makes much trouble to the system. This series eli

Re: [PATCH v4 1/2] mm: huge_memory: a new debugfs interface for splitting THP tests.

2021-03-16 Thread Yang Shi
id code to a separate >function. > 2. Added the missing put_page for not split pages. > 3. pr_debug -> pr_info, make reading results simpler. > > From v2: > > 1. Reused existing /split_huge_pages interface. (suggested by >Yang Shi) > > From v1: >

Re: [PATCH v4 2/2] mm: huge_memory: debugfs for file-backed THP split.

2021-03-16 Thread Yang Shi
On Mon, Mar 15, 2021 at 1:34 PM Zi Yan wrote: > > From: Zi Yan > > Further extend /split_huge_pages to accept > ",," for file-backed THP split tests since > tmpfs may have file backed by THP that mapped nowhere. > > Update selftest program to test file-backed THP split too. > > Suggested-by: Kiri

[PATCH 2/6] mm: memory: make numa_migrate_prep() non-static

2021-03-29 Thread Yang Shi
The numa_migrate_prep() will be used by huge NUMA fault as well in the following patch, make it non-static. Signed-off-by: Yang Shi --- mm/internal.h | 3 +++ mm/memory.c | 5 ++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 1432feec62df

[RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-03-29 Thread Yang Shi
fcount. I saw there were some hacks about gup from git history, but I didn't figure out if they have been removed or not since I just found FOLL_NUMA code in the current gup implementation and they seems useful. Yang Shi (6): mm: memory: add orig_pmd to struct vm_fault m

[PATCH 5/6] mm: migrate: don't split THP for misplaced NUMA page

2021-03-29 Thread Yang Shi
The old behavior didn't split THP if migration is failed due to lack of memory on the target node. But the THP migration does split THP, so keep the old behavior for misplaced NUMA page migration. Signed-off-by: Yang Shi --- mm/migrate.c | 3 ++- 1 file changed, 2 insertions(+), 1 del

[PATCH 4/6] mm: thp: refactor NUMA fault handling

2021-03-29 Thread Yang Shi
s not required anymore to avoid the race. The page refcount elevation when holding ptl should prevent from THP split. Signed-off-by: Yang Shi --- include/linux/migrate.h | 23 -- mm/huge_memory.c| 132 -- mm/migrate.c

[PATCH 1/6] mm: memory: add orig_pmd to struct vm_fault

2021-03-29 Thread Yang Shi
Add orig_pmd to struct vm_fault so the "orig_pmd" parameter used by huge page fault could be removed, just like its PTE counterpart does. Signed-off-by: Yang Shi --- include/linux/huge_mm.h | 9 - include/linux/mm.h | 1 + mm/huge_memory.c| 9 ++--- m

[PATCH 6/6] mm: migrate: remove redundant page count check for THP

2021-03-29 Thread Yang Shi
Don't have to keep the redundant page count check for THP anymore after switching to use generic migration code. Signed-off-by: Yang Shi --- mm/migrate.c | 12 1 file changed, 12 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 1c0c873375ab..328f76848d6c 100644 ---

[PATCH 3/6] mm: migrate: teach migrate_misplaced_page() about THP

2021-03-29 Thread Yang Shi
In the following patch the migrate_misplaced_page() will be used to migrate THP for NUMA faul too. Prepare to deal with THP. Signed-off-by: Yang Shi --- include/linux/migrate.h | 6 -- mm/memory.c | 2 +- mm/migrate.c| 2 +- 3 files changed, 6 insertions(+), 4

[PATCH] mm: gup: remove FOLL_SPLIT

2021-03-29 Thread Yang Shi
Since commit 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT") and commit ba925fa35057 ("s390/gmap: improve THP splitting") FOLL_SPLIT has not been used anymore. Remove the dead code. Signed-off-by: Yang Shi --- include/linux/mm.h | 1 - mm/

[v2 RFC PATCH 0/7] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-13 Thread Yang Shi
aration patches. Patch #3 is the real meat. Patch #4 ~ #6 keep consistent counters and behaviors with before. Patch #7 skips change huge PMD to prot_none if thp migration is not supported. Yang Shi (7): mm: memory: add orig_pmd to struct vm_fault mm: memory: make numa_migrate_pre

[v2 PATCH 2/7] mm: memory: make numa_migrate_prep() non-static

2021-04-13 Thread Yang Shi
The numa_migrate_prep() will be used by huge NUMA fault as well in the following patch, make it non-static. Signed-off-by: Yang Shi --- mm/internal.h | 3 +++ mm/memory.c | 5 ++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index f469f69309de

[v2 PATCH 1/7] mm: memory: add orig_pmd to struct vm_fault

2021-04-13 Thread Yang Shi
Add orig_pmd to struct vm_fault so the "orig_pmd" parameter used by huge page fault could be removed, just like its PTE counterpart does. Signed-off-by: Yang Shi --- include/linux/huge_mm.h | 9 - include/linux/mm.h | 3 +++ mm/huge_memory.c| 9 ++--- m

[v2 PATCH 3/7] mm: thp: refactor NUMA fault handling

2021-04-13 Thread Yang Shi
been reworked a lot, it seems anon_vma lock is not required anymore to avoid the race. The page refcount elevation when holding ptl should prevent from THP split. Use migrate_misplaced_page() for both base page and THP NUMA hinting fault and remove all the dead and duplicate code. Signed-off-by: Yan

[v2 PATCH 4/7] mm: migrate: account THP NUMA migration counters correctly

2021-04-13 Thread Yang Shi
Now both base page and THP NUMA migration is done via migrate_misplaced_page(), keep the counters correctly for THP. Signed-off-by: Yang Shi --- mm/migrate.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 333448aa53f1..a473f25fbd01

[v2 PATCH 5/7] mm: migrate: don't split THP for misplaced NUMA page

2021-04-13 Thread Yang Shi
The old behavior didn't split THP if migration is failed due to lack of memory on the target node. But the THP migration does split THP, so keep the old behavior for misplaced NUMA page migration. Signed-off-by: Yang Shi --- mm/migrate.c | 4 +++- 1 file changed, 3 insertions(+), 1 del

[v2 PATCH 6/7] mm: migrate: check mapcount for THP instead of ref count

2021-04-13 Thread Yang Shi
The generic migration path will check refcount, so no need check refcount here. But the old code actually prevents from migrating shared THP (mapped by multiple processes), so bail out early if mapcount is > 1 to keep the behavior. Signed-off-by: Yang Shi --- mm/migrate.c |

[v2 PATCH 7/7] mm: thp: skip make PMD PROT_NONE if THP migration is not supported

2021-04-13 Thread Yang Shi
MA hinting faults on S390. Signed-off-by: Yang Shi --- mm/huge_memory.c | 4 1 file changed, 4 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94981907fd4c..f63445f3a17d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1741,6 +1741,7 @@ bool move_huge_

Re: [v2 PATCH 3/7] mm: thp: refactor NUMA fault handling

2021-04-14 Thread Yang Shi
On Tue, Apr 13, 2021 at 7:44 PM Huang, Ying wrote: > > Yang Shi writes: > > > When the THP NUMA fault support was added THP migration was not supported > > yet. > > So the ad hoc THP migration was implemented in NUMA fault handling. Since > > v4.14 > >

Re: [v2 PATCH 6/7] mm: migrate: check mapcount for THP instead of ref count

2021-04-14 Thread Yang Shi
On Tue, Apr 13, 2021 at 8:00 PM Huang, Ying wrote: > > Yang Shi writes: > > > The generic migration path will check refcount, so no need check refcount > > here. > > But the old code actually prevents from migrating shared THP (mapped by > > multiple >

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-06 Thread Yang Shi
On Tue, Apr 6, 2021 at 5:03 AM Gerald Schaefer wrote: > > On Thu, 1 Apr 2021 13:10:49 -0700 > Yang Shi wrote: > > [...] > > > > > > > > Yes, it could be. The old behavior of migration was to return -ENOMEM > > > > if THP migration is not supp

Re: [PATCH 2/2] mm: khugepaged: check MMF_DISABLE_THP ahead of iterating over vmas

2021-04-06 Thread Yang Shi
On Mon, Apr 5, 2021 at 8:05 PM Xu, Yanfei wrote: > > > > On 4/6/21 10:51 AM, Xu, Yanfei wrote: > > > > > > On 4/6/21 2:20 AM, Yang Shi wrote: > >> [Please note: This e-mail is from an EXTERNAL e-mail address] > >> > >> On Sun, A

Re: High kmalloc-32 slab cache consumption with 10k containers

2021-04-06 Thread Yang Shi
On Tue, Apr 6, 2021 at 3:05 AM Bharata B Rao wrote: > > On Mon, Apr 05, 2021 at 11:08:26AM -0700, Yang Shi wrote: > > On Sun, Apr 4, 2021 at 10:49 PM Bharata B Rao wrote: > > > > > > Hi, > > > > > > When running 1 (more-or-less-empty-)contain

Re: [PATCH mmotm] mm: vmscan: fix shrinker_rwsem in free_shrinker_info()

2021-03-31 Thread Yang Shi
On Wed, Mar 31, 2021 at 2:13 PM Hugh Dickins wrote: > > On Wed, 31 Mar 2021, Yang Shi wrote: > > On Wed, Mar 31, 2021 at 6:54 AM Shakeel Butt wrote: > > > On Tue, Mar 30, 2021 at 4:44 PM Hugh Dickins wrote: > > > > > > > > Lockdep warns mm/vmscan.c:

Re: [PATCH 05/10] mm/migrate: demote pages during reclaim

2021-04-01 Thread Yang Shi
On Thu, Apr 1, 2021 at 11:35 AM Dave Hansen wrote: > > > From: Dave Hansen > > This is mostly derived from a patch from Yang Shi: > > > https://lore.kernel.org/linux-mm/1560468577-101178-10-git-send-email-yang@linux.alibaba.com/ > > Add code to the

Re: [PATCH 10/10] mm/migrate: new zone_reclaim_mode to enable reclaim migration

2021-04-01 Thread Yang Shi
led page demotion may move data to a NUMA node > that does not fall into the cpuset of the allocating process. > This could be construed to violate the guarantees of cpusets. > However, since this is an opt-in mechanism, the assumption is > that anyone enabling it is content to relax the

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-01 Thread Yang Shi
On Wed, Mar 31, 2021 at 4:47 AM Gerald Schaefer wrote: > > On Tue, 30 Mar 2021 09:51:46 -0700 > Yang Shi wrote: > > > On Tue, Mar 30, 2021 at 7:42 AM Gerald Schaefer > > wrote: > > > > > > On Mon, 29 Mar 2021 11:33:06 -0700 > > > Yang Shi w

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-01 Thread Yang Shi
On Wed, Mar 31, 2021 at 6:20 AM Mel Gorman wrote: > > On Tue, Mar 30, 2021 at 04:42:00PM +0200, Gerald Schaefer wrote: > > Could there be a work-around by splitting THP pages instead of marking them > > as migrate pmds (via pte swap entries), at least when THP migration is not > > supported? I gue

<    1   2   3   4   5   6   7   8   9   10   >