Re: [PATCH 4/6] mm: thp: refactor NUMA fault handling

2021-03-29 Thread Huang, Ying
implementation and your new implementation. Originally, PMD is restored after trying to migrate the misplaced THP. I think this can reduce the TLB shooting-down IPI. Best Regards, Huang, Ying > In the old code anon_vma lock was needed to serialize THP migration > against THP split, but si

Re: [PATCH 3/6] mm: migrate: teach migrate_misplaced_page() about THP

2021-03-29 Thread Huang, Ying
; --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -2127,7 +2127,7 @@ static inline bool is_shared_exec_page(struct > vm_area_struct *vma, > * the page that will be dropped by this function before returning. > */ > int migrate_misplaced_page(struct page *page, struct vm_area_struc

[RFC] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

2021-03-29 Thread Huang Ying
inaccessible. But the difference between the accessible window is small. Because the page will be made inaccessible soon for migrating. Signed-off-by: "Huang, Ying" Cc: Peter Zijlstra Cc: Mel Gorman Cc: Peter Xu Cc: Johannes Weiner Cc: Vlastimil Babka Cc: "Matthew Wilcox"

Re: [RFC] mm: activate access-more-than-once page via NUMA balancing

2021-03-26 Thread Huang, Ying
Mel Gorman writes: > On Thu, Mar 25, 2021 at 12:33:45PM +0800, Huang, Ying wrote: >> > I caution against this patch. >> > >> > It's non-deterministic for a number of reasons. As it requires NUMA >> > balancing to be enabled, the pageout behaviour of a

Re: [RFC] mm: activate access-more-than-once page via NUMA balancing

2021-03-24 Thread Huang, Ying
Hi, Mel, Thanks for comment! Mel Gorman writes: > On Wed, Mar 24, 2021 at 04:32:09PM +0800, Huang Ying wrote: >> One idea behind the LRU page reclaiming algorithm is to put the >> access-once pages in the inactive list and access-more-than-once pages >> in the activ

[RFC] mm: activate access-more-than-once page via NUMA balancing

2021-03-24 Thread Huang Ying
and cold pages. But generally, I don't think it is a good idea to improve the performance via increasing the system overhead purely. Signed-off-by: "Huang, Ying" Inspired-by: Yu Zhao Cc: Hillf Danton Cc: Johannes Weiner Cc: Joonsoo Kim Cc: Matthew Wilcox Cc: Mel Gorman Cc: Michal

Re: [PATCH v1 09/14] mm: multigenerational lru: mm_struct list

2021-03-24 Thread Huang, Ying
Yu Zhao writes: > On Mon, Mar 22, 2021 at 11:13:19AM +0800, Huang, Ying wrote: >> Yu Zhao writes: >> >> > On Wed, Mar 17, 2021 at 11:37:38AM +0800, Huang, Ying wrote: >> >> Yu Zhao writes: >> >> >> >> > On Tue, Mar 16, 20

Re: [PATCH v1 09/14] mm: multigenerational lru: mm_struct list

2021-03-21 Thread Huang, Ying
Yu Zhao writes: > On Wed, Mar 17, 2021 at 11:37:38AM +0800, Huang, Ying wrote: >> Yu Zhao writes: >> >> > On Tue, Mar 16, 2021 at 02:44:31PM +0800, Huang, Ying wrote: >> > The scanning overhead is only one of the two major problems of the >> &g

Re: [PATCH v1 09/14] mm: multigenerational lru: mm_struct list

2021-03-16 Thread Huang, Ying
Yu Zhao writes: > On Tue, Mar 16, 2021 at 02:44:31PM +0800, Huang, Ying wrote: >> Yu Zhao writes: >> >> > On Tue, Mar 16, 2021 at 10:07:36AM +0800, Huang, Ying wrote: >> >> Rik van Riel writes: >> >> >>

Re: [PATCH v1 10/14] mm: multigenerational lru: core

2021-03-16 Thread Huang, Ying
Yu Zhao writes: > On Tue, Mar 16, 2021 at 02:52:52PM +0800, Huang, Ying wrote: >> Yu Zhao writes: >> >> > On Tue, Mar 16, 2021 at 10:08:51AM +0800, Huang, Ying wrote: >> >> Yu Zhao writes: >> >> [snip] >> >> >> >&g

Re: [PATCH v1 10/14] mm: multigenerational lru: core

2021-03-16 Thread Huang, Ying
Yu Zhao writes: > On Tue, Mar 16, 2021 at 10:08:51AM +0800, Huang, Ying wrote: >> Yu Zhao writes: >> [snip] >> >> > +/* Main function used by foreground, background and user-triggered aging. >> > */ >> > +static bool walk_mm_li

Re: [PATCH v1 09/14] mm: multigenerational lru: mm_struct list

2021-03-16 Thread Huang, Ying
Yu Zhao writes: > On Tue, Mar 16, 2021 at 10:07:36AM +0800, Huang, Ying wrote: >> Rik van Riel writes: >> >> > On Sat, 2021-03-13 at 00:57 -0700, Yu Zhao wrote: >> > >> >> +/* >> >> + * After pages are faulted in, they become the younge

Re: [PATCH v1 10/14] mm: multigenerational lru: core

2021-03-15 Thread Huang, Ying
ation of the function? And may be the number of mm_struct and the number of pages scanned. In comparison, in the traditional LRU algorithm, for each round, only a small subset of the whole physical memory is scanned. Best Regards, Huang, Ying > + > + if (!last) { > +

Re: [PATCH v1 09/14] mm: multigenerational lru: mm_struct list

2021-03-15 Thread Huang, Ying
scheduled after the previous scanning will not be scanned. I guess that this helps OOM kills? If so, how about just take advantage of that information for OOM killing and page reclaiming? For example, if a process hasn't been scheduled for long time, just reclaim its private pages. Best Regards, Huang, Ying

Re: [PATCH] vmscan: retry without cache trim mode if nothing scanned

2021-03-11 Thread Huang, Ying
Hi, Butt, Shakeel Butt writes: > On Wed, Mar 10, 2021 at 4:47 PM Huang, Ying wrote: >> >> From: Huang Ying >> >> In shrink_node(), to determine whether to enable cache trim mode, the >> LRU size is gotten via lruvec_page_state(). That gets th

[RFC -V6 5/6] memory tiering: rate limit NUMA migration throughput

2021-03-11 Thread Huang Ying
decreases 51.4% (from 213.0 MB/s to 103.6 MB/s) with the patch, while the benchmark score decreases only 1.8%. A new sysctl knob kernel.numa_balancing_rate_limit_mbps is added for the users to specify the limit. TODO: Add ABI document for new sysctl knob. Signed-off-by: "Huang, Ying"

[RFC -V6 6/6] memory tiering: adjust hot threshold automatically

2021-03-11 Thread Huang Ying
% with 32.4% fewer NUMA page migrations on a 2 socket Intel server with Optance DC Persistent Memory. Because it improves the accuracy of the hot page selection. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo

[RFC -V6 4/6] memory tiering: hot page selection with hint page fault latency

2021-03-11 Thread Huang Ying
nse. - If fast response is more important for system performance, the administrator can set a higher hot threshold. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dave Hansen Cc: Dan Williams Cc:

[RFC -V6 3/6] memory tiering: skip to scan fast memory

2021-03-11 Thread Huang Ying
-by: "Huang, Ying" Suggested-by: Dave Hansen Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dan Williams Cc: linux-kernel@vger.kernel.org Cc: linux...@kvack.org --- mm/huge_memory.c | 30 +- mm/mprotect

[RFC -V6 0/6] NUMA balancing: optimize memory placement for memory tiering system

2021-03-11 Thread Huang Ying
cleanup. - Rebased on the latest page demotion patchset. v2: - Addressed comments for V1. - Rebased on v5.5. Huang Ying (6): NUMA balancing: optimize page placement for memory tiering system memory tiering: add page promotion counter memory tiering: skip to scan fast memory memo

[RFC -V6 1/6] NUMA balancing: optimize page placement for memory tiering system

2021-03-11 Thread Huang Ying
TODO: - Update ABI document: Documentation/sysctl/kernel.txt Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dave Hansen Cc: Dan Williams Cc: linux-kernel@vger.kernel.org Cc: linux...@kvack.or

[RFC -V6 2/6] memory tiering: add page promotion counter

2021-03-11 Thread Huang Ying
To distinguish the number of the memory tiering promoted pages from that of the originally inter-socket NUMA balancing migrated pages. The counter is per-node (count in the target node). So this can be used to identify promotion imbalance among the NUMA nodes. Signed-off-by: "Huang, Ying

[PATCH] vmscan: retry without cache trim mode if nothing scanned

2021-03-10 Thread Huang, Ying
From: Huang Ying In shrink_node(), to determine whether to enable cache trim mode, the LRU size is gotten via lruvec_page_state(). That gets the value from a per-CPU counter (mem_cgroup_per_node->lruvec_stat[]). The error of the per-CPU counter from CPU local counting and the descendant mem

Re: [RFC -V5 1/6] NUMA balancing: optimize page placement for memory tiering system

2021-02-05 Thread Huang, Ying
Hillf Danton writes: > On Thu, 4 Feb 2021 18:10:51 +0800 Huang Ying wrote: >> With the advent of various new memory types, some machines will have >> multiple types of memory, e.g. DRAM and PMEM (persistent memory). The >> memory subsystem of these machines can be

[RFC -V5 6/6] memory tiering: add page promotion counter

2021-02-04 Thread Huang Ying
To distinguish the number of the memory tiering promoted pages from that of the originally inter-socket NUMA balancing migrated pages. The counter is per-node (count in the target node). So this can be used to identify promotion imbalance among the NUMA nodes. Signed-off-by: "Huang, Ying

[RFC -V5 4/6] memory tiering: rate limit NUMA migration throughput

2021-02-04 Thread Huang Ying
decreases 51.4% (from 213.0 MB/s to 103.6 MB/s) with the patch, while the benchmark score decreases only 1.8%. A new sysctl knob kernel.numa_balancing_rate_limit_mbps is added for the users to specify the limit. TODO: Add ABI document for new sysctl knob. Signed-off-by: "Huang, Ying"

[RFC -V5 5/6] memory tiering: adjust hot threshold automatically

2021-02-04 Thread Huang Ying
% with 32.4% fewer NUMA page migrations on a 2 socket Intel server with Optance DC Persistent Memory. Because it improves the accuracy of the hot page selection. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo

[RFC -V5 3/6] memory tiering: hot page selection with hint page fault latency

2021-02-04 Thread Huang Ying
nse. - If fast response is more important for system performance, the administrator can set a higher hot threshold. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dave Hansen Cc: Dan Williams Cc:

[RFC -V5 2/6] memory tiering: skip to scan fast memory

2021-02-04 Thread Huang Ying
-by: "Huang, Ying" Suggested-by: Dave Hansen Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dan Williams Cc: linux-kernel@vger.kernel.org Cc: linux...@kvack.org --- include/linux/node.h | 5 + mm/huge_memory.

[RFC -V5 1/6] NUMA balancing: optimize page placement for memory tiering system

2021-02-04 Thread Huang Ying
TODO: - Update ABI document: Documentation/sysctl/kernel.txt Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dave Hansen Cc: Dan Williams Cc: linux-kernel@vger.kernel.org Cc: linux...@kvack.or

[RFC -V5 0/6] autonuma: Optimize memory placement for memory tiering system

2021-02-04 Thread Huang Ying
ebased on the latest page demotion patchset. v2: - Addressed comments for V1. - Rebased on v5.5. Huang Ying (6): NUMA balancing: optimize page placement for memory tiering system memory tiering: skip to scan fast memory memory tiering: hot page selection with hint page fault latency memory

Re: [PATCH] mm/swap_state: Constify static struct attribute_group

2021-02-01 Thread Huang, Ying
d to me. Acked-by: "Huang, Ying" > --- > mm/swap_state.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/swap_state.c b/mm/swap_state.c > index d0d417efeecc..3cdee7b11da9 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@

Re: [PATCH -V9 2/3] NOT kernel/man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2021-01-21 Thread Huang, Ying
"Alejandro Colomar (man-pages)" writes: > Hi Huang Ying, > > On 1/20/21 7:12 AM, Huang Ying wrote: >> Signed-off-by: "Huang, Ying" >> Cc: "Alejandro Colomar" > > Sorry, for the confusion. > I have a different email for reading lists.

Re: [PATCH] swap: Check nrexceptional of swap cache before being freed

2021-01-21 Thread Huang, Ying
Matthew Wilcox writes: > On Wed, Jan 20, 2021 at 03:27:11PM +0800, Huang Ying wrote: >> To catch the error in updating the swap cache shadow entries or their count. > > I just resent a patch that removes nrexceptional tracking. > > Can you use !mapping_empty() inst

Re: [PATCH] swap: Check nrexceptional of swap cache before being freed

2021-01-19 Thread Huang, Ying
Michal Hocko writes: > On Wed 20-01-21 15:27:11, Huang Ying wrote: >> To catch the error in updating the swap cache shadow entries or their count. > > What is the error? There's no error in the current code. But we will change the related code in the future. So this checki

[PATCH] swap: Check nrexceptional of swap cache before being freed

2021-01-19 Thread Huang Ying
To catch the error in updating the swap cache shadow entries or their count. Signed-off-by: "Huang, Ying" Cc: Minchan Kim Cc: Joonsoo Kim , Cc: Johannes Weiner , Cc: Vlastimil Babka , Hugh Dickins , Cc: Mel Gorman , Cc: Michal Hocko , Cc: Dan Williams , Cc: Christoph Hellwig , Il

[PATCH -V9 2/3] NOT kernel/man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2021-01-19 Thread Huang Ying
Signed-off-by: "Huang, Ying" Cc: "Alejandro Colomar" --- man2/set_mempolicy.2 | 22 ++ 1 file changed, 22 insertions(+) diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 index 68011eecb..fa64a1820 100644 --- a/man2/set_mempolicy.2 +++ b/man2/set_m

[PATCH -V9 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing

2021-01-19 Thread Huang Ying
be used before the --membind/-m memory policy in the command line. With it, the Linux kernel NUMA balancing will be enabled for the process if --membind/-m is used and the feature is supported by the kernel. Signed-off-by: "Huang, Ying" --- libnuma.c | 14 ++ numa.3

[PATCH -V9 1/3] numa balancing: Migrate on fault among multiple bound nodes

2021-01-19 Thread Huang Ying
from node 1 to node 3 after killing the memory eater, and the pmbench score can increase about 17.5%. Signed-off-by: "Huang, Ying" Acked-by: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dave Hanse

[PATCH -V9 0/3] numa balancing: Migrate on fault among multiple bound nodes

2021-01-19 Thread Huang Ying
necessary. v4: - Use new flags instead of reuse MPOL_MF_LAZY. v3: - Rebased on latest upstream (v5.10-rc3) - Revised the change log. v2: - Rebased on latest upstream (v5.10-rc1) Best Regards, Huang, Ying

Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

2021-01-15 Thread Huang, Ying
Linus Torvalds writes: > On Tue, Jan 12, 2021 at 9:24 PM huang ying > wrote: >> > >> > Couldn't we just move it to the tail of the LRU list so it's reclaimed >> > first? Or is locking going to be a problem here? >> >> Yes. That's a way to

Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

2021-01-12 Thread huang ying
On Wed, Jan 13, 2021 at 11:12 AM Matthew Wilcox wrote: > > On Wed, Jan 13, 2021 at 11:08:56AM +0800, huang ying wrote: > > On Wed, Jan 13, 2021 at 10:47 AM Linus Torvalds > > wrote: > > > > > > On Tue, Jan 12, 2021 at 6:43 PM Huang Ying wrote: > >

Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

2021-01-12 Thread huang ying
On Wed, Jan 13, 2021 at 10:47 AM Linus Torvalds wrote: > > On Tue, Jan 12, 2021 at 6:43 PM Huang Ying wrote: > > > > So in this patch, at the end of wp_page_copy(), the old unused swap > > cache page will be tried to be freed. > > I'd much rather free it later

[PATCH] mm: Free unused swap cache page in write protection fault handler

2021-01-12 Thread Huang Ying
SwapCached: 1240 kB AnonPages: 1904 kB BTW: I think this should be in stable after v5.9. Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") Signed-off-by: "Huang, Ying" Cc: Linus Torvalds Cc: Peter Xu Cc: Hugh Dickins Cc: Johannes Weiner Cc: Mel Gorman

Re: [PATCH -V8 1/3] numa balancing: Migrate on fault among multiple bound nodes

2021-01-11 Thread Huang, Ying
Hi, Peter, Huang Ying writes: > Now, NUMA balancing can only optimize the page placement among the > NUMA nodes if the default memory policy is used. Because the memory > policy specified explicitly should take precedence. But this seems > too strict in some situations.

[PATCH -V8 2/3] NOT kernel/man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2021-01-05 Thread Huang Ying
Signed-off-by: "Huang, Ying" Cc: "Alejandro Colomar" --- man2/set_mempolicy.2 | 22 ++ 1 file changed, 22 insertions(+) diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 index 68011eecb..fa64a1820 100644 --- a/man2/set_mempolicy.2 +++ b/man2/set_m

[PATCH -V8 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing

2021-01-05 Thread Huang Ying
be used before the --membind/-m memory policy in the command line. With it, the Linux kernel NUMA balancing will be enabled for the process if --membind/-m is used and the feature is supported by the kernel. Signed-off-by: "Huang, Ying" --- libnuma.c | 14 ++ numa.3

[PATCH -V8 1/3] numa balancing: Migrate on fault among multiple bound nodes

2021-01-05 Thread Huang Ying
from node 1 to node 3 after killing the memory eater, and the pmbench score can increase about 17.5%. Signed-off-by: "Huang, Ying" Acked-by: Mel Gorman Cc: Andrew Morton Cc: Ingo Molnar Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dave Hanse

[PATCH -V8 0/3] numa balancing: Migrate on fault among multiple bound nodes

2021-01-05 Thread Huang Ying
: - Rebased on latest upstream (v5.10-rc3) - Revised the change log. v2: - Rebased on latest upstream (v5.10-rc1) Best Regards, Huang, Ying

Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-12-20 Thread Huang, Ying
"Alejandro Colomar (mailing lists; readonly)" writes: > Hi Huang, Ying, > > Sorry I forgot to answer. > See below. > > BTW, Linux 5.10 has been released recently; > is this series already merged for 5.11? > If not yet, could you just write '5.??' and we'll

Re: [PATCH] x86/mtrr: Correct the returned MTRR type of mtrr_type_lookup.

2020-12-14 Thread Huang, Ying-Tsun
On Fri, 11 Dec 2020, Borislav Petkov wrote: > [CAUTION: External Email] > > On Mon, Dec 07, 2020 at 02:12:26PM +0800, Ying-Tsun Huang wrote: > > In mtrr_type_lookup, if the input memory address region is not in the > > MTRR, over 4GB, and not over the top of memory, write-back attribute > >

Re: [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes

2020-12-10 Thread Huang, Ying
"Huang, Ying" writes: > Peter Zijlstra writes: > >> On Wed, Dec 02, 2020 at 11:40:54AM +, Mel Gorman wrote: >>> On Wed, Dec 02, 2020 at 04:42:32PM +0800, Huang Ying wrote: >>> > Now, NUMA balancing can only optimize the page placement among the &g

Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-12-08 Thread Huang, Ying
Hi, Alex, Sorry for late, I just notice this email today. "Alejandro Colomar (mailing lists; readonly)" writes: > Hi Huang Ying, > > Please see a few fixes below. > > Michael, as always, some question for you too ;) > > Thanks, > > Alex > > On 12/2/

Re: [PATCH -V7 2/3] NOT kernel/man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-12-06 Thread Huang, Ying
Hi, Alex, "Alejandro Colomar (man-pages)" writes: > Hi Huang Ying, > > Please, see a few fixes below. > > Thanks, > > Alex > > On 12/4/20 10:15 AM, Huang Ying wrote: >> Signed-off-by: "Huang, Ying" >> --- >> man2/set_

Re: [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes

2020-12-04 Thread Huang, Ying
Peter Zijlstra writes: > On Wed, Dec 02, 2020 at 11:40:54AM +, Mel Gorman wrote: >> On Wed, Dec 02, 2020 at 04:42:32PM +0800, Huang Ying wrote: >> > Now, NUMA balancing can only optimize the page placement among the >> > NUMA nodes if the default memory policy i

[PATCH -V7 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing

2020-12-04 Thread Huang Ying
be used before the --membind/-m memory policy in the command line. With it, the Linux kernel NUMA balancing will be enabled for the process if --membind/-m is used and the feature is supported by the kernel. Signed-off-by: "Huang, Ying" --- libnuma.c | 14 ++ numa.3

[PATCH -V7 2/3] NOT kernel/man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-12-04 Thread Huang Ying
Signed-off-by: "Huang, Ying" --- man2/set_mempolicy.2 | 14 ++ 1 file changed, 14 insertions(+) diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 index 68011eecb..fb2e6fd96 100644 --- a/man2/set_mempolicy.2 +++ b/man2/set_mempolicy.2 @@ -113,6 +113,15 @@ A no

[PATCH -V7 1/3] numa balancing: Migrate on fault among multiple bound nodes

2020-12-04 Thread Huang Ying
from node 1 to node 3 after killing the memory eater, and the pmbench score can increase about 17.5%. Signed-off-by: "Huang, Ying" Acked-by: Mel Gorman Cc: Andrew Morton Cc: Ingo Molnar Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dave Hanse

[PATCH -V7 0/3] numa balancing: Migrate on fault among multiple bound nodes

2020-12-04 Thread Huang Ying
the change log. v2: - Rebased on latest upstream (v5.10-rc1) Best Regards, Huang, Ying

Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-12-02 Thread Huang, Ying
Mel Gorman writes: > On Wed, Dec 02, 2020 at 04:42:33PM +0800, Huang Ying wrote: >> Signed-off-by: "Huang, Ying" >> --- >> man2/set_mempolicy.2 | 9 + >> 1 file changed, 9 insertions(+) >> >> diff --git a/man2/set_mempolicy.2 b/man2/set

[PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-12-02 Thread Huang Ying
Signed-off-by: "Huang, Ying" --- man2/set_mempolicy.2 | 9 + 1 file changed, 9 insertions(+) diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 index 68011eecb..3754b3e12 100644 --- a/man2/set_mempolicy.2 +++ b/man2/set_mempolicy.2 @@ -113,6 +113,12 @@ A nonempty .

[PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes

2020-12-02 Thread Huang Ying
es can be migrated from node 1 to node 3 after killing the memory eater, and the pmbench score can increase about 17.5%. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Ingo Molnar Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dav

[PATCH -V6 RESEND 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing

2020-12-02 Thread Huang Ying
be used before the memory policy options in the command line. With it, the Linux kernel NUMA balancing will be enabled for the process if the feature is supported by the kernel. Signed-off-by: "Huang, Ying" --- libnuma.c | 14 ++ numa.3| 15 +

[PATCH -V6 RESEND 0/3] numa balancing: Migrate on fault among multiple bound nodes

2020-12-02 Thread Huang Ying
, because it's not clear that it's necessary. v4: - Use new flags instead of reuse MPOL_MF_LAZY. v3: - Rebased on latest upstream (v5.10-rc3) - Revised the change log. v2: - Rebased on latest upstream (v5.10-rc1) Best Regards, Huang, Ying

Re: [PATCH -v6 2/3] NOT kernel/man-pages man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-11-30 Thread Huang, Ying
Dave Hansen writes: > On 11/25/20 9:32 PM, Huang Ying wrote: >> --- a/man2/set_mempolicy.2 >> +++ b/man2/set_mempolicy.2 >> @@ -113,6 +113,11 @@ A nonempty >> .I nodemask >> specifies node IDs that are relative to the set of >> node IDs allowed by the pr

[PATCH -V6 1/3] numa balancing: Migrate on fault among multiple bound nodes

2020-11-25 Thread Huang Ying
es can be migrated from node 1 to node 3 after killing the memory eater, and the pmbench score can increase about 17.5%. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Ingo Molnar Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dav

[PATCH -V6 0/3] autonuma: Migrate on fault among multiple bound nodes

2020-11-25 Thread Huang Ying
, because it's not clear that it's necessary. v4: - Use new flags instead of reuse MPOL_MF_LAZY. v3: - Rebased on latest upstream (v5.10-rc3) - Revised the change log. v2: - Rebased on latest upstream (v5.10-rc1) Best Regards, Huang, Ying

[PATCH -V6 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing

2020-11-25 Thread Huang Ying
From: Huang Ying A new API: numa_set_membind_balancing() is added to libnuma. It is same as numa_set_membind() except that the Linux kernel NUMA balancing will be enabled for the task if the feature is supported by the kernel. At the same time, a new option: --balancing (-b) is added

[PATCH -v6 2/3] NOT kernel/man-pages man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING

2020-11-25 Thread Huang Ying
From: Huang Ying Signed-off-by: "Huang, Ying" --- man2/set_mempolicy.2 | 8 1 file changed, 8 insertions(+) diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 index 68011eecb..fb16bb351 100644 --- a/man2/set_mempolicy.2 +++ b/man2/set_mempolicy.2 @@ -113,6 +113,11 @@

Re: [PATCH 2/2] mm/vmalloc: rework the drain logic

2020-11-24 Thread Huang, Ying
st may >> become too long in some cases. And the code/algorithm changes that are >> needed by controlling the length of the purging list is much less than >> that are needed by merging. So I suggest to do length controlling >> firstly, then merging. Again, just my 2 cents. >> > All such kind of tuning parameters work for one case and does not for > others. Therefore i prefer to have something more generic that tends > to improve the things, instead of thinking how to tune parameters to > cover all test cases and workloads. It's a new mechanism to control the length of the purging list directly. So, I don't think that's just parameter tuning. It's just a simple and direct method. It can work together with merging method to control the purging latency even if the vmap areas cannot be merged in some cases. But these cases may not exist in practice, so I will not insist to use this method. Best Regards, Huang, Ying

Re: [PATCH 2/2] mm/vmalloc: rework the drain logic

2020-11-23 Thread Huang, Ying
Do you think so? >> >> >> > If we set lazy_max_pages() to vague value such as 100, the performance >> > will be just destroyed. >> >> Sorry, my original words weren't clear enough. What I really want to >> suggest is to control the length of the purging list instead of reduce >> lazy_max_pages() directly. That is, we can have a "atomic_t >> nr_purge_item" to record the length of the purging list and start >> purging if (vmap_lazy_nr > lazy_max_pages && nr_purge_item > >> max_purge_item). vmap_lazy_nr is to control the virtual address space, >> nr_purge_item is to control the batching purging latency. "100" is just >> an example, the real value should be determined according to the test >> results. >> > OK. Now i see what you meant. Please note, the merging is in place, so > the list size gets reduced. Yes. In theory, even with merging, the length of the purging list may become too long in some cases. And the code/algorithm changes that are needed by controlling the length of the purging list is much less than that are needed by merging. So I suggest to do length controlling firstly, then merging. Again, just my 2 cents. Best Regards, Huang, Ying

Re: [PATCH 2/2] mm/vmalloc: rework the drain logic

2020-11-19 Thread Huang, Ying
Uladzislau Rezki writes: > On Thu, Nov 19, 2020 at 09:40:29AM +0800, Huang, Ying wrote: >> Uladzislau Rezki writes: >> >> > On Wed, Nov 18, 2020 at 10:44:13AM +0800, huang ying wrote: >> >> On Tue, Nov 17, 2020 at 9:04 PM Uladzislau Rezki wrote: >>

Re: [RFC -V5] autonuma: Migrate on fault among multiple bound nodes

2020-11-19 Thread Huang, Ying
Mel Gorman writes: > On Thu, Nov 19, 2020 at 02:17:21PM +0800, Huang, Ying wrote: >> >> Various page placement optimization based on the NUMA balancing can be >> >> done with these flags. As the first step, in this patch, if the >> >> memory of the

Re: [RFC -V5] autonuma: Migrate on fault among multiple bound nodes

2020-11-18 Thread Huang, Ying
Mel Gorman writes: > On Wed, Nov 18, 2020 at 01:19:52PM +0800, Huang Ying wrote: >> Now, AutoNUMA can only optimize the page placement among the NUMA > > Note that the feature is referred to as NUMA_BALANCING in the kernel > configs as AUTONUMA as it was first presen

Re: [PATCH 2/2] mm/vmalloc: rework the drain logic

2020-11-18 Thread Huang, Ying
Uladzislau Rezki writes: > On Wed, Nov 18, 2020 at 10:44:13AM +0800, huang ying wrote: >> On Tue, Nov 17, 2020 at 9:04 PM Uladzislau Rezki wrote: >> > >> > On Tue, Nov 17, 2020 at 10:37:34AM +0800, huang ying wrote: >> > > On Tue, Nov 17, 2020 at 6:00 A

[RFC -V5] autonuma: Migrate on fault among multiple bound nodes

2020-11-17 Thread Huang Ying
AutoNUMA for a specific memory area inside an application, so we only add the flag at the thread level (set_mempolicy()) instead of the memory area level (mbind()). We can do that when it become necessary. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Ingo Molnar Cc: Mel Gorman C

Re: [PATCH 2/2] mm/vmalloc: rework the drain logic

2020-11-17 Thread huang ying
On Tue, Nov 17, 2020 at 9:04 PM Uladzislau Rezki wrote: > > On Tue, Nov 17, 2020 at 10:37:34AM +0800, huang ying wrote: > > On Tue, Nov 17, 2020 at 6:00 AM Uladzislau Rezki (Sony) > > wrote: > > > > > > A current "lazy drain" model suffers f

Re: [PATCH 2/2] mm/vmalloc: rework the drain logic

2020-11-16 Thread huang ying
_vmap_area_lazy() as follows, if (atomic_long_read(_lazy_nr) < resched_threshold) cond_resched_lock(_vmap_area_lock); If it works properly, the latency problem can be solved. Can you check whether this doesn't work for you? Best Reagrds, Huang, Ying

Re: [PATCH v21 07/19] mm: page_idle_get_page() does not need lru_lock

2020-11-11 Thread huang ying
U(page) || > !get_page_unless_zero(page)) > return NULL; > > - pgdat = page_pgdat(page); > - spin_lock_irq(>lru_lock); get_page_unless_zero() is a full memory barrier. But do we need a compiler barrier here to prevent the compiler to cache Pag

Re: [PATCH -V2 2/2] autonuma: Migrate on fault among multiple bound nodes

2020-11-10 Thread Huang, Ying
Hi, Mel, Mel Gorman writes: > On Wed, Nov 04, 2020 at 01:36:58PM +0800, Huang, Ying wrote: >> > I've no specific objection to the patch or the name change. I can't >> > remember exactly why I picked the name, it was 8 years ago but I think it >> > was because t

[RFC -V4] autonuma: Migrate on fault among multiple bound nodes

2020-11-10 Thread Huang Ying
it seems not a good API/ABI for the purpose of the patch. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Ingo Molnar Cc: Mel Gorman Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dave Hansen Cc: Andi Kleen Cc: Michal Hocko Cc: David Rientje

[PATCH -V3 2/2] autonuma: Migrate on fault among multiple bound nodes

2020-11-09 Thread Huang Ying
.] Signed-off-by: "Huang, Ying" Acked-by: Mel Gorman Cc: Andrew Morton Cc: Ingo Molnar Cc: Rik van Riel Cc: Johannes Weiner Cc: "Matthew Wilcox (Oracle)" Cc: Dave Hansen Cc: Andi Kleen Cc: Michal Hocko Cc: David Rientjes --- mm/mempolicy.c | 17 +++-- 1 file ch

[PATCH -V3 0/2] autonuma: Migrate on fault among multiple bound nodes

2020-11-09 Thread Huang Ying
To make it possible to optimize cross-socket memory accessing with AutoNUMA even if the memory of the application is bound to multiple NUMA nodes. Changes: v3: - Rebased on latest upstream (v5.10-rc3) - Revised the change log. v2: - Rebased on latest upstream (v5.10-rc1) Huang Ying (2

[PATCH -V3 1/2] mempolicy: Rename MPOL_F_MORON to MPOL_F_MOPRON

2020-11-09 Thread Huang Ying
. The flag is upper case with prefix, so it looks generally OK by itself. But in the following patch, we will introduce a label named after the flag, which is lower case and without prefix, so it's better to rename it. Signed-off-by: "Huang, Ying" Suggested-by: "Matthew Wilcox (

Re: [PATCH -V2 2/2] autonuma: Migrate on fault among multiple bound nodes

2020-11-05 Thread Huang, Ying
Mel Gorman writes: > On Wed, Nov 04, 2020 at 01:36:58PM +0800, Huang, Ying wrote: >> But from another point of view, I suggest to remove the constraints of >> MPOL_F_MOF in the future. If the overhead of AutoNUMA isn't acceptable, >> why not just disable AutoNUMA glo

Re: [RFC PATCH 0/2] mm: fix OOMs for binding workloads to movable zone only node

2020-11-05 Thread Huang, Ying
performance of PMEM is much worse than that of DRAM. If we found that some pages on PMEM are accessed frequently (hot), we may want to move them to DRAM to optimize the system performance. If the unmovable pages are allocated on PMEM and hot, it's possible that we cannot move the pages to DRAM unless rebooting the system. So we think we should make the PMEM nodes to be MOVABLE only. Best Regards, Huang, Ying

Re: [PATCH -V2 2/2] autonuma: Migrate on fault among multiple bound nodes

2020-11-03 Thread Huang, Ying
Hi, Mel, Thanks for comments! Mel Gorman writes: > On Wed, Oct 28, 2020 at 10:34:11AM +0800, Huang Ying wrote: >> Now, AutoNUMA can only optimize the page placement among the NUMA nodes if >> the >> default memory policy is used. Because the memory policy specified &g

Re: [PATCH -V2 1/2] mempolicy: Rename MPOL_F_MORON to MPOL_F_MOPRON

2020-11-01 Thread Huang, Ying
Michal Hocko writes: > On Fri 30-10-20 15:27:51, Huang, Ying wrote: >> Michal Hocko writes: >> >> > On Wed 28-10-20 10:34:10, Huang Ying wrote: >> >> To follow code-of-conduct better. >> > >> > This is changing a user visible interface an

Re: [PATCH -V2 1/2] mempolicy: Rename MPOL_F_MORON to MPOL_F_MOPRON

2020-10-30 Thread Huang, Ying
Michal Hocko writes: > On Wed 28-10-20 10:34:10, Huang Ying wrote: >> To follow code-of-conduct better. > > This is changing a user visible interface and any userspace which refers > to the existing name will fail to compile unless I am missing something. Although these flags

[RFC -V4 4/6] autonuma, memory tiering: Rate limit NUMA migration throughput

2020-10-27 Thread Huang Ying
MB/s to 105.9 MB/s) with the patch, while the benchmark score decreases only 3.3%. A new sysctl knob kernel.numa_balancing_rate_limit_mbps is added for the users to specify the limit. TODO: Add ABI document for new sysctl knob. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal

[RFC -V4 6/6] autonuma, memory tiering: Add page promotion counter

2020-10-27 Thread Huang Ying
To distinguish the number of promotion from the original inter-socket NUMA balancing migration. The counter is per-node (target node). This is to identify imbalance among NUMA nodes. Signed-off-by: "Huang, Ying" --- include/linux/mmzone.h | 1 + mm/migrate.c | 10

[RFC -V4 1/6] autonuma: Optimize page placement for memory tiering system

2020-10-27 Thread Huang Ying
ocumentation/sysctl/kernel.txt Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dave Hansen Cc: Dan Williams Cc: linux-kernel@vger.kernel.org Cc: linux...@kvack.org --- include/linux/sched/

[RFC -V4 2/6] autonuma, memory tiering: Skip to scan fast memory

2020-10-27 Thread Huang Ying
. So that the page faults could be avoided too. In the test, if only the memory tiering AutoNUMA mode is enabled, the number of the AutoNUMA hint faults for the DRAM node is reduced to almost 0 with the patch. While the benchmark score doesn't change visibly. Signed-off-by: "Huang,

[RFC -V4 3/6] autonuma, memory tiering: Hot page selection with hint page fault latency

2020-10-27 Thread Huang Ying
ant for system performance, the administrator can set a higher hot threshold. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Dave Hansen Cc: Dan Williams Cc: linux-kernel@vger.kernel.

[RFC -V4 5/6] autonuma, memory tiering: Adjust hot threshold automatically

2020-10-27 Thread Huang Ying
NUMA page migrations on a 2 socket Intel server with Optance DC Persistent Memory. Because it improves the accuracy of the hot page selection. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Michal Hocko Cc: Rik van Riel Cc: Mel Gorman Cc: Peter Zijlstra Cc: Ingo Molnar Cc: D

[RFC -V4 0/6] autonuma: Optimize memory placement for memory tiering system

2020-10-27 Thread Huang Ying
ssed comments for V1. - Rebased on v5.5. Huang Ying (6): autonuma: Optimize page placement for memory tiering system autonuma, memory tiering: Skip to scan fast memory autonuma, memory tiering: Hot page selection with hint page fault latency autonuma, memory tiering: Rate limit NUMA

Re: [mm, thp] 85b9f46e8e: vm-scalability.throughput -8.7% regression

2020-10-20 Thread Huang, Ying
David Rientjes writes: > On Tue, 20 Oct 2020, Huang, Ying wrote: > >> >> = >> >> compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode: >>

Re: [mm, thp] 85b9f46e8e: vm-scalability.throughput -8.7% regression

2020-10-19 Thread Huang, Ying
t memory access latency at the hardware level when > running on a NUMA system. So you think it's better to bind processes to NUMA node or CPU? But we want to use this test case to capture NUMA/CPU placement/balance issue too. 0day solve the problem in another way. We run the test case multiple-times and calculate the average and standard deviation, then compare. For this specific regression, I found something strange, 10.93 ± 15% +10.8 21.78 ± 10% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.shmem_getpage_gfp.shmem_fault It appears the lock contention becomes heavier with the patch. But I cannot understand why too. Best Regards, Huang, Ying

Re: [RFC][PATCH 6/9] mm/vmscan: add page demotion counter

2020-10-19 Thread Huang, Ying
t; /proc/vmstat. > > [ daveh: >- __count_vm_events() a bit, and made them look at the THP > size directly rather than getting data from migrate_pages() It appears that we get the data from migrate_pages() now. > ] > > Signed-off-by: Yang Shi > Signed-off-by: Dave Hansen > Cc: Dav

Re: [PATCH] mm: Fix a race during split THP

2020-10-09 Thread Huang, Ying
Matthew Wilcox writes: > On Fri, Oct 09, 2020 at 03:36:47PM +0800, Huang, Ying wrote: >> +if (PageSwapCache(head)) { >> +swp_entry_t entry = { .val = page_private(head) }; >> + >> +split_swap_cluster(entry); >> +} > ... &g

<    1   2   3   4   5   6   7   8   9   10   >