Re: [PATCH -v2 01/10] swap: Change SWAPFILE_CLUSTER to 512

2016-09-02 Thread Huang, Ying
Andrew Morton <a...@linux-foundation.org> writes: > On Thu, 01 Sep 2016 16:04:57 -0700 "Huang\, Ying" <ying.hu...@intel.com> > wrote: > >> >> } >> >> >> >> -#define SWAPFILE_CLUSTER 256 >> >> +#define SWAPFILE

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-09-06 Thread Huang, Ying
siders > itself to be rotational storage. It takes the paths that are optimised to > minimise seeks but it's quite slow. When tree_lock contention is reduced, > workload is dominated by scan_swap_map. It's a one-line fix and I have > a patch for it but it only really matters if ramdisk is being used as a > simulator for swapping to fast storage. We (LKP people) use drivers/nvdimm/pmem.c instead of drivers/block/brd.c as ramdisk. Which considers itself to be non-rotational storage. And we have a series to optimize other locks in the swap path too, for example batching the swap space allocating and freeing, etc. If your solution to optimize batching removing pages from the swap cache can be merged, that will help us much! Best Regards, Huang, Ying

Re: [PATCH -v2 01/10] swap: Change SWAPFILE_CLUSTER to 512

2016-09-01 Thread Huang, Ying
Andrew Morton <a...@linux-foundation.org> writes: > On Thu, 1 Sep 2016 08:16:54 -0700 "Huang, Ying" <ying.hu...@intel.com> wrote: > >> From: Huang Ying <ying.hu...@intel.com> >> >> In this patch, the size of the swap cluster is changed to

Re: [LKP] [lkp] [f2fs] ec795418c4: fsmark.files_per_sec -36.3% regression

2016-08-30 Thread Huang, Ying
e.net/project/aimbench/aim-suite7/Initial%20release/s7110.tar.Z > > Thank you for the codes. > > I've run this workload on the latest f2fs and compared performance having > without the reported patch. (1TB nvme SSD, 16 cores, 16GB DRAM) > Interestingly, I could find slight performance improvement rather than > regression. :( > Not sure how to reproduce this. I think the difference lies on disk used. The ramdisk is used in the original test, but it appears that your memory is too small to setup the RAM disk for test. So it may be impossible for you to reproduce the test unless you can find more memory :) But we can help you to root cause the issue. What additional data do you want? perf-profile data before and after the patch? Best Regards, Huang, Ying

Re: [PATCH] mm: Don't use radix tree writeback tags for pages in swap cache

2016-08-29 Thread Huang, Ying
Hi, Rik, Thanks for comments! Rik van Riel <r...@redhat.com> writes: > On Thu, 2016-08-25 at 12:27 -0700, Huang, Ying wrote: >> File pages use a set of radix tags (DIRTY, TOWRITE, WRITEBACK, etc.) >> to >> accelerate finding the pages with a specific tag in the

[PATCH -v2] mm: Don't use radix tree writeback tags for pages in swap cache

2016-08-30 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> File pages use a set of radix tree tags (DIRTY, TOWRITE, WRITEBACK, etc.) to accelerate finding the pages with a specific tag in the radix tree during inode writeback. But for anonymous pages in the swap cache, there is no inode writebac

[PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patchset is to optimize the performance of Transparent Huge Page (THP) swap. Hi, Andrew, could you help me to check whether the overall design is reasonable? Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the swa

[PATCH -v3 06/10] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> __swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag for the huge page. This will free the specified swap cluster now. Because now this function will be called only in the error path to free the swap cluster just allocate

[PATCH -v2 01/10] swap: Change SWAPFILE_CLUSTER to 512

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, the size of the swap cluster is changed to that of the THP (Transparent Huge Page) on x86_64 architecture (512). This is for the THP swap support on x86_64. Where one swap cluster will be used to hold the contents of each THP swapp

[PATCH -v2 02/10] mm, memcg: Add swap_cgroup_iter iterator

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> The swap cgroup uses a kind of discontinuous array to record the information for the swap entries. lookup_swap_cgroup() provides a good encapsulation to access one element of the discontinuous array. To make it easier to access multiple el

[PATCH -v2 03/10] mm, memcg: Support to charge/uncharge multiple swap entries

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch make it possible to charge or uncharge a set of continuous swap entries in the swap cgroup. The number of swap entries is specified via an added parameter. This will be used for the THP (Transparent Huge Page) swap support. Where a swap c

[PATCH -v2 07/10] mm, THP, swap: Support to add/delete THP to/from swap cache

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> With this patch, a THP (Transparent Huge Page) can be added/deleted to/from the swap cache as a set of sub-pages (512 on x86_64). This will be used for the THP (Transparent Huge Page) swap support. Where one THP may be added/delted to/from the swap

[PATCH -v2 08/10] mm, THP: Add can_split_huge_page()

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> Separates checking whether we can split the huge page from split_huge_page_to_list() into a function. This will help to check that before splitting the THP (Transparent Huge Page) really. This will be used for delaying splitting THP during swappi

[PATCH -v2 04/10] mm, THP, swap: Add swap cluster allocate/free functions

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> The swap cluster allocation/free functions are added based on the existing swap cluster management mechanism for SSD. These functions don't work for the rotating hard disks because the existing swap cluster management mechanism doesn't work fo

[PATCH -v2 09/10] mm, THP, swap: Support to split THP in swap cache

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch enhanced the split_huge_page_to_list() to work properly for the THP (Transparent Huge Page) in the swap cache during swapping out. This is used for delaying splitting the THP during swapping out. Where for a THP to be swapped o

[PATCH -v2 05/10] mm, THP, swap: Add get_huge_swap_page()

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> A variation of get_swap_page(), get_huge_swap_page(), is added to allocate a swap cluster (512 swap slots) based on the swap cluster allocation function. A fair simple algorithm is used, that is, only the first swap device in priority list will be

[PATCH -v2 06/10] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> __swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag for the huge page. This will free the specified swap cluster now. Because now this function will be called only in the error path to free the swap cluster just allocate

[PATCH -v2 10/10] mm, THP, swap: Delay splitting THP during swap out

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP (Transparent Huge Page) and adding the THP into the swap cache. This will reduce lock acquiring/releasing for the

[PATCH -v2 00/10] THP swap: Delay splitting THP during swapping out

2016-09-01 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patchset is to optimize the performance of Transparent Huge Page (THP) swap. Hi, Andrew, could you help me to check whether the overall design is reasonable? Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the swa

Re: [PATCH -v2] mm: Don't use radix tree writeback tags for pages in swap cache

2016-08-31 Thread Huang, Ying
Mel Gorman <mgor...@techsingularity.net> writes: > On Wed, Aug 31, 2016 at 08:17:24AM -0700, Huang, Ying wrote: >> Mel Gorman <mgor...@techsingularity.net> writes: >> >> > On Tue, Aug 30, 2016 at 10:28:09AM -0700, Huang, Ying wrote: >> >> From: Hua

Re: [PATCH -v2] mm: Don't use radix tree writeback tags for pages in swap cache

2016-08-31 Thread Huang, Ying
Mel Gorman <mgor...@techsingularity.net> writes: > On Tue, Aug 30, 2016 at 10:28:09AM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> File pages use a set of radix tree tags (DIRTY, TOWRITE, WRITEBACK, >> etc.) to accelerat

Re: [PATCH -v3 04/10] mm, THP, swap: Add swap cluster allocate/free functions

2016-09-08 Thread Huang, Ying
Anshuman Khandual <khand...@linux.vnet.ibm.com> writes: > On 09/07/2016 10:16 PM, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> The swap cluster allocation/free functions are added based on the >> existing swap cluster management mecha

Re: [PATCH -v3 05/10] mm, THP, swap: Add get_huge_swap_page()

2016-09-08 Thread Huang, Ying
"Kirill A. Shutemov" <kir...@shutemov.name> writes: > On Wed, Sep 07, 2016 at 09:46:04AM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> A variation of get_swap_page(), get_huge_swap_page(), is added to >> allocate

Re: [PATCH -v3 08/10] mm, THP: Add can_split_huge_page()

2016-09-08 Thread Huang, Ying
Hi, Kirill, Thanks for your comments! "Kirill A. Shutemov" <kir...@shutemov.name> writes: > On Wed, Sep 07, 2016 at 09:46:07AM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> Separates checking whether we can split th

Re: [PATCH -v3 07/10] mm, THP, swap: Support to add/delete THP to/from swap cache

2016-09-08 Thread Huang, Ying
Hi, Anshuman, Thanks for comments! Anshuman Khandual <khand...@linux.vnet.ibm.com> writes: > On 09/07/2016 10:16 PM, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> With this patch, a THP (Transparent Huge Page) can be added/deleted >>

Re: [PATCH -v3 01/10] mm, swap: Make swap cluster size same of THP size on x86_64

2016-09-08 Thread Huang, Ying
"Kirill A. Shutemov" <kir...@shutemov.name> writes: > On Wed, Sep 07, 2016 at 09:46:00AM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> In this patch, the size of the swap cluster is changed to that of the >> THP (T

Re: [PATCH -v3 01/10] mm, swap: Make swap cluster size same of THP size on x86_64

2016-09-08 Thread Huang, Ying
Anshuman Khandual <khand...@linux.vnet.ibm.com> writes: > On 09/07/2016 10:16 PM, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> In this patch, the size of the swap cluster is changed to that of the >> THP (Transparent

Re: [PATCH -v3 01/10] mm, swap: Make swap cluster size same of THP size on x86_64

2016-09-08 Thread Huang, Ying
"Kirill A. Shutemov" <kir...@shutemov.name> writes: > On Wed, Sep 07, 2016 at 09:46:00AM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> In this patch, the size of the swap cluster is changed to that of the >> THP (T

Re: [PATCH -v3 03/10] mm, memcg: Support to charge/uncharge multiple swap entries

2016-09-08 Thread Huang, Ying
Anshuman Khandual <khand...@linux.vnet.ibm.com> writes: > On 09/07/2016 10:16 PM, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> This patch make it possible to charge or uncharge a set of continuous >> swap entries in the s

Re: [RFC PATCH 0/4] Reduce tree_lock contention during swap and reclaim of a single file v1

2016-09-09 Thread Huang, Ying
gt; Agreed. I don't mind leaving it on the back burner unless Dave reports > it really helps or a new bug report about realistic tree_lock contention > shows up. Best Regards, Huang, Ying

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-13 Thread Huang, Ying
Minchan Kim <minc...@kernel.org> writes: > On Tue, Sep 13, 2016 at 02:40:00PM +0800, Huang, Ying wrote: >> Minchan Kim <minc...@kernel.org> writes: >> >> > Hi Huang, >> > >> > On Fri, Sep 09, 2016 at 01:35:12PM -0700, Huang, Ying wrote: >

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-13 Thread Huang, Ying
Minchan Kim <minc...@kernel.org> writes: > Hi Huang, > > On Fri, Sep 09, 2016 at 01:35:12PM -0700, Huang, Ying wrote: > > < snip > > >> >> Recently, the performance of the storage devices improved so fast that >> >> we cannot saturate the dis

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-09 Thread Huang, Ying
Hi, Minchan, Minchan Kim <minc...@kernel.org> writes: > Hi Huang, > > On Wed, Sep 07, 2016 at 09:45:59AM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> This patchset is to optimize the performance of Transparent Huge Page >

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-25 Thread Huang, Ying
Shaohua Li <s...@kernel.org> writes: > On Fri, Sep 23, 2016 at 10:32:39AM +0800, Huang, Ying wrote: >> Rik van Riel <r...@redhat.com> writes: >> >> > On Thu, 2016-09-22 at 15:56 -0700, Shaohua Li wrote: >> >> On Wed, Sep

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-09-26 Thread Huang, Ying
Hi, Christoph, "Huang, Ying" <ying.hu...@intel.com> writes: > Hi, Christoph, > > "Huang, Ying" <ying.hu...@intel.com> writes: > >> Christoph Hellwig <h...@lst.de> writes: >> >>> Snipping the long contest: &

Re: [LKP] [lkp] [f2fs] ec795418c4: fsmark.files_per_sec -36.3% regression

2016-09-26 Thread Huang, Ying
Hi, Jaegeuk, "Huang, Ying" <ying.hu...@intel.com> writes: > Jaegeuk Kim <jaeg...@kernel.org> writes: > >> Hello, >> >> On Sat, Aug 27, 2016 at 10:13:34AM +0800, Fengguang Wu wrote: >>> Hi Jaegeuk, >>> >>>

Re: [LKP] [lkp] [f2fs] ec795418c4: fsmark.files_per_sec -36.3% regression

2016-09-26 Thread Huang, Ying
Jaegeuk Kim <jaeg...@kernel.org> writes: > On Mon, Sep 26, 2016 at 02:26:06PM +0800, Huang, Ying wrote: >> Hi, Jaegeuk, >> >> "Huang, Ying" <ying.hu...@intel.com> writes: >> >> > Jaegeuk Kim <jaeg...@kernel.org> writes: >&g

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-09-26 Thread Huang, Ying
be fixed that this unconvered. But > in the end will probabkly stuck with a slight regression in this > artificial workload. I see. Thanks for update. Please keep me posted. Best Regards, Huang, Ying

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-17 Thread Huang, Ying
Minchan Kim <minc...@kernel.org> writes: > On Tue, Sep 13, 2016 at 04:53:49PM +0800, Huang, Ying wrote: >> Minchan Kim <minc...@kernel.org> writes: >> > On Tue, Sep 13, 2016 at 02:40:00PM +0800, Huang, Ying wrote: >> >> Minchan Kim <minc..

Re: [PATCH -v3 01/10] mm, swap: Make swap cluster size same of THP size on x86_64

2016-09-19 Thread Huang, Ying
Hi, Johannes, Johannes Weiner <han...@cmpxchg.org> writes: > On Thu, Sep 08, 2016 at 11:15:52AM +0530, Anshuman Khandual wrote: >> On 09/07/2016 10:16 PM, Huang, Ying wrote: >> > From: Huang Ying <ying.hu...@intel.com> >> > >> > In thi

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-19 Thread Huang, Ying
Hi, Minchan, Minchan Kim <minc...@kernel.org> writes: > Hi Huang, > > On Sun, Sep 18, 2016 at 09:53:39AM +0800, Huang, Ying wrote: >> Minchan Kim <minc...@kernel.org> writes: >> >> > On Tue, Sep 13, 2016 at 04:53:49PM +0800, Huang, Ying wrote: >

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-19 Thread Huang, Ying
Minchan Kim <minc...@kernel.org> writes: > Hi Huang, > > On Tue, Sep 20, 2016 at 10:54:35AM +0800, Huang, Ying wrote: >> Hi, Minchan, >> >> Minchan Kim <minc...@kernel.org> writes: >> > Hi Huang, >> > >> > On Sun, Sep 18, 20

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-22 Thread Huang, Ying
Rik van Riel <r...@redhat.com> writes: > On Thu, 2016-09-22 at 15:56 -0700, Shaohua Li wrote: >> On Wed, Sep 07, 2016 at 09:45:59AM -0700, Huang, Ying wrote: >> >  >> > - It will help the memory fragmentation, especially when the THP is >> >   heavily us

Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-22 Thread Huang, Ying
Hi, Shaohua, Thanks for comments! Shaohua Li <s...@kernel.org> writes: > On Wed, Sep 07, 2016 at 09:45:59AM -0700, Huang, Ying wrote: >> >> The advantages of the THP swap support include: Sorry for confusing. This is the advantages of the final goal, that is, avoid

[PATCH -v3 03/10] mm, memcg: Support to charge/uncharge multiple swap entries

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch make it possible to charge or uncharge a set of continuous swap entries in the swap cgroup. The number of swap entries is specified via an added parameter. This will be used for the THP (Transparent Huge Page) swap support. Where a swap c

[PATCH -v3 07/10] mm, THP, swap: Support to add/delete THP to/from swap cache

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> With this patch, a THP (Transparent Huge Page) can be added/deleted to/from the swap cache as a set of sub-pages (512 on x86_64). This will be used for the THP (Transparent Huge Page) swap support. Where one THP may be added/delted to/from the swap

[PATCH -v3 09/10] mm, THP, swap: Support to split THP in swap cache

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch enhanced the split_huge_page_to_list() to work properly for the THP (Transparent Huge Page) in the swap cache during swapping out. This is used for delaying splitting the THP during swapping out. Where for a THP to be swapped o

[PATCH -v3 10/10] mm, THP, swap: Delay splitting THP during swap out

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP (Transparent Huge Page) and adding the THP into the swap cache. This will reduce lock acquiring/releasing for the

[PATCH -v3 08/10] mm, THP: Add can_split_huge_page()

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> Separates checking whether we can split the huge page from split_huge_page_to_list() into a function. This will help to check that before splitting the THP (Transparent Huge Page) really. This will be used for delaying splitting THP during swappi

[PATCH -v3 04/10] mm, THP, swap: Add swap cluster allocate/free functions

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> The swap cluster allocation/free functions are added based on the existing swap cluster management mechanism for SSD. These functions don't work for the rotating hard disks because the existing swap cluster management mechanism doesn't work fo

[PATCH -v3 05/10] mm, THP, swap: Add get_huge_swap_page()

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> A variation of get_swap_page(), get_huge_swap_page(), is added to allocate a swap cluster (512 swap slots) based on the swap cluster allocation function. A fair simple algorithm is used, that is, only the first swap device in priority list will be

[PATCH -v3 01/10] mm, swap: Make swap cluster size same of THP size on x86_64

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, the size of the swap cluster is changed to that of the THP (Transparent Huge Page) on x86_64 architecture (512). This is for the THP swap support on x86_64. Where one swap cluster will be used to hold the contents of each THP swapp

[PATCH -v3 02/10] mm, memcg: Add swap_cgroup_iter iterator

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> The swap cgroup uses a kind of discontinuous array to record the information for the swap entries. lookup_swap_cgroup() provides a good encapsulation to access one element of the discontinuous array. To make it easier to access multiple el

[PATCH 1/2] mm, swap: Use offset of swap entry as key of swap cache

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch is to improve the performance of swap cache operations when the type of the swap device is not 0. Originally, the whole swap entry value is used as the key of the swap cache, even though there is one radix tree for each swap device. If th

[PATCH 2/2] mm: Remove page_file_index

2016-09-07 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> After using the offset of the swap entry as the key of the swap cache, the page_index() becomes exactly same as page_file_index(). So the page_file_index() is removed and the callers are changed to use page_index() instead. Cc: Trond Myk

Re: [LKP] [lkp] [x86/hweight] 65ea11ec6a: will-it-scale.per_process_ops 9.3% improvement

2016-08-17 Thread Huang, Ying
Borislav Petkov <b...@suse.de> writes: > On Wed, Aug 17, 2016 at 03:29:04PM -0700, Huang, Ying wrote: >> branch-miss-rate decreased from ~0.30% to ~0.043%. >> >> So I guess there are some code alignment change, which caused decreased >> branch miss rate. &g

Re: [RFC 00/11] THP swap: Delay splitting THP during swapping out

2016-08-23 Thread Huang, Ying
gt; On Mon, Aug 22, 2016 at 02:33:08PM -0700, Huang, Ying wrote: >> Hi, Minchan, >> >> Minchan Kim <minc...@kernel.org> writes: >> > Anyway, I hope [1/11] should be merged regardless of the patchset because >> > I believe anyone doesn't feel comfortable with cl

Re: [RFC 00/11] THP swap: Delay splitting THP during swapping out

2016-08-22 Thread Huang, Ying
for it? Best Regards, Huang, Ying

Re: [RFC][PATCH 0/3] locking/mutex: Rewrite basic mutex

2016-08-25 Thread huang ying
Hi, Peter, Do you have a git tree branch for this patchset? We want to test it in 0day performance test. That will make it a little easier. Best Regards, Huang, Ying

[PATCH] mm: Don't use radix tree writeback tags for pages in swap cache

2016-08-25 Thread Huang, Ying
mgor...@techsingularity.net> Cc: Tejun Heo <t...@kernel.org> Cc: Wu Fengguang <fengguang...@intel.com> Cc: Dave Hansen <dave.han...@intel.com> Signed-off-by: "Huang, Ying" <ying.hu...@intel.com> --- mm/page-writeback.c | 6 -- 1 file changed, 4 insertion

Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression

2016-08-24 Thread Huang, Ying
ap write test case for a ramdisk on a Xeon E5 v3 machine, the swap out throughput improved 40.4%, from ~0.97GB/s to ~1.36GB/s. What's your plan for this patch? If it can be merged soon, that will be great! I found some issues in the original patch to work with swap cache. Below is my fixes to make

Re: [RFC] mm: Don't use radix tree writeback tags for pages in swap cache

2016-08-24 Thread Huang, Ying
"Huang, Ying" <ying.hu...@intel.com> writes: > Hi, Dave, > > Dave Hansen <dave.han...@intel.com> writes: > >> On 08/09/2016 09:17 AM, Huang, Ying wrote: >>> File pages uses a set of radix tags (DIRTY, TOWRITE, WRITEBACK) to >>> accelerat

Re: [LKP] [lkp] [f2fs] ec795418c4: fsmark.files_per_sec -36.3% regression

2016-08-24 Thread huang ying
Hi, Jaegeuk, On Thu, Aug 11, 2016 at 6:22 PM, Jaegeuk Kim <jaeg...@kernel.org> wrote: > On Thu, Aug 11, 2016 at 03:49:41PM -0700, Huang, Ying wrote: >> Hi, Kim, >> >> "Huang, Ying" <ying.hu...@intel.com> writes: >> >> >> >> [lkp

[PATCH] mm, swap: Add swap_cluster_list

2016-08-24 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This is a code clean up patch without functionality changes. The swap_cluster_list data structure and its operations are introduced to provide some better encapsulation for the free cluster and discard cluster list operations. This avoid som

[PATCH -v4 00/10] THP swap: Delay splitting THP during swapping out

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patchset is to optimize the performance of Transparent Huge Page (THP) swap. Hi, Andrew, could you help me to check whether the overall design is reasonable? Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the swa

[PATCH -v4 2/9] mm, memcg: Support to charge/uncharge multiple swap entries

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch make it possible to charge or uncharge a set of continuous swap entries in the swap cgroup. The number of swap entries is specified via an added parameter. This will be used for the THP (Transparent Huge Page) swap support. Where a swap c

[PATCH -v4 8/9] mm, THP, swap: Support to split THP in swap cache

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch enhanced the split_huge_page_to_list() to work properly for the THP (Transparent Huge Page) in the swap cache during swapping out. This is used for delaying splitting the THP during swapping out. Where for a THP to be swapped o

[PATCH] Subject: [PATCH -v4] THP swap: Delay splitting THP during swapping out

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> Johannes suggested me to use two big patches instead 9 patches. And he feels that is easier for him to review. I am not sure whether this is desirable for other reviewers too. So I sent out both versions for review. If this version is pref

[PATCH -v4 6/9] mm, THP, swap: Support to add/delete THP to/from swap cache

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> With this patch, a THP (Transparent Huge Page) can be added/deleted to/from the swap cache as a set of (HPAGE_PMD_NR) sub-pages. This will be used for the THP (Transparent Huge Page) swap support. Where one THP may be added/delted to/from the swap

[PATCH -v4 3/9] mm, THP, swap: Add swap cluster allocate/free functions

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> The swap cluster allocation/free functions are added based on the existing swap cluster management mechanism for SSD. These functions don't work for the rotating hard disks because the existing swap cluster management mechanism doesn't work fo

[PATCH -v4 1/9] mm, swap: Make swap cluster size same of THP size on x86_64

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, the size of the swap cluster is changed to that of the THP (Transparent Huge Page) on x86_64 architecture (512). This is for the THP swap support on x86_64. Where one swap cluster will be used to hold the contents of each THP swapp

[PATCH -v4 4/9] mm, THP, swap: Add get_huge_swap_page()

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> A variation of get_swap_page(), get_huge_swap_page(), is added to allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap cluster allocation function. A fair simple algorithm is used, that is, only the first swap device in priorit

[PATCH -v4 5/9] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> __swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag for the huge page. This will free the specified swap cluster now. Because now this function will be called only in the error path to free the swap cluster just allocate

[PATCH -v4 9/9] mm, THP, swap: Delay splitting THP during swap out

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP (Transparent Huge Page) and adding the THP into the swap cache. This will reduce lock acquiring/releasing for the

[PATCH -v4 7/9] mm, THP: Add can_split_huge_page()

2016-09-29 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> Separates checking whether we can split the huge page from split_huge_page_to_list() into a function. This will help to check that before splitting the THP (Transparent Huge Page) really. This will be used for delaying splitting THP during swappi

Re: [PATCH 2/8] mm/swap: Add cluster lock

2016-09-28 Thread Huang, Ying
goto new_cluster; >> +} >> +ci = lock_cluster(si, tmp); >> +while (tmp < max) { > > In this work tmp is checked to be less than the max value. > Semantic change hoped? Oops! tmp should be checked to be more than the min value. Will fix it in

Re: [LKP] [lkp] [perf powerpc] 18d1796d0b: [No primary change]

2016-10-25 Thread Huang, Ying
n't I still find that? These reports suck! There is observable changes between the benchmark (will-it-scale) scores. That is said in the subject of the mail: "[No primary change]". But apparently, that is not clear. We will improve that to make it more clear. > The result doesn't make sense, my gcc inlines the function call, the > emitted code is very similar to the old code, with exception of one > extra symbol. > > Are you sure this isn't simple run to run variation? The reported change is perf-stat.branch-miss-rate%, which is changed from 0.19% to 0.21%. That is too small. So, please ignore this report. We will be more careful in the future. Best Regards, Huang, Ying

Re: [PATCH v2 2/8] mm/swap: Add cluster lock

2016-10-24 Thread Huang, Ying
Hi, Jonathan, Thanks for review. Jonathan Corbet <cor...@lwn.net> writes: > On Thu, 20 Oct 2016 16:31:41 -0700 > Tim Chen <tim.c.c...@linux.intel.com> wrote: > >> From: "Huang, Ying" <ying.hu...@intel.com> >> >> This patch is to reduce the

Re: [PATCH -v5 0/9] THP swap: Delay splitting THP during swapping out

2016-11-21 Thread Huang, Ying
"Kirill A. Shutemov" <kir...@shutemov.name> writes: > On Wed, Nov 16, 2016 at 11:10:48AM +0800, Huang, Ying wrote: >> From: Huang Ying <ying.hu...@intel.com> >> >> This patchset is to optimize the performance of Transparent Huge Page >> (THP) sw

Re: GHES platform devices

2016-11-16 Thread Huang, Ying
robably puts gunk in sysfs that we don't need. Although other error sources are not platform devices, I think it is generally good to make GHES platform devices. To take advantage of automatic module loading, we can make ghes a module again, but prevent it from unloading. What do you think about that? Best Regards, Huang, Ying

[PATCH -v5 2/9] mm, memcg: Support to charge/uncharge multiple swap entries

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch make it possible to charge or uncharge a set of continuous swap entries in the swap cgroup. The number of swap entries is specified via an added parameter. This will be used for the THP (Transparent Huge Page) swap support. Where a swap c

[PATCH -v5 9/9] mm, THP, swap: Delay splitting THP during swap out

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP (Transparent Huge Page) and adding the THP into the swap cache. This will reduce lock acquiring/releasing for the

[PATCH -v5 7/9] mm, THP: Add can_split_huge_page()

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> Separates checking whether we can split the huge page from split_huge_page_to_list() into a function. This will help to check that before splitting the THP (Transparent Huge Page) really. This will be used for delaying splitting THP during swappi

[PATCH -v5 6/9] mm, THP, swap: Support to add/delete THP to/from swap cache

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> With this patch, a THP (Transparent Huge Page) can be added/deleted to/from the swap cache as a set of (HPAGE_PMD_NR) sub-pages. This will be used for the THP (Transparent Huge Page) swap support. Where one THP may be added/delted to/from the swap

[PATCH -v5 1/9] mm, swap: Make swap cluster size same of THP size on x86_64

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, the size of the swap cluster is changed to that of the THP (Transparent Huge Page) on x86_64 architecture (512). This is for the THP swap support on x86_64. Where one swap cluster will be used to hold the contents of each THP swapp

[PATCH -v5 0/9] THP swap: Delay splitting THP during swapping out

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patchset is to optimize the performance of Transparent Huge Page (THP) swap. Hi, Andrew, could you help me to check whether the overall design is reasonable? Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the swa

[PATCH -v5 4/9] mm, THP, swap: Add get_huge_swap_page()

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> A variation of get_swap_page(), get_huge_swap_page(), is added to allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap cluster allocation function. A fair simple algorithm is used, that is, only the first swap device in priorit

[PATCH -v5 8/9] mm, THP, swap: Support to split THP in swap cache

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch enhanced the split_huge_page_to_list() to work properly for the THP (Transparent Huge Page) in the swap cache during swapping out. This is used for delaying splitting the THP during swapping out. Where for a THP to be swapped o

[PATCH -v5 3/9] mm, THP, swap: Add swap cluster allocate/free functions

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> The swap cluster allocation/free functions are added based on the existing swap cluster management mechanism for SSD. These functions don't work for the rotating hard disks because the existing swap cluster management mechanism doesn't work fo

[PATCH -v5 5/9] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page

2016-11-15 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> __swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag for the huge page. This will free the specified swap cluster now. Because now this function will be called only in the error path to free the swap cluster just allocate

Re: [PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-30 Thread Huang, Ying
Hillf Danton <hillf...@alibaba-inc.com> writes: > On Friday, October 28, 2016 1:56 PM Huang, Ying wrote: >> @@ -2016,10 +2021,12 @@ int page_trans_huge_mapcount(struct page *page, int >> *total_mapcount) >> /* Racy check whether the huge page can be split */ >>

[PATCH -v4 RESEND 3/9] mm, THP, swap: Add swap cluster allocate/free functions

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> The swap cluster allocation/free functions are added based on the existing swap cluster management mechanism for SSD. These functions don't work for the rotating hard disks because the existing swap cluster management mechanism doesn't work fo

[PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch enhanced the split_huge_page_to_list() to work properly for the THP (Transparent Huge Page) in the swap cache during swapping out. This is used for delaying splitting the THP during swapping out. Where for a THP to be swapped o

[PATCH -v4 RESEND 5/9] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> __swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag for the huge page. This will free the specified swap cluster now. Because now this function will be called only in the error path to free the swap cluster just allocate

[PATCH -v4 RESEND 1/9] mm, swap: Make swap cluster size same of THP size on x86_64

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, the size of the swap cluster is changed to that of the THP (Transparent Huge Page) on x86_64 architecture (512). This is for the THP swap support on x86_64. Where one swap cluster will be used to hold the contents of each THP swapp

[PATCH -v4 RESEND 2/9] mm, memcg: Support to charge/uncharge multiple swap entries

2016-10-28 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patch make it possible to charge or uncharge a set of continuous swap entries in the swap cgroup. The number of swap entries is specified via an added parameter. This will be used for the THP (Transparent Huge Page) swap support. Where a swap c

[PATCH -v4 RESEND 0/9] THP swap: Delay splitting THP during swapping out

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> This patchset is to optimize the performance of Transparent Huge Page (THP) swap. Hi, Andrew, could you help me to check whether the overall design is reasonable? Hi, Hugh, Shaohua, Minchan and Rik, could you help me to review the swa

[PATCH -v4 RESEND 9/9] mm, THP, swap: Delay splitting THP during swap out

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> In this patch, splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP (Transparent Huge Page) and adding the THP into the swap cache. This will reduce lock acquiring/releasing for the

[PATCH -v4 RESEND 4/9] mm, THP, swap: Add get_huge_swap_page()

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> A variation of get_swap_page(), get_huge_swap_page(), is added to allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap cluster allocation function. A fair simple algorithm is used, that is, only the first swap device in priorit

[PATCH -v4 RESEND 7/9] mm, THP: Add can_split_huge_page()

2016-10-27 Thread Huang, Ying
From: Huang Ying <ying.hu...@intel.com> Separates checking whether we can split the huge page from split_huge_page_to_list() into a function. This will help to check that before splitting the THP (Transparent Huge Page) really. This will be used for delaying splitting THP during swappi

<    3   4   5   6   7   8   9   10   11   12   >