From: Huang Ying <ying.hu...@intel.com>
__swapcache_free() is added to support to clear the SWAP_HAS_CACHE flag
for the huge page. This will free the specified swap cluster now.
Because now this function will be called only in the error path to free
the swap cluster just allocate
From: Huang Ying <ying.hu...@intel.com>
With this patch, a THP (Transparent Huge Page) can be added/deleted
to/from the swap cache as a set of (HPAGE_PMD_NR) sub-pages.
This will be used for the THP (Transparent Huge Page) swap support.
Where one THP may be added/delted to/from the swap
From: Huang Ying <ying.hu...@intel.com>
A variation of get_swap_page(), get_huge_swap_page(), is added to
allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap
cluster allocation function. A fair simple algorithm is used, that is,
only the first swap device in priorit
From: Huang Ying <ying.hu...@intel.com>
The swap cluster allocation/free functions are added based on the
existing swap cluster management mechanism for SSD. These functions
don't work for the rotating hard disks because the existing swap cluster
management mechanism doesn't work fo
Hi, Matthew,
Matthew Wilcox <wi...@infradead.org> writes:
> On Wed, Apr 05, 2017 at 03:10:58PM +0800, Huang, Ying wrote:
>> In general, kmalloc() will have less memory fragmentation than
>> vmalloc(). From Dave Hansen: For example, we have a two-page data
>> stru
From: Huang Ying <ying.hu...@intel.com>
To reduce the lock contention of swap_info_struct->lock when freeing
swap entry. The freed swap entries will be collected in a per-CPU
buffer firstly, and be really freed later in batch. During the batch
freeing, if the consecutive swap entries i
From: Huang Ying <ying.hu...@intel.com>
Now vzalloc() is used in swap code to allocate various data
structures, such as swap cache, swap slots cache, cluster info, etc.
Because the size may be too large on some system, so that normal
kzalloc() may fail. But using kzalloc() has some adva
Johannes Weiner <han...@cmpxchg.org> writes:
> On Sat, Apr 15, 2017 at 09:17:04AM +0800, Huang, Ying wrote:
>> Hi, Johannes,
>>
>> Johannes Weiner <han...@cmpxchg.org> writes:
>>
>> > Hi Huang,
>> >
>> > I reviewed this patch bas
Minchan Kim <minc...@kernel.org> writes:
> Hi Huang,
>
> On Fri, Apr 07, 2017 at 02:49:01PM +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> To reduce the lock contention of swap_info_struct->lock when freeing
>> swap
Christopher Lameter <c...@linux.com> writes:
> On Wed, 2 Aug 2017, Huang, Ying wrote:
>
>> --- a/include/linux/percpu.h
>> +++ b/include/linux/percpu.h
>> @@ -129,5 +129,8 @@ extern phys_addr_t per_cpu_ptr_to_phys(void *addr);
From: Huang Ying <ying.hu...@intel.com>
struct call_single_data is used in IPI to transfer information between
CPUs. Its size is bigger than sizeof(unsigned long) and less than
cache line size. Now, it is allocated with no any alignment
requirement. This makes it possible for all
From: Huang Ying <ying.hu...@intel.com>
To use the newly introduced alloc_percpu_aligned(), which can allocate
cache line size aligned percpu memory dynamically.
Signed-off-by: "Huang, Ying" <ying.hu...@intel.com>
Cc: Joerg Roedel <j...@8bytes.org>
Cc: io..
From: Huang Ying <ying.hu...@intel.com>
struct call_single_data is used in IPI to transfer information between
CPUs. Its size is bigger than sizeof(unsigned long) and less than
cache line size. Now, it is allocated with no any alignment
requirement. This makes it possible for all
From: Huang Ying <ying.hu...@intel.com>
To allocate percpu memory that is aligned with cache line size
dynamically. We can statically allocate percpu memory that is aligned
with cache line size with DEFINE_PER_CPU_ALIGNED(), but we have no
correspondent API for dynamic allocation.
Sign
Eric Dumazet <eric.duma...@gmail.com> writes:
> On Wed, 2017-08-02 at 16:52 +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> struct call_single_data is used in IPI to transfer information between
>> CPUs. Its size is bigger t
"Huang, Ying" <ying.hu...@intel.com> writes:
> Peter Zijlstra <pet...@infradead.org> writes:
> [snip]
>> diff --git a/include/linux/smp.h b/include/linux/smp.h
>> index 68123c1fe549..8d817cb80a38 100644
>> --- a/include/linux/smp.h
>> +++ b/inc
__aligned(sizeof(struct __call_single_data));
> +
Another requirement of the alignment is that it should be the power of
2. Otherwise, for example, if someone adds a field to struct, so that
the size becomes 40 on x86_64. The alignment should be 64 instead of
40.
Best Regards,
Huang, Ying
From: Huang Ying <ying.hu...@intel.com>
Huge page helps to reduce TLB miss rate, but it has higher cache
footprint, sometimes this may cause some issue. For example, when
clearing huge page on x86_64 platform, the cache footprint is 2M. But
on a Xeon E5 v3 2699 CPU, there are 18 cor
Hi, Peter,
"Huang, Ying" <ying.hu...@intel.com> writes:
> Peter Zijlstra <pet...@infradead.org> writes:
>
>> On Sat, Aug 05, 2017 at 08:47:02AM +0800, Huang, Ying wrote:
>>> Yes. That looks good. So you will prepare the final patch? Or you
>>
Dave Hansen <dave.han...@intel.com> writes:
> On 06/29/2017 06:44 PM, Huang, Ying wrote:
>>
>> static atomic_t swapin_readahead_hits = ATOMIC_INIT(4);
>> +static atomic_long_t swapin_readahead_hits_total = ATOMIC_INIT(0);
>> +static atomic_long_t swapi
be onlined
alloc_swap_slot_cache()
mutex_lock(cache[B]->alloc_lock)
mutex_init(cache[B]->alloc_lock) !!!
The cache[B]->alloc_lock will be reinitialized when it is still held.
Best Regards,
Huang, Ying
> Reporte
From: Huang Ying <ying.hu...@intel.com>
When swapping out THP (Transparent Huge Page), instead of swapping out
the THP as a whole, sometimes we have to fallback to split the THP
into normal pages before swapping, because no free swap clusters are
available, or cgroup limit is exceede
From: Huang Ying <ying.hu...@intel.com>
This patch makes mem_cgroup_swapout() works for the transparent huge
page (THP). Which will move the memory cgroup charge from memory to
swap for a THP.
This will be used for the THP swap support. Where a THP may be
swapped out as a whole to
From: Huang Ying <ying.hu...@intel.com>
After adding swapping out support for THP (Transparent Huge Page), it
is possible that a THP in swap cache (partly swapped out) need to be
split. To split such a THP, the swap cluster backing the THP need to
be split too, that is, the CLUSTER_FLA
From: Huang Ying <ying.hu...@intel.com>
In this patch, splitting transparent huge page (THP) during swapping
out is delayed from after adding the THP into the swap cache to after
swapping out finishes. After the patch, more operations for the
anonymous THP reclaiming, such as writing t
From: Huang Ying <ying.hu...@intel.com>
To support to delay splitting THP (Transparent Huge Page) after
swapped out. We need to enhance swap writing code to support to write
a THP as a whole. This will improve swap write IO performance. As
Ming Lei <ming@redhat.com>
From: Huang Ying <ying.hu...@intel.com>
The .rw_page in struct block_device_operations is used by the swap
subsystem to read/write the page contents from/into the corresponding
swap slot in the swap device. To support the THP (Transparent Huge
Page) swap optimization, the .rw_page is en
From: Huang Ying <ying.hu...@intel.com>
After supporting to delay THP (Transparent Huge Page) splitting after
swapped out, it is possible that some page table mappings of the THP
are turned into swap entries. So reuse_swap_page() need to check the
swap count in addition to the map
From: Huang Ying <ying.hu...@intel.com>
It's hard to write a whole transparent huge page (THP) to a file
backed swap device during swapping out and the file backed swap device
isn't very popular. So the huge cluster allocation for the file
backed swap device is disabled.
Signed-off-by:
From: Huang Ying <ying.hu...@intel.com>
PTE mapped THP (Transparent Huge Page) will be ignored when moving
memory cgroup charge. But for THP which is in the swap cache, the
memory cgroup charge for the swap of a tail-page may be moved in
current implementation. That isn't correct, b
From: Huang Ying <ying.hu...@intel.com>
For a THP (Transparent Huge Page), tail_page->mem_cgroup is NULL. So
to check whether the page is charged already, we need to check the
head page. This is not an issue before because it is impossible for a
THP to be in the swap cache before. But
(before core_initcall()?). So, do you think it is possible to use
ftrace to measure secondary CPU bootup time?
Thanks,
Huang, Ying
From: Huang Ying <ying.hu...@intel.com>
The normal swap slot reclaiming can be done when the swap count
reaches SWAP_HAS_CACHE. But for the swap slot which is backing a THP,
all swap slots backing one THP must be reclaimed together, because the
swap slot may be used again when the THP is s
From: Huang Ying <ying.hu...@intel.com>
Previously, swapcache_free_cluster() is used only in the error path of
shrink_page_list() to free the swap cluster just allocated if the
THP (Transparent Huge Page) is failed to be split. In this patch, it
is enhanced to clear the swap cach
From: Huang Ying <ying.hu...@intel.com>
Hi, Andrew, could you help me to check whether the overall design is
reasonable?
Hi, Johannes and Minchan, Thanks a lot for your review to the first
step of the THP swap optimization! Could you help me to review the
second step in this patchset?
Hi
Andrew Morton <a...@linux-foundation.org> writes:
> On Fri, 23 Jun 2017 15:12:51 +0800 "Huang, Ying" <ying.hu...@intel.com> wrote:
>
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> Hi, Andrew, could you help me to check whether the
Steven Rostedt <rost...@goodmis.org> writes:
> On Mon, 24 Jul 2017 13:46:07 +0800
> "Huang\, Ying" <ying.hu...@intel.com> wrote:
>
>> Hi, Steven,
>>
>> We are working on parallelizing secondary CPU bootup. So we need to
>> measure the bo
From: Huang Ying <ying.hu...@intel.com>
The swap readahead is an important mechanism to reduce the swap in
latency. Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.
In the original swap readahead implemen
From: Huang Ying <ying.hu...@intel.com>
VMA based swap readahead will readahead the virtual pages that is
continuous in the virtual address space. While the original swap
readahead will readahead the swap slots that is continuous in the swap
device. Although VMA based swap readahead i
From: Huang Ying <ying.hu...@intel.com>
The swap cache stats could be gotten only via sysrq, which isn't
convenient in some situation. So the sysfs interface of swap cache
stats is added for that. The added sysfs directories/files are as
follow,
/sys/kernel/mm/swap
/sys/kernel/m
From: Huang Ying <ying.hu...@intel.com>
The sysfs interface to control the VMA based swap readahead is added
as follow,
/sys/kernel/mm/swap/vma_ra_enabled
Enable the VMA based swap readahead algorithm, or use the original
global swap readahead algorithm.
/sys/kernel/mm/swap/vma_ra_max
From: Huang Ying <ying.hu...@intel.com>
The statistics for total readahead pages and total readahead hits are
recorded and exported via the following sysfs interface.
/sys/kernel/mm/swap/ra_hits
/sys/kernel/mm/swap/ra_total
With them, the efficiency of the swap readahead could be measur
From: Huang Ying <ying.hu...@intel.com>
In the original implementation, it is possible that the existing pages
in the swap cache (not newly readahead) could be marked as the
readahead pages. This will cause the statistics of swap readahead be
wrong and influence the swap readahead algorit
is high, shows that the space
locality is still valid in some practical workloads.
Changelogs:
v3:
- Rebased on latest -mm tree
- Use percpu_counter for swap readahead statistics per Dave Hansen's comment.
Best Regards,
Huang, Ying
The swap readahead is an important mechanism to reduce the swap in
latency. Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.
In the original swap readahead implementation, the consecutive blocks
in swap device
From: Huang Ying <ying.hu...@intel.com>
The swap cache stats could be gotten only via sysrq, which isn't
convenient in some situation. So the sysfs interface of swap cache
stats is added for that. The added sysfs directories/files are as
follow,
/sys/kernel/mm/swap
/sys/kernel/m
From: Huang Ying <ying.hu...@intel.com>
The sysfs interface to control the VMA based swap readahead is added
as follow,
/sys/kernel/mm/swap/vma_ra_enabled
Enable the VMA based swap readahead algorithm, or use the original
global swap readahead algorithm.
/sys/kernel/mm/swap/vma_ra_max
From: Huang Ying <ying.hu...@intel.com>
The swap readahead is an important mechanism to reduce the swap in
latency. Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.
In the original swap readahead implemen
From: Huang Ying <ying.hu...@intel.com>
VMA based swap readahead will readahead the virtual pages that is
continuous in the virtual address space. While the original swap
readahead will readahead the swap slots that is continuous in the swap
device. Although VMA based swap readahead i
From: Huang Ying <ying.hu...@intel.com>
The statistics for total readahead pages and total readahead hits are
recorded and exported via the following sysfs interface.
/sys/kernel/mm/swap/ra_hits
/sys/kernel/mm/swap/ra_total
With them, the efficiency of the swap readahead could be measur
From: Huang Ying <ying.hu...@intel.com>
In the original implementation, it is possible that the existing pages
in the swap cache (not newly readahead) could be marked as the
readahead pages. This will cause the statistics of swap readahead be
wrong and influence the swap readahead algorit
!
Hi, Johannes,
Do you have time to take a look at this patchset?
Best Regards,
Huang, Ying
[snip]
Minchan Kim <minc...@kernel.org> writes:
> On Fri, Apr 21, 2017 at 08:29:30PM +0800, Huang, Ying wrote:
>> "Huang, Ying" <ying.hu...@intel.com> writes:
>>
>> > Minchan Kim <minc...@kernel.org> writes:
>> >
>> >> On Wed
Tim Chen <tim.c.c...@linux.intel.com> writes:
>>
>> From 7bd903c42749c448ef6acbbdee8dcbc1c5b498b9 Mon Sep 17 00:00:00 2001
>> From: Huang Ying <ying.hu...@intel.com>
>> Date: Thu, 23 Feb 2017 13:05:20 +0800
>> Subject: [PATCH -v5] mm, swap: Sort swap
Minchan Kim <minc...@kernel.org> writes:
> On Tue, Apr 25, 2017 at 08:56:56PM +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> In this patch, splitting huge page is delayed from almost the first
>> step of swapping out to after a
= 0; i < SWAPFILE_CLUSTER; i++) {
>> > + VM_BUG_ON(map[i] != SWAP_HAS_CACHE);
>> > + map[i] = 0;
>> > + }
>> > + unlock_cluster(ci);
>> > + mem_cgroup_uncharge_swap(entry, SWAPFILE_CLUSTER);
>> > + swap_free_cluster(si, idx);
>> > + spin_unlock(>lock);
>> > +}
>> > +#endif /* CONFIG_THP_SWAP */
>> > +
>> > static int swp_entry_cmp(const void *ent1, const void *ent2)
>> > {
>> >const swp_entry_t *e1 = ent1, *e2 = ent2;
>>
>>
>> This is a massive patch, I presume you've got recommendations to keep it
>> this way?
>
> It used to be split into patches that introduce API and helpers on one
> hand and patches that use these functions on the other hand. That was
> impossible to review, because you had to jump between emails.
>
> If you have ideas about which parts could be split out and be
> stand-alone changes in their own right, I'd be all for that.
Best Regards,
Huang, Ying
From: Huang Ying <ying.hu...@intel.com>
This patchset is to optimize the performance of Transparent Huge Page
(THP) swap.
Recently, the performance of the storage devices improved so fast that
we cannot saturate the disk bandwidth with single logical CPU when do
page swap out even on a hi
From: Huang Ying <ying.hu...@intel.com>
In this patch, splitting huge page is delayed from almost the first
step of swapping out to after allocating the swap space for the
THP (Transparent Huge Page) and adding the THP into the swap cache.
This will batch the corresponding operation, thus i
From: Huang Ying <ying.hu...@intel.com>
To swap out THP (Transparent Huage Page), before splitting the THP,
the swap cluster will be allocated and the THP will be added into the
swap cache. But it is possible that the THP cannot be split, so that
we must delete the THP from the swap
From: Huang Ying <ying.hu...@intel.com>
If there is no compound map for a THP (Transparent Huge Page), it is
possible that the map count of some sub-pages of the THP is 0. So it
is better to split the THP before swapping out. In this way, the
sub-pages not mapped will be freed, and we can
Minchan Kim <minc...@kernel.org> writes:
> On Fri, Apr 28, 2017 at 04:05:26PM +0800, Huang, Ying wrote:
>> Minchan Kim <minc...@kernel.org> writes:
>>
>> > On Fri, Apr 28, 2017 at 09:09:53AM +0800, Huang, Ying wrote:
>> >> Minchan Kim <minc..
Minchan Kim <minc...@kernel.org> writes:
> On Thu, Apr 27, 2017 at 03:12:34PM +0800, Huang, Ying wrote:
>> Minchan Kim <minc...@kernel.org> writes:
>>
>> > On Tue, Apr 25, 2017 at 08:56:56PM +0800, Huang, Ying wrote:
>> >> From: Huang Ying &
"Huang, Ying" <ying.hu...@intel.com> writes:
> Minchan Kim <minc...@kernel.org> writes:
>
>> On Fri, Apr 28, 2017 at 04:05:26PM +0800, Huang, Ying wrote:
>>> Minchan Kim <minc...@kernel.org> writes:
>>>
>>> > On Fri, Apr 28,
Minchan Kim <minc...@kernel.org> writes:
> On Wed, Apr 26, 2017 at 08:42:10PM +0800, Huang, Ying wrote:
>> Minchan Kim <minc...@kernel.org> writes:
>>
>> > On Fri, Apr 21, 2017 at 08:29:30PM +0800, Huang, Ying wrote:
>> >> "Huang, Ying"
Minchan Kim <minc...@kernel.org> writes:
> On Fri, Apr 28, 2017 at 09:09:53AM +0800, Huang, Ying wrote:
>> Minchan Kim <minc...@kernel.org> writes:
>>
>> > On Wed, Apr 26, 2017 at 08:42:10PM +0800, Huang, Ying wrote:
>> >> Minchan Kim <minc..
Peter Zijlstra <pet...@infradead.org> writes:
> On Fri, Aug 04, 2017 at 10:05:55AM +0800, Huang, Ying wrote:
>> "Huang, Ying" <ying.hu...@intel.com> writes:
>> > Peter Zijlstra <pet...@infradead.org> writes:
>
>> >>
Matthew Wilcox <wi...@infradead.org> writes:
> On Mon, Aug 07, 2017 at 03:21:31PM +0800, Huang, Ying wrote:
>> @@ -2509,7 +2509,8 @@ enum mf_action_page_type {
>> #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
>> extern void clear_
"Huang, Ying" <ying.hu...@intel.com> writes:
> "Kirill A. Shutemov" <kir...@shutemov.name> writes:
>
>> On Mon, Aug 07, 2017 at 03:21:31PM +0800, Huang, Ying wrote:
>>> From: Huang Ying <ying.hu...@intel.com>
>>>
>>> H
Mike Kravetz <mike.krav...@oracle.com> writes:
> On 08/07/2017 12:21 AM, Huang, Ying wrote:
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> Huge page helps to reduce TLB miss rate, but it has higher cache
>> footprint, sometimes this may cause some iss
Christopher Lameter <c...@linux.com> writes:
> On Mon, 7 Aug 2017, Huang, Ying wrote:
>
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -4374,9 +4374,31 @@ void clear_huge_page(struct page *page,
>> }
>>
>> might_sleep()
Peter Zijlstra <pet...@infradead.org> writes:
> On Sat, Aug 05, 2017 at 08:47:02AM +0800, Huang, Ying wrote:
>> Yes. That looks good. So you will prepare the final patch? Or you
>> hope me to do that?
>
> I was hoping you'd do it ;-)
Thanks! Here is the updated
[ 113.336121] ---[ end trace 2cd503b4980b0afc ]---
> [ 113.341281] Kernel panic - not syncing: Fatal exception
> [ 113.347398] Kernel Offset: 0x700 from 0x8100 (relocation
> range: 0x8000-0xbfff)
Thanks for reporting! Do you test it on a HDD? I ca
looks like a false positive reporting and not reported by my
compiler and kbuild compiler (gcc-6). But anyway, we should silence it.
Best Regards,
Huang, Ying
-->8--
>From 7a7ff76d7bcbd7affda169b29abcf3dafa38052e Mon Sep 17 00:00:00 2001
From: Huang Ying <ying.hu...@in
Hi, Andrew,
Andrew Morton <a...@linux-foundation.org> writes:
> On Mon, 7 Aug 2017 15:21:31 +0800 "Huang, Ying" <ying.hu...@intel.com> wrote:
>
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> Huge page helps to reduce TLB miss rate, but
Andrew Morton <a...@linux-foundation.org> writes:
> On Mon, 7 Aug 2017 13:40:34 +0800 "Huang, Ying" <ying.hu...@intel.com> wrote:
>
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> The statistics for total readahead pages and total readahead
"Kirill A. Shutemov" <kir...@shutemov.name> writes:
> On Mon, Aug 07, 2017 at 03:21:31PM +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> Huge page helps to reduce TLB miss rate, but it has higher cache
>> footprint,
Christopher Lameter <c...@linux.com> writes:
> On Mon, 7 Aug 2017, Huang, Ying wrote:
>
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -4374,9 +4374,31 @@ void clear_huge_page(struct page *page,
>> }
>>
>> might_sleep()
Hi, Andrew,
Andrew Morton <a...@linux-foundation.org> writes:
> On Tue, 25 Jul 2017 09:51:51 +0800 "Huang, Ying" <ying.hu...@intel.com> wrote:
>
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> VMA based swap readahead will readahead th
Andrew Morton <a...@linux-foundation.org> writes:
> On Tue, 25 Jul 2017 09:51:46 +0800 "Huang, Ying" <ying.hu...@intel.com> wrote:
>
>> The swap cache stats could be gotten only via sysrq, which isn't
>> convenient in some situation. So the sysfs i
Hi, Rik,
Rik van Riel <r...@redhat.com> writes:
> On Tue, 2017-07-25 at 09:51 +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> The swap cache stats could be gotten only via sysrq, which isn't
>> convenient in some situation
From: Huang Ying <ying.hu...@intel.com>
VMA based swap readahead will readahead the virtual pages that is
continuous in the virtual address space. While the original swap
readahead will readahead the swap slots that is continuous in the swap
device. Although VMA based swap readahead i
From: Huang Ying <ying.hu...@intel.com>
Huge page helps to reduce TLB miss rate, but it has higher cache
footprint, sometimes this may cause some issue. For example, when
clearing huge page on x86_64 platform, the cache footprint is 2M. But
on a Xeon E5 v3 2699 CPU, there are 18 cor
From: Huang Ying <ying.hu...@intel.com>
The sysfs interface to control the VMA based swap readahead is added
as follow,
/sys/kernel/mm/swap/vma_ra_enabled
Enable the VMA based swap readahead algorithm, or use the original
global swap readahead algorithm.
/sys/kernel/mm/swap/vma_ra_max
From: Huang Ying <ying.hu...@intel.com>
The statistics for total readahead pages and total readahead hits are
recorded and exported via the following sysfs interface.
/sys/kernel/mm/swap/ra_hits
/sys/kernel/mm/swap/ra_total
With them, the efficiency of the swap readahead could be measur
From: Huang Ying <ying.hu...@intel.com>
The swap readahead is an important mechanism to reduce the swap in
latency. Although pure sequential memory access pattern isn't very
popular for anonymous memory, the space locality is still considered
valid.
In the original swap readahead implemen
statistics, because that is the
interface used by other similar statistics.
- Add ABI document for newly added sysfs interface.
v3:
- Rebased on latest -mm tree
- Use percpu_counter for swap readahead statistics per Dave Hansen's comment.
Best Regards,
Huang, Ying
From: Huang Ying <ying.hu...@intel.com>
In the original implementation, it is possible that the existing pages
in the swap cache (not newly readahead) could be marked as the
readahead pages. This will cause the statistics of swap readahead be
wrong and influence the swap readahead algorit
Jan Kara <j...@suse.cz> writes:
> On Mon 07-08-17 15:21:31, Huang, Ying wrote:
>> From: Huang Ying <ying.hu...@intel.com>
>>
>> Huge page helps to reduce TLB miss rate, but it has higher cache
>> footprint, sometimes this may cause some issue. For exampl
ust want to get the latest status.
Best Regards,
Huang, Ying
From: Huang Ying <ying.hu...@intel.com>
In this patch, splitting huge page is delayed from almost the first
step of swapping out to after allocating the swap space for the
THP (Transparent Huge Page) and adding the THP into the swap cache.
This will batch the corresponding operation, thus i
From: Huang Ying <ying.hu...@intel.com>
If there is no compound map for a THP (Transparent Huge Page), it is
possible that the map count of some sub-pages of the THP is 0. So it
is better to split the THP before swapping out. In this way, the
sub-pages not mapped will be freed, and we can
From: Huang Ying <ying.hu...@intel.com>
This patchset is to optimize the performance of Transparent Huge Page
(THP) swap.
Recently, the performance of the storage devices improved so fast that
we cannot saturate the disk bandwidth with single logical CPU when do
page swap out even on a hi
From: Huang Ying <ying.hu...@intel.com>
To swap out THP (Transparent Huage Page), before splitting the THP,
the swap cluster will be allocated and the THP will be added into the
swap cache. But it is possible that the THP cannot be split, so that
we must delete the THP from the swap
h base page which is more natural.
Acked-by: Johannes Weiner <han...@cmpxchg.org>
Signed-off-by: Minchan Kim <minc...@kernel.org>
Signed-off-by: "Huang, Ying" <ying.hu...@intel.com>
---
include/linux/swap.h | 4 ++--
mm/swap_state.c | 23 ++--
nction
depending on page's size.
[ying.hu...@intel.com: minor cleanup and fix]
Acked-by: Johannes Weiner <han...@cmpxchg.org>
Signed-off-by: Minchan Kim <minc...@kernel.org>
Signed-off-by: "Huang, Ying" <ying.hu...@intel.com>
---
include/linux/swap.h | 12 ++--
ansHuge(page) &&
>> > +split_huge_page_to_list(page, page_list)) {
>> > + delete_from_swap_cache(page);
>> >goto activate_locked;
>> > + }
>>
>> Pulling this out of add_to_swap() is an improvement for sure. Add an
>> XXX: before that "we don't support THP writes" comment for good
>> measure :)
>
> Sure.
>
> It could be a separate patch which makes add_to_swap clean via
> removing page_list argument but I hope Huang take/fold it when he
> resend it because it would be more important with THP swap.
Sure. I will take this patch as one patch of the THP swap series.
Because the first patch of the THP swap series is a little big, I don't
think it is a good idea to fold this patch into it. Could you update
the patch according to Johannes' comments and resend it?
Best Regards,
Huang, Ying
From: Huang Ying <ying.hu...@intel.com>
The normal swap slot reclaiming can be done when the swap count
reaches SWAP_HAS_CACHE. But for the swap slot which is backing a THP,
all swap slots backing one THP must be reclaimed together, because the
swap slot may be used again when the THP is s
From: Huang Ying <ying.hu...@intel.com>
To support to delay splitting THP (Transparent Huge Page) after
swapped out. We need to enhance swap writing code to support to write
a THP as a whole. This will improve swap write IO performance. As
Ming Lei <ming@redhat.com>
From: Huang Ying <ying.hu...@intel.com>
After supporting to delay THP (Transparent Huge Page) splitting after
swapped out, it is possible that some page table mappings of the THP
are turned into swap entries. So reuse_swap_page() need to check the
swap count in addition to the map
From: Huang Ying <ying.hu...@intel.com>
Previously, swapcache_free_cluster() is used only in the error path of
shrink_page_list() to free the swap cluster just allocated if the
THP (Transparent Huge Page) is failed to be split. In this patch, it
is enhanced to clear the swap cach
901 - 1000 of 3349 matches
Mail list logo