Re: [PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-30 Thread Huang, Ying
Hillf Danton  writes:

> On Friday, October 28, 2016 1:56 PM Huang, Ying wrote: 
>> @@ -2016,10 +2021,12 @@ int page_trans_huge_mapcount(struct page *page, int 
>> *total_mapcount)
>>  /* Racy check whether the huge page can be split */
>>  bool can_split_huge_page(struct page *page)
>>  {
>> -int extra_pins = 0;
>> +int extra_pins;
>> 
>>  /* Additional pins from radix tree */
>> -if (!PageAnon(page))
>> +if (PageAnon(page))
>> +extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
>> +else
>>  extra_pins = HPAGE_PMD_NR;
>
> extra_pins is computed in this newly added helper.
>
>>  return total_mapcount(page) == page_count(page) - extra_pins - 1;
>>  }
>> @@ -2072,7 +2079,7 @@ int split_huge_page_to_list(struct page *page, struct 
>> list_head *list)
>>  ret = -EBUSY;
>>  goto out;
>>  }
>> -extra_pins = 0;
>> +extra_pins = PageSwapCache(head) ? HPAGE_PMD_NR : 0;
>
> It is also computed at the call site, so can we fold them into one?

Sounds reasonable.  I will add another argument to can_split_huge_page()
to return extra_pins, so we can avoid duplicated code and calculation.

Best Regards,
Huang, Ying

>>  mapping = NULL;
>>  anon_vma_lock_write(anon_vma);
>>  } else {
>> --
>> 2.9.3


Re: [PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-30 Thread Huang, Ying
Hillf Danton  writes:

> On Friday, October 28, 2016 1:56 PM Huang, Ying wrote: 
>> @@ -2016,10 +2021,12 @@ int page_trans_huge_mapcount(struct page *page, int 
>> *total_mapcount)
>>  /* Racy check whether the huge page can be split */
>>  bool can_split_huge_page(struct page *page)
>>  {
>> -int extra_pins = 0;
>> +int extra_pins;
>> 
>>  /* Additional pins from radix tree */
>> -if (!PageAnon(page))
>> +if (PageAnon(page))
>> +extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
>> +else
>>  extra_pins = HPAGE_PMD_NR;
>
> extra_pins is computed in this newly added helper.
>
>>  return total_mapcount(page) == page_count(page) - extra_pins - 1;
>>  }
>> @@ -2072,7 +2079,7 @@ int split_huge_page_to_list(struct page *page, struct 
>> list_head *list)
>>  ret = -EBUSY;
>>  goto out;
>>  }
>> -extra_pins = 0;
>> +extra_pins = PageSwapCache(head) ? HPAGE_PMD_NR : 0;
>
> It is also computed at the call site, so can we fold them into one?

Sounds reasonable.  I will add another argument to can_split_huge_page()
to return extra_pins, so we can avoid duplicated code and calculation.

Best Regards,
Huang, Ying

>>  mapping = NULL;
>>  anon_vma_lock_write(anon_vma);
>>  } else {
>> --
>> 2.9.3


Re: [PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-28 Thread Hillf Danton
On Friday, October 28, 2016 1:56 PM Huang, Ying wrote: 
> @@ -2016,10 +2021,12 @@ int page_trans_huge_mapcount(struct page *page, int 
> *total_mapcount)
>  /* Racy check whether the huge page can be split */
>  bool can_split_huge_page(struct page *page)
>  {
> - int extra_pins = 0;
> + int extra_pins;
> 
>   /* Additional pins from radix tree */
> - if (!PageAnon(page))
> + if (PageAnon(page))
> + extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
> + else
>   extra_pins = HPAGE_PMD_NR;

extra_pins is computed in this newly added helper.

>   return total_mapcount(page) == page_count(page) - extra_pins - 1;
>  }
> @@ -2072,7 +2079,7 @@ int split_huge_page_to_list(struct page *page, struct 
> list_head *list)
>   ret = -EBUSY;
>   goto out;
>   }
> - extra_pins = 0;
> + extra_pins = PageSwapCache(head) ? HPAGE_PMD_NR : 0;

It is also computed at the call site, so can we fold them into one?

>   mapping = NULL;
>   anon_vma_lock_write(anon_vma);
>   } else {
> --
> 2.9.3



Re: [PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-28 Thread Hillf Danton
On Friday, October 28, 2016 1:56 PM Huang, Ying wrote: 
> @@ -2016,10 +2021,12 @@ int page_trans_huge_mapcount(struct page *page, int 
> *total_mapcount)
>  /* Racy check whether the huge page can be split */
>  bool can_split_huge_page(struct page *page)
>  {
> - int extra_pins = 0;
> + int extra_pins;
> 
>   /* Additional pins from radix tree */
> - if (!PageAnon(page))
> + if (PageAnon(page))
> + extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
> + else
>   extra_pins = HPAGE_PMD_NR;

extra_pins is computed in this newly added helper.

>   return total_mapcount(page) == page_count(page) - extra_pins - 1;
>  }
> @@ -2072,7 +2079,7 @@ int split_huge_page_to_list(struct page *page, struct 
> list_head *list)
>   ret = -EBUSY;
>   goto out;
>   }
> - extra_pins = 0;
> + extra_pins = PageSwapCache(head) ? HPAGE_PMD_NR : 0;

It is also computed at the call site, so can we fold them into one?

>   mapping = NULL;
>   anon_vma_lock_write(anon_vma);
>   } else {
> --
> 2.9.3



[PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-27 Thread Huang, Ying
From: Huang Ying 

This patch enhanced the split_huge_page_to_list() to work properly for
the THP (Transparent Huge Page) in the swap cache during swapping out.

This is used for delaying splitting the THP during swapping out.  Where
for a THP to be swapped out, we will allocate a swap cluster, add the
THP into the swap cache, then split the THP.  The page lock will be held
during this process.  So in the code path other than swapping out, if
the THP need to be split, the PageSwapCache(THP) will be always false.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Ebru Akagunduz 
Signed-off-by: "Huang, Ying" 
---
 mm/huge_memory.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 326b145..199eaba 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1831,7 +1831,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
 * atomic_set() here would be safe on all archs (and not only on x86),
 * it's safer to use atomic_inc()/atomic_add().
 */
-   if (PageAnon(head)) {
+   if (PageAnon(head) && !PageSwapCache(head)) {
page_ref_inc(page_tail);
} else {
/* Additional pin to radix tree */
@@ -1842,6 +1842,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
page_tail->flags |= (head->flags &
((1L << PG_referenced) |
 (1L << PG_swapbacked) |
+(1L << PG_swapcache) |
 (1L << PG_mlocked) |
 (1L << PG_uptodate) |
 (1L << PG_active) |
@@ -1904,7 +1905,11 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
ClearPageCompound(head);
/* See comment in __split_huge_page_tail() */
if (PageAnon(head)) {
-   page_ref_inc(head);
+   /* Additional pin to radix tree of swap cache */
+   if (PageSwapCache(head))
+   page_ref_add(head, 2);
+   else
+   page_ref_inc(head);
} else {
/* Additional pin to radix tree */
page_ref_add(head, 2);
@@ -2016,10 +2021,12 @@ int page_trans_huge_mapcount(struct page *page, int 
*total_mapcount)
 /* Racy check whether the huge page can be split */
 bool can_split_huge_page(struct page *page)
 {
-   int extra_pins = 0;
+   int extra_pins;
 
/* Additional pins from radix tree */
-   if (!PageAnon(page))
+   if (PageAnon(page))
+   extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
+   else
extra_pins = HPAGE_PMD_NR;
return total_mapcount(page) == page_count(page) - extra_pins - 1;
 }
@@ -2072,7 +2079,7 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
ret = -EBUSY;
goto out;
}
-   extra_pins = 0;
+   extra_pins = PageSwapCache(head) ? HPAGE_PMD_NR : 0;
mapping = NULL;
anon_vma_lock_write(anon_vma);
} else {
-- 
2.9.3



[PATCH -v4 RESEND 8/9] mm, THP, swap: Support to split THP in swap cache

2016-10-27 Thread Huang, Ying
From: Huang Ying 

This patch enhanced the split_huge_page_to_list() to work properly for
the THP (Transparent Huge Page) in the swap cache during swapping out.

This is used for delaying splitting the THP during swapping out.  Where
for a THP to be swapped out, we will allocate a swap cluster, add the
THP into the swap cache, then split the THP.  The page lock will be held
during this process.  So in the code path other than swapping out, if
the THP need to be split, the PageSwapCache(THP) will be always false.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Ebru Akagunduz 
Signed-off-by: "Huang, Ying" 
---
 mm/huge_memory.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 326b145..199eaba 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1831,7 +1831,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
 * atomic_set() here would be safe on all archs (and not only on x86),
 * it's safer to use atomic_inc()/atomic_add().
 */
-   if (PageAnon(head)) {
+   if (PageAnon(head) && !PageSwapCache(head)) {
page_ref_inc(page_tail);
} else {
/* Additional pin to radix tree */
@@ -1842,6 +1842,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
page_tail->flags |= (head->flags &
((1L << PG_referenced) |
 (1L << PG_swapbacked) |
+(1L << PG_swapcache) |
 (1L << PG_mlocked) |
 (1L << PG_uptodate) |
 (1L << PG_active) |
@@ -1904,7 +1905,11 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
ClearPageCompound(head);
/* See comment in __split_huge_page_tail() */
if (PageAnon(head)) {
-   page_ref_inc(head);
+   /* Additional pin to radix tree of swap cache */
+   if (PageSwapCache(head))
+   page_ref_add(head, 2);
+   else
+   page_ref_inc(head);
} else {
/* Additional pin to radix tree */
page_ref_add(head, 2);
@@ -2016,10 +2021,12 @@ int page_trans_huge_mapcount(struct page *page, int 
*total_mapcount)
 /* Racy check whether the huge page can be split */
 bool can_split_huge_page(struct page *page)
 {
-   int extra_pins = 0;
+   int extra_pins;
 
/* Additional pins from radix tree */
-   if (!PageAnon(page))
+   if (PageAnon(page))
+   extra_pins = PageSwapCache(page) ? HPAGE_PMD_NR : 0;
+   else
extra_pins = HPAGE_PMD_NR;
return total_mapcount(page) == page_count(page) - extra_pins - 1;
 }
@@ -2072,7 +2079,7 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
ret = -EBUSY;
goto out;
}
-   extra_pins = 0;
+   extra_pins = PageSwapCache(head) ? HPAGE_PMD_NR : 0;
mapping = NULL;
anon_vma_lock_write(anon_vma);
} else {
-- 
2.9.3