Re: [PATCH v6 4/6] mm/swapcache: support to handle the exceptional entries in swapcache

2020-06-25 Thread Joonsoo Kim
2020년 6월 19일 (금) 오전 10:33, Joonsoo Kim 님이 작성:
>
> On Wed, Jun 17, 2020 at 05:17:17AM -0700, Matthew Wilcox wrote:
> > On Wed, Jun 17, 2020 at 02:26:21PM +0900, js1...@gmail.com wrote:
> > > From: Joonsoo Kim 
> > >
> > > Swapcache doesn't handle the exceptional entries since there is no case
> >
> > Don't call them exceptional entries.
> >
> > The radix tree has/had the concecpt of exceptional entries.  The swapcache
> > doesn't use the radix tree any more, it uses the XArray.  The XArray
> > has value entries.
> >
> > But you shouldn't call them value entries either; that's an XArray
> > concept.  The swap cache and indeed page cache use value entries to
> > implement shadow entries (they're also used to implement dax entries and
> > swap entries in the page cache).  So just call them shadow entries here.
> >
> > I know there are still places which use the term 'nrexceptional' in
> > the kernel.  I just haven't got round to replacing them yet.  Please
> > don't add more.
>
> Okay! Thanks for commenting.
>
> >
> > > +void clear_shadow_from_swap_cache(int type, unsigned long begin,
> > > +   unsigned long end)
> > > +{
> > > +   unsigned long curr;
> > > +   void *old;
> > > +   swp_entry_t entry = swp_entry(type, begin);
> > > +   struct address_space *address_space = swap_address_space(entry);
> > > +   XA_STATE(xas, _space->i_pages, begin);
> > > +
> > > +retry:
> > > +   xa_lock_irq(_space->i_pages);
> > > +   for (curr = begin; curr <= end; curr++) {
> > > +   entry = swp_entry(type, curr);
> > > +   if (swap_address_space(entry) != address_space) {
> > > +   xa_unlock_irq(_space->i_pages);
> > > +   address_space = swap_address_space(entry);
> > > +   begin = curr;
> > > +   xas_set(, begin);
> > > +   goto retry;
> > > +   }
> > > +
> > > +   old = xas_load();
> > > +   if (!xa_is_value(old))
> > > +   continue;
> > > +   xas_store(, NULL);
> > > +   address_space->nrexceptional--;
> > > +   xas_next();
> > > +   }
> > > +   xa_unlock_irq(_space->i_pages);
> > > +}
> >
> > This is a very clunky loop.  I'm not sure it's even right, given that
> > you change address space without changing the xas's address space.  How
> > about this?
>
> You are correct. The xas's address space should be changed.
>
>
> >   for (;;) {
> >   XA_STATE(xas, _space->i_pages, begin);
> >   unsigned long nr_shadows = 0;
> >
> >   xas_lock_irq();
> >   xas_for_each(, entry, end) {
> >   if (!xa_is_value(entry))
> >   continue;
> >   xas_store(, NULL);
> >   nr_shadows++;
> >   }
> >   address_space->nr_exceptionals -= nr_shadows;
> >   xas_unlock_irq();
> >
> >   if (xas.xa_index >= end)
> >   break;
> >   entry = swp_entry(type, xas.xa_index);
> >   address_space = swap_address_space(entry);
> >   }
>
> Thanks for suggestion.
>
> I make a patch based on your suggestion. IIUC about Xarray,
> after running xas_for_each(), xas.xa_index can be less than the end if
> there are empty slots on last portion of array. Handling this case is
> also considered in my patch.

Hello, Matthew.

Could you check if the following patch (Xarray part) is correct?
Since I made a patch based on your suggestion, I'd like to get your review. :)

Thanks.

> Thanks.
>
> --->8
> From 72e97600ea294372a13ab8e208ebd3c0e1889408 Mon Sep 17 00:00:00 2001
> From: Joonsoo Kim 
> Date: Fri, 15 Nov 2019 09:48:32 +0900
> Subject: [PATCH v6 4/6] mm/swapcache: support to handle the shadow entries
>
> Workingset detection for anonymous page will be implemented in the
> following patch and it requires to store the shadow entries into the
> swapcache. This patch implements an infrastructure to store the shadow
> entry in the swapcache.
>
> Acked-by: Johannes Weiner 
> Signed-off-by: Joonsoo Kim 
> ---
>  include/linux/swap.h | 17 
>  mm/shmem.c   |  3 ++-
>  mm/swap_state.c  | 57 
> ++--
>  mm/swapfile.c|  2 ++
>  mm/vmscan.c  |  2 +-
>  5 files changed, 69 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index f4f5f94..901da54 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -416,9 +416,13 @@ extern struct address_space *swapper_spaces[];
>  extern unsigned long total_swapcache_pages(void);
>  extern void show_swap_cache_info(void);
>  extern int add_to_swap(struct page *page);
> -extern int add_to_swap_cache(struct page *, swp_entry_t, gfp_t);
> -extern void __delete_from_swap_cache(struct page *, swp_entry_t entry);
> +extern int 

Re: [PATCH v6 4/6] mm/swapcache: support to handle the exceptional entries in swapcache

2020-06-18 Thread Joonsoo Kim
On Wed, Jun 17, 2020 at 05:17:17AM -0700, Matthew Wilcox wrote:
> On Wed, Jun 17, 2020 at 02:26:21PM +0900, js1...@gmail.com wrote:
> > From: Joonsoo Kim 
> > 
> > Swapcache doesn't handle the exceptional entries since there is no case
> 
> Don't call them exceptional entries.
> 
> The radix tree has/had the concecpt of exceptional entries.  The swapcache
> doesn't use the radix tree any more, it uses the XArray.  The XArray
> has value entries.
> 
> But you shouldn't call them value entries either; that's an XArray
> concept.  The swap cache and indeed page cache use value entries to
> implement shadow entries (they're also used to implement dax entries and
> swap entries in the page cache).  So just call them shadow entries here.
> 
> I know there are still places which use the term 'nrexceptional' in
> the kernel.  I just haven't got round to replacing them yet.  Please
> don't add more.

Okay! Thanks for commenting.

> 
> > +void clear_shadow_from_swap_cache(int type, unsigned long begin,
> > +   unsigned long end)
> > +{
> > +   unsigned long curr;
> > +   void *old;
> > +   swp_entry_t entry = swp_entry(type, begin);
> > +   struct address_space *address_space = swap_address_space(entry);
> > +   XA_STATE(xas, _space->i_pages, begin);
> > +
> > +retry:
> > +   xa_lock_irq(_space->i_pages);
> > +   for (curr = begin; curr <= end; curr++) {
> > +   entry = swp_entry(type, curr);
> > +   if (swap_address_space(entry) != address_space) {
> > +   xa_unlock_irq(_space->i_pages);
> > +   address_space = swap_address_space(entry);
> > +   begin = curr;
> > +   xas_set(, begin);
> > +   goto retry;
> > +   }
> > +
> > +   old = xas_load();
> > +   if (!xa_is_value(old))
> > +   continue;
> > +   xas_store(, NULL);
> > +   address_space->nrexceptional--;
> > +   xas_next();
> > +   }
> > +   xa_unlock_irq(_space->i_pages);
> > +}
> 
> This is a very clunky loop.  I'm not sure it's even right, given that
> you change address space without changing the xas's address space.  How
> about this?

You are correct. The xas's address space should be changed.


>   for (;;) {
>   XA_STATE(xas, _space->i_pages, begin);
>   unsigned long nr_shadows = 0;
> 
>   xas_lock_irq();
>   xas_for_each(, entry, end) {
>   if (!xa_is_value(entry))
>   continue;
>   xas_store(, NULL);
>   nr_shadows++;
>   }
>   address_space->nr_exceptionals -= nr_shadows;
>   xas_unlock_irq();
> 
>   if (xas.xa_index >= end)
>   break;
>   entry = swp_entry(type, xas.xa_index);
>   address_space = swap_address_space(entry);
>   }

Thanks for suggestion.

I make a patch based on your suggestion. IIUC about Xarray,
after running xas_for_each(), xas.xa_index can be less than the end if
there are empty slots on last portion of array. Handling this case is
also considered in my patch.

Thanks.

--->8
>From 72e97600ea294372a13ab8e208ebd3c0e1889408 Mon Sep 17 00:00:00 2001
From: Joonsoo Kim 
Date: Fri, 15 Nov 2019 09:48:32 +0900
Subject: [PATCH v6 4/6] mm/swapcache: support to handle the shadow entries

Workingset detection for anonymous page will be implemented in the
following patch and it requires to store the shadow entries into the
swapcache. This patch implements an infrastructure to store the shadow
entry in the swapcache.

Acked-by: Johannes Weiner 
Signed-off-by: Joonsoo Kim 
---
 include/linux/swap.h | 17 
 mm/shmem.c   |  3 ++-
 mm/swap_state.c  | 57 ++--
 mm/swapfile.c|  2 ++
 mm/vmscan.c  |  2 +-
 5 files changed, 69 insertions(+), 12 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index f4f5f94..901da54 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -416,9 +416,13 @@ extern struct address_space *swapper_spaces[];
 extern unsigned long total_swapcache_pages(void);
 extern void show_swap_cache_info(void);
 extern int add_to_swap(struct page *page);
-extern int add_to_swap_cache(struct page *, swp_entry_t, gfp_t);
-extern void __delete_from_swap_cache(struct page *, swp_entry_t entry);
+extern int add_to_swap_cache(struct page *page, swp_entry_t entry,
+   gfp_t gfp, void **shadowp);
+extern void __delete_from_swap_cache(struct page *page,
+   swp_entry_t entry, void *shadow);
 extern void delete_from_swap_cache(struct page *);
+extern void clear_shadow_from_swap_cache(int type, unsigned long begin,
+   unsigned long end);
 extern void free_page_and_swap_cache(struct page *);
 extern void 

Re: [PATCH v6 4/6] mm/swapcache: support to handle the exceptional entries in swapcache

2020-06-17 Thread Matthew Wilcox
On Wed, Jun 17, 2020 at 02:26:21PM +0900, js1...@gmail.com wrote:
> From: Joonsoo Kim 
> 
> Swapcache doesn't handle the exceptional entries since there is no case

Don't call them exceptional entries.

The radix tree has/had the concecpt of exceptional entries.  The swapcache
doesn't use the radix tree any more, it uses the XArray.  The XArray
has value entries.

But you shouldn't call them value entries either; that's an XArray
concept.  The swap cache and indeed page cache use value entries to
implement shadow entries (they're also used to implement dax entries and
swap entries in the page cache).  So just call them shadow entries here.

I know there are still places which use the term 'nrexceptional' in
the kernel.  I just haven't got round to replacing them yet.  Please
don't add more.

> +void clear_shadow_from_swap_cache(int type, unsigned long begin,
> + unsigned long end)
> +{
> + unsigned long curr;
> + void *old;
> + swp_entry_t entry = swp_entry(type, begin);
> + struct address_space *address_space = swap_address_space(entry);
> + XA_STATE(xas, _space->i_pages, begin);
> +
> +retry:
> + xa_lock_irq(_space->i_pages);
> + for (curr = begin; curr <= end; curr++) {
> + entry = swp_entry(type, curr);
> + if (swap_address_space(entry) != address_space) {
> + xa_unlock_irq(_space->i_pages);
> + address_space = swap_address_space(entry);
> + begin = curr;
> + xas_set(, begin);
> + goto retry;
> + }
> +
> + old = xas_load();
> + if (!xa_is_value(old))
> + continue;
> + xas_store(, NULL);
> + address_space->nrexceptional--;
> + xas_next();
> + }
> + xa_unlock_irq(_space->i_pages);
> +}

This is a very clunky loop.  I'm not sure it's even right, given that
you change address space without changing the xas's address space.  How
about this?

for (;;) {
XA_STATE(xas, _space->i_pages, begin);
unsigned long nr_shadows = 0;

xas_lock_irq();
xas_for_each(, entry, end) {
if (!xa_is_value(entry))
continue;
xas_store(, NULL);
nr_shadows++;
}
address_space->nr_exceptionals -= nr_shadows;
xas_unlock_irq();

if (xas.xa_index >= end)
break;
entry = swp_entry(type, xas.xa_index);
address_space = swap_address_space(entry);
}



[PATCH v6 4/6] mm/swapcache: support to handle the exceptional entries in swapcache

2020-06-16 Thread js1304
From: Joonsoo Kim 

Swapcache doesn't handle the exceptional entries since there is no case
using it. In the following patch, workingset detection for anonymous
page will be implemented and it stores the shadow entries as exceptional
entries into the swapcache. So, we need to handle the exceptional entries
and this patch implements it.

Acked-by: Johannes Weiner 
Signed-off-by: Joonsoo Kim 
---
 include/linux/swap.h | 17 
 mm/shmem.c   |  3 ++-
 mm/swap_state.c  | 56 ++--
 mm/swapfile.c|  2 ++
 mm/vmscan.c  |  2 +-
 5 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index f4f5f94..901da54 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -416,9 +416,13 @@ extern struct address_space *swapper_spaces[];
 extern unsigned long total_swapcache_pages(void);
 extern void show_swap_cache_info(void);
 extern int add_to_swap(struct page *page);
-extern int add_to_swap_cache(struct page *, swp_entry_t, gfp_t);
-extern void __delete_from_swap_cache(struct page *, swp_entry_t entry);
+extern int add_to_swap_cache(struct page *page, swp_entry_t entry,
+   gfp_t gfp, void **shadowp);
+extern void __delete_from_swap_cache(struct page *page,
+   swp_entry_t entry, void *shadow);
 extern void delete_from_swap_cache(struct page *);
+extern void clear_shadow_from_swap_cache(int type, unsigned long begin,
+   unsigned long end);
 extern void free_page_and_swap_cache(struct page *);
 extern void free_pages_and_swap_cache(struct page **, int);
 extern struct page *lookup_swap_cache(swp_entry_t entry,
@@ -572,13 +576,13 @@ static inline int add_to_swap(struct page *page)
 }
 
 static inline int add_to_swap_cache(struct page *page, swp_entry_t entry,
-   gfp_t gfp_mask)
+   gfp_t gfp_mask, void **shadowp)
 {
return -1;
 }
 
 static inline void __delete_from_swap_cache(struct page *page,
-   swp_entry_t entry)
+   swp_entry_t entry, void *shadow)
 {
 }
 
@@ -586,6 +590,11 @@ static inline void delete_from_swap_cache(struct page 
*page)
 {
 }
 
+static inline void clear_shadow_from_swap_cache(int type, unsigned long begin,
+   unsigned long end)
+{
+}
+
 static inline int page_swapcount(struct page *page)
 {
return 0;
diff --git a/mm/shmem.c b/mm/shmem.c
index a0dbe62..e9a99a2 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1374,7 +1374,8 @@ static int shmem_writepage(struct page *page, struct 
writeback_control *wbc)
list_add(>swaplist, _swaplist);
 
if (add_to_swap_cache(page, swap,
-   __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
+   __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN,
+   NULL) == 0) {
spin_lock_irq(>lock);
shmem_recalc_inode(inode);
info->swapped++;
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 1050fde..43c4e3a 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -110,12 +110,15 @@ void show_swap_cache_info(void)
  * add_to_swap_cache resembles add_to_page_cache_locked on swapper_space,
  * but sets SwapCache flag and private instead of mapping and index.
  */
-int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp)
+int add_to_swap_cache(struct page *page, swp_entry_t entry,
+   gfp_t gfp, void **shadowp)
 {
struct address_space *address_space = swap_address_space(entry);
pgoff_t idx = swp_offset(entry);
XA_STATE_ORDER(xas, _space->i_pages, idx, compound_order(page));
unsigned long i, nr = hpage_nr_pages(page);
+   unsigned long nrexceptional = 0;
+   void *old;
 
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(PageSwapCache(page), page);
@@ -131,10 +134,17 @@ int add_to_swap_cache(struct page *page, swp_entry_t 
entry, gfp_t gfp)
goto unlock;
for (i = 0; i < nr; i++) {
VM_BUG_ON_PAGE(xas.xa_index != idx + i, page);
+   old = xas_load();
+   if (xa_is_value(old)) {
+   nrexceptional++;
+   if (shadowp)
+   *shadowp = old;
+   }
set_page_private(page + i, entry.val + i);
xas_store(, page);
xas_next();
}
+   address_space->nrexceptional -= nrexceptional;
address_space->nrpages += nr;
__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
ADD_CACHE_INFO(add_total, nr);
@@