Re: [RFC][PATCH v2] splice: Prevent gifting of multipage folios

2023-02-27 Thread Matthew Wilcox
On Mon, Feb 27, 2023 at 03:51:03PM +, David Howells wrote:
> 
> Don't let parts of compound pages/multipage folios be gifted by (vm)splice
> into a pipe as the other end may only be expecting single-page gifts (fuse
> and virtio console for example).
> 
> replace_page_cache_folio(), for example, will do the wrong thing if it
> tries to replace a single paged folio with a multipage folio.
> 
> Try to avoid this by making add_to_pipe() remove the gift flag on multipage
> folios.
> 
> Fixes: 7afa6fd037e5 ("[PATCH] vmsplice: allow user to pass in gift pages")
> Signed-off-by: David Howells 
> cc: Matthew Wilcox 

Reviewed-by: Matthew Wilcox (Oracle) 
Cc: sta...@vger.kernel.org

> cc: Jens Axboe 
> cc: Miklos Szeredi 
> cc: Amit Shah 
> cc: linux-fsde...@vger.kernel.org
> cc: virtualization@lists.linux-foundation.org
> cc: linux...@kvack.org
> ---
>  fs/splice.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index 2e76dbb81a8f..8bbd7794d9f0 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -240,6 +240,8 @@ ssize_t add_to_pipe(struct pipe_inode_info *pipe, struct 
> pipe_buffer *buf)
>   } else if (pipe_full(head, tail, pipe->max_usage)) {
>   ret = -EAGAIN;
>   } else {
> + if (PageCompound(buf->page))
> + buf->flags &= ~PIPE_BUF_FLAG_GIFT;
>   pipe->bufs[head & mask] = *buf;
>   pipe->head = head + 1;
>   return buf->len;
> 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC][PATCH] splice: Prevent gifting of multipage folios

2023-02-27 Thread Matthew Wilcox
On Mon, Feb 27, 2023 at 02:23:32PM +, David Howells wrote:
> 
> Don't let parts of multipage folios be gifted by (vm)splice into a pipe as
> the other end may only be expecting single-page gifts (fuse and virtio
> console for example).
> 
> replace_page_cache_folio(), for example, will do the wrong thing if it
> tries to replace a single paged folio with a multipage folio.
> 
> Try to avoid this by making add_to_pipe() remove the gift flag on multipage
> folios.
> 
> Signed-off-by: David Howells 

What should the Fixes: here be?  This was already possible with THPs
(both anon and tmpfs backed) long before I introduced folios.

> cc: Matthew Wilcox 
> cc: Miklos Szeredi 
> cc: Amit Shah 
> cc: linux-fsde...@vger.kernel.org
> cc: virtualization@lists.linux-foundation.org
> cc: linux...@kvack.org
> ---
>  fs/splice.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index 2e76dbb81a8f..33caa28a86e4 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -240,6 +240,8 @@ ssize_t add_to_pipe(struct pipe_inode_info *pipe, struct 
> pipe_buffer *buf)
>   } else if (pipe_full(head, tail, pipe->max_usage)) {
>   ret = -EAGAIN;
>   } else {
> + if (folio_nr_pages(page_folio(buf->page)) > 1)
> + buf->flags &= ~PIPE_BUF_FLAG_GIFT;

if (PageCompound(buf->page))
buf->flags &= ~PIPE_BUF_FLAG_GIFT;

would be simpler and more backportable.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 01/23] block: factor out a bvec_set_page helper

2023-01-30 Thread Matthew Wilcox
On Tue, Jan 31, 2023 at 05:00:32AM +, Matthew Wilcox wrote:
> On Mon, Jan 30, 2023 at 08:47:58PM -0800, Jakub Kicinski wrote:
> > kinda random thought but since we're touching this area - could we
> > perhaps move the definition of struct bio_vec and trivial helpers 
> > like this into a new header? bvec.h pulls in mm.h which is a right
> > behemoth :S
> 
> I bet we can drop mm.h now.  It was originally added for nth_page()
> in 3d75ca0adef4 but those were all removed by b8753433fc61.
> 
> A quick smoke test on my default testing config doesn't find any
> problems.  Let me send a patch and see if the build bots complain.

Disappointingly, it doesn't really change anything.  1134 files
depend on mm.h both before and after [1].  Looks like it's due to
arch/x86/include/asm/cacheflush.h pulling in linux/mm.h, judging by the
contents of .build_test_kernel-x86_64/net/ipv6/.inet6_hashtables.o.cmd.
But *lots* of header files pull in mm.h, including scatterlist.h,
vt_kern.h, net.h, nfs_fs.h, sunrpc/svc.h and security.h.

I suppose it may cut down on include loops to drop it here, so I'm
still in favour of the patch I posted, but this illustrates how
deeply entangled our headers still are.

[1] find .build_test_kernel-x86_64/ -name '.*.cmd' |xargs grep 
'include/linux/mm.h' |wc -l
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 01/23] block: factor out a bvec_set_page helper

2023-01-30 Thread Matthew Wilcox
On Mon, Jan 30, 2023 at 08:47:58PM -0800, Jakub Kicinski wrote:
> kinda random thought but since we're touching this area - could we
> perhaps move the definition of struct bio_vec and trivial helpers 
> like this into a new header? bvec.h pulls in mm.h which is a right
> behemoth :S

I bet we can drop mm.h now.  It was originally added for nth_page()
in 3d75ca0adef4 but those were all removed by b8753433fc61.

A quick smoke test on my default testing config doesn't find any
problems.  Let me send a patch and see if the build bots complain.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/6] mm: introduce vma->vm_flags modifier functions

2023-01-26 Thread Matthew Wilcox
On Thu, Jan 26, 2023 at 04:50:59PM +0200, Mike Rapoport wrote:
> On Thu, Jan 26, 2023 at 11:17:09AM +0200, Mike Rapoport wrote:
> > On Wed, Jan 25, 2023 at 12:38:46AM -0800, Suren Baghdasaryan wrote:
> > > +/* Use when VMA is not part of the VMA tree and needs no locking */
> > > +static inline void init_vm_flags(struct vm_area_struct *vma,
> > > +  unsigned long flags)
> > 
> > I'd suggest to make it vm_flags_init() etc.
> 
> Thinking more about it, it will be even clearer to name these vma_flags_xyz()

Perhaps vma_VERB_flags()?

vma_init_flags()
vma_reset_flags()
vma_set_flags()
vma_clear_flags()
vma_mod_flags()

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/6] mm: introduce vma->vm_flags modifier functions

2023-01-25 Thread Matthew Wilcox
On Wed, Jan 25, 2023 at 08:49:50AM -0800, Suren Baghdasaryan wrote:
> On Wed, Jan 25, 2023 at 1:10 AM Peter Zijlstra  wrote:
> > > + /*
> > > +  * Flags, see mm.h.
> > > +  * WARNING! Do not modify directly.
> > > +  * Use {init|reset|set|clear|mod}_vm_flags() functions instead.
> > > +  */
> > > + unsigned long vm_flags;
> >
> > We have __private and ACCESS_PRIVATE() to help with enforcing this.
> 
> Thanks for pointing this out, Peter! I guess for that I'll need to
> convert all read accesses and provide get_vm_flags() too? That will
> cause some additional churt (a quick search shows 801 hits over 248
> files) but maybe it's worth it? I think Michal suggested that too in
> another patch. Should I do that while we are at it?

Here's a trick I saw somewhere in the VFS:

union {
const vm_flags_t vm_flags;
vm_flags_t __private __vm_flags;
};

Now it can be read by anybody but written only by those using
ACCESS_PRIVATE.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/6] mm: introduce vma->vm_flags modifier functions

2023-01-25 Thread Matthew Wilcox
On Wed, Jan 25, 2023 at 12:38:46AM -0800, Suren Baghdasaryan wrote:
> +/* Use when VMA is not part of the VMA tree and needs no locking */
> +static inline void init_vm_flags(struct vm_area_struct *vma,
> +  unsigned long flags)
> +{
> + vma->vm_flags = flags;

vm_flags are supposed to have type vm_flags_t.  That's not been
fully realised yet, but perhaps we could avoid making it worse?

>   pgprot_t vm_page_prot;
> - unsigned long vm_flags; /* Flags, see mm.h. */
> +
> + /*
> +  * Flags, see mm.h.
> +  * WARNING! Do not modify directly.
> +  * Use {init|reset|set|clear|mod}_vm_flags() functions instead.
> +  */
> + unsigned long vm_flags;

Including changing this line to vm_flags_t
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] docs: driver-api: virtio: virtio on Linux

2022-08-03 Thread Matthew Wilcox
On Wed, Aug 03, 2022 at 09:24:49AM +0200, Ricardo Cañuelo wrote:
> Hi Matthew,
> 
> On mar, ago 02 2022 at 16:56:48, Matthew Wilcox  wrote:
> > You don't need to use :c:func:`foo`.  You can just write foo() and the
> > tooling will convert it into :c:func:`foo` for you.
> 
> Thanks for the tip. However, I did some tests and the results aren't
> quite the same. For functions with kerneldocs that are referenced in the
> same document (.. kernel-doc::) the tool does efectively link to the
> generated documentation, but for all the other functions using
> c:func:`foo` generates a different formatting than `foo`, which does no
> formatting at all.

I didn't say `foo`, I said foo().  This is handled by
Documentation/sphinx/automarkup.py.  To quote the doc-guide:

Please note that there is no need to use ``c:func:`` to generate cross
references to function documentation.  Due to some Sphinx extension magic,
the documentation build system will automatically turn a reference to
``function()`` into a cross reference if an index entry for the given
function name exists.  If you see ``c:func:`` use in a kernel document,
please feel free to remove it.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] docs: driver-api: virtio: virtio on Linux

2022-08-02 Thread Matthew Wilcox
On Tue, Aug 02, 2022 at 02:42:22PM +0200, Ricardo Cañuelo wrote:
> +In this case, when the interrupt arrives :c:func:`vp_interrupt` will be
> +called and it will ultimately lead to a call to
> +:c:func:`vring_interrupt`, which ends up calling the virtqueue callback
> +function::

You don't need to use :c:func:`foo`.  You can just write foo() and the
tooling will convert it into :c:func:`foo` for you.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 07/19] mm/migrate: Convert expected_page_refs() to folio_expected_refs()

2022-07-07 Thread Matthew Wilcox
On Thu, Jul 07, 2022 at 07:50:17PM -0700, Hugh Dickins wrote:
> On Wed, 8 Jun 2022, Matthew Wilcox (Oracle) wrote:
> 
> > Now that both callers have a folio, convert this function to
> > take a folio & rename it.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) 
> > Reviewed-by: Christoph Hellwig 
> > ---
> >  mm/migrate.c | 19 ---
> >  1 file changed, 12 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 2975f0c4d7cf..2e2f41572066 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -336,13 +336,18 @@ void pmd_migration_entry_wait(struct mm_struct *mm, 
> > pmd_t *pmd)
> >  }
> >  #endif
> >  
> > -static int expected_page_refs(struct address_space *mapping, struct page 
> > *page)
> > +static int folio_expected_refs(struct address_space *mapping,
> > +   struct folio *folio)
> >  {
> > -   int expected_count = 1;
> > +   int refs = 1;
> > +   if (!mapping)
> > +   return refs;
> >  
> > -   if (mapping)
> > -   expected_count += compound_nr(page) + page_has_private(page);
> > -   return expected_count;
> > +   refs += folio_nr_pages(folio);
> > +   if (folio_get_private(folio))
> > +   refs++;
> > +
> > +   return refs;
> >  }
> >  
> >  /*
> > @@ -359,7 +364,7 @@ int folio_migrate_mapping(struct address_space *mapping,
> > XA_STATE(xas, >i_pages, folio_index(folio));
> > struct zone *oldzone, *newzone;
> > int dirty;
> > -   int expected_count = expected_page_refs(mapping, >page) + 
> > extra_count;
> > +   int expected_count = folio_expected_refs(mapping, folio) + extra_count;
> > long nr = folio_nr_pages(folio);
> >  
> > if (!mapping) {
> > @@ -669,7 +674,7 @@ static int __buffer_migrate_folio(struct address_space 
> > *mapping,
> > return migrate_page(mapping, >page, >page, mode);
> >  
> > /* Check whether page does not have extra refs before we do more work */
> > -   expected_count = expected_page_refs(mapping, >page);
> > +   expected_count = folio_expected_refs(mapping, src);
> > if (folio_ref_count(src) != expected_count)
> > return -EAGAIN;
> >  
> > -- 
> > 2.35.1
> 
> This commit (742e89c9e352d38df1a5825fe40c4de73a5d5f7a in pagecache.git
> folio/for-next and recent linux-next) is dangerously wrong, at least
> for swapcache, and probably for some others.
> 
> I say "dangerously" because it tells page migration a swapcache page
> is safe for migration when it certainly is not.
> 
> The fun that typically ensues is kernel BUG at include/linux/mm.h:750!
> put_page_testzero() VM_BUG_ON_PAGE(page_ref_count(page) == 0, page),
> if CONFIG_DEBUG_VM=y (bisecting for that is what brought me to this).
> But I guess you might get silent data corruption too.
> 
> I assumed at first that you'd changed the rules, and were now expecting
> any subsystem that puts a non-zero value into folio->private to raise
> its refcount - whereas the old convention (originating with buffer heads)
> is that setting PG_private says an extra refcount has been taken, please
> call try_to_release_page() to lower it, and maybe that will use data in
> page->private to do so; but page->private free for the subsystem owning
> the page to use as it wishes, no refcount implication.  But that you
> had missed updating swapcache.
> 
> So I got working okay with the patch below; but before turning it into
> a proper patch, noticed that there were still plenty of other places
> applying the test for PG_private: so now think that maybe you set out
> with intention as above, realized it wouldn't work, but got distracted
> before cleaning up some places you'd already changed.  And patch below
> now goes in the wrong direction.
> 
> Or maybe you didn't intend any change, but the PG_private test just got
> missed in a few places.  I don't know, hope you remember, but current
> linux-next badly inconsistent.
> Over to you, thanks,

Ugh.  The problem I'm trying to solve is that we're short on page flags.
We _seemed_ to have correlation between "page->private != NULL" and
"PG_private is set", and so I thought I could make progress towards
removing PG_private.  But the rule you set out above wasn't written down
anywhere that I was able to find it.

I'm about to go to sleep, but I'll think on this some more tomorrow.

> Hugh
> 
> --- a/mm/migrate.c2022-07-06 14:24:44.499941975 -0700
> +++ b/mm/migrate.c2022-07-06 15:49:25.0 -0700
> @@ -351,6 +351,10 

Re: [PATCH v2 12/19] btrfs: Convert btrfs_migratepage to migrate_folio

2022-06-09 Thread Matthew Wilcox
On Thu, Jun 09, 2022 at 06:33:23PM +0200, David Sterba wrote:
> On Wed, Jun 08, 2022 at 04:02:42PM +0100, Matthew Wilcox (Oracle) wrote:
> > Use filemap_migrate_folio() to do the bulk of the work, and then copy
> > the ordered flag across if needed.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) 
> > Reviewed-by: Christoph Hellwig 
> 
> Acked-by: David Sterba 
> 
> > +static int btrfs_migrate_folio(struct address_space *mapping,
> > +struct folio *dst, struct folio *src,
> >  enum migrate_mode mode)
> >  {
> > -   int ret;
> > +   int ret = filemap_migrate_folio(mapping, dst, src, mode);
> >  
> > -   ret = migrate_page_move_mapping(mapping, newpage, page, 0);
> > if (ret != MIGRATEPAGE_SUCCESS)
> > return ret;
> >  
> > -   if (page_has_private(page))
> > -   attach_page_private(newpage, detach_page_private(page));
> 
> If I'm reading it correctly, the private pointer does not need to be set
> like that anymore because it's done somewhere during the
> filemap_migrate_folio() call.

That's correct.  Everything except moving the ordered flag across is
done for you, and I'm kind of tempted to modify folio_migrate_flags()
to copy the ordered flag across as well.  Then you could just use
filemap_migrate_folio() directly.

> > -
> > -   if (PageOrdered(page)) {
> > -   ClearPageOrdered(page);
> > -   SetPageOrdered(newpage);
> > +   if (folio_test_ordered(src)) {
> > +   folio_clear_ordered(src);
> > +   folio_set_ordered(dst);
> > }
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 03/19] fs: Add aops->migrate_folio

2022-06-09 Thread Matthew Wilcox
On Thu, Jun 09, 2022 at 02:50:20PM +0200, David Hildenbrand wrote:
> On 08.06.22 17:02, Matthew Wilcox (Oracle) wrote:
> > diff --git a/Documentation/filesystems/locking.rst 
> > b/Documentation/filesystems/locking.rst
> > index c0fe711f14d3..3d28b23676bd 100644
> > --- a/Documentation/filesystems/locking.rst
> > +++ b/Documentation/filesystems/locking.rst
> > @@ -253,7 +253,8 @@ prototypes::
> > void (*free_folio)(struct folio *);
> > int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
> > bool (*isolate_page) (struct page *, isolate_mode_t);
> > -   int (*migratepage)(struct address_space *, struct page *, struct page 
> > *);
> > +   int (*migrate_folio)(struct address_space *, struct folio *dst,
> > +   struct folio *src, enum migrate_mode);
> > void (*putback_page) (struct page *);
> 
> isolate_page/putback_page are leftovers from the previous patch, no?

Argh, right, I completely forgot I needed to update the documentation in
that patch.

> > +++ b/Documentation/vm/page_migration.rst
> > @@ -181,22 +181,23 @@ which are function pointers of struct 
> > address_space_operations.
> > Once page is successfully isolated, VM uses page.lru fields so driver
> > shouldn't expect to preserve values in those fields.
> >  
> > -2. ``int (*migratepage) (struct address_space *mapping,``
> > -|  ``struct page *newpage, struct page *oldpage, enum migrate_mode);``
> > -
> > -   After isolation, VM calls migratepage() of driver with the isolated 
> > page.
> > -   The function of migratepage() is to move the contents of the old page 
> > to the
> > -   new page
> > -   and set up fields of struct page newpage. Keep in mind that you should
> > -   indicate to the VM the oldpage is no longer movable via 
> > __ClearPageMovable()
> > -   under page_lock if you migrated the oldpage successfully and returned
> > -   MIGRATEPAGE_SUCCESS. If driver cannot migrate the page at the moment, 
> > driver
> > -   can return -EAGAIN. On -EAGAIN, VM will retry page migration in a short 
> > time
> > -   because VM interprets -EAGAIN as "temporary migration failure". On 
> > returning
> > -   any error except -EAGAIN, VM will give up the page migration without
> > -   retrying.
> > -
> > -   Driver shouldn't touch the page.lru field while in the migratepage() 
> > function.
> > +2. ``int (*migrate_folio) (struct address_space *mapping,``
> > +|  ``struct folio *dst, struct folio *src, enum migrate_mode);``
> > +
> > +   After isolation, VM calls the driver's migrate_folio() with the
> > +   isolated folio.  The purpose of migrate_folio() is to move the contents
> > +   of the source folio to the destination folio and set up the fields
> > +   of destination folio.  Keep in mind that you should indicate to the
> > +   VM the source folio is no longer movable via __ClearPageMovable()
> > +   under folio if you migrated the source successfully and returned
> > +   MIGRATEPAGE_SUCCESS.  If driver cannot migrate the folio at the
> > +   moment, driver can return -EAGAIN. On -EAGAIN, VM will retry folio
> > +   migration in a short time because VM interprets -EAGAIN as "temporary
> > +   migration failure".  On returning any error except -EAGAIN, VM will
> > +   give up the folio migration without retrying.
> > +
> > +   Driver shouldn't touch the folio.lru field while in the migrate_folio()
> > +   function.
> >  
> >  3. ``void (*putback_page)(struct page *);``
> 
> Hmm, here it's a bit more complicated now, because we essentially have
> two paths: LRU+migrate_folio or !LRU+movable_ops
> (isolate/migrate/putback page)

Oh ... actually, this is just documenting the driver side of things.
I don't really like how it's written.  Here, have some rewritten
documentation (which is now part of the previous patch):

+++ b/Documentation/vm/page_migration.rst
@@ -152,110 +152,15 @@ Steps:
 Non-LRU page migration
 ==

-Although migration originally aimed for reducing the latency of memory accesses
-for NUMA, compaction also uses migration to create high-order pages.
+Although migration originally aimed for reducing the latency of memory
+accesses for NUMA, compaction also uses migration to create high-order
+pages.  For compaction purposes, it is also useful to be able to move
+non-LRU pages, such as zsmalloc and virtio-balloon pages.

-Current problem of the implementation is that it is designed to migrate only
-*LRU* pages. However, there are potential non-LRU pages which can be migrated
-in drivers, for example, zsmalloc, virtio-balloon pages.
-
-For virtio-balloon pages,

[PATCH v2 07/19] mm/migrate: Convert expected_page_refs() to folio_expected_refs()

2022-06-08 Thread Matthew Wilcox (Oracle)
Now that both callers have a folio, convert this function to
take a folio & rename it.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 mm/migrate.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 2975f0c4d7cf..2e2f41572066 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -336,13 +336,18 @@ void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t 
*pmd)
 }
 #endif
 
-static int expected_page_refs(struct address_space *mapping, struct page *page)
+static int folio_expected_refs(struct address_space *mapping,
+   struct folio *folio)
 {
-   int expected_count = 1;
+   int refs = 1;
+   if (!mapping)
+   return refs;
 
-   if (mapping)
-   expected_count += compound_nr(page) + page_has_private(page);
-   return expected_count;
+   refs += folio_nr_pages(folio);
+   if (folio_get_private(folio))
+   refs++;
+
+   return refs;
 }
 
 /*
@@ -359,7 +364,7 @@ int folio_migrate_mapping(struct address_space *mapping,
XA_STATE(xas, >i_pages, folio_index(folio));
struct zone *oldzone, *newzone;
int dirty;
-   int expected_count = expected_page_refs(mapping, >page) + 
extra_count;
+   int expected_count = folio_expected_refs(mapping, folio) + extra_count;
long nr = folio_nr_pages(folio);
 
if (!mapping) {
@@ -669,7 +674,7 @@ static int __buffer_migrate_folio(struct address_space 
*mapping,
return migrate_page(mapping, >page, >page, mode);
 
/* Check whether page does not have extra refs before we do more work */
-   expected_count = expected_page_refs(mapping, >page);
+   expected_count = folio_expected_refs(mapping, src);
if (folio_ref_count(src) != expected_count)
return -EAGAIN;
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 16/19] hugetlb: Convert to migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
This involves converting migrate_huge_page_move_mapping().  We also need a
folio variant of hugetlb_set_page_subpool(), but that's for a later patch.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/hugetlbfs/inode.c| 23 ++-
 include/linux/migrate.h |  6 +++---
 mm/migrate.c| 18 +-
 3 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 14d33f725e05..eca1d0fabd7e 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -954,28 +954,33 @@ static int hugetlbfs_symlink(struct user_namespace 
*mnt_userns,
return error;
 }
 
-static int hugetlbfs_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page,
+#ifdef CONFIG_MIGRATION
+static int hugetlbfs_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src,
enum migrate_mode mode)
 {
int rc;
 
-   rc = migrate_huge_page_move_mapping(mapping, newpage, page);
+   rc = migrate_huge_page_move_mapping(mapping, dst, src);
if (rc != MIGRATEPAGE_SUCCESS)
return rc;
 
-   if (hugetlb_page_subpool(page)) {
-   hugetlb_set_page_subpool(newpage, hugetlb_page_subpool(page));
-   hugetlb_set_page_subpool(page, NULL);
+   if (hugetlb_page_subpool(>page)) {
+   hugetlb_set_page_subpool(>page,
+   hugetlb_page_subpool(>page));
+   hugetlb_set_page_subpool(>page, NULL);
}
 
if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
+   folio_migrate_copy(dst, src);
else
-   migrate_page_states(newpage, page);
+   folio_migrate_flags(dst, src);
 
return MIGRATEPAGE_SUCCESS;
 }
+#else
+#define hugetlbfs_migrate_folio NULL
+#endif
 
 static int hugetlbfs_error_remove_page(struct address_space *mapping,
struct page *page)
@@ -1142,7 +1147,7 @@ static const struct address_space_operations 
hugetlbfs_aops = {
.write_begin= hugetlbfs_write_begin,
.write_end  = hugetlbfs_write_end,
.dirty_folio= noop_dirty_folio,
-   .migratepage= hugetlbfs_migrate_page,
+   .migrate_folio  = hugetlbfs_migrate_folio,
.error_remove_page  = hugetlbfs_error_remove_page,
 };
 
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 82f00ad69a54..59d64a1e6b4b 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -42,8 +42,8 @@ extern int isolate_movable_page(struct page *page, 
isolate_mode_t mode);
 
 extern void migrate_page_states(struct page *newpage, struct page *page);
 extern void migrate_page_copy(struct page *newpage, struct page *page);
-extern int migrate_huge_page_move_mapping(struct address_space *mapping,
- struct page *newpage, struct page *page);
+int migrate_huge_page_move_mapping(struct address_space *mapping,
+   struct folio *dst, struct folio *src);
 extern int migrate_page_move_mapping(struct address_space *mapping,
struct page *newpage, struct page *page, int extra_count);
 void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep,
@@ -74,7 +74,7 @@ static inline void migrate_page_copy(struct page *newpage,
 struct page *page) {}
 
 static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
- struct page *newpage, struct page *page)
+ struct folio *dst, struct folio *src)
 {
return -ENOSYS;
 }
diff --git a/mm/migrate.c b/mm/migrate.c
index 4d8115ca93bb..bed0de86f3ae 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -474,26 +474,26 @@ EXPORT_SYMBOL(folio_migrate_mapping);
  * of folio_migrate_mapping().
  */
 int migrate_huge_page_move_mapping(struct address_space *mapping,
-  struct page *newpage, struct page *page)
+  struct folio *dst, struct folio *src)
 {
-   XA_STATE(xas, >i_pages, page_index(page));
+   XA_STATE(xas, >i_pages, folio_index(src));
int expected_count;
 
xas_lock_irq();
-   expected_count = 2 + page_has_private(page);
-   if (!page_ref_freeze(page, expected_count)) {
+   expected_count = 2 + folio_has_private(src);
+   if (!folio_ref_freeze(src, expected_count)) {
xas_unlock_irq();
return -EAGAIN;
}
 
-   newpage->index = page->index;
-   newpage->mapping = page->mapping;
+   dst->index = src->index;
+   dst->mapping = src->mapping;
 
-   get_page(newpage);
+   folio_get(dst);
 
-   xas_store(, newpage);
+   xas_store(, dst);
 
-   

[PATCH v2 09/19] nfs: Convert to migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.  migrate_page() will be converted
later.

Signed-off-by: Matthew Wilcox (Oracle) 
Acked-by: Anna Schumaker 
Reviewed-by: Christoph Hellwig 
---
 fs/nfs/file.c |  4 +---
 fs/nfs/internal.h |  6 --
 fs/nfs/write.c| 16 
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 2d72b1b7ed74..549baed76351 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -533,9 +533,7 @@ const struct address_space_operations nfs_file_aops = {
.write_end = nfs_write_end,
.invalidate_folio = nfs_invalidate_folio,
.release_folio = nfs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage = nfs_migrate_page,
-#endif
+   .migrate_folio = nfs_migrate_folio,
.launder_folio = nfs_launder_folio,
.is_dirty_writeback = nfs_check_dirty_writeback,
.error_remove_page = generic_error_remove_page,
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 8f8cd6e2d4db..437ebe544aaf 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -578,8 +578,10 @@ void nfs_clear_pnfs_ds_commit_verifiers(struct 
pnfs_ds_commit_info *cinfo)
 #endif
 
 #ifdef CONFIG_MIGRATION
-extern int nfs_migrate_page(struct address_space *,
-   struct page *, struct page *, enum migrate_mode);
+int nfs_migrate_folio(struct address_space *, struct folio *dst,
+   struct folio *src, enum migrate_mode);
+#else
+#define nfs_migrate_folio NULL
 #endif
 
 static inline int
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 1c706465d090..649b9e633459 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -2119,27 +2119,27 @@ int nfs_wb_page(struct inode *inode, struct page *page)
 }
 
 #ifdef CONFIG_MIGRATION
-int nfs_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode)
+int nfs_migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode)
 {
/*
-* If PagePrivate is set, then the page is currently associated with
+* If the private flag is set, the folio is currently associated with
 * an in-progress read or write request. Don't try to migrate it.
 *
 * FIXME: we could do this in principle, but we'll need a way to ensure
 *that we can safely release the inode reference while holding
-*the page lock.
+*the folio lock.
 */
-   if (PagePrivate(page))
+   if (folio_test_private(src))
return -EBUSY;
 
-   if (PageFsCache(page)) {
+   if (folio_test_fscache(src)) {
if (mode == MIGRATE_ASYNC)
return -EBUSY;
-   wait_on_page_fscache(page);
+   folio_wait_fscache(src);
}
 
-   return migrate_page(mapping, newpage, page, mode);
+   return migrate_page(mapping, >page, >page, mode);
 }
 #endif
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 06/19] mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()

2022-06-08 Thread Matthew Wilcox (Oracle)
Use a folio throughout __buffer_migrate_folio(), add kernel-doc for
buffer_migrate_folio() and buffer_migrate_folio_norefs(), move their
declarations to buffer.h and switch all filesystems that have wired
them up.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 block/fops.c|  2 +-
 fs/ext2/inode.c |  4 +-
 fs/ext4/inode.c |  4 +-
 fs/ntfs/aops.c  |  6 +--
 fs/ocfs2/aops.c |  2 +-
 include/linux/buffer_head.h | 10 +
 include/linux/fs.h  | 12 --
 mm/migrate.c| 76 ++---
 8 files changed, 65 insertions(+), 51 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index d6b3276a6c68..743fc46d0aad 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -417,7 +417,7 @@ const struct address_space_operations def_blk_aops = {
.write_end  = blkdev_write_end,
.writepages = blkdev_writepages,
.direct_IO  = blkdev_direct_IO,
-   .migratepage= buffer_migrate_page_norefs,
+   .migrate_folio  = buffer_migrate_folio_norefs,
.is_dirty_writeback = buffer_check_dirty_writeback,
 };
 
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 360ce3604a2d..84570c6265aa 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -973,7 +973,7 @@ const struct address_space_operations ext2_aops = {
.bmap   = ext2_bmap,
.direct_IO  = ext2_direct_IO,
.writepages = ext2_writepages,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
 };
@@ -989,7 +989,7 @@ const struct address_space_operations ext2_nobh_aops = {
.bmap   = ext2_bmap,
.direct_IO  = ext2_direct_IO,
.writepages = ext2_writepages,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.error_remove_page  = generic_error_remove_page,
 };
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1aaea53e67b5..53877ffe3c41 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3633,7 +3633,7 @@ static const struct address_space_operations ext4_aops = {
.invalidate_folio   = ext4_invalidate_folio,
.release_folio  = ext4_release_folio,
.direct_IO  = noop_direct_IO,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.swap_activate  = ext4_iomap_swap_activate,
@@ -3668,7 +3668,7 @@ static const struct address_space_operations ext4_da_aops 
= {
.invalidate_folio   = ext4_invalidate_folio,
.release_folio  = ext4_release_folio,
.direct_IO  = noop_direct_IO,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.swap_activate  = ext4_iomap_swap_activate,
diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index 9e3964ea2ea0..5f4fb6ca6f2e 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -1659,7 +1659,7 @@ const struct address_space_operations ntfs_normal_aops = {
.dirty_folio= block_dirty_folio,
 #endif /* NTFS_RW */
.bmap   = ntfs_bmap,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
@@ -1673,7 +1673,7 @@ const struct address_space_operations 
ntfs_compressed_aops = {
.writepage  = ntfs_writepage,
.dirty_folio= block_dirty_folio,
 #endif /* NTFS_RW */
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
@@ -1688,7 +1688,7 @@ const struct address_space_operations ntfs_mst_aops = {
.writepage  = ntfs_writepage,   /* Write dirty page to disk. */
.dirty_folio= filemap_dirty_folio,
 #endif /* NTFS_RW */
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 767df51f8657..1d489003f99d 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2462,7 +2462,7 @@ const struct address_space_operations ocfs2_aops = {
.direct_IO

[PATCH v2 14/19] f2fs: Convert to filemap_migrate_folio()

2022-06-08 Thread Matthew Wilcox (Oracle)
filemap_migrate_folio() fits f2fs's needs perfectly.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/f2fs/checkpoint.c |  4 +---
 fs/f2fs/data.c   | 40 +---
 fs/f2fs/f2fs.h   |  4 
 fs/f2fs/node.c   |  4 +---
 4 files changed, 3 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 6d8b2bf14de0..8259e0fa97e1 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -463,9 +463,7 @@ const struct address_space_operations f2fs_meta_aops = {
.dirty_folio= f2fs_dirty_meta_folio,
.invalidate_folio = f2fs_invalidate_folio,
.release_folio  = f2fs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= f2fs_migrate_page,
-#endif
+   .migrate_folio  = filemap_migrate_folio,
 };
 
 static void __add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino,
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7fcbcf979737..318a3f91ad74 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3751,42 +3751,6 @@ static sector_t f2fs_bmap(struct address_space *mapping, 
sector_t block)
return blknr;
 }
 
-#ifdef CONFIG_MIGRATION
-#include 
-
-int f2fs_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page, enum migrate_mode mode)
-{
-   int rc, extra_count = 0;
-
-   BUG_ON(PageWriteback(page));
-
-   rc = migrate_page_move_mapping(mapping, newpage,
-   page, extra_count);
-   if (rc != MIGRATEPAGE_SUCCESS)
-   return rc;
-
-   /* guarantee to start from no stale private field */
-   set_page_private(newpage, 0);
-   if (PagePrivate(page)) {
-   set_page_private(newpage, page_private(page));
-   SetPagePrivate(newpage);
-   get_page(newpage);
-
-   set_page_private(page, 0);
-   ClearPagePrivate(page);
-   put_page(page);
-   }
-
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
-   else
-   migrate_page_states(newpage, page);
-
-   return MIGRATEPAGE_SUCCESS;
-}
-#endif
-
 #ifdef CONFIG_SWAP
 static int f2fs_migrate_blocks(struct inode *inode, block_t start_blk,
unsigned int blkcnt)
@@ -4018,15 +3982,13 @@ const struct address_space_operations f2fs_dblock_aops 
= {
.write_begin= f2fs_write_begin,
.write_end  = f2fs_write_end,
.dirty_folio= f2fs_dirty_data_folio,
+   .migrate_folio  = filemap_migrate_folio,
.invalidate_folio = f2fs_invalidate_folio,
.release_folio  = f2fs_release_folio,
.direct_IO  = noop_direct_IO,
.bmap   = f2fs_bmap,
.swap_activate  = f2fs_swap_activate,
.swap_deactivate = f2fs_swap_deactivate,
-#ifdef CONFIG_MIGRATION
-   .migratepage= f2fs_migrate_page,
-#endif
 };
 
 void f2fs_clear_page_cache_dirty_tag(struct page *page)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index d9bbecd008d2..f258a1b6faed 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3764,10 +3764,6 @@ int f2fs_write_single_data_page(struct page *page, int 
*submitted,
 void f2fs_write_failed(struct inode *inode, loff_t to);
 void f2fs_invalidate_folio(struct folio *folio, size_t offset, size_t length);
 bool f2fs_release_folio(struct folio *folio, gfp_t wait);
-#ifdef CONFIG_MIGRATION
-int f2fs_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode);
-#endif
 bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
 void f2fs_clear_page_cache_dirty_tag(struct page *page);
 int f2fs_init_post_read_processing(void);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 836c79a20afc..ed1cbfb0345f 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -2163,9 +2163,7 @@ const struct address_space_operations f2fs_node_aops = {
.dirty_folio= f2fs_dirty_node_folio,
.invalidate_folio = f2fs_invalidate_folio,
.release_folio  = f2fs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= f2fs_migrate_page,
-#endif
+   .migrate_folio  = filemap_migrate_folio,
 };
 
 static struct free_nid *__lookup_free_nid_list(struct f2fs_nm_info *nm_i,
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 19/19] mm/folio-compat: Remove migration compatibility functions

2022-06-08 Thread Matthew Wilcox (Oracle)
migrate_page_move_mapping(), migrate_page_copy() and migrate_page_states()
are all now unused after converting all the filesystems from
aops->migratepage() to aops->migrate_folio().

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 include/linux/migrate.h | 11 ---
 mm/folio-compat.c   | 22 --
 mm/ksm.c|  2 +-
 3 files changed, 1 insertion(+), 34 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 59d64a1e6b4b..3e18c7048506 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -40,12 +40,8 @@ extern int migrate_pages(struct list_head *l, new_page_t 
new, free_page_t free,
 extern struct page *alloc_migration_target(struct page *page, unsigned long 
private);
 extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
 
-extern void migrate_page_states(struct page *newpage, struct page *page);
-extern void migrate_page_copy(struct page *newpage, struct page *page);
 int migrate_huge_page_move_mapping(struct address_space *mapping,
struct folio *dst, struct folio *src);
-extern int migrate_page_move_mapping(struct address_space *mapping,
-   struct page *newpage, struct page *page, int extra_count);
 void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep,
spinlock_t *ptl);
 void folio_migrate_flags(struct folio *newfolio, struct folio *folio);
@@ -66,13 +62,6 @@ static inline struct page *alloc_migration_target(struct 
page *page,
 static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
{ return -EBUSY; }
 
-static inline void migrate_page_states(struct page *newpage, struct page *page)
-{
-}
-
-static inline void migrate_page_copy(struct page *newpage,
-struct page *page) {}
-
 static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
  struct folio *dst, struct folio *src)
 {
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 20bc15b57d93..458618c7302c 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -51,28 +51,6 @@ void mark_page_accessed(struct page *page)
 }
 EXPORT_SYMBOL(mark_page_accessed);
 
-#ifdef CONFIG_MIGRATION
-int migrate_page_move_mapping(struct address_space *mapping,
-   struct page *newpage, struct page *page, int extra_count)
-{
-   return folio_migrate_mapping(mapping, page_folio(newpage),
-   page_folio(page), extra_count);
-}
-EXPORT_SYMBOL(migrate_page_move_mapping);
-
-void migrate_page_states(struct page *newpage, struct page *page)
-{
-   folio_migrate_flags(page_folio(newpage), page_folio(page));
-}
-EXPORT_SYMBOL(migrate_page_states);
-
-void migrate_page_copy(struct page *newpage, struct page *page)
-{
-   folio_migrate_copy(page_folio(newpage), page_folio(page));
-}
-EXPORT_SYMBOL(migrate_page_copy);
-#endif
-
 bool set_page_writeback(struct page *page)
 {
return folio_start_writeback(page_folio(page));
diff --git a/mm/ksm.c b/mm/ksm.c
index 54f78c9eecae..e8f8c1a2bb39 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -712,7 +712,7 @@ static struct page *get_ksm_page(struct stable_node 
*stable_node,
 * however, it might mean that the page is under page_ref_freeze().
 * The __remove_mapping() case is easy, again the node is now stale;
 * the same is in reuse_ksm_page() case; but if page is swapcache
-* in migrate_page_move_mapping(), it might still be our page,
+* in folio_migrate_mapping(), it might still be our page,
 * in which case it's essential to keep the node.
 */
while (!get_page_unless_zero(page)) {
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 05/19] mm/migrate: Convert writeout() to take a folio

2022-06-08 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 mm/migrate.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 1878de817a01..6b6fec26f4d0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -761,11 +761,10 @@ int buffer_migrate_page_norefs(struct address_space 
*mapping,
 #endif
 
 /*
- * Writeback a page to clean the dirty state
+ * Writeback a folio to clean the dirty state
  */
-static int writeout(struct address_space *mapping, struct page *page)
+static int writeout(struct address_space *mapping, struct folio *folio)
 {
-   struct folio *folio = page_folio(page);
struct writeback_control wbc = {
.sync_mode = WB_SYNC_NONE,
.nr_to_write = 1,
@@ -779,25 +778,25 @@ static int writeout(struct address_space *mapping, struct 
page *page)
/* No write method for the address space */
return -EINVAL;
 
-   if (!clear_page_dirty_for_io(page))
+   if (!folio_clear_dirty_for_io(folio))
/* Someone else already triggered a write */
return -EAGAIN;
 
/*
-* A dirty page may imply that the underlying filesystem has
-* the page on some queue. So the page must be clean for
-* migration. Writeout may mean we loose the lock and the
-* page state is no longer what we checked for earlier.
+* A dirty folio may imply that the underlying filesystem has
+* the folio on some queue. So the folio must be clean for
+* migration. Writeout may mean we lose the lock and the
+* folio state is no longer what we checked for earlier.
 * At this point we know that the migration attempt cannot
 * be successful.
 */
remove_migration_ptes(folio, folio, false);
 
-   rc = mapping->a_ops->writepage(page, );
+   rc = mapping->a_ops->writepage(>page, );
 
if (rc != AOP_WRITEPAGE_ACTIVATE)
/* unlocked. Relock */
-   lock_page(page);
+   folio_lock(folio);
 
return (rc < 0) ? -EIO : -EAGAIN;
 }
@@ -817,7 +816,7 @@ static int fallback_migrate_folio(struct address_space 
*mapping,
default:
return -EBUSY;
}
-   return writeout(mapping, >page);
+   return writeout(mapping, src);
}
 
/*
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 02/19] mm: Convert all PageMovable users to movable_operations

2022-06-08 Thread Matthew Wilcox (Oracle)
These drivers are rather uncomfortably hammered into the
address_space_operations hole.  They aren't filesystems and don't behave
like filesystems.  They just need their own movable_operations structure,
which we can point to directly from page->mapping.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 arch/powerpc/platforms/pseries/cmm.c |  60 +---
 drivers/misc/vmw_balloon.c   |  61 +---
 drivers/virtio/virtio_balloon.c  |  47 +---
 include/linux/balloon_compaction.h   |   6 +-
 include/linux/fs.h   |   2 -
 include/linux/migrate.h  |  26 +--
 include/linux/page-flags.h   |   2 +-
 include/uapi/linux/magic.h   |   4 --
 mm/balloon_compaction.c  |  10 ++-
 mm/compaction.c  |  29 
 mm/migrate.c |  24 +++
 mm/util.c|   4 +-
 mm/z3fold.c  |  82 +++--
 mm/zsmalloc.c| 102 ++-
 14 files changed, 94 insertions(+), 365 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c 
b/arch/powerpc/platforms/pseries/cmm.c
index 15ed8206c463..5f4037c1d7fe 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -19,9 +19,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 
 #include 
 #include 
 #include 
@@ -500,19 +497,6 @@ static struct notifier_block cmm_mem_nb = {
 };
 
 #ifdef CONFIG_BALLOON_COMPACTION
-static struct vfsmount *balloon_mnt;
-
-static int cmm_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, PPC_CMM_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type balloon_fs = {
-   .name = "ppc-cmm",
-   .init_fs_context = cmm_init_fs_context,
-   .kill_sb = kill_anon_super,
-};
-
 static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
   struct page *newpage, struct page *page,
   enum migrate_mode mode)
@@ -564,47 +548,13 @@ static int cmm_migratepage(struct balloon_dev_info 
*b_dev_info,
return MIGRATEPAGE_SUCCESS;
 }
 
-static int cmm_balloon_compaction_init(void)
+static void cmm_balloon_compaction_init(void)
 {
-   int rc;
-
balloon_devinfo_init(_dev_info);
b_dev_info.migratepage = cmm_migratepage;
-
-   balloon_mnt = kern_mount(_fs);
-   if (IS_ERR(balloon_mnt)) {
-   rc = PTR_ERR(balloon_mnt);
-   balloon_mnt = NULL;
-   return rc;
-   }
-
-   b_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
-   if (IS_ERR(b_dev_info.inode)) {
-   rc = PTR_ERR(b_dev_info.inode);
-   b_dev_info.inode = NULL;
-   kern_unmount(balloon_mnt);
-   balloon_mnt = NULL;
-   return rc;
-   }
-
-   b_dev_info.inode->i_mapping->a_ops = _aops;
-   return 0;
-}
-static void cmm_balloon_compaction_deinit(void)
-{
-   if (b_dev_info.inode)
-   iput(b_dev_info.inode);
-   b_dev_info.inode = NULL;
-   kern_unmount(balloon_mnt);
-   balloon_mnt = NULL;
 }
 #else /* CONFIG_BALLOON_COMPACTION */
-static int cmm_balloon_compaction_init(void)
-{
-   return 0;
-}
-
-static void cmm_balloon_compaction_deinit(void)
+static void cmm_balloon_compaction_init(void)
 {
 }
 #endif /* CONFIG_BALLOON_COMPACTION */
@@ -622,9 +572,7 @@ static int cmm_init(void)
if (!firmware_has_feature(FW_FEATURE_CMO) && !simulate)
return -EOPNOTSUPP;
 
-   rc = cmm_balloon_compaction_init();
-   if (rc)
-   return rc;
+   cmm_balloon_compaction_init();
 
rc = register_oom_notifier(_oom_nb);
if (rc < 0)
@@ -658,7 +606,6 @@ static int cmm_init(void)
 out_oom_notifier:
unregister_oom_notifier(_oom_nb);
 out_balloon_compaction:
-   cmm_balloon_compaction_deinit();
return rc;
 }
 
@@ -677,7 +624,6 @@ static void cmm_exit(void)
unregister_memory_notifier(_mem_nb);
cmm_free_pages(atomic_long_read(_pages));
cmm_unregister_sysfs(_dev);
-   cmm_balloon_compaction_deinit();
 }
 
 /**
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 086ce77d9074..85dd6aa33df6 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -29,8 +29,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 #include 
 #include 
@@ -1730,20 +1728,6 @@ static inline void vmballoon_debugfs_exit(struct 
vmballoon *b)
 
 
 #ifdef CONFIG_BALLOON_COMPACTION
-
-static int vmballoon_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, BALLOON_VMW_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type vmballoon_fs = {
-   .name   = "balloon-vmware",
-   .init_fs_context= vmballoon_init_fs_context,
-   .kill_sb

[PATCH v2 13/19] ubifs: Convert to filemap_migrate_folio()

2022-06-08 Thread Matthew Wilcox (Oracle)
filemap_migrate_folio() is a little more general than ubifs really needs,
but it's better to share the code.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/ubifs/file.c | 29 ++---
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 04ced154960f..f2353dd676ef 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1461,29 +1461,6 @@ static bool ubifs_dirty_folio(struct address_space 
*mapping,
return ret;
 }
 
-#ifdef CONFIG_MIGRATION
-static int ubifs_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page, enum migrate_mode mode)
-{
-   int rc;
-
-   rc = migrate_page_move_mapping(mapping, newpage, page, 0);
-   if (rc != MIGRATEPAGE_SUCCESS)
-   return rc;
-
-   if (PagePrivate(page)) {
-   detach_page_private(page);
-   attach_page_private(newpage, (void *)1);
-   }
-
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
-   else
-   migrate_page_states(newpage, page);
-   return MIGRATEPAGE_SUCCESS;
-}
-#endif
-
 static bool ubifs_release_folio(struct folio *folio, gfp_t unused_gfp_flags)
 {
struct inode *inode = folio->mapping->host;
@@ -1649,10 +1626,8 @@ const struct address_space_operations 
ubifs_file_address_operations = {
.write_end  = ubifs_write_end,
.invalidate_folio = ubifs_invalidate_folio,
.dirty_folio= ubifs_dirty_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= ubifs_migrate_page,
-#endif
-   .release_folio= ubifs_release_folio,
+   .migrate_folio  = filemap_migrate_folio,
+   .release_folio  = ubifs_release_folio,
 };
 
 const struct inode_operations ubifs_file_inode_operations = {
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 08/19] btrfs: Convert btree_migratepage to migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.  migrate_page() will be converted
later.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 fs/btrfs/disk-io.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 12b11e645c14..9ceb73f683af 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -952,28 +952,28 @@ void btrfs_submit_metadata_bio(struct inode *inode, 
struct bio *bio, int mirror_
 }
 
 #ifdef CONFIG_MIGRATION
-static int btree_migratepage(struct address_space *mapping,
-   struct page *newpage, struct page *page,
-   enum migrate_mode mode)
+static int btree_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
/*
 * we can't safely write a btree page from here,
 * we haven't done the locking hook
 */
-   if (PageDirty(page))
+   if (folio_test_dirty(src))
return -EAGAIN;
/*
 * Buffers may be managed in a filesystem specific way.
 * We must have no buffers or drop them.
 */
-   if (page_has_private(page) &&
-   !try_to_release_page(page, GFP_KERNEL))
+   if (folio_get_private(src) &&
+   !filemap_release_folio(src, GFP_KERNEL))
return -EAGAIN;
-   return migrate_page(mapping, newpage, page, mode);
+   return migrate_page(mapping, >page, >page, mode);
 }
+#else
+#define btree_migrate_folio NULL
 #endif
 
-
 static int btree_writepages(struct address_space *mapping,
struct writeback_control *wbc)
 {
@@ -1073,10 +1073,8 @@ static const struct address_space_operations btree_aops 
= {
.writepages = btree_writepages,
.release_folio  = btree_release_folio,
.invalidate_folio = btree_invalidate_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= btree_migratepage,
-#endif
-   .dirty_folio = btree_dirty_folio,
+   .migrate_folio  = btree_migrate_folio,
+   .dirty_folio= btree_dirty_folio,
 };
 
 struct extent_buffer *btrfs_find_create_tree_block(
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 00/19] Convert aops->migratepage to aops->migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
We're getting to the last aops that take a struct page.  The only
remaining ones are ->writepage, ->write_begin, ->write_end and
->error_remove_page.

Changes from v1:
 - Remove ->isolate_page from secretmem
 - Split the movable_operations from address_space_operations
 - Drop the conversions of balloon, zsmalloc and z3fold
 - Fix the build errors with hugetlbfs
 - Fix the kerneldoc errors
 - Fix the ;; typo

Matthew Wilcox (Oracle) (19):
  secretmem: Remove isolate_page
  mm: Convert all PageMovable users to movable_operations
  fs: Add aops->migrate_folio
  mm/migrate: Convert fallback_migrate_page() to
fallback_migrate_folio()
  mm/migrate: Convert writeout() to take a folio
  mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()
  mm/migrate: Convert expected_page_refs() to folio_expected_refs()
  btrfs: Convert btree_migratepage to migrate_folio
  nfs: Convert to migrate_folio
  mm/migrate: Convert migrate_page() to migrate_folio()
  mm/migrate: Add filemap_migrate_folio()
  btrfs: Convert btrfs_migratepage to migrate_folio
  ubifs: Convert to filemap_migrate_folio()
  f2fs: Convert to filemap_migrate_folio()
  aio: Convert to migrate_folio
  hugetlb: Convert to migrate_folio
  secretmem: Convert to migrate_folio
  fs: Remove aops->migratepage()
  mm/folio-compat: Remove migration compatibility functions

 Documentation/filesystems/locking.rst   |   5 +-
 Documentation/filesystems/vfs.rst   |  13 +-
 Documentation/vm/page_migration.rst |  33 +--
 arch/powerpc/platforms/pseries/cmm.c|  60 +
 block/fops.c|   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |   4 +-
 drivers/misc/vmw_balloon.c  |  61 +
 drivers/virtio/virtio_balloon.c |  47 +---
 fs/aio.c|  36 +--
 fs/btrfs/disk-io.c  |  22 +-
 fs/btrfs/inode.c|  26 +--
 fs/ext2/inode.c |   4 +-
 fs/ext4/inode.c |   4 +-
 fs/f2fs/checkpoint.c|   4 +-
 fs/f2fs/data.c  |  40 +---
 fs/f2fs/f2fs.h  |   4 -
 fs/f2fs/node.c  |   4 +-
 fs/gfs2/aops.c  |   2 +-
 fs/hugetlbfs/inode.c|  23 +-
 fs/iomap/buffered-io.c  |  25 --
 fs/nfs/file.c   |   4 +-
 fs/nfs/internal.h   |   6 +-
 fs/nfs/write.c  |  16 +-
 fs/ntfs/aops.c  |   6 +-
 fs/ocfs2/aops.c |   2 +-
 fs/ubifs/file.c |  29 +--
 fs/xfs/xfs_aops.c   |   2 +-
 fs/zonefs/super.c   |   2 +-
 include/linux/balloon_compaction.h  |   6 +-
 include/linux/buffer_head.h |  10 +
 include/linux/fs.h  |  20 +-
 include/linux/iomap.h   |   6 -
 include/linux/migrate.h |  48 ++--
 include/linux/page-flags.h  |   2 +-
 include/linux/pagemap.h |   6 +
 include/uapi/linux/magic.h  |   4 -
 mm/balloon_compaction.c |  10 +-
 mm/compaction.c |  34 ++-
 mm/folio-compat.c   |  22 --
 mm/ksm.c|   2 +-
 mm/migrate.c| 238 
 mm/migrate_device.c |   3 +-
 mm/secretmem.c  |  13 +-
 mm/shmem.c  |   2 +-
 mm/swap_state.c |   2 +-
 mm/util.c   |   4 +-
 mm/z3fold.c |  82 +--
 mm/zsmalloc.c   | 102 ++---
 48 files changed, 367 insertions(+), 735 deletions(-)

-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 10/19] mm/migrate: Convert migrate_page() to migrate_folio()

2022-06-08 Thread Matthew Wilcox (Oracle)
Convert all callers to pass a folio.  Most have the folio
already available.  Switch all users from aops->migratepage to
aops->migrate_folio.  Also turn the documentation into kerneldoc.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  4 +--
 fs/btrfs/disk-io.c  |  2 +-
 fs/nfs/write.c  |  2 +-
 include/linux/migrate.h |  5 ++-
 mm/migrate.c| 37 +++--
 mm/migrate_device.c |  3 +-
 mm/shmem.c  |  2 +-
 mm/swap_state.c |  2 +-
 8 files changed, 30 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 094f06b4ce33..8423df021b71 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -216,8 +216,8 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 * However...!
 *
 * The mmu-notifier can be invalidated for a
-* migrate_page, that is alreadying holding the lock
-* on the page. Such a try_to_unmap() will result
+* migrate_folio, that is alreadying holding the lock
+* on the folio. Such a try_to_unmap() will result
 * in us calling put_pages() and so recursively try
 * to lock the page. We avoid that deadlock with
 * a trylock_page() and in exchange we risk missing
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 9ceb73f683af..8e5f1fa1e972 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -968,7 +968,7 @@ static int btree_migrate_folio(struct address_space 
*mapping,
if (folio_get_private(src) &&
!filemap_release_folio(src, GFP_KERNEL))
return -EAGAIN;
-   return migrate_page(mapping, >page, >page, mode);
+   return migrate_folio(mapping, dst, src, mode);
 }
 #else
 #define btree_migrate_folio NULL
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 649b9e633459..69569696dde0 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -2139,7 +2139,7 @@ int nfs_migrate_folio(struct address_space *mapping, 
struct folio *dst,
folio_wait_fscache(src);
}
 
-   return migrate_page(mapping, >page, >page, mode);
+   return migrate_folio(mapping, dst, src, mode);
 }
 #endif
 
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 48aa4be04108..82f00ad69a54 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -32,9 +32,8 @@ extern const char *migrate_reason_names[MR_TYPES];
 #ifdef CONFIG_MIGRATION
 
 extern void putback_movable_pages(struct list_head *l);
-extern int migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page,
-   enum migrate_mode mode);
+int migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode);
 extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
unsigned long private, enum migrate_mode mode, int reason,
unsigned int *ret_succeeded);
diff --git a/mm/migrate.c b/mm/migrate.c
index 2e2f41572066..785e32d0cf1b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -593,34 +593,37 @@ EXPORT_SYMBOL(folio_migrate_copy);
  *Migration functions
  ***/
 
-/*
- * Common logic to directly migrate a single LRU page suitable for
- * pages that do not use PagePrivate/PagePrivate2.
+/**
+ * migrate_folio() - Simple folio migration.
+ * @mapping: The address_space containing the folio.
+ * @dst: The folio to migrate the data to.
+ * @src: The folio containing the current data.
+ * @mode: How to migrate the page.
  *
- * Pages are locked upon entry and exit.
+ * Common logic to directly migrate a single LRU folio suitable for
+ * folios that do not use PagePrivate/PagePrivate2.
+ *
+ * Folios are locked upon entry and exit.
  */
-int migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page,
-   enum migrate_mode mode)
+int migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode)
 {
-   struct folio *newfolio = page_folio(newpage);
-   struct folio *folio = page_folio(page);
int rc;
 
-   BUG_ON(folio_test_writeback(folio));/* Writeback must be complete */
+   BUG_ON(folio_test_writeback(src));  /* Writeback must be complete */
 
-   rc = folio_migrate_mapping(mapping, newfolio, folio, 0);
+   

[PATCH v2 03/19] fs: Add aops->migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
Provide a folio-based replacement for aops->migratepage.  Update the
documentation to document migrate_folio instead of migratepage.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 Documentation/filesystems/locking.rst |  5 ++--
 Documentation/filesystems/vfs.rst | 13 ++-
 Documentation/vm/page_migration.rst   | 33 ++-
 include/linux/fs.h|  4 +++-
 mm/compaction.c   |  4 +++-
 mm/migrate.c  | 11 +
 6 files changed, 40 insertions(+), 30 deletions(-)

diff --git a/Documentation/filesystems/locking.rst 
b/Documentation/filesystems/locking.rst
index c0fe711f14d3..3d28b23676bd 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -253,7 +253,8 @@ prototypes::
void (*free_folio)(struct folio *);
int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
bool (*isolate_page) (struct page *, isolate_mode_t);
-   int (*migratepage)(struct address_space *, struct page *, struct page 
*);
+   int (*migrate_folio)(struct address_space *, struct folio *dst,
+   struct folio *src, enum migrate_mode);
void (*putback_page) (struct page *);
int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate)(struct folio *, size_t from, size_t 
count);
@@ -281,7 +282,7 @@ release_folio:  yes
 free_folio:yes
 direct_IO:
 isolate_page:  yes
-migratepage:   yes (both)
+migrate_folio: yes (both)
 putback_page:  yes
 launder_folio: yes
 is_partially_uptodate: yes
diff --git a/Documentation/filesystems/vfs.rst 
b/Documentation/filesystems/vfs.rst
index a08c652467d7..3ae1b039b03f 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -740,7 +740,8 @@ cache in your filesystem.  The following members are 
defined:
/* isolate a page for migration */
bool (*isolate_page) (struct page *, isolate_mode_t);
/* migrate the contents of a page to the specified target */
-   int (*migratepage) (struct page *, struct page *);
+   int (*migrate_folio)(struct mapping *, struct folio *dst,
+   struct folio *src, enum migrate_mode);
/* put migration-failed page back to right list */
void (*putback_page) (struct page *);
int (*launder_folio) (struct folio *);
@@ -935,12 +936,12 @@ cache in your filesystem.  The following members are 
defined:
is successfully isolated, VM marks the page as PG_isolated via
__SetPageIsolated.
 
-``migrate_page``
+``migrate_folio``
This is used to compact the physical memory usage.  If the VM
-   wants to relocate a page (maybe off a memory card that is
-   signalling imminent failure) it will pass a new page and an old
-   page to this function.  migrate_page should transfer any private
-   data across and update any references that it has to the page.
+   wants to relocate a folio (maybe from a memory device that is
+   signalling imminent failure) it will pass a new folio and an old
+   folio to this function.  migrate_folio should transfer any private
+   data across and update any references that it has to the folio.
 
 ``putback_page``
Called by the VM when isolated page's migration fails.
diff --git a/Documentation/vm/page_migration.rst 
b/Documentation/vm/page_migration.rst
index 8c5cb8147e55..e0f73ddfabb1 100644
--- a/Documentation/vm/page_migration.rst
+++ b/Documentation/vm/page_migration.rst
@@ -181,22 +181,23 @@ which are function pointers of struct 
address_space_operations.
Once page is successfully isolated, VM uses page.lru fields so driver
shouldn't expect to preserve values in those fields.
 
-2. ``int (*migratepage) (struct address_space *mapping,``
-|  ``struct page *newpage, struct page *oldpage, enum migrate_mode);``
-
-   After isolation, VM calls migratepage() of driver with the isolated page.
-   The function of migratepage() is to move the contents of the old page to the
-   new page
-   and set up fields of struct page newpage. Keep in mind that you should
-   indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
-   under page_lock if you migrated the oldpage successfully and returned
-   MIGRATEPAGE_SUCCESS. If driver cannot migrate the page at the moment, driver
-   can return -EAGAIN. On -EAGAIN, VM will retry page migration in a short time
-   because VM interprets -EAGAIN as "temporary migration failure". On returning
-   any error except -EAGAIN, VM will give up the page migration without
-   retrying.
-
-   Driver shouldn't touch the page.lru field while in the migratepage() 
function.
+2. ``int (*migrate_folio) (struct address_space *mapping,``
+|  ``struct folio *dst, stru

[PATCH v2 18/19] fs: Remove aops->migratepage()

2022-06-08 Thread Matthew Wilcox (Oracle)
With all users converted to migrate_folio(), remove this operation.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 include/linux/fs.h | 2 --
 mm/compaction.c| 5 ++---
 mm/migrate.c   | 3 ---
 3 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9e6b17da4e11..7e06919b8f60 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -367,8 +367,6 @@ struct address_space_operations {
 */
int (*migrate_folio)(struct address_space *, struct folio *dst,
struct folio *src, enum migrate_mode);
-   int (*migratepage) (struct address_space *,
-   struct page *, struct page *, enum migrate_mode);
int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate) (struct folio *, size_t from,
size_t count);
diff --git a/mm/compaction.c b/mm/compaction.c
index 458f49f9ab09..a2c53fcf933e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1031,7 +1031,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
 
/*
 * Only pages without mappings or that have a
-* ->migratepage callback are possible to migrate
+* ->migrate_folio callback are possible to migrate
 * without blocking. However, we can be racing with
 * truncation so it's necessary to lock the page
 * to stabilise the mapping as truncation holds
@@ -1043,8 +1043,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
 
mapping = page_mapping(page);
migrate_dirty = !mapping ||
-   mapping->a_ops->migrate_folio ||
-   mapping->a_ops->migratepage;
+   mapping->a_ops->migrate_folio;
unlock_page(page);
if (!migrate_dirty)
goto isolate_fail_put;
diff --git a/mm/migrate.c b/mm/migrate.c
index bed0de86f3ae..767e41800d15 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -909,9 +909,6 @@ static int move_to_new_folio(struct folio *dst, struct 
folio *src,
 */
rc = mapping->a_ops->migrate_folio(mapping, dst, src,
mode);
-   else if (mapping->a_ops->migratepage)
-   rc = mapping->a_ops->migratepage(mapping, >page,
-   >page, mode);
else
rc = fallback_migrate_folio(mapping, dst, src, mode);
} else {
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 12/19] btrfs: Convert btrfs_migratepage to migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
Use filemap_migrate_folio() to do the bulk of the work, and then copy
the ordered flag across if needed.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 fs/btrfs/inode.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 81737eff92f3..5f41d869c648 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8255,30 +8255,24 @@ static bool btrfs_release_folio(struct folio *folio, 
gfp_t gfp_flags)
 }
 
 #ifdef CONFIG_MIGRATION
-static int btrfs_migratepage(struct address_space *mapping,
-struct page *newpage, struct page *page,
+static int btrfs_migrate_folio(struct address_space *mapping,
+struct folio *dst, struct folio *src,
 enum migrate_mode mode)
 {
-   int ret;
+   int ret = filemap_migrate_folio(mapping, dst, src, mode);
 
-   ret = migrate_page_move_mapping(mapping, newpage, page, 0);
if (ret != MIGRATEPAGE_SUCCESS)
return ret;
 
-   if (page_has_private(page))
-   attach_page_private(newpage, detach_page_private(page));
-
-   if (PageOrdered(page)) {
-   ClearPageOrdered(page);
-   SetPageOrdered(newpage);
+   if (folio_test_ordered(src)) {
+   folio_clear_ordered(src);
+   folio_set_ordered(dst);
}
 
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
-   else
-   migrate_page_states(newpage, page);
return MIGRATEPAGE_SUCCESS;
 }
+#else
+#define btrfs_migrate_folio NULL
 #endif
 
 static void btrfs_invalidate_folio(struct folio *folio, size_t offset,
@@ -11422,9 +11416,7 @@ static const struct address_space_operations btrfs_aops 
= {
.direct_IO  = noop_direct_IO,
.invalidate_folio = btrfs_invalidate_folio,
.release_folio  = btrfs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= btrfs_migratepage,
-#endif
+   .migrate_folio  = btrfs_migrate_folio,
.dirty_folio= filemap_dirty_folio,
.error_remove_page = generic_error_remove_page,
.swap_activate  = btrfs_swap_activate,
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 15/19] aio: Convert to migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 fs/aio.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 3c249b938632..a1911e86859c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -400,8 +400,8 @@ static const struct file_operations aio_ring_fops = {
 };
 
 #if IS_ENABLED(CONFIG_MIGRATION)
-static int aio_migratepage(struct address_space *mapping, struct page *new,
-   struct page *old, enum migrate_mode mode)
+static int aio_migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode)
 {
struct kioctx *ctx;
unsigned long flags;
@@ -435,10 +435,10 @@ static int aio_migratepage(struct address_space *mapping, 
struct page *new,
goto out;
}
 
-   idx = old->index;
+   idx = src->index;
if (idx < (pgoff_t)ctx->nr_pages) {
-   /* Make sure the old page hasn't already been changed */
-   if (ctx->ring_pages[idx] != old)
+   /* Make sure the old folio hasn't already been changed */
+   if (ctx->ring_pages[idx] != >page)
rc = -EAGAIN;
} else
rc = -EINVAL;
@@ -447,27 +447,27 @@ static int aio_migratepage(struct address_space *mapping, 
struct page *new,
goto out_unlock;
 
/* Writeback must be complete */
-   BUG_ON(PageWriteback(old));
-   get_page(new);
+   BUG_ON(folio_test_writeback(src));
+   folio_get(dst);
 
-   rc = migrate_page_move_mapping(mapping, new, old, 1);
+   rc = folio_migrate_mapping(mapping, dst, src, 1);
if (rc != MIGRATEPAGE_SUCCESS) {
-   put_page(new);
+   folio_put(dst);
goto out_unlock;
}
 
/* Take completion_lock to prevent other writes to the ring buffer
-* while the old page is copied to the new.  This prevents new
+* while the old folio is copied to the new.  This prevents new
 * events from being lost.
 */
spin_lock_irqsave(>completion_lock, flags);
-   migrate_page_copy(new, old);
-   BUG_ON(ctx->ring_pages[idx] != old);
-   ctx->ring_pages[idx] = new;
+   folio_migrate_copy(dst, src);
+   BUG_ON(ctx->ring_pages[idx] != >page);
+   ctx->ring_pages[idx] = >page;
spin_unlock_irqrestore(>completion_lock, flags);
 
-   /* The old page is no longer accessible. */
-   put_page(old);
+   /* The old folio is no longer accessible. */
+   folio_put(src);
 
 out_unlock:
mutex_unlock(>ring_lock);
@@ -475,13 +475,13 @@ static int aio_migratepage(struct address_space *mapping, 
struct page *new,
spin_unlock(>private_lock);
return rc;
 }
+#else
+#define aio_migrate_folio NULL
 #endif
 
 static const struct address_space_operations aio_ctx_aops = {
.dirty_folio= noop_dirty_folio,
-#if IS_ENABLED(CONFIG_MIGRATION)
-   .migratepage= aio_migratepage,
-#endif
+   .migrate_folio  = aio_migrate_folio,
 };
 
 static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 01/19] secretmem: Remove isolate_page

2022-06-08 Thread Matthew Wilcox (Oracle)
The isolate_page operation is never called for filesystems, only
for device drivers which call SetPageMovable.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/secretmem.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/mm/secretmem.c b/mm/secretmem.c
index 206ed6b40c1d..1c7f1775b56e 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -133,11 +133,6 @@ static const struct file_operations secretmem_fops = {
.mmap   = secretmem_mmap,
 };
 
-static bool secretmem_isolate_page(struct page *page, isolate_mode_t mode)
-{
-   return false;
-}
-
 static int secretmem_migratepage(struct address_space *mapping,
 struct page *newpage, struct page *page,
 enum migrate_mode mode)
@@ -155,7 +150,6 @@ const struct address_space_operations secretmem_aops = {
.dirty_folio= noop_dirty_folio,
.free_folio = secretmem_free_folio,
.migratepage= secretmem_migratepage,
-   .isolate_page   = secretmem_isolate_page,
 };
 
 static int secretmem_setattr(struct user_namespace *mnt_userns,
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 11/19] mm/migrate: Add filemap_migrate_folio()

2022-06-08 Thread Matthew Wilcox (Oracle)
There is nothing iomap-specific about iomap_migratepage(), and it fits
a pattern used by several other filesystems, so move it to mm/migrate.c,
convert it to be filemap_migrate_folio() and convert the iomap filesystems
to use it.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 fs/gfs2/aops.c  |  2 +-
 fs/iomap/buffered-io.c  | 25 -
 fs/xfs/xfs_aops.c   |  2 +-
 fs/zonefs/super.c   |  2 +-
 include/linux/iomap.h   |  6 --
 include/linux/pagemap.h |  6 ++
 mm/migrate.c| 20 
 7 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 106e90a36583..57ff883d432c 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -774,7 +774,7 @@ static const struct address_space_operations gfs2_aops = {
.invalidate_folio = iomap_invalidate_folio,
.bmap = gfs2_bmap,
.direct_IO = noop_direct_IO,
-   .migratepage = iomap_migrate_page,
+   .migrate_folio = filemap_migrate_folio,
.is_partially_uptodate = iomap_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 66278a14bfa7..5a91aa1db945 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -489,31 +489,6 @@ void iomap_invalidate_folio(struct folio *folio, size_t 
offset, size_t len)
 }
 EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
 
-#ifdef CONFIG_MIGRATION
-int
-iomap_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode)
-{
-   struct folio *folio = page_folio(page);
-   struct folio *newfolio = page_folio(newpage);
-   int ret;
-
-   ret = folio_migrate_mapping(mapping, newfolio, folio, 0);
-   if (ret != MIGRATEPAGE_SUCCESS)
-   return ret;
-
-   if (folio_test_private(folio))
-   folio_attach_private(newfolio, folio_detach_private(folio));
-
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   folio_migrate_copy(newfolio, folio);
-   else
-   folio_migrate_flags(newfolio, folio);
-   return MIGRATEPAGE_SUCCESS;
-}
-EXPORT_SYMBOL_GPL(iomap_migrate_page);
-#endif /* CONFIG_MIGRATION */
-
 static void
 iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
 {
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 8ec38b25187b..5d1a995b15f8 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -570,7 +570,7 @@ const struct address_space_operations 
xfs_address_space_operations = {
.invalidate_folio   = iomap_invalidate_folio,
.bmap   = xfs_vm_bmap,
.direct_IO  = noop_direct_IO,
-   .migratepage= iomap_migrate_page,
+   .migrate_folio  = filemap_migrate_folio,
.is_partially_uptodate  = iomap_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.swap_activate  = xfs_iomap_swapfile_activate,
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index bcb21aea990a..d4c3f28f34ee 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -237,7 +237,7 @@ static const struct address_space_operations 
zonefs_file_aops = {
.dirty_folio= filemap_dirty_folio,
.release_folio  = iomap_release_folio,
.invalidate_folio   = iomap_invalidate_folio,
-   .migratepage= iomap_migrate_page,
+   .migrate_folio  = filemap_migrate_folio,
.is_partially_uptodate  = iomap_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.direct_IO  = noop_direct_IO,
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e552097c67e0..758a1125e72f 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -231,12 +231,6 @@ void iomap_readahead(struct readahead_control *, const 
struct iomap_ops *ops);
 bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count);
 bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags);
 void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
-#ifdef CONFIG_MIGRATION
-int iomap_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode);
-#else
-#define iomap_migrate_page NULL
-#endif
 int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
const struct iomap_ops *ops);
 int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 1caccb9f99aa..2a67c0ad7348 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1078,6 +1078,12 @@ static inline int __must_check write_one_page(struct 
page *page)
 int __set_page_dirty_nobuffers(struct page *page);
 bool noop_dirty_folio(struct address_space *mapping, struct folio *folio

[PATCH v2 04/19] mm/migrate: Convert fallback_migrate_page() to fallback_migrate_folio()

2022-06-08 Thread Matthew Wilcox (Oracle)
Use a folio throughout.  migrate_page() will be converted to
migrate_folio() later.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Christoph Hellwig 
---
 mm/migrate.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index e064b998ead0..1878de817a01 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -805,11 +805,11 @@ static int writeout(struct address_space *mapping, struct 
page *page)
 /*
  * Default handling if a filesystem does not provide a migration function.
  */
-static int fallback_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page, enum migrate_mode mode)
+static int fallback_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
-   if (PageDirty(page)) {
-   /* Only writeback pages in full synchronous migration */
+   if (folio_test_dirty(src)) {
+   /* Only writeback folios in full synchronous migration */
switch (mode) {
case MIGRATE_SYNC:
case MIGRATE_SYNC_NO_COPY:
@@ -817,18 +817,18 @@ static int fallback_migrate_page(struct address_space 
*mapping,
default:
return -EBUSY;
}
-   return writeout(mapping, page);
+   return writeout(mapping, >page);
}
 
/*
 * Buffers may be managed in a filesystem specific way.
 * We must have no buffers or drop them.
 */
-   if (page_has_private(page) &&
-   !try_to_release_page(page, GFP_KERNEL))
+   if (folio_test_private(src) &&
+   !filemap_release_folio(src, GFP_KERNEL))
return mode == MIGRATE_SYNC ? -EAGAIN : -EBUSY;
 
-   return migrate_page(mapping, newpage, page, mode);
+   return migrate_page(mapping, >page, >page, mode);
 }
 
 /*
@@ -870,8 +870,7 @@ static int move_to_new_folio(struct folio *dst, struct 
folio *src,
rc = mapping->a_ops->migratepage(mapping, >page,
>page, mode);
else
-   rc = fallback_migrate_page(mapping, >page,
-   >page, mode);
+   rc = fallback_migrate_folio(mapping, dst, src, mode);
} else {
const struct movable_operations *mops;
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 17/19] secretmem: Convert to migrate_folio

2022-06-08 Thread Matthew Wilcox (Oracle)
This is little more than changing the types over; there's no real work
being done in this function.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/secretmem.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/secretmem.c b/mm/secretmem.c
index 1c7f1775b56e..658a7486efa9 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -133,9 +133,8 @@ static const struct file_operations secretmem_fops = {
.mmap   = secretmem_mmap,
 };
 
-static int secretmem_migratepage(struct address_space *mapping,
-struct page *newpage, struct page *page,
-enum migrate_mode mode)
+static int secretmem_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
return -EBUSY;
 }
@@ -149,7 +148,7 @@ static void secretmem_free_folio(struct folio *folio)
 const struct address_space_operations secretmem_aops = {
.dirty_folio= noop_dirty_folio,
.free_folio = secretmem_free_folio,
-   .migratepage= secretmem_migratepage,
+   .migrate_folio  = secretmem_migrate_folio,
 };
 
 static int secretmem_setattr(struct user_namespace *mnt_userns,
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 15/20] balloon: Convert to migrate_folio

2022-06-07 Thread Matthew Wilcox
On Tue, Jun 07, 2022 at 03:24:15PM +0100, Matthew Wilcox wrote:
> On Tue, Jun 07, 2022 at 09:36:21AM +0200, David Hildenbrand wrote:
> > On 06.06.22 22:40, Matthew Wilcox (Oracle) wrote:
> > >  const struct address_space_operations balloon_aops = {
> > > - .migratepage = balloon_page_migrate,
> > > + .migrate_folio = balloon_migrate_folio,
> > >   .isolate_page = balloon_page_isolate,
> > >   .putback_page = balloon_page_putback,
> > >  };
> > 
> > I assume you're working on conversion of the other callbacks as well,
> > because otherwise, this ends up looking a bit inconsistent and confusing :)
> 
> My intention was to finish converting aops for the next merge window.
> 
> However, it seems to me that we goofed back in 2016 by merging
> commit bda807d44454.  isolate_page() and putback_page() should
> never have been part of address_space_operations.
> 
> I'm about to embark on creating a new migrate_operations struct
> for drivers to use that contains only isolate/putback/migrate.
> No filesystem uses isolate/putback, so those can just be deleted.
> Both migrate_operations & address_space_operations will contain a
> migrate callback.

Well, that went more smoothly than I thought it would.

I can't see a nice way to split this patch up (other than making secretmem
its own patch).  We just don't have enough bits in struct page to support
both ways of handling PageMovable at the same time, so we can't convert
one driver at a time.  The diffstat is pretty compelling.

The patch is on top of this patch series; I think it probably makes
sense to shuffle it to be first, to avoid changing these drivers to
folios, then changing them back.

Questions:

Is what I've done with zsmalloc acceptable?  The locking in that
file is rather complex.

Can we now eliminate balloon_mnt / balloon_fs from cmm.c?  I haven't even
compiled thatfile , but it seems like the filesystem serves no use now.

Similar question for vmw_balloon.c, although I have compiled that.

---

I just spotted a bug with zs_unregister_migration(); it won't compile
without CONFIG_MIGRATION.  I'll fix that up if the general approach is
acceptable.

 arch/powerpc/platforms/pseries/cmm.c |   13 
 drivers/misc/vmw_balloon.c   |   10 --
 include/linux/balloon_compaction.h   |6 +---
 include/linux/fs.h   |2 -
 include/linux/migrate.h  |   27 ++
 include/linux/page-flags.h   |2 -
 mm/balloon_compaction.c  |   18 ++--
 mm/compaction.c  |   29 ---
 mm/migrate.c |   23 ---
 mm/secretmem.c   |6 
 mm/util.c|4 +-
 mm/z3fold.c  |   45 ++
 mm/zsmalloc.c|   52 +--
 13 files changed, 83 insertions(+), 154 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c 
b/arch/powerpc/platforms/pseries/cmm.c
index 15ed8206c463..2ecbab3db723 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -578,23 +578,10 @@ static int cmm_balloon_compaction_init(void)
return rc;
}
 
-   b_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
-   if (IS_ERR(b_dev_info.inode)) {
-   rc = PTR_ERR(b_dev_info.inode);
-   b_dev_info.inode = NULL;
-   kern_unmount(balloon_mnt);
-   balloon_mnt = NULL;
-   return rc;
-   }
-
-   b_dev_info.inode->i_mapping->a_ops = _aops;
return 0;
 }
 static void cmm_balloon_compaction_deinit(void)
 {
-   if (b_dev_info.inode)
-   iput(b_dev_info.inode);
-   b_dev_info.inode = NULL;
kern_unmount(balloon_mnt);
balloon_mnt = NULL;
 }
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 086ce77d9074..4a6755934bb5 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1869,10 +1869,6 @@ static int vmballoon_migratepage(struct balloon_dev_info 
*b_dev_info,
  */
 static void vmballoon_compaction_deinit(struct vmballoon *b)
 {
-   if (!IS_ERR(b->b_dev_info.inode))
-   iput(b->b_dev_info.inode);
-
-   b->b_dev_info.inode = NULL;
kern_unmount(vmballoon_mnt);
vmballoon_mnt = NULL;
 }
@@ -1895,12 +1891,6 @@ static __init int vmballoon_compaction_init(struct 
vmballoon *b)
return PTR_ERR(vmballoon_mnt);
 
b->b_dev_info.migratepage = vmballoon_migratepage;
-   b->b_dev_info.inode = alloc_anon_inode(vmballoon_mnt->mnt_sb);
-
-   if (IS_ERR(b->b_dev_info.inode))
-   return PTR_ERR(b->b_dev_info.inode);
-
-   b->b_dev_info.inode->i_mapping->a_ops = _aops;
 

Re: [PATCH 14/20] hugetlb: Convert to migrate_folio

2022-06-07 Thread Matthew Wilcox
On Tue, Jun 07, 2022 at 02:13:26PM +0800, kernel test robot wrote:
>fs/hugetlbfs/inode.c: In function 'hugetlbfs_migrate_folio':
> >> fs/hugetlbfs/inode.c:990:17: error: implicit declaration of function 
> >> 'folio_migrate_copy' [-Werror=implicit-function-declaration]
>  990 | folio_migrate_copy(dst, src);
>  | ^~
> >> fs/hugetlbfs/inode.c:992:17: error: implicit declaration of function 
> >> 'folio_migrate_flags'; did you mean 'folio_mapping_flags'? 
> >> [-Werror=implicit-function-declaration]
>  992 | folio_migrate_flags(dst, src);
>  | ^~~
>  | folio_mapping_flags
>cc1: some warnings being treated as errors

Thanks, fixed.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 04/20] mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()

2022-06-07 Thread Matthew Wilcox
On Tue, Jun 07, 2022 at 11:37:45AM +0800, kernel test robot wrote:
> All warnings (new ones prefixed by >>):
> 
> >> mm/migrate.c:775: warning: expecting prototype for 
> >> buffer_migrate_folio_noref(). Prototype was for 
> >> buffer_migrate_folio_norefs() instead

No good deed (turning documentation into kerneldoc) goes unpunished ...
thanks, fixed.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 15/20] balloon: Convert to migrate_folio

2022-06-07 Thread Matthew Wilcox
On Tue, Jun 07, 2022 at 09:36:21AM +0200, David Hildenbrand wrote:
> On 06.06.22 22:40, Matthew Wilcox (Oracle) wrote:
> >  const struct address_space_operations balloon_aops = {
> > -   .migratepage = balloon_page_migrate,
> > +   .migrate_folio = balloon_migrate_folio,
> > .isolate_page = balloon_page_isolate,
> > .putback_page = balloon_page_putback,
> >  };
> 
> I assume you're working on conversion of the other callbacks as well,
> because otherwise, this ends up looking a bit inconsistent and confusing :)

My intention was to finish converting aops for the next merge window.

However, it seems to me that we goofed back in 2016 by merging
commit bda807d44454.  isolate_page() and putback_page() should
never have been part of address_space_operations.

I'm about to embark on creating a new migrate_operations struct
for drivers to use that contains only isolate/putback/migrate.
No filesystem uses isolate/putback, so those can just be deleted.
Both migrate_operations & address_space_operations will contain a
migrate callback.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 05/20] mm/migrate: Convert expected_page_refs() to folio_expected_refs()

2022-06-07 Thread Matthew Wilcox
On Tue, Jun 07, 2022 at 09:41:57AM -0400, Brian Foster wrote:
> On Mon, Jun 06, 2022 at 09:40:35PM +0100, Matthew Wilcox (Oracle) wrote:
> > -static int expected_page_refs(struct address_space *mapping, struct page 
> > *page)
> > +static int folio_expected_refs(struct address_space *mapping,
> > +   struct folio *folio)
> >  {
> > -   int expected_count = 1;
> > +   int refs = 1;
> > +   if (!mapping)
> > +   return refs;
> >  
> > -   if (mapping)
> > -   expected_count += compound_nr(page) + page_has_private(page);
> > -   return expected_count;
> > +   refs += folio_nr_pages(folio);
> > +   if (folio_get_private(folio))
> > +   refs++;
> 
> Why not folio_has_private() (as seems to be used for later
> page_has_private() conversions) here?

We have a horrid confusion that I'm trying to clean up stealthily
without anyone noticing.  I would have gotten away with it too if it
weren't for you pesky kids.

#define PAGE_FLAGS_PRIVATE  \
(1UL << PG_private | 1UL << PG_private_2)

static inline int page_has_private(struct page *page)
{
return !!(page->flags & PAGE_FLAGS_PRIVATE);
}

So what this function is saying is that there is one extra refcount
expected on the struct page if PG_private _or_ PG_private_2 is set.

How are filesystems expected to manage their page's refcount with this
rule?  Increment the refcount when setting PG_private unless
PG_private_2 is already set?  Decrement the refcount when clearing
PG_private_2 unless PG_private is set?

This is garbage.  IMO, PG_private_2 should have no bearing on the page's
refcount.  Only btrfs and the netfs's use private_2 and neither of them
do anything to the refcount when setting/clearing it.  So that's what
I'm implementing here.

> > +
> > +   return refs;;
> 
> Nit: extra ;

Oh, that's where it went ;-)  I had a compile error due to a missing
semicolon at some point, and thought it was just a typo ...
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 14/20] hugetlb: Convert to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
This involves converting migrate_huge_page_move_mapping().  We also need a
folio variant of hugetlb_set_page_subpool(), but that's for a later patch.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/hugetlbfs/inode.c| 19 ++-
 include/linux/migrate.h |  6 +++---
 mm/migrate.c| 18 +-
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 14d33f725e05..583ca3f52c04 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -954,25 +954,26 @@ static int hugetlbfs_symlink(struct user_namespace 
*mnt_userns,
return error;
 }
 
-static int hugetlbfs_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page,
+static int hugetlbfs_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src,
enum migrate_mode mode)
 {
int rc;
 
-   rc = migrate_huge_page_move_mapping(mapping, newpage, page);
+   rc = migrate_huge_page_move_mapping(mapping, dst, src);
if (rc != MIGRATEPAGE_SUCCESS)
return rc;
 
-   if (hugetlb_page_subpool(page)) {
-   hugetlb_set_page_subpool(newpage, hugetlb_page_subpool(page));
-   hugetlb_set_page_subpool(page, NULL);
+   if (hugetlb_page_subpool(>page)) {
+   hugetlb_set_page_subpool(>page,
+   hugetlb_page_subpool(>page));
+   hugetlb_set_page_subpool(>page, NULL);
}
 
if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
+   folio_migrate_copy(dst, src);
else
-   migrate_page_states(newpage, page);
+   folio_migrate_flags(dst, src);
 
return MIGRATEPAGE_SUCCESS;
 }
@@ -1142,7 +1143,7 @@ static const struct address_space_operations 
hugetlbfs_aops = {
.write_begin= hugetlbfs_write_begin,
.write_end  = hugetlbfs_write_end,
.dirty_folio= noop_dirty_folio,
-   .migratepage= hugetlbfs_migrate_page,
+   .migrate_folio  = hugetlbfs_migrate_folio,
.error_remove_page  = hugetlbfs_error_remove_page,
 };
 
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 4ef22806cd8e..088749471485 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -35,8 +35,8 @@ extern int isolate_movable_page(struct page *page, 
isolate_mode_t mode);
 
 extern void migrate_page_states(struct page *newpage, struct page *page);
 extern void migrate_page_copy(struct page *newpage, struct page *page);
-extern int migrate_huge_page_move_mapping(struct address_space *mapping,
- struct page *newpage, struct page *page);
+int migrate_huge_page_move_mapping(struct address_space *mapping,
+   struct folio *dst, struct folio *src);
 extern int migrate_page_move_mapping(struct address_space *mapping,
struct page *newpage, struct page *page, int extra_count);
 void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep,
@@ -67,7 +67,7 @@ static inline void migrate_page_copy(struct page *newpage,
 struct page *page) {}
 
 static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
- struct page *newpage, struct page *page)
+ struct folio *dst, struct folio *src)
 {
return -ENOSYS;
 }
diff --git a/mm/migrate.c b/mm/migrate.c
index 148dd0463dec..a8edd226c72d 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -475,26 +475,26 @@ EXPORT_SYMBOL(folio_migrate_mapping);
  * of folio_migrate_mapping().
  */
 int migrate_huge_page_move_mapping(struct address_space *mapping,
-  struct page *newpage, struct page *page)
+  struct folio *dst, struct folio *src)
 {
-   XA_STATE(xas, >i_pages, page_index(page));
+   XA_STATE(xas, >i_pages, folio_index(src));
int expected_count;
 
xas_lock_irq();
-   expected_count = 2 + page_has_private(page);
-   if (!page_ref_freeze(page, expected_count)) {
+   expected_count = 2 + folio_has_private(src);
+   if (!folio_ref_freeze(src, expected_count)) {
xas_unlock_irq();
return -EAGAIN;
}
 
-   newpage->index = page->index;
-   newpage->mapping = page->mapping;
+   dst->index = src->index;
+   dst->mapping = src->mapping;
 
-   get_page(newpage);
+   folio_get(dst);
 
-   xas_store(, newpage);
+   xas_store(, dst);
 
-   page_ref_unfreeze(page, expected_count - 1);
+   folio_ref_unfreeze(src, expected_count - 1);
 
xas_unlock_irq();
 
-- 
2.35.1

___
Virtualization mailing list
Virtu

[PATCH 05/20] mm/migrate: Convert expected_page_refs() to folio_expected_refs()

2022-06-06 Thread Matthew Wilcox (Oracle)
Now that both callers have a folio, convert this function to
take a folio & rename it.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/migrate.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 77b8c662c9ca..e0a593e5b5f9 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -337,13 +337,18 @@ void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t 
*pmd)
 }
 #endif
 
-static int expected_page_refs(struct address_space *mapping, struct page *page)
+static int folio_expected_refs(struct address_space *mapping,
+   struct folio *folio)
 {
-   int expected_count = 1;
+   int refs = 1;
+   if (!mapping)
+   return refs;
 
-   if (mapping)
-   expected_count += compound_nr(page) + page_has_private(page);
-   return expected_count;
+   refs += folio_nr_pages(folio);
+   if (folio_get_private(folio))
+   refs++;
+
+   return refs;;
 }
 
 /*
@@ -360,7 +365,7 @@ int folio_migrate_mapping(struct address_space *mapping,
XA_STATE(xas, >i_pages, folio_index(folio));
struct zone *oldzone, *newzone;
int dirty;
-   int expected_count = expected_page_refs(mapping, >page) + 
extra_count;
+   int expected_count = folio_expected_refs(mapping, folio) + extra_count;
long nr = folio_nr_pages(folio);
 
if (!mapping) {
@@ -670,7 +675,7 @@ static int __buffer_migrate_folio(struct address_space 
*mapping,
return migrate_page(mapping, >page, >page, mode);
 
/* Check whether page does not have extra refs before we do more work */
-   expected_count = expected_page_refs(mapping, >page);
+   expected_count = folio_expected_refs(mapping, src);
if (folio_ref_count(src) != expected_count)
return -EAGAIN;
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 00/20] Convert aops->migratepage to aops->migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
I plan to submit these patches through my pagecache tree in the upcoming
merge window.  I'm pretty happy that most filesystems are now using
common code for ->migrate_folio; it's not something that most filesystem
people want to care about.  I'm running xfstests using xfs against it now,
but it's little more than compile tested for other filesystems.

Matthew Wilcox (Oracle) (20):
  fs: Add aops->migrate_folio
  mm/migrate: Convert fallback_migrate_page() to
fallback_migrate_folio()
  mm/migrate: Convert writeout() to take a folio
  mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()
  mm/migrate: Convert expected_page_refs() to folio_expected_refs()
  btrfs: Convert btree_migratepage to migrate_folio
  nfs: Convert to migrate_folio
  mm/migrate: Convert migrate_page() to migrate_folio()
  mm/migrate: Add filemap_migrate_folio()
  btrfs: Convert btrfs_migratepage to migrate_folio
  ubifs: Convert to filemap_migrate_folio()
  f2fs: Convert to filemap_migrate_folio()
  aio: Convert to migrate_folio
  hugetlb: Convert to migrate_folio
  balloon: Convert to migrate_folio
  secretmem: Convert to migrate_folio
  z3fold: Convert to migrate_folio
  zsmalloc: Convert to migrate_folio
  fs: Remove aops->migratepage()
  mm/folio-compat: Remove migration compatibility functions

 Documentation/filesystems/locking.rst   |   5 +-
 Documentation/filesystems/vfs.rst   |  13 +-
 Documentation/vm/page_migration.rst |  33 +--
 block/fops.c|   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |   4 +-
 fs/aio.c|  36 ++--
 fs/btrfs/disk-io.c  |  22 +-
 fs/btrfs/inode.c|  26 +--
 fs/ext2/inode.c |   4 +-
 fs/ext4/inode.c |   4 +-
 fs/f2fs/checkpoint.c|   4 +-
 fs/f2fs/data.c  |  40 +---
 fs/f2fs/f2fs.h  |   4 -
 fs/f2fs/node.c  |   4 +-
 fs/gfs2/aops.c  |   2 +-
 fs/hugetlbfs/inode.c|  19 +-
 fs/iomap/buffered-io.c  |  25 ---
 fs/nfs/file.c   |   4 +-
 fs/nfs/internal.h   |   6 +-
 fs/nfs/write.c  |  16 +-
 fs/ntfs/aops.c  |   6 +-
 fs/ocfs2/aops.c |   2 +-
 fs/ubifs/file.c |  29 +--
 fs/xfs/xfs_aops.c   |   2 +-
 fs/zonefs/super.c   |   2 +-
 include/linux/buffer_head.h |  10 +
 include/linux/fs.h  |  18 +-
 include/linux/iomap.h   |   6 -
 include/linux/migrate.h |  22 +-
 include/linux/pagemap.h |   6 +
 mm/balloon_compaction.c |  15 +-
 mm/compaction.c |   5 +-
 mm/folio-compat.c   |  22 --
 mm/ksm.c|   2 +-
 mm/migrate.c| 217 
 mm/migrate_device.c |   3 +-
 mm/secretmem.c  |   6 +-
 mm/shmem.c  |   2 +-
 mm/swap_state.c |   2 +-
 mm/z3fold.c |   8 +-
 mm/zsmalloc.c   |   8 +-
 41 files changed, 287 insertions(+), 379 deletions(-)

-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 07/20] nfs: Convert to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.  migrate_page() will be converted
later.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/nfs/file.c |  4 +---
 fs/nfs/internal.h |  6 --
 fs/nfs/write.c| 16 
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 2d72b1b7ed74..549baed76351 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -533,9 +533,7 @@ const struct address_space_operations nfs_file_aops = {
.write_end = nfs_write_end,
.invalidate_folio = nfs_invalidate_folio,
.release_folio = nfs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage = nfs_migrate_page,
-#endif
+   .migrate_folio = nfs_migrate_folio,
.launder_folio = nfs_launder_folio,
.is_dirty_writeback = nfs_check_dirty_writeback,
.error_remove_page = generic_error_remove_page,
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 8f8cd6e2d4db..437ebe544aaf 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -578,8 +578,10 @@ void nfs_clear_pnfs_ds_commit_verifiers(struct 
pnfs_ds_commit_info *cinfo)
 #endif
 
 #ifdef CONFIG_MIGRATION
-extern int nfs_migrate_page(struct address_space *,
-   struct page *, struct page *, enum migrate_mode);
+int nfs_migrate_folio(struct address_space *, struct folio *dst,
+   struct folio *src, enum migrate_mode);
+#else
+#define nfs_migrate_folio NULL
 #endif
 
 static inline int
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 1c706465d090..649b9e633459 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -2119,27 +2119,27 @@ int nfs_wb_page(struct inode *inode, struct page *page)
 }
 
 #ifdef CONFIG_MIGRATION
-int nfs_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode)
+int nfs_migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode)
 {
/*
-* If PagePrivate is set, then the page is currently associated with
+* If the private flag is set, the folio is currently associated with
 * an in-progress read or write request. Don't try to migrate it.
 *
 * FIXME: we could do this in principle, but we'll need a way to ensure
 *that we can safely release the inode reference while holding
-*the page lock.
+*the folio lock.
 */
-   if (PagePrivate(page))
+   if (folio_test_private(src))
return -EBUSY;
 
-   if (PageFsCache(page)) {
+   if (folio_test_fscache(src)) {
if (mode == MIGRATE_ASYNC)
return -EBUSY;
-   wait_on_page_fscache(page);
+   folio_wait_fscache(src);
}
 
-   return migrate_page(mapping, newpage, page, mode);
+   return migrate_page(mapping, >page, >page, mode);
 }
 #endif
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 16/20] secretmem: Convert to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
This is little more than changing the types over; there's no real work
being done in this function.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/secretmem.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/secretmem.c b/mm/secretmem.c
index 206ed6b40c1d..9c7f6e3bf3e1 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -138,8 +138,8 @@ static bool secretmem_isolate_page(struct page *page, 
isolate_mode_t mode)
return false;
 }
 
-static int secretmem_migratepage(struct address_space *mapping,
-struct page *newpage, struct page *page,
+static int secretmem_migrate_folio(struct address_space *mapping,
+struct folio *dst, struct folio *src,
 enum migrate_mode mode)
 {
return -EBUSY;
@@ -154,7 +154,7 @@ static void secretmem_free_folio(struct folio *folio)
 const struct address_space_operations secretmem_aops = {
.dirty_folio= noop_dirty_folio,
.free_folio = secretmem_free_folio,
-   .migratepage= secretmem_migratepage,
+   .migrate_folio  = secretmem_migrate_folio,
.isolate_page   = secretmem_isolate_page,
 };
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 15/20] balloon: Convert to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
This is little more than changing the types over; there's no real work
being done in this function.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/balloon_compaction.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 4b8eab4b3f45..3f75b876ad76 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -230,11 +230,10 @@ static void balloon_page_putback(struct page *page)
 
 
 /* move_to_new_page() counterpart for a ballooned page */
-static int balloon_page_migrate(struct address_space *mapping,
-   struct page *newpage, struct page *page,
-   enum migrate_mode mode)
+static int balloon_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
-   struct balloon_dev_info *balloon = balloon_page_device(page);
+   struct balloon_dev_info *balloon = balloon_page_device(>page);
 
/*
 * We can not easily support the no copy case here so ignore it as it
@@ -244,14 +243,14 @@ static int balloon_page_migrate(struct address_space 
*mapping,
if (mode == MIGRATE_SYNC_NO_COPY)
return -EINVAL;
 
-   VM_BUG_ON_PAGE(!PageLocked(page), page);
-   VM_BUG_ON_PAGE(!PageLocked(newpage), newpage);
+   VM_BUG_ON_FOLIO(!folio_test_locked(src), src);
+   VM_BUG_ON_FOLIO(!folio_test_locked(dst), dst);
 
-   return balloon->migratepage(balloon, newpage, page, mode);
+   return balloon->migratepage(balloon, >page, >page, mode);
 }
 
 const struct address_space_operations balloon_aops = {
-   .migratepage = balloon_page_migrate,
+   .migrate_folio = balloon_migrate_folio,
.isolate_page = balloon_page_isolate,
.putback_page = balloon_page_putback,
 };
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 20/20] mm/folio-compat: Remove migration compatibility functions

2022-06-06 Thread Matthew Wilcox (Oracle)
migrate_page_move_mapping(), migrate_page_copy() and migrate_page_states()
are all now unused after converting all the filesystems from
aops->migratepage() to aops->migrate_folio().

Signed-off-by: Matthew Wilcox (Oracle) 
---
 include/linux/migrate.h | 11 ---
 mm/folio-compat.c   | 22 --
 mm/ksm.c|  2 +-
 3 files changed, 1 insertion(+), 34 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 088749471485..4670f3aec232 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -33,12 +33,8 @@ extern int migrate_pages(struct list_head *l, new_page_t 
new, free_page_t free,
 extern struct page *alloc_migration_target(struct page *page, unsigned long 
private);
 extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
 
-extern void migrate_page_states(struct page *newpage, struct page *page);
-extern void migrate_page_copy(struct page *newpage, struct page *page);
 int migrate_huge_page_move_mapping(struct address_space *mapping,
struct folio *dst, struct folio *src);
-extern int migrate_page_move_mapping(struct address_space *mapping,
-   struct page *newpage, struct page *page, int extra_count);
 void migration_entry_wait_on_locked(swp_entry_t entry, pte_t *ptep,
spinlock_t *ptl);
 void folio_migrate_flags(struct folio *newfolio, struct folio *folio);
@@ -59,13 +55,6 @@ static inline struct page *alloc_migration_target(struct 
page *page,
 static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
{ return -EBUSY; }
 
-static inline void migrate_page_states(struct page *newpage, struct page *page)
-{
-}
-
-static inline void migrate_page_copy(struct page *newpage,
-struct page *page) {}
-
 static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
  struct folio *dst, struct folio *src)
 {
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 20bc15b57d93..458618c7302c 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -51,28 +51,6 @@ void mark_page_accessed(struct page *page)
 }
 EXPORT_SYMBOL(mark_page_accessed);
 
-#ifdef CONFIG_MIGRATION
-int migrate_page_move_mapping(struct address_space *mapping,
-   struct page *newpage, struct page *page, int extra_count)
-{
-   return folio_migrate_mapping(mapping, page_folio(newpage),
-   page_folio(page), extra_count);
-}
-EXPORT_SYMBOL(migrate_page_move_mapping);
-
-void migrate_page_states(struct page *newpage, struct page *page)
-{
-   folio_migrate_flags(page_folio(newpage), page_folio(page));
-}
-EXPORT_SYMBOL(migrate_page_states);
-
-void migrate_page_copy(struct page *newpage, struct page *page)
-{
-   folio_migrate_copy(page_folio(newpage), page_folio(page));
-}
-EXPORT_SYMBOL(migrate_page_copy);
-#endif
-
 bool set_page_writeback(struct page *page)
 {
return folio_start_writeback(page_folio(page));
diff --git a/mm/ksm.c b/mm/ksm.c
index 54f78c9eecae..e8f8c1a2bb39 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -712,7 +712,7 @@ static struct page *get_ksm_page(struct stable_node 
*stable_node,
 * however, it might mean that the page is under page_ref_freeze().
 * The __remove_mapping() case is easy, again the node is now stale;
 * the same is in reuse_ksm_page() case; but if page is swapcache
-* in migrate_page_move_mapping(), it might still be our page,
+* in folio_migrate_mapping(), it might still be our page,
 * in which case it's essential to keep the node.
 */
while (!get_page_unless_zero(page)) {
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 06/20] btrfs: Convert btree_migratepage to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.  migrate_page() will be converted
later.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/btrfs/disk-io.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 12b11e645c14..9ceb73f683af 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -952,28 +952,28 @@ void btrfs_submit_metadata_bio(struct inode *inode, 
struct bio *bio, int mirror_
 }
 
 #ifdef CONFIG_MIGRATION
-static int btree_migratepage(struct address_space *mapping,
-   struct page *newpage, struct page *page,
-   enum migrate_mode mode)
+static int btree_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
/*
 * we can't safely write a btree page from here,
 * we haven't done the locking hook
 */
-   if (PageDirty(page))
+   if (folio_test_dirty(src))
return -EAGAIN;
/*
 * Buffers may be managed in a filesystem specific way.
 * We must have no buffers or drop them.
 */
-   if (page_has_private(page) &&
-   !try_to_release_page(page, GFP_KERNEL))
+   if (folio_get_private(src) &&
+   !filemap_release_folio(src, GFP_KERNEL))
return -EAGAIN;
-   return migrate_page(mapping, newpage, page, mode);
+   return migrate_page(mapping, >page, >page, mode);
 }
+#else
+#define btree_migrate_folio NULL
 #endif
 
-
 static int btree_writepages(struct address_space *mapping,
struct writeback_control *wbc)
 {
@@ -1073,10 +1073,8 @@ static const struct address_space_operations btree_aops 
= {
.writepages = btree_writepages,
.release_folio  = btree_release_folio,
.invalidate_folio = btree_invalidate_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= btree_migratepage,
-#endif
-   .dirty_folio = btree_dirty_folio,
+   .migrate_folio  = btree_migrate_folio,
+   .dirty_folio= btree_dirty_folio,
 };
 
 struct extent_buffer *btrfs_find_create_tree_block(
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 08/20] mm/migrate: Convert migrate_page() to migrate_folio()

2022-06-06 Thread Matthew Wilcox (Oracle)
Convert all callers to pass a folio.  Most have the folio
already available.  Switch all users from aops->migratepage to
aops->migrate_folio.  Also turn the documentation into kerneldoc.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  4 +--
 fs/btrfs/disk-io.c  |  2 +-
 fs/nfs/write.c  |  2 +-
 include/linux/migrate.h |  5 ++-
 mm/migrate.c| 37 +++--
 mm/migrate_device.c |  3 +-
 mm/shmem.c  |  2 +-
 mm/swap_state.c |  2 +-
 8 files changed, 30 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 094f06b4ce33..8423df021b71 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -216,8 +216,8 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 * However...!
 *
 * The mmu-notifier can be invalidated for a
-* migrate_page, that is alreadying holding the lock
-* on the page. Such a try_to_unmap() will result
+* migrate_folio, that is alreadying holding the lock
+* on the folio. Such a try_to_unmap() will result
 * in us calling put_pages() and so recursively try
 * to lock the page. We avoid that deadlock with
 * a trylock_page() and in exchange we risk missing
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 9ceb73f683af..8e5f1fa1e972 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -968,7 +968,7 @@ static int btree_migrate_folio(struct address_space 
*mapping,
if (folio_get_private(src) &&
!filemap_release_folio(src, GFP_KERNEL))
return -EAGAIN;
-   return migrate_page(mapping, >page, >page, mode);
+   return migrate_folio(mapping, dst, src, mode);
 }
 #else
 #define btree_migrate_folio NULL
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 649b9e633459..69569696dde0 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -2139,7 +2139,7 @@ int nfs_migrate_folio(struct address_space *mapping, 
struct folio *dst,
folio_wait_fscache(src);
}
 
-   return migrate_page(mapping, >page, >page, mode);
+   return migrate_folio(mapping, dst, src, mode);
 }
 #endif
 
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 069a89e847f3..4ef22806cd8e 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -25,9 +25,8 @@ extern const char *migrate_reason_names[MR_TYPES];
 #ifdef CONFIG_MIGRATION
 
 extern void putback_movable_pages(struct list_head *l);
-extern int migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page,
-   enum migrate_mode mode);
+int migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode);
 extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
unsigned long private, enum migrate_mode mode, int reason,
unsigned int *ret_succeeded);
diff --git a/mm/migrate.c b/mm/migrate.c
index e0a593e5b5f9..6232c291fdb9 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -594,34 +594,37 @@ EXPORT_SYMBOL(folio_migrate_copy);
  *Migration functions
  ***/
 
-/*
- * Common logic to directly migrate a single LRU page suitable for
- * pages that do not use PagePrivate/PagePrivate2.
+/**
+ * migrate_folio() - Simple folio migration.
+ * @mapping: The address_space containing the folio.
+ * @dst: The folio to migrate the data to.
+ * @src: The folio containing the current data.
+ * @mode: How to migrate the page.
  *
- * Pages are locked upon entry and exit.
+ * Common logic to directly migrate a single LRU folio suitable for
+ * folios that do not use PagePrivate/PagePrivate2.
+ *
+ * Folios are locked upon entry and exit.
  */
-int migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page,
-   enum migrate_mode mode)
+int migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode)
 {
-   struct folio *newfolio = page_folio(newpage);
-   struct folio *folio = page_folio(page);
int rc;
 
-   BUG_ON(folio_test_writeback(folio));/* Writeback must be complete */
+   BUG_ON(folio_test_writeback(src));  /* Writeback must be complete */
 
-   rc = folio_migrate_mapping(mapping, newfolio, folio, 0);
+   rc = folio_migrate_ma

[PATCH 10/20] btrfs: Convert btrfs_migratepage to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
Use filemap_migrate_folio() to do the bulk of the work, and then copy
the ordered flag across if needed.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/btrfs/inode.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 81737eff92f3..5f41d869c648 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8255,30 +8255,24 @@ static bool btrfs_release_folio(struct folio *folio, 
gfp_t gfp_flags)
 }
 
 #ifdef CONFIG_MIGRATION
-static int btrfs_migratepage(struct address_space *mapping,
-struct page *newpage, struct page *page,
+static int btrfs_migrate_folio(struct address_space *mapping,
+struct folio *dst, struct folio *src,
 enum migrate_mode mode)
 {
-   int ret;
+   int ret = filemap_migrate_folio(mapping, dst, src, mode);
 
-   ret = migrate_page_move_mapping(mapping, newpage, page, 0);
if (ret != MIGRATEPAGE_SUCCESS)
return ret;
 
-   if (page_has_private(page))
-   attach_page_private(newpage, detach_page_private(page));
-
-   if (PageOrdered(page)) {
-   ClearPageOrdered(page);
-   SetPageOrdered(newpage);
+   if (folio_test_ordered(src)) {
+   folio_clear_ordered(src);
+   folio_set_ordered(dst);
}
 
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
-   else
-   migrate_page_states(newpage, page);
return MIGRATEPAGE_SUCCESS;
 }
+#else
+#define btrfs_migrate_folio NULL
 #endif
 
 static void btrfs_invalidate_folio(struct folio *folio, size_t offset,
@@ -11422,9 +11416,7 @@ static const struct address_space_operations btrfs_aops 
= {
.direct_IO  = noop_direct_IO,
.invalidate_folio = btrfs_invalidate_folio,
.release_folio  = btrfs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= btrfs_migratepage,
-#endif
+   .migrate_folio  = btrfs_migrate_folio,
.dirty_folio= filemap_dirty_folio,
.error_remove_page = generic_error_remove_page,
.swap_activate  = btrfs_swap_activate,
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 03/20] mm/migrate: Convert writeout() to take a folio

2022-06-06 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/migrate.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index d772ce63d7e2..f19246c12fe9 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -762,11 +762,10 @@ int buffer_migrate_page_norefs(struct address_space 
*mapping,
 #endif
 
 /*
- * Writeback a page to clean the dirty state
+ * Writeback a folio to clean the dirty state
  */
-static int writeout(struct address_space *mapping, struct page *page)
+static int writeout(struct address_space *mapping, struct folio *folio)
 {
-   struct folio *folio = page_folio(page);
struct writeback_control wbc = {
.sync_mode = WB_SYNC_NONE,
.nr_to_write = 1,
@@ -780,25 +779,25 @@ static int writeout(struct address_space *mapping, struct 
page *page)
/* No write method for the address space */
return -EINVAL;
 
-   if (!clear_page_dirty_for_io(page))
+   if (!folio_clear_dirty_for_io(folio))
/* Someone else already triggered a write */
return -EAGAIN;
 
/*
-* A dirty page may imply that the underlying filesystem has
-* the page on some queue. So the page must be clean for
-* migration. Writeout may mean we loose the lock and the
-* page state is no longer what we checked for earlier.
+* A dirty folio may imply that the underlying filesystem has
+* the folio on some queue. So the folio must be clean for
+* migration. Writeout may mean we lose the lock and the
+* folio state is no longer what we checked for earlier.
 * At this point we know that the migration attempt cannot
 * be successful.
 */
remove_migration_ptes(folio, folio, false);
 
-   rc = mapping->a_ops->writepage(page, );
+   rc = mapping->a_ops->writepage(>page, );
 
if (rc != AOP_WRITEPAGE_ACTIVATE)
/* unlocked. Relock */
-   lock_page(page);
+   folio_lock(folio);
 
return (rc < 0) ? -EIO : -EAGAIN;
 }
@@ -818,7 +817,7 @@ static int fallback_migrate_folio(struct address_space 
*mapping,
default:
return -EBUSY;
}
-   return writeout(mapping, >page);
+   return writeout(mapping, src);
}
 
/*
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 02/20] mm/migrate: Convert fallback_migrate_page() to fallback_migrate_folio()

2022-06-06 Thread Matthew Wilcox (Oracle)
Use a folio throughout.  migrate_page() will be converted to
migrate_folio() later.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/migrate.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 75cb6aa38988..d772ce63d7e2 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -806,11 +806,11 @@ static int writeout(struct address_space *mapping, struct 
page *page)
 /*
  * Default handling if a filesystem does not provide a migration function.
  */
-static int fallback_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page, enum migrate_mode mode)
+static int fallback_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
-   if (PageDirty(page)) {
-   /* Only writeback pages in full synchronous migration */
+   if (folio_test_dirty(src)) {
+   /* Only writeback folios in full synchronous migration */
switch (mode) {
case MIGRATE_SYNC:
case MIGRATE_SYNC_NO_COPY:
@@ -818,18 +818,18 @@ static int fallback_migrate_page(struct address_space 
*mapping,
default:
return -EBUSY;
}
-   return writeout(mapping, page);
+   return writeout(mapping, >page);
}
 
/*
 * Buffers may be managed in a filesystem specific way.
 * We must have no buffers or drop them.
 */
-   if (page_has_private(page) &&
-   !try_to_release_page(page, GFP_KERNEL))
+   if (folio_test_private(src) &&
+   !filemap_release_folio(src, GFP_KERNEL))
return mode == MIGRATE_SYNC ? -EAGAIN : -EBUSY;
 
-   return migrate_page(mapping, newpage, page, mode);
+   return migrate_page(mapping, >page, >page, mode);
 }
 
 /*
@@ -872,8 +872,7 @@ static int move_to_new_folio(struct folio *dst, struct 
folio *src,
rc = mapping->a_ops->migratepage(mapping, >page,
>page, mode);
else
-   rc = fallback_migrate_page(mapping, >page,
-   >page, mode);
+   rc = fallback_migrate_folio(mapping, dst, src, mode);
} else {
/*
 * In case of non-lru page, it could be released after
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 01/20] fs: Add aops->migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
Provide a folio-based replacement for aops->migratepage.  Update the
documentation to document migrate_folio instead of migratepage.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 Documentation/filesystems/locking.rst |  5 ++--
 Documentation/filesystems/vfs.rst | 13 ++-
 Documentation/vm/page_migration.rst   | 33 ++-
 include/linux/fs.h|  4 +++-
 mm/compaction.c   |  4 +++-
 mm/migrate.c  | 19 ++-
 6 files changed, 46 insertions(+), 32 deletions(-)

diff --git a/Documentation/filesystems/locking.rst 
b/Documentation/filesystems/locking.rst
index c0fe711f14d3..3d28b23676bd 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -253,7 +253,8 @@ prototypes::
void (*free_folio)(struct folio *);
int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
bool (*isolate_page) (struct page *, isolate_mode_t);
-   int (*migratepage)(struct address_space *, struct page *, struct page 
*);
+   int (*migrate_folio)(struct address_space *, struct folio *dst,
+   struct folio *src, enum migrate_mode);
void (*putback_page) (struct page *);
int (*launder_folio)(struct folio *);
bool (*is_partially_uptodate)(struct folio *, size_t from, size_t 
count);
@@ -281,7 +282,7 @@ release_folio:  yes
 free_folio:yes
 direct_IO:
 isolate_page:  yes
-migratepage:   yes (both)
+migrate_folio: yes (both)
 putback_page:  yes
 launder_folio: yes
 is_partially_uptodate: yes
diff --git a/Documentation/filesystems/vfs.rst 
b/Documentation/filesystems/vfs.rst
index a08c652467d7..3ae1b039b03f 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -740,7 +740,8 @@ cache in your filesystem.  The following members are 
defined:
/* isolate a page for migration */
bool (*isolate_page) (struct page *, isolate_mode_t);
/* migrate the contents of a page to the specified target */
-   int (*migratepage) (struct page *, struct page *);
+   int (*migrate_folio)(struct mapping *, struct folio *dst,
+   struct folio *src, enum migrate_mode);
/* put migration-failed page back to right list */
void (*putback_page) (struct page *);
int (*launder_folio) (struct folio *);
@@ -935,12 +936,12 @@ cache in your filesystem.  The following members are 
defined:
is successfully isolated, VM marks the page as PG_isolated via
__SetPageIsolated.
 
-``migrate_page``
+``migrate_folio``
This is used to compact the physical memory usage.  If the VM
-   wants to relocate a page (maybe off a memory card that is
-   signalling imminent failure) it will pass a new page and an old
-   page to this function.  migrate_page should transfer any private
-   data across and update any references that it has to the page.
+   wants to relocate a folio (maybe from a memory device that is
+   signalling imminent failure) it will pass a new folio and an old
+   folio to this function.  migrate_folio should transfer any private
+   data across and update any references that it has to the folio.
 
 ``putback_page``
Called by the VM when isolated page's migration fails.
diff --git a/Documentation/vm/page_migration.rst 
b/Documentation/vm/page_migration.rst
index 8c5cb8147e55..e0f73ddfabb1 100644
--- a/Documentation/vm/page_migration.rst
+++ b/Documentation/vm/page_migration.rst
@@ -181,22 +181,23 @@ which are function pointers of struct 
address_space_operations.
Once page is successfully isolated, VM uses page.lru fields so driver
shouldn't expect to preserve values in those fields.
 
-2. ``int (*migratepage) (struct address_space *mapping,``
-|  ``struct page *newpage, struct page *oldpage, enum migrate_mode);``
-
-   After isolation, VM calls migratepage() of driver with the isolated page.
-   The function of migratepage() is to move the contents of the old page to the
-   new page
-   and set up fields of struct page newpage. Keep in mind that you should
-   indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
-   under page_lock if you migrated the oldpage successfully and returned
-   MIGRATEPAGE_SUCCESS. If driver cannot migrate the page at the moment, driver
-   can return -EAGAIN. On -EAGAIN, VM will retry page migration in a short time
-   because VM interprets -EAGAIN as "temporary migration failure". On returning
-   any error except -EAGAIN, VM will give up the page migration without
-   retrying.
-
-   Driver shouldn't touch the page.lru field while in the migratepage() 
function.
+2. ``int (*migrate_folio) (struct address_space *mapping,``
+|  ``struct folio *dst, struct folio *src, enum

[PATCH 04/20] mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()

2022-06-06 Thread Matthew Wilcox (Oracle)
Use a folio throughout __buffer_migrate_folio(), add kernel-doc for
buffer_migrate_folio() and buffer_migrate_folio_norefs(), move their
declarations to buffer.h and switch all filesystems that have wired
them up.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 block/fops.c|  2 +-
 fs/ext2/inode.c |  4 +-
 fs/ext4/inode.c |  4 +-
 fs/ntfs/aops.c  |  6 +--
 fs/ocfs2/aops.c |  2 +-
 include/linux/buffer_head.h | 10 +
 include/linux/fs.h  | 12 --
 mm/migrate.c| 76 ++---
 8 files changed, 65 insertions(+), 51 deletions(-)

diff --git a/block/fops.c b/block/fops.c
index d6b3276a6c68..743fc46d0aad 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -417,7 +417,7 @@ const struct address_space_operations def_blk_aops = {
.write_end  = blkdev_write_end,
.writepages = blkdev_writepages,
.direct_IO  = blkdev_direct_IO,
-   .migratepage= buffer_migrate_page_norefs,
+   .migrate_folio  = buffer_migrate_folio_norefs,
.is_dirty_writeback = buffer_check_dirty_writeback,
 };
 
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 360ce3604a2d..84570c6265aa 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -973,7 +973,7 @@ const struct address_space_operations ext2_aops = {
.bmap   = ext2_bmap,
.direct_IO  = ext2_direct_IO,
.writepages = ext2_writepages,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
 };
@@ -989,7 +989,7 @@ const struct address_space_operations ext2_nobh_aops = {
.bmap   = ext2_bmap,
.direct_IO  = ext2_direct_IO,
.writepages = ext2_writepages,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.error_remove_page  = generic_error_remove_page,
 };
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 1aaea53e67b5..53877ffe3c41 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3633,7 +3633,7 @@ static const struct address_space_operations ext4_aops = {
.invalidate_folio   = ext4_invalidate_folio,
.release_folio  = ext4_release_folio,
.direct_IO  = noop_direct_IO,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.swap_activate  = ext4_iomap_swap_activate,
@@ -3668,7 +3668,7 @@ static const struct address_space_operations ext4_da_aops 
= {
.invalidate_folio   = ext4_invalidate_folio,
.release_folio  = ext4_release_folio,
.direct_IO  = noop_direct_IO,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.swap_activate  = ext4_iomap_swap_activate,
diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index 9e3964ea2ea0..5f4fb6ca6f2e 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -1659,7 +1659,7 @@ const struct address_space_operations ntfs_normal_aops = {
.dirty_folio= block_dirty_folio,
 #endif /* NTFS_RW */
.bmap   = ntfs_bmap,
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
@@ -1673,7 +1673,7 @@ const struct address_space_operations 
ntfs_compressed_aops = {
.writepage  = ntfs_writepage,
.dirty_folio= block_dirty_folio,
 #endif /* NTFS_RW */
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
@@ -1688,7 +1688,7 @@ const struct address_space_operations ntfs_mst_aops = {
.writepage  = ntfs_writepage,   /* Write dirty page to disk. */
.dirty_folio= filemap_dirty_folio,
 #endif /* NTFS_RW */
-   .migratepage= buffer_migrate_page,
+   .migrate_folio  = buffer_migrate_folio,
.is_partially_uptodate  = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 767df51f8657..1d489003f99d 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2462,7 +2462,7 @@ const struct address_space_operations ocfs2_aops = {
.direct_IO  = ocfs2_direct_IO

[PATCH 09/20] mm/migrate: Add filemap_migrate_folio()

2022-06-06 Thread Matthew Wilcox (Oracle)
There is nothing iomap-specific about iomap_migratepage(), and it fits
a pattern used by several other filesystems, so move it to mm/migrate.c,
convert it to be filemap_migrate_folio() and convert the iomap filesystems
to use it.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/gfs2/aops.c  |  2 +-
 fs/iomap/buffered-io.c  | 25 -
 fs/xfs/xfs_aops.c   |  2 +-
 fs/zonefs/super.c   |  2 +-
 include/linux/iomap.h   |  6 --
 include/linux/pagemap.h |  6 ++
 mm/migrate.c| 20 
 7 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 106e90a36583..57ff883d432c 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -774,7 +774,7 @@ static const struct address_space_operations gfs2_aops = {
.invalidate_folio = iomap_invalidate_folio,
.bmap = gfs2_bmap,
.direct_IO = noop_direct_IO,
-   .migratepage = iomap_migrate_page,
+   .migrate_folio = filemap_migrate_folio,
.is_partially_uptodate = iomap_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
 };
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 66278a14bfa7..5a91aa1db945 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -489,31 +489,6 @@ void iomap_invalidate_folio(struct folio *folio, size_t 
offset, size_t len)
 }
 EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
 
-#ifdef CONFIG_MIGRATION
-int
-iomap_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode)
-{
-   struct folio *folio = page_folio(page);
-   struct folio *newfolio = page_folio(newpage);
-   int ret;
-
-   ret = folio_migrate_mapping(mapping, newfolio, folio, 0);
-   if (ret != MIGRATEPAGE_SUCCESS)
-   return ret;
-
-   if (folio_test_private(folio))
-   folio_attach_private(newfolio, folio_detach_private(folio));
-
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   folio_migrate_copy(newfolio, folio);
-   else
-   folio_migrate_flags(newfolio, folio);
-   return MIGRATEPAGE_SUCCESS;
-}
-EXPORT_SYMBOL_GPL(iomap_migrate_page);
-#endif /* CONFIG_MIGRATION */
-
 static void
 iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
 {
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 8ec38b25187b..5d1a995b15f8 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -570,7 +570,7 @@ const struct address_space_operations 
xfs_address_space_operations = {
.invalidate_folio   = iomap_invalidate_folio,
.bmap   = xfs_vm_bmap,
.direct_IO  = noop_direct_IO,
-   .migratepage= iomap_migrate_page,
+   .migrate_folio  = filemap_migrate_folio,
.is_partially_uptodate  = iomap_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.swap_activate  = xfs_iomap_swapfile_activate,
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index bcb21aea990a..d4c3f28f34ee 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -237,7 +237,7 @@ static const struct address_space_operations 
zonefs_file_aops = {
.dirty_folio= filemap_dirty_folio,
.release_folio  = iomap_release_folio,
.invalidate_folio   = iomap_invalidate_folio,
-   .migratepage= iomap_migrate_page,
+   .migrate_folio  = filemap_migrate_folio,
.is_partially_uptodate  = iomap_is_partially_uptodate,
.error_remove_page  = generic_error_remove_page,
.direct_IO  = noop_direct_IO,
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index e552097c67e0..758a1125e72f 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -231,12 +231,6 @@ void iomap_readahead(struct readahead_control *, const 
struct iomap_ops *ops);
 bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count);
 bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags);
 void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
-#ifdef CONFIG_MIGRATION
-int iomap_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode);
-#else
-#define iomap_migrate_page NULL
-#endif
 int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
const struct iomap_ops *ops);
 int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 1caccb9f99aa..2a67c0ad7348 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1078,6 +1078,12 @@ static inline int __must_check write_one_page(struct 
page *page)
 int __set_page_dirty_nobuffers(struct page *page);
 bool noop_dirty_folio(struct address_space *mapping, struct folio *folio);
 
+#ifdef CONFIG_MIGRATION
+int

[PATCH 19/20] fs: Remove aops->migratepage()

2022-06-06 Thread Matthew Wilcox (Oracle)
With all users converted to migrate_folio(), remove this operation.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 include/linux/fs.h |  2 --
 mm/compaction.c|  5 ++---
 mm/migrate.c   | 10 +-
 3 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5737c92ed286..95347cc035ae 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -367,8 +367,6 @@ struct address_space_operations {
 */
int (*migrate_folio)(struct address_space *, struct folio *dst,
struct folio *src, enum migrate_mode);
-   int (*migratepage) (struct address_space *,
-   struct page *, struct page *, enum migrate_mode);
bool (*isolate_page)(struct page *, isolate_mode_t);
void (*putback_page)(struct page *);
int (*launder_folio)(struct folio *);
diff --git a/mm/compaction.c b/mm/compaction.c
index db34b459e5d9..f0dc62159c0e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1034,7 +1034,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
 
/*
 * Only pages without mappings or that have a
-* ->migratepage callback are possible to migrate
+* ->migrate_folio callback are possible to migrate
 * without blocking. However, we can be racing with
 * truncation so it's necessary to lock the page
 * to stabilise the mapping as truncation holds
@@ -1046,8 +1046,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
 
mapping = page_mapping(page);
migrate_dirty = !mapping ||
-   mapping->a_ops->migrate_folio ||
-   mapping->a_ops->migratepage;
+   mapping->a_ops->migrate_folio;
unlock_page(page);
if (!migrate_dirty)
goto isolate_fail_put;
diff --git a/mm/migrate.c b/mm/migrate.c
index a8edd226c72d..c5560430dce4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -911,9 +911,6 @@ static int move_to_new_folio(struct folio *dst, struct 
folio *src,
 */
rc = mapping->a_ops->migrate_folio(mapping, dst, src,
mode);
-   else if (mapping->a_ops->migratepage)
-   rc = mapping->a_ops->migratepage(mapping, >page,
-   >page, mode);
else
rc = fallback_migrate_folio(mapping, dst, src, mode);
} else {
@@ -928,12 +925,7 @@ static int move_to_new_folio(struct folio *dst, struct 
folio *src,
goto out;
}
 
-   if (mapping->a_ops->migrate_folio)
-   rc = mapping->a_ops->migrate_folio(mapping, dst, src,
-   mode);
-   else
-   rc = mapping->a_ops->migratepage(mapping, >page,
-   >page, mode);
+   rc = mapping->a_ops->migrate_folio(mapping, dst, src, mode);
WARN_ON_ONCE(rc == MIGRATEPAGE_SUCCESS &&
!folio_test_isolated(src));
}
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 12/20] f2fs: Convert to filemap_migrate_folio()

2022-06-06 Thread Matthew Wilcox (Oracle)
filemap_migrate_folio() fits f2fs's needs perfectly.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/f2fs/checkpoint.c |  4 +---
 fs/f2fs/data.c   | 40 +---
 fs/f2fs/f2fs.h   |  4 
 fs/f2fs/node.c   |  4 +---
 4 files changed, 3 insertions(+), 49 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 6d8b2bf14de0..8259e0fa97e1 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -463,9 +463,7 @@ const struct address_space_operations f2fs_meta_aops = {
.dirty_folio= f2fs_dirty_meta_folio,
.invalidate_folio = f2fs_invalidate_folio,
.release_folio  = f2fs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= f2fs_migrate_page,
-#endif
+   .migrate_folio  = filemap_migrate_folio,
 };
 
 static void __add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino,
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7fcbcf979737..318a3f91ad74 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3751,42 +3751,6 @@ static sector_t f2fs_bmap(struct address_space *mapping, 
sector_t block)
return blknr;
 }
 
-#ifdef CONFIG_MIGRATION
-#include 
-
-int f2fs_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page, enum migrate_mode mode)
-{
-   int rc, extra_count = 0;
-
-   BUG_ON(PageWriteback(page));
-
-   rc = migrate_page_move_mapping(mapping, newpage,
-   page, extra_count);
-   if (rc != MIGRATEPAGE_SUCCESS)
-   return rc;
-
-   /* guarantee to start from no stale private field */
-   set_page_private(newpage, 0);
-   if (PagePrivate(page)) {
-   set_page_private(newpage, page_private(page));
-   SetPagePrivate(newpage);
-   get_page(newpage);
-
-   set_page_private(page, 0);
-   ClearPagePrivate(page);
-   put_page(page);
-   }
-
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
-   else
-   migrate_page_states(newpage, page);
-
-   return MIGRATEPAGE_SUCCESS;
-}
-#endif
-
 #ifdef CONFIG_SWAP
 static int f2fs_migrate_blocks(struct inode *inode, block_t start_blk,
unsigned int blkcnt)
@@ -4018,15 +3982,13 @@ const struct address_space_operations f2fs_dblock_aops 
= {
.write_begin= f2fs_write_begin,
.write_end  = f2fs_write_end,
.dirty_folio= f2fs_dirty_data_folio,
+   .migrate_folio  = filemap_migrate_folio,
.invalidate_folio = f2fs_invalidate_folio,
.release_folio  = f2fs_release_folio,
.direct_IO  = noop_direct_IO,
.bmap   = f2fs_bmap,
.swap_activate  = f2fs_swap_activate,
.swap_deactivate = f2fs_swap_deactivate,
-#ifdef CONFIG_MIGRATION
-   .migratepage= f2fs_migrate_page,
-#endif
 };
 
 void f2fs_clear_page_cache_dirty_tag(struct page *page)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index d9bbecd008d2..f258a1b6faed 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3764,10 +3764,6 @@ int f2fs_write_single_data_page(struct page *page, int 
*submitted,
 void f2fs_write_failed(struct inode *inode, loff_t to);
 void f2fs_invalidate_folio(struct folio *folio, size_t offset, size_t length);
 bool f2fs_release_folio(struct folio *folio, gfp_t wait);
-#ifdef CONFIG_MIGRATION
-int f2fs_migrate_page(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode);
-#endif
 bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len);
 void f2fs_clear_page_cache_dirty_tag(struct page *page);
 int f2fs_init_post_read_processing(void);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 836c79a20afc..ed1cbfb0345f 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -2163,9 +2163,7 @@ const struct address_space_operations f2fs_node_aops = {
.dirty_folio= f2fs_dirty_node_folio,
.invalidate_folio = f2fs_invalidate_folio,
.release_folio  = f2fs_release_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= f2fs_migrate_page,
-#endif
+   .migrate_folio  = filemap_migrate_folio,
 };
 
 static struct free_nid *__lookup_free_nid_list(struct f2fs_nm_info *nm_i,
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 17/20] z3fold: Convert to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
z3fold doesn't really use folios, but it needs to be called like this
in order to migrate an individual page.  Convert from a folio back to
a page until we decide how to handle migration better for z3fold.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/z3fold.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index f41f8b0d9e9a..5d091c41fb35 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -1554,9 +1554,11 @@ static bool z3fold_page_isolate(struct page *page, 
isolate_mode_t mode)
return false;
 }
 
-static int z3fold_page_migrate(struct address_space *mapping, struct page 
*newpage,
-  struct page *page, enum migrate_mode mode)
+static int z3fold_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
+   struct page *newpage = >page;
+   struct page *page = >page;
struct z3fold_header *zhdr, *new_zhdr;
struct z3fold_pool *pool;
struct address_space *new_mapping;
@@ -1644,7 +1646,7 @@ static void z3fold_page_putback(struct page *page)
 
 static const struct address_space_operations z3fold_aops = {
.isolate_page = z3fold_page_isolate,
-   .migratepage = z3fold_page_migrate,
+   .migrate_folio = z3fold_migrate_folio,
.putback_page = z3fold_page_putback,
 };
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 13/20] aio: Convert to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
Use a folio throughout this function.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/aio.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 3c249b938632..a1911e86859c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -400,8 +400,8 @@ static const struct file_operations aio_ring_fops = {
 };
 
 #if IS_ENABLED(CONFIG_MIGRATION)
-static int aio_migratepage(struct address_space *mapping, struct page *new,
-   struct page *old, enum migrate_mode mode)
+static int aio_migrate_folio(struct address_space *mapping, struct folio *dst,
+   struct folio *src, enum migrate_mode mode)
 {
struct kioctx *ctx;
unsigned long flags;
@@ -435,10 +435,10 @@ static int aio_migratepage(struct address_space *mapping, 
struct page *new,
goto out;
}
 
-   idx = old->index;
+   idx = src->index;
if (idx < (pgoff_t)ctx->nr_pages) {
-   /* Make sure the old page hasn't already been changed */
-   if (ctx->ring_pages[idx] != old)
+   /* Make sure the old folio hasn't already been changed */
+   if (ctx->ring_pages[idx] != >page)
rc = -EAGAIN;
} else
rc = -EINVAL;
@@ -447,27 +447,27 @@ static int aio_migratepage(struct address_space *mapping, 
struct page *new,
goto out_unlock;
 
/* Writeback must be complete */
-   BUG_ON(PageWriteback(old));
-   get_page(new);
+   BUG_ON(folio_test_writeback(src));
+   folio_get(dst);
 
-   rc = migrate_page_move_mapping(mapping, new, old, 1);
+   rc = folio_migrate_mapping(mapping, dst, src, 1);
if (rc != MIGRATEPAGE_SUCCESS) {
-   put_page(new);
+   folio_put(dst);
goto out_unlock;
}
 
/* Take completion_lock to prevent other writes to the ring buffer
-* while the old page is copied to the new.  This prevents new
+* while the old folio is copied to the new.  This prevents new
 * events from being lost.
 */
spin_lock_irqsave(>completion_lock, flags);
-   migrate_page_copy(new, old);
-   BUG_ON(ctx->ring_pages[idx] != old);
-   ctx->ring_pages[idx] = new;
+   folio_migrate_copy(dst, src);
+   BUG_ON(ctx->ring_pages[idx] != >page);
+   ctx->ring_pages[idx] = >page;
spin_unlock_irqrestore(>completion_lock, flags);
 
-   /* The old page is no longer accessible. */
-   put_page(old);
+   /* The old folio is no longer accessible. */
+   folio_put(src);
 
 out_unlock:
mutex_unlock(>ring_lock);
@@ -475,13 +475,13 @@ static int aio_migratepage(struct address_space *mapping, 
struct page *new,
spin_unlock(>private_lock);
return rc;
 }
+#else
+#define aio_migrate_folio NULL
 #endif
 
 static const struct address_space_operations aio_ctx_aops = {
.dirty_folio= noop_dirty_folio,
-#if IS_ENABLED(CONFIG_MIGRATION)
-   .migratepage= aio_migratepage,
-#endif
+   .migrate_folio  = aio_migrate_folio,
 };
 
 static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 11/20] ubifs: Convert to filemap_migrate_folio()

2022-06-06 Thread Matthew Wilcox (Oracle)
filemap_migrate_folio() is a little more general than ubifs really needs,
but it's better to share the code.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 fs/ubifs/file.c | 29 ++---
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 04ced154960f..f2353dd676ef 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1461,29 +1461,6 @@ static bool ubifs_dirty_folio(struct address_space 
*mapping,
return ret;
 }
 
-#ifdef CONFIG_MIGRATION
-static int ubifs_migrate_page(struct address_space *mapping,
-   struct page *newpage, struct page *page, enum migrate_mode mode)
-{
-   int rc;
-
-   rc = migrate_page_move_mapping(mapping, newpage, page, 0);
-   if (rc != MIGRATEPAGE_SUCCESS)
-   return rc;
-
-   if (PagePrivate(page)) {
-   detach_page_private(page);
-   attach_page_private(newpage, (void *)1);
-   }
-
-   if (mode != MIGRATE_SYNC_NO_COPY)
-   migrate_page_copy(newpage, page);
-   else
-   migrate_page_states(newpage, page);
-   return MIGRATEPAGE_SUCCESS;
-}
-#endif
-
 static bool ubifs_release_folio(struct folio *folio, gfp_t unused_gfp_flags)
 {
struct inode *inode = folio->mapping->host;
@@ -1649,10 +1626,8 @@ const struct address_space_operations 
ubifs_file_address_operations = {
.write_end  = ubifs_write_end,
.invalidate_folio = ubifs_invalidate_folio,
.dirty_folio= ubifs_dirty_folio,
-#ifdef CONFIG_MIGRATION
-   .migratepage= ubifs_migrate_page,
-#endif
-   .release_folio= ubifs_release_folio,
+   .migrate_folio  = filemap_migrate_folio,
+   .release_folio  = ubifs_release_folio,
 };
 
 const struct inode_operations ubifs_file_inode_operations = {
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 18/20] zsmalloc: Convert to migrate_folio

2022-06-06 Thread Matthew Wilcox (Oracle)
zsmalloc doesn't really use folios, but it needs to be called like this
in order to migrate an individual page.  Convert from a folio back to
a page until we decide how to handle migration better for zsmalloc.

Signed-off-by: Matthew Wilcox (Oracle) 
---
 mm/zsmalloc.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 5d5fc04385b8..8ed79121195a 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1865,9 +1865,11 @@ static bool zs_page_isolate(struct page *page, 
isolate_mode_t mode)
return true;
 }
 
-static int zs_page_migrate(struct address_space *mapping, struct page *newpage,
-   struct page *page, enum migrate_mode mode)
+static int zs_migrate_folio(struct address_space *mapping,
+   struct folio *dst, struct folio *src, enum migrate_mode mode)
 {
+   struct page *newpage = >page;
+   struct page *page = >page;
struct zs_pool *pool;
struct size_class *class;
struct zspage *zspage;
@@ -1966,7 +1968,7 @@ static void zs_page_putback(struct page *page)
 
 static const struct address_space_operations zsmalloc_aops = {
.isolate_page = zs_page_isolate,
-   .migratepage = zs_page_migrate,
+   .migrate_folio = zs_migrate_folio,
.putback_page = zs_page_putback,
 };
 
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [linux-next:master] BUILD REGRESSION 8cb8311e95e3bb58bd84d6350365f14a718faa6d

2022-05-26 Thread Matthew Wilcox
On Thu, May 26, 2022 at 11:48:32AM +0300, Dan Carpenter wrote:
> On Thu, May 26, 2022 at 02:16:34AM +0100, Matthew Wilcox wrote:
> > Bizarre this started showing up now.  The recent patch was:
> > 
> > -   info->alloced += compound_nr(page);
> > -   inode->i_blocks += BLOCKS_PER_PAGE << compound_order(page);
> > +   info->alloced += folio_nr_pages(folio);
> > +   inode->i_blocks += BLOCKS_PER_PAGE << folio_order(folio);
> > 
> > so it could tell that compound_order() was small, but folio_order()
> > might be large?
> 
> The old code also generates a warning on my test system.  Smatch thinks
> both compound_order() and folio_order() are 0-255.  I guess because of
> the "unsigned char compound_order;" in the struct page.

It'd be nice if we could annotate that as "contains a value between
1 and BITS_PER_LONG - PAGE_SHIFT".  Then be able to optionally enable
a checker that ensures that's true on loads/stores.  Maybe we need a
language that isn't C :-P  Ada can do this ... I don't think Rust can.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [linux-next:master] BUILD REGRESSION 8cb8311e95e3bb58bd84d6350365f14a718faa6d

2022-05-25 Thread Matthew Wilcox
On Wed, May 25, 2022 at 03:20:06PM -0700, Andrew Morton wrote:
> On Wed, 25 May 2022 23:07:35 +0100 Jessica Clarke  wrote:
> 
> > This is i386, so an unsigned long is 32-bit, but i_blocks is a blkcnt_t
> > i.e. a u64, which makes the shift without a cast of the LHS fishy.
> 
> Ah, of course, thanks.  I remember 32 bits ;)
> 
> --- a/mm/shmem.c~mm-shmemc-suppress-shift-warning
> +++ a/mm/shmem.c
> @@ -1945,7 +1945,7 @@ alloc_nohuge:
>  
>   spin_lock_irq(>lock);
>   info->alloced += folio_nr_pages(folio);
> - inode->i_blocks += BLOCKS_PER_PAGE << folio_order(folio);
> + inode->i_blocks += (blkcnt_t)BLOCKS_PER_PAGE << folio_order(folio);

Bizarre this started showing up now.  The recent patch was:

-   info->alloced += compound_nr(page);
-   inode->i_blocks += BLOCKS_PER_PAGE << compound_order(page);
+   info->alloced += folio_nr_pages(folio);
+   inode->i_blocks += BLOCKS_PER_PAGE << folio_order(folio);

so it could tell that compound_order() was small, but folio_order()
might be large?

Silencing the warning is a good thing, but folio_order() can (at the
moment) be at most 9 on i386, so it isn't actually going to be
larger than 4096.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] tools/virtio: Test virtual address range detection

2022-03-16 Thread Matthew Wilcox
On Tue, Feb 22, 2022 at 11:18:18PM +, Matthew Wilcox wrote:
> On Tue, Feb 22, 2022 at 07:58:33AM +, David Woodhouse wrote:
> > On Tue, 2022-02-22 at 01:31 -0500, Michael S. Tsirkin wrote:
> > > On Mon, Feb 21, 2022 at 05:18:48PM +, David Woodhouse wrote:
> > > > 
> > > > [dwoodhou@i7 virtio]$ sudo ~/virtio_test
> > > > Detected virtual address range 0x1000-0x7000
> > > > spurious wakeups: 0x0 started=0x10 completed=0x10
> > > > 
> > > > Although in some circumstances I also see a different build failure:
> > > > 
> > > > cc -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I 
> > > > ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow 
> > > > -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include 
> > > > ../../include/linux/kconfig.h   -c -o vringh_test.o vringh_test.c
> 
> Trying to test this myself ...
> 
> $ cd tools/virtio/
> $ make
> ...
> cc -lpthread  virtio_test.o virtio_ring.o   -o virtio_test
> /usr/bin/ld: virtio_ring.o: in function `spin_lock':
> /home/willy/kernel/folio/tools/virtio/./linux/spinlock.h:16: undefined 
> reference to `pthread_spin_lock'
> 
> So this is not the only problem here?

FYI, this fixes it for me:

diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
index 0d7bbe49359d..83b6a522d0d2 100644
--- a/tools/virtio/Makefile
+++ b/tools/virtio/Makefile
@@ -5,7 +5,7 @@ virtio_test: virtio_ring.o virtio_test.o
 vringh_test: vringh_test.o vringh.o virtio_ring.o

 CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I 
../../usr/include/ -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing 
-fno-common -MMD -U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h
-LDFLAGS += -lpthread
+LDFLAGS += -pthread
 vpath %.c ../../drivers/virtio ../../drivers/vhost
 mod:
${MAKE} -C `pwd`/../.. M=`pwd`/vhost_test V=${V}

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] tools/virtio: Test virtual address range detection

2022-02-25 Thread Matthew Wilcox
On Tue, Feb 22, 2022 at 11:18:18PM +, Matthew Wilcox wrote:
> On Tue, Feb 22, 2022 at 07:58:33AM +, David Woodhouse wrote:
> > On Tue, 2022-02-22 at 01:31 -0500, Michael S. Tsirkin wrote:
> > > On Mon, Feb 21, 2022 at 05:18:48PM +, David Woodhouse wrote:
> > > > 
> > > > [dwoodhou@i7 virtio]$ sudo ~/virtio_test
> > > > Detected virtual address range 0x1000-0x7000
> > > > spurious wakeups: 0x0 started=0x10 completed=0x10
> > > > 
> > > > Although in some circumstances I also see a different build failure:
> > > > 
> > > > cc -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I 
> > > > ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow 
> > > > -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include 
> > > > ../../include/linux/kconfig.h   -c -o vringh_test.o vringh_test.c
> 
> Trying to test this myself ...
> 
> $ cd tools/virtio/
> $ make
> ...
> cc -lpthread  virtio_test.o virtio_ring.o   -o virtio_test
> /usr/bin/ld: virtio_ring.o: in function `spin_lock':
> /home/willy/kernel/folio/tools/virtio/./linux/spinlock.h:16: undefined 
> reference to `pthread_spin_lock'
> 
> So this is not the only problem here?
> 
> > > > In file included from ./linux/uio.h:3,
> > > >  from ./linux/../../../include/linux/vringh.h:15,
> > > >  from ./linux/vringh.h:1,
> > > >  from vringh_test.c:9:
> > > > ./linux/../../../include/linux/uio.h:10:10: fatal error: 
> > > > linux/mm_types.h: No such file or directory
> > > >10 | #include 
> > > >   |  ^~
> > > > compilation terminated.
> > > > make: *** [: vringh_test.o] Error 1
> > > 
> > > Which tree has this build failure? In mine linux/uio.h does not
> > > include linux/mm_types.h.
> > 
> > Strictly it's
> > https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/xen-evtchn-kernel
> > but I'm sure my part isn't relevant; it's just v5.17-rc5.
> > 
> >  $ git blame include/linux/uio.h | grep mm_types.h
> > d9c19d32d86fa (Matthew Wilcox (Oracle) 2021-10-18 10:39:06 -0400  10) 
> > #include 
> >  $ git describe --tags d9c19d32d86fa
> > v5.16-rc4-37-gd9c19d32d86f
> 
> grr.  Originally, I had this doing a typebusting cast, but hch objected,
> so I had to include mm_types.h.  This should fix it ...

ping?  Just noticed this one crop up in a "list of problems".  Should
I submit it myself?

> $ git diff
> diff --git a/tools/virtio/linux/mm_types.h b/tools/virtio/linux/mm_types.h
> new file mode 100644
> index ..3b0fc9bc5b8f
> --- /dev/null
> +++ b/tools/virtio/linux/mm_types.h
> @@ -0,0 +1,3 @@
> +struct folio {
> +   struct page page;
> +};
> 
> At least, it makes it compile for me.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] tools/virtio: Test virtual address range detection

2022-02-22 Thread Matthew Wilcox
On Tue, Feb 22, 2022 at 07:58:33AM +, David Woodhouse wrote:
> On Tue, 2022-02-22 at 01:31 -0500, Michael S. Tsirkin wrote:
> > On Mon, Feb 21, 2022 at 05:18:48PM +, David Woodhouse wrote:
> > > 
> > > [dwoodhou@i7 virtio]$ sudo ~/virtio_test
> > > Detected virtual address range 0x1000-0x7000
> > > spurious wakeups: 0x0 started=0x10 completed=0x10
> > > 
> > > Although in some circumstances I also see a different build failure:
> > > 
> > > cc -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I 
> > > ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow 
> > > -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include 
> > > ../../include/linux/kconfig.h   -c -o vringh_test.o vringh_test.c

Trying to test this myself ...

$ cd tools/virtio/
$ make
...
cc -lpthread  virtio_test.o virtio_ring.o   -o virtio_test
/usr/bin/ld: virtio_ring.o: in function `spin_lock':
/home/willy/kernel/folio/tools/virtio/./linux/spinlock.h:16: undefined 
reference to `pthread_spin_lock'

So this is not the only problem here?

> > > In file included from ./linux/uio.h:3,
> > >  from ./linux/../../../include/linux/vringh.h:15,
> > >  from ./linux/vringh.h:1,
> > >  from vringh_test.c:9:
> > > ./linux/../../../include/linux/uio.h:10:10: fatal error: 
> > > linux/mm_types.h: No such file or directory
> > >10 | #include 
> > >   |  ^~
> > > compilation terminated.
> > > make: *** [: vringh_test.o] Error 1
> > 
> > Which tree has this build failure? In mine linux/uio.h does not
> > include linux/mm_types.h.
> 
> Strictly it's
> https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/xen-evtchn-kernel
> but I'm sure my part isn't relevant; it's just v5.17-rc5.
> 
>  $ git blame include/linux/uio.h | grep mm_types.h
> d9c19d32d86fa (Matthew Wilcox (Oracle) 2021-10-18 10:39:06 -0400  10) 
> #include 
>  $ git describe --tags d9c19d32d86fa
> v5.16-rc4-37-gd9c19d32d86f

grr.  Originally, I had this doing a typebusting cast, but hch objected,
so I had to include mm_types.h.  This should fix it ...

$ git diff
diff --git a/tools/virtio/linux/mm_types.h b/tools/virtio/linux/mm_types.h
new file mode 100644
index ..3b0fc9bc5b8f
--- /dev/null
+++ b/tools/virtio/linux/mm_types.h
@@ -0,0 +1,3 @@
+struct folio {
+   struct page page;
+};

At least, it makes it compile for me.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: futher decouple DAX from block devices

2021-11-04 Thread Matthew Wilcox
On Thu, Nov 04, 2021 at 11:09:19PM -0400, Theodore Ts'o wrote:
> On Thu, Nov 04, 2021 at 12:04:43PM -0700, Darrick J. Wong wrote:
> > > Note that I've avoided implementing read/write fops for dax devices
> > > partly out of concern for not wanting to figure out shared-mmap vs
> > > write coherence issues, but also because of a bet with Dave Hansen
> > > that device-dax not grow features like what happened to hugetlbfs. So
> > > it would seem mkfs would need to switch to mmap I/O, or bite the
> > > bullet and implement read/write fops in the driver.
> > 
> > That ... would require a fair amount of userspace changes, though at
> > least e2fsprogs has pluggable io drivers, which would make mmapping a
> > character device not too awful.
> > 
> > xfsprogs would be another story -- porting the buffer cache mignt not be
> > too bad, but mkfs and repair seem to issue pread/pwrite calls directly.
> > Note that xfsprogs explicitly screens out chardevs.
> 
> It's not just e2fsprogs and xfsprogs.  There's also udev, blkid,
> potententially systemd unit generators to kick off fsck runs, etc.
> There are probably any number of user scripts which assume that file
> systems are mounted on block devices --- for example, by looking at
> the output of lsblk, etc.
> 
> Also note that block devices have O_EXCL support to provide locking
> against attempts to run mkfs on a mounted file system.  If you move
> dax file systems to be mounted on a character mode device, that would
> have to be replicated as well, etc.  So I suspect that a large number
> of subtle things would break, and I'd strongly recommend against going
> down that path.

Agreed.  There were reasons we decided to present pmem as "block
device with extra functionality" rather than try to cram all the block
layer functionality (eg submitting BIOs for filesystem metadata) into a
character device.  Some of those assumptions might be worth re-examining,
but I haven't seen anything that makes me say "this is obviously better
than what we did at the time".
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v1 7/8] mshv: implement in-kernel device framework

2021-07-09 Thread Matthew Wilcox
On Fri, Jul 09, 2021 at 07:14:05PM +, Wei Liu wrote:
> You were not CC'ed on this patch, so presumably you got it via one of
> the mailing lists. I'm not sure why you only got this one patch. Perhaps
> if you wait a bit you will get the rest.

No, I won't.  You only cc'd linux-doc on this one patch and not on any
of the others.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v1 7/8] mshv: implement in-kernel device framework

2021-07-09 Thread Matthew Wilcox
On Fri, Jul 09, 2021 at 04:27:32PM +, Wei Liu wrote:
> > Then don't define your own structure.  Use theirs.
> 
> I specifically mentioned in the cover letter I didn't do it because I
> was not sure if that would be acceptable. I guess I will find out.

I only got patch 7/8.  You can't blame me for not reading 0/8 if you
didn't send me 0/8.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v1 7/8] mshv: implement in-kernel device framework

2021-07-09 Thread Matthew Wilcox
On Fri, Jul 09, 2021 at 01:50:13PM +, Wei Liu wrote:
> On Fri, Jul 09, 2021 at 02:02:04PM +0100, Matthew Wilcox wrote:
> > On Fri, Jul 09, 2021 at 11:43:38AM +, Wei Liu wrote:
> > > +static long
> > > +mshv_partition_ioctl_create_device(struct mshv_partition *partition,
> > > + void __user *user_args)
> > > +{
> > [...]
> > > + mshv_partition_get(partition);
> > > + r = anon_inode_getfd(ops->name, _device_fops, dev, O_RDWR | 
> > > O_CLOEXEC);
> > > + if (r < 0) {
> > > + mshv_partition_put_no_destroy(partition);
> > > + list_del(>partition_node);
> > > + ops->destroy(dev);
> > > + goto out;
> > > + }
> > > +
> > > + cd->fd = r;
> > > + r = 0;
> > 
> > Why return the fd in memory instead of returning the fd as the return
> > value from the ioctl?
> > 
> > > + if (copy_to_user(user_args, , sizeof(tmp))) {
> > > + r = -EFAULT;
> > > + goto out;
> > > + }
> > 
> > ... this could then disappear.
> 
> Thanks for your comment, Matthew.
> 
> This is intentionally because I didn't want to deviate from KVM's API.
> The fewer differences the better.

Then don't define your own structure.  Use theirs.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v1 7/8] mshv: implement in-kernel device framework

2021-07-09 Thread Matthew Wilcox
On Fri, Jul 09, 2021 at 11:43:38AM +, Wei Liu wrote:
> +static long
> +mshv_partition_ioctl_create_device(struct mshv_partition *partition,
> + void __user *user_args)
> +{
[...]
> + mshv_partition_get(partition);
> + r = anon_inode_getfd(ops->name, _device_fops, dev, O_RDWR | 
> O_CLOEXEC);
> + if (r < 0) {
> + mshv_partition_put_no_destroy(partition);
> + list_del(>partition_node);
> + ops->destroy(dev);
> + goto out;
> + }
> +
> + cd->fd = r;
> + r = 0;

Why return the fd in memory instead of returning the fd as the return
value from the ioctl?

> + if (copy_to_user(user_args, , sizeof(tmp))) {
> + r = -EFAULT;
> + goto out;
> + }

... this could then disappear.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: make alloc_anon_inode more useful

2021-03-09 Thread Matthew Wilcox
On Tue, Mar 09, 2021 at 04:53:39PM +0100, Christoph Hellwig wrote:
> this series first renames the existing alloc_anon_inode to
> alloc_anon_inode_sb to clearly mark it as requiring a superblock.
> 
> It then adds a new alloc_anon_inode that works on the anon_inode
> file system super block, thus removing tons of boilerplate code.
> 
> The few remainig callers of alloc_anon_inode_sb all use alloc_file_pseudo
> later, but might also be ripe for some cleanup.

On a somewhat related note, could I get you to look at
drivers/video/fbdev/core/fb_defio.c?

As far as I can tell, there's no need for fb_deferred_io_aops to exist.
We could just set file->f_mapping->a_ops to NULL, and set_page_dirty()
would do the exact same thing this code does (except it would get the
return value correct).

But maybe that would make something else go wrong that distinguishes
between page->mapping being NULL and page->mapping->a_ops->foo being NULL?
Completely untested patch ...

diff --git a/drivers/video/fbdev/core/fb_defio.c 
b/drivers/video/fbdev/core/fb_defio.c
index a591d291b231..441ec31d3e4d 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -151,17 +151,6 @@ static const struct vm_operations_struct 
fb_deferred_io_vm_ops = {
.page_mkwrite   = fb_deferred_io_mkwrite,
 };
 
-static int fb_deferred_io_set_page_dirty(struct page *page)
-{
-   if (!PageDirty(page))
-   SetPageDirty(page);
-   return 0;
-}
-
-static const struct address_space_operations fb_deferred_io_aops = {
-   .set_page_dirty = fb_deferred_io_set_page_dirty,
-};
-
 int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma)
 {
vma->vm_ops = _deferred_io_vm_ops;
@@ -212,14 +201,6 @@ void fb_deferred_io_init(struct fb_info *info)
 }
 EXPORT_SYMBOL_GPL(fb_deferred_io_init);
 
-void fb_deferred_io_open(struct fb_info *info,
-struct inode *inode,
-struct file *file)
-{
-   file->f_mapping->a_ops = _deferred_io_aops;
-}
-EXPORT_SYMBOL_GPL(fb_deferred_io_open);
-
 void fb_deferred_io_cleanup(struct fb_info *info)
 {
struct fb_deferred_io *fbdefio = info->fbdefio;
diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 06f5805de2de..c4ba76359f22 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -1415,10 +1415,7 @@ __releases(>lock)
if (res)
module_put(info->fbops->owner);
}
-#ifdef CONFIG_FB_DEFERRED_IO
-   if (info->fbdefio)
-   fb_deferred_io_open(info, inode, file);
-#endif
+   file->f_mapping->a_ops = NULL;
 out:
unlock_fb_info(info);
if (res)
diff --git a/include/linux/fb.h b/include/linux/fb.h
index ecfbcc0553a5..a8dccd23c249 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -659,9 +659,6 @@ static inline void __fb_pad_aligned_buffer(u8 *dst, u32 
d_pitch,
 /* drivers/video/fb_defio.c */
 int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma);
 extern void fb_deferred_io_init(struct fb_info *info);
-extern void fb_deferred_io_open(struct fb_info *info,
-   struct inode *inode,
-   struct file *file);
 extern void fb_deferred_io_cleanup(struct fb_info *info);
 extern int fb_deferred_io_fsync(struct file *file, loff_t start,
loff_t end, int datasync);
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v2 PATCH 4/4] mm: pre zero out free pages to speed up page allocation for __GFP_ZERO

2021-01-04 Thread Matthew Wilcox
On Mon, Jan 04, 2021 at 11:19:13AM -0800, Dave Hansen wrote:
> On 12/21/20 8:30 AM, Liang Li wrote:
> > --- a/include/linux/page-flags.h
> > +++ b/include/linux/page-flags.h
> > @@ -137,6 +137,9 @@ enum pageflags {
> >  #endif
> >  #ifdef CONFIG_64BIT
> > PG_arch_2,
> > +#endif
> > +#ifdef CONFIG_PREZERO_PAGE
> > +   PG_zero,
> >  #endif
> > __NR_PAGEFLAGS,
> 
> I don't think this is worth a generic page->flags bit.
> 
> There's a ton of space in 'struct page' for pages that are in the
> allocator.  Can't we use some of that space?

I was going to object to that too, but I think the entire approach is
flawed and needs to be thrown out.  It just nukes the caches in extremely
subtle and hard to measure ways, lowering overall system performance.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO

2020-12-22 Thread Matthew Wilcox
On Mon, Dec 21, 2020 at 11:25:22AM -0500, Liang Li wrote:
> Creating a VM [64G RAM, 32 CPUs] with GPU passthrough
> =
> QEMU use 4K pages, THP is off
>   round1  round2  round3
> w/o this patch:23.5s   24.7s   24.6s
> w/ this patch: 10.2s   10.3s   11.2s
> 
> QEMU use 4K pages, THP is on
>   round1  round2  round3
> w/o this patch:17.9s   14.8s   14.9s
> w/ this patch: 1.9s1.8s1.9s
> =

The cost of zeroing pages has to be paid somewhere.  You've successfully
moved it out of this path that you can measure.  So now you've put it
somewhere that you're not measuring.  Why is this a win?

> Speed up kernel routine
> ===
> This can’t be guaranteed because we don’t pre zero out all the free pages,
> but is true for most case. It can help to speed up some important system
> call just like fork, which will allocate zero pages for building page
> table. And speed up the process of page fault, especially for huge page
> fault. The POC of Hugetlb free page pre zero out has been done.

Try kernbench with and without your patch.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 1/8] mm: slab: provide krealloc_array()

2020-11-02 Thread Matthew Wilcox
On Mon, Nov 02, 2020 at 04:20:30PM +0100, Bartosz Golaszewski wrote:
> +Chunks allocated with `kmalloc` can be resized with `krealloc`. Similarly
> +to `kmalloc_array`: a helper for resising arrays is provided in the form of
> +`krealloc_array`.

Is there any reason you chose to `do_this` instead of do_this()?  The
automarkup script turns do_this() into a nice link to the documentation
which you're adding below.

Typo 'resising' resizing.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 1/4] mm: export zap_page_range() for driver use

2020-10-19 Thread Matthew Wilcox
On Mon, Oct 19, 2020 at 10:56:20PM +0800, Xie Yongji wrote:
> Export zap_page_range() for use in VDUSE.

I think you're missing a lot of MMU notifier work by calling this
directly.  It probably works in every scenario you've tested, but won't
work for others.  I see you're using VM_MIXEDMAP -- would it make sense
to use VM_PFNMAP instead and use zap_vma_ptes()?  Or would it make sense
to change zap_vma_ptes() to handle VM_MIXEDMAP as well as VM_PFNMAP?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [Ocfs2-devel] [RFC] treewide: cleanup unreachable breaks

2020-10-18 Thread Matthew Wilcox
On Sun, Oct 18, 2020 at 12:13:35PM -0700, James Bottomley wrote:
> On Sun, 2020-10-18 at 19:59 +0100, Matthew Wilcox wrote:
> > On Sat, Oct 17, 2020 at 09:09:28AM -0700, t...@redhat.com wrote:
> > > clang has a number of useful, new warnings see
> > > https://urldefense.com/v3/__https://clang.llvm.org/docs/DiagnosticsReference.html__;!!GqivPVa7Brio!Krxz78O3RKcB9JBMVo_F98FupVhj_jxX60ddN6tKGEbv_cnooXc1nnBmchm-e_O9ieGnyQ$
> > >  
> > 
> > Please get your IT department to remove that stupidity.  If you
> > can't, please send email from a non-Red Hat email address.
> 
> Actually, the problem is at Oracle's end somewhere in the ocfs2 list
> ... if you could fix it, that would be great.  The usual real mailing
> lists didn't get this transformation
> 
> https://lore.kernel.org/bpf/20201017160928.12698-1-t...@redhat.com/
> 
> but the ocfs2 list archive did:
> 
> https://oss.oracle.com/pipermail/ocfs2-devel/2020-October/015330.html
> 
> I bet Oracle IT has put some spam filter on the list that mangles URLs
> this way.

*sigh*.  I'm sure there's a way.  I've raised it with someone who should
be able to fix it.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [Ocfs2-devel] [RFC] treewide: cleanup unreachable breaks

2020-10-18 Thread Matthew Wilcox
On Sat, Oct 17, 2020 at 09:09:28AM -0700, t...@redhat.com wrote:
> clang has a number of useful, new warnings see
> https://urldefense.com/v3/__https://clang.llvm.org/docs/DiagnosticsReference.html__;!!GqivPVa7Brio!Krxz78O3RKcB9JBMVo_F98FupVhj_jxX60ddN6tKGEbv_cnooXc1nnBmchm-e_O9ieGnyQ$
>  

Please get your IT department to remove that stupidity.  If you can't,
please send email from a non-Red Hat email address.

I don't understand why this is a useful warning to fix.  What actual
problem is caused by the code below?

> return and break
> 
>   switch (c->x86_vendor) {
>   case X86_VENDOR_INTEL:
>   intel_p5_mcheck_init(c);
>   return 1;
> - break;

Sure, it's unnecessary, but it's not masking a bug.  It's not unclear.
Why do we want to enable this warning?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Matthew Wilcox
On Wed, Jun 17, 2020 at 01:31:57PM +0200, Michal Hocko wrote:
> On Wed 17-06-20 04:08:20, Matthew Wilcox wrote:
> > If you call vfree() under
> > a spinlock, you're in trouble.  in_atomic() only knows if we hold a
> > spinlock for CONFIG_PREEMPT, so it's not safe to check for in_atomic()
> > in __vfree().  So we need the warning in order that preempt people can
> > tell those without that there is a bug here.
> 
> ... Unless I am missing something in_interrupt depends on preempt_count() as
> well so neither of the two is reliable without PREEMPT_COUNT configured.

preempt_count() always tracks whether we're in interrupt context,
regardless of CONFIG_PREEMPT.  The difference is that CONFIG_PREEMPT
will track spinlock acquisitions as well.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Matthew Wilcox
On Wed, Jun 17, 2020 at 09:12:12AM +0200, Michal Hocko wrote:
> On Tue 16-06-20 17:37:11, Matthew Wilcox wrote:
> > Not just performance critical, but correctness critical.  Since kvfree()
> > may allocate from the vmalloc allocator, I really think that kvfree()
> > should assert that it's !in_atomic().  Otherwise we can get into trouble
> > if we end up calling vfree() and have to take the mutex.
> 
> FWIW __vfree already checks for atomic context and put the work into a
> deferred context. So this should be safe. It should be used as a last
> resort, though.

Actually, it only checks for in_interrupt().  If you call vfree() under
a spinlock, you're in trouble.  in_atomic() only knows if we hold a
spinlock for CONFIG_PREEMPT, so it's not safe to check for in_atomic()
in __vfree().  So we need the warning in order that preempt people can
tell those without that there is a bug here.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Matthew Wilcox
On Wed, Jun 17, 2020 at 01:01:30AM +0200, David Sterba wrote:
> On Tue, Jun 16, 2020 at 11:53:50AM -0700, Joe Perches wrote:
> > On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote:
> > >  v4:
> > >   - Break out the memzero_explicit() change as suggested by Dan Carpenter
> > > so that it can be backported to stable.
> > >   - Drop the "crypto: Remove unnecessary memzero_explicit()" patch for
> > > now as there can be a bit more discussion on what is best. It will be
> > > introduced as a separate patch later on after this one is merged.
> > 
> > To this larger audience and last week without reply:
> > https://lore.kernel.org/lkml/573b3fbd5927c643920e1364230c296b23e7584d.ca...@perches.com/
> > 
> > Are there _any_ fastpath uses of kfree or vfree?
> 
> I'd consider kfree performance critical for cases where it is called
> under locks. If possible the kfree is moved outside of the critical
> section, but we have rbtrees or lists that get deleted under locks and
> restructuring the code to do eg. splice and free it outside of the lock
> is not always possible.

Not just performance critical, but correctness critical.  Since kvfree()
may allocate from the vmalloc allocator, I really think that kvfree()
should assert that it's !in_atomic().  Otherwise we can get into trouble
if we end up calling vfree() and have to take the mutex.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Matthew Wilcox
On Tue, Jun 16, 2020 at 11:53:50AM -0700, Joe Perches wrote:
> To this larger audience and last week without reply:
> https://lore.kernel.org/lkml/573b3fbd5927c643920e1364230c296b23e7584d.ca...@perches.com/
> 
> Are there _any_ fastpath uses of kfree or vfree?

I worked on adding a 'free' a couple of years ago.  That was capable
of freeing percpu, vmalloc, kmalloc and alloc_pages memory.  I ran into
trouble when I tried to free kmem_cache_alloc memory -- it works for slab
and slub, but not slob (because slob needs the size from the kmem_cache).

My motivation for this was to change kfree_rcu() to just free_rcu().

> To eliminate these mispairings at a runtime cost of four
> comparisons, should the kfree/vfree/kvfree/kfree_const
> functions be consolidated into a single kfree?

I would say to leave kfree() alone and just introduce free() as a new
default.  There's some weird places in the kernel that have a 'free'
symbol of their own, but those should be renamed anyway.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5"

2020-06-12 Thread Matthew Wilcox
On Fri, May 29, 2020 at 04:43:08PM -0700, John Hubbard wrote:
> +CASE 5: Pinning in order to write to the data within the page
> +-
> +Even though neither DMA nor Direct IO is involved, just a simple case of 
> "pin,
> +access page's data, unpin" can cause a problem. Case 5 may be considered a
> +superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
> +other words, if the code is neither Case 1 nor Case 2, it may still require
> +FOLL_PIN, for patterns like this:
> +
> +Correct (uses FOLL_PIN calls):
> +pin_user_pages()
> +access the data within the pages
> +set_page_dirty_lock()
> +unpin_user_pages()
> +
> +INCORRECT (uses FOLL_GET calls):
> +get_user_pages()
> +access the data within the pages
> +set_page_dirty_lock()
> +put_page()

Why does this case need to pin?  Why can't it just do ...

get_user_pages()
lock_page(page);
... modify the data ...
set_page_dirty(page);
unlock_page(page);

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: improve use_mm / unuse_mm v2

2020-04-16 Thread Matthew Wilcox
On Thu, Apr 16, 2020 at 07:31:55AM +0200, Christoph Hellwig wrote:
> this series improves the use_mm / unuse_mm interface by better
> documenting the assumptions, and my taking the set_fs manipulations
> spread over the callers into the core API.

I appreciate all the work you're doing here.

Do you have plans to introduce a better-named API than set_fs() / get_fs()?

Also, having set_fs() return the previous value of 'fs' would simplify
a lot of the callers.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: DANGER WILL ROBINSON, DANGER

2019-08-13 Thread Matthew Wilcox
On Tue, Aug 13, 2019 at 11:29:07AM +0200, Paolo Bonzini wrote:
> On 09/08/19 18:24, Matthew Wilcox wrote:
> > On Fri, Aug 09, 2019 at 07:00:26PM +0300, Adalbert Lazăr wrote:
> >> +++ b/include/linux/page-flags.h
> >> @@ -417,8 +417,10 @@ PAGEFLAG(Idle, idle, PF_ANY)
> >>   */
> >>  #define PAGE_MAPPING_ANON 0x1
> >>  #define PAGE_MAPPING_MOVABLE  0x2
> >> +#define PAGE_MAPPING_REMOTE   0x4
> > Uh.  How do you know page->mapping would otherwise have bit 2 clear?
> > Who's guaranteeing that?
> > 
> > This is an awfully big patch to the memory management code, buried in
> > the middle of a gigantic series which almost guarantees nobody would
> > look at it.  I call shenanigans.
> 
> Are you calling shenanigans on the patch submitter (which is gratuitous)
> or on the KVM maintainers/reviewers?

On the patch submitter, of course.  How can I possibly be criticising you
for something you didn't do?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

DANGER WILL ROBINSON, DANGER

2019-08-09 Thread Matthew Wilcox
On Fri, Aug 09, 2019 at 07:00:26PM +0300, Adalbert Lazăr wrote:
> +++ b/include/linux/page-flags.h
> @@ -417,8 +417,10 @@ PAGEFLAG(Idle, idle, PF_ANY)
>   */
>  #define PAGE_MAPPING_ANON0x1
>  #define PAGE_MAPPING_MOVABLE 0x2
> +#define PAGE_MAPPING_REMOTE  0x4

Uh.  How do you know page->mapping would otherwise have bit 2 clear?
Who's guaranteeing that?

This is an awfully big patch to the memory management code, buried in
the middle of a gigantic series which almost guarantees nobody would
look at it.  I call shenanigans.

> @@ -1021,7 +1022,7 @@ void page_move_anon_rmap(struct page *page, struct 
> vm_area_struct *vma)
>   * __page_set_anon_rmap - set up new anonymous rmap
>   * @page:Page or Hugepage to add to rmap
>   * @vma: VM area to add page to.
> - * @address: User virtual address of the mapping 
> + * @address: User virtual address of the mapping

And mixing in fluff changes like this is a real no-no.  Try again.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 0/5] kvm "virtio pmem" device

2019-01-16 Thread Matthew Wilcox
On Mon, Jan 14, 2019 at 10:29:02AM +1100, Dave Chinner wrote:
> Until you have images (and hence host page cache) shared between
> multiple guests. People will want to do this, because it means they
> only need a single set of pages in host memory for executable
> binaries rather than a set of pages per guest. Then you have
> multiple guests being able to detect residency of the same set of
> pages. If the guests can then, in any way, control eviction of the
> pages from the host cache, then we have a guest-to-guest information
> leak channel.

I don't think we should ever be considering something that would allow a
guest to evict page's from the host's pagecache [1].  The guest should
be able to kick its own references to the host's pagecache out of its
own pagecache, but not be able to influence whether the host or another
guest has a read-only mapping cached.

[1] Unless the guest is allowed to modify the host's file; obviously
truncation, holepunching, etc are going to evict pages from the host's
page cache.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency

2019-01-02 Thread Matthew Wilcox
On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy 
> atomicity"
>  for more information.
>  
>  
> +
> +
>  In summary:
>  
>(*) Control dependencies can order prior loads against later stores.

Was this hunk intentional?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/2] drm/virtio: Use IDAs more efficiently

2018-10-31 Thread Matthew Wilcox
0-based IDAs are more efficient than any other base.  Convert the
1-based IDAs to be 0-based.

Signed-off-by: Matthew Wilcox 
---
 drivers/gpu/drm/virtio/virtgpu_kms.c| 5 +++--
 drivers/gpu/drm/virtio/virtgpu_object.c | 6 +++---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index bf609dcae224..8118f10fde4a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -55,10 +55,11 @@ static void virtio_gpu_config_changed_work_func(struct 
work_struct *work)
 static int virtio_gpu_context_create(struct virtio_gpu_device *vgdev,
  uint32_t nlen, const char *name)
 {
-   int handle = ida_alloc_min(>ctx_id_ida, 1, GFP_KERNEL);
+   int handle = ida_alloc(>ctx_id_ida, GFP_KERNEL);
 
if (handle < 0)
return handle;
+   handle += 1;
virtio_gpu_cmd_context_create(vgdev, handle, nlen, name);
return handle;
 }
@@ -67,7 +68,7 @@ static void virtio_gpu_context_destroy(struct 
virtio_gpu_device *vgdev,
  uint32_t ctx_id)
 {
virtio_gpu_cmd_context_destroy(vgdev, ctx_id);
-   ida_free(>ctx_id_ida, ctx_id);
+   ida_free(>ctx_id_ida, ctx_id - 1);
 }
 
 static void virtio_gpu_init_vq(struct virtio_gpu_queue *vgvq,
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index 5ac42dded217..f39a183d59c2 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -28,18 +28,18 @@
 static int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
   uint32_t *resid)
 {
-   int handle = ida_alloc_min(>resource_ida, 1, GFP_KERNEL);
+   int handle = ida_alloc(>resource_ida, GFP_KERNEL);
 
if (handle < 0)
return handle;
 
-   *resid = handle;
+   *resid = handle + 1;
return 0;
 }
 
 static void virtio_gpu_resource_id_put(struct virtio_gpu_device *vgdev, 
uint32_t id)
 {
-   ida_free(>resource_ida, id);
+   ida_free(>resource_ida, id - 1);
 }
 
 static void virtio_gpu_ttm_bo_destroy(struct ttm_buffer_object *tbo)
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 1/2] drm/virtio: Handle error from virtio_gpu_resource_id_get

2018-10-31 Thread Matthew Wilcox
ida_alloc() can return -ENOMEM in the highly unlikely case we run out
of memory.  The current code creates an object with an invalid ID.

Signed-off-by: Matthew Wilcox 
---
 drivers/gpu/drm/virtio/virtgpu_object.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index 77eac4eb06b1..5ac42dded217 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -25,11 +25,16 @@
 
 #include "virtgpu_drv.h"
 
-static void virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
+static int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
   uint32_t *resid)
 {
int handle = ida_alloc_min(>resource_ida, 1, GFP_KERNEL);
+
+   if (handle < 0)
+   return handle;
+
*resid = handle;
+   return 0;
 }
 
 static void virtio_gpu_resource_id_put(struct virtio_gpu_device *vgdev, 
uint32_t id)
@@ -94,7 +99,11 @@ int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
bo = kzalloc(sizeof(struct virtio_gpu_object), GFP_KERNEL);
if (bo == NULL)
return -ENOMEM;
-   virtio_gpu_resource_id_get(vgdev, >hw_res_handle);
+   ret = virtio_gpu_resource_id_get(vgdev, >hw_res_handle);
+   if (ret < 0) {
+   kfree(bo);
+   return ret;
+   }
size = roundup(size, PAGE_SIZE);
ret = drm_gem_object_init(vgdev->ddev, >gem_base, size);
if (ret != 0) {
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/4] Improve virtio ID allocation

2018-10-30 Thread Matthew Wilcox
On Mon, Oct 29, 2018 at 10:53:39PM +0100, Gerd Hoffmann wrote:
> On Wed, Sep 26, 2018 at 09:00:27AM -0700, Matthew Wilcox wrote:
> > I noticed you were using IDRs where you could be using the more efficient
> > IDAs, then while fixing that I noticed the lack of error handling,
> > and I decided to follow that up with an efficiency improvement.
> > 
> > There's probably a v2 of this to follow because I couldn't figure
> > out how to properly handle one of the error cases ... see the comment
> > embedded in one of the patches.
> 
> #1 + #2 pushed to drm-misc-next now.
> #3 should not be needed any more.
> waiting for v2 of #4 ...

Thanks!  I think we do still need a small part of #3.  Patches in
replies to this email.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/4] drm/virtio: Use IDAs more efficiently

2018-10-02 Thread Matthew Wilcox
On Tue, Oct 02, 2018 at 01:43:28PM +0200, Gerd Hoffmann wrote:
> On Wed, Sep 26, 2018 at 09:04:55AM -0700, Matthew Wilcox wrote:
> > On Wed, Sep 26, 2018 at 09:00:31AM -0700, Matthew Wilcox wrote:
> > > @@ -59,6 +59,7 @@ static int virtio_gpu_context_create(struct 
> > > virtio_gpu_device *vgdev,
> > >  
> > >   if (handle < 0)
> > >   return handle;
> > > + handle++;
> > >   virtio_gpu_cmd_context_create(vgdev, handle, nlen, name);
> > >   return handle;
> > >  }
> > 
> > Uh.  This line is missing.
> > 
> > -   int handle = ida_alloc_min(>ctx_id_ida, 1, GFP_KERNEL);
> > +   int handle = ida_alloc(>ctx_id_ida, GFP_KERNEL);
> > 
> > It'll be there in v2 ;-)
> 
> I've touched the resource/object id handling too, see my "drm/virtio:
> rework ttm resource handling" patch series
> (https://patchwork.freedesktop.org/series/50382/).  Which still needs a
> review btw.

Um, according to patchwork, you only posted it yesterday.  Does DRM
normally expect a review within 24 hours?

> I think that series obsoletes patch 3/4 (object id fixes) of your
> series.  The other patches should rebase without too much trouble, you
> could do that as well when preparing v2 ...

It seems a little odd to me to expect a drive-by contributor (ie me) to
rebase their patches on top of a patch series which wasn't even posted
at the time they contributed their original patch.  If it was already
in -next, that'd be a reasonable request.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/4] drm/virtio: Use IDAs more efficiently

2018-09-26 Thread Matthew Wilcox
On Wed, Sep 26, 2018 at 09:00:31AM -0700, Matthew Wilcox wrote:
> @@ -59,6 +59,7 @@ static int virtio_gpu_context_create(struct 
> virtio_gpu_device *vgdev,
>  
>   if (handle < 0)
>   return handle;
> + handle++;
>   virtio_gpu_cmd_context_create(vgdev, handle, nlen, name);
>   return handle;
>  }

Uh.  This line is missing.

-   int handle = ida_alloc_min(>ctx_id_ida, 1, GFP_KERNEL);
+   int handle = ida_alloc(>ctx_id_ida, GFP_KERNEL);

It'll be there in v2 ;-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4/4] drm/virtio: Use IDAs more efficiently

2018-09-26 Thread Matthew Wilcox
0-based IDAs are more efficient than any other base.  Convert the
1-based IDAs to be 0-based.

Signed-off-by: Matthew Wilcox 
---
 drivers/gpu/drm/virtio/virtgpu_kms.c | 3 ++-
 drivers/gpu/drm/virtio/virtgpu_vq.c  | 7 +--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index bf609dcae224..b576c9ef6323 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -59,6 +59,7 @@ static int virtio_gpu_context_create(struct virtio_gpu_device 
*vgdev,
 
if (handle < 0)
return handle;
+   handle++;
virtio_gpu_cmd_context_create(vgdev, handle, nlen, name);
return handle;
 }
@@ -67,7 +68,7 @@ static void virtio_gpu_context_destroy(struct 
virtio_gpu_device *vgdev,
  uint32_t ctx_id)
 {
virtio_gpu_cmd_context_destroy(vgdev, ctx_id);
-   ida_free(>ctx_id_ida, ctx_id);
+   ida_free(>ctx_id_ida, ctx_id - 1);
 }
 
 static void virtio_gpu_init_vq(struct virtio_gpu_queue *vgvq,
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 387951c971d4..81297fe0147d 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -40,12 +40,15 @@
 
 int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev)
 {
-   return ida_alloc_min(>resource_ida, 1, GFP_KERNEL);
+   int handle = ida_alloc(>resource_ida, GFP_KERNEL);
+   if (handle < 0)
+   return handle;
+   return handle + 1;
 }
 
 void virtio_gpu_resource_id_put(struct virtio_gpu_device *vgdev, uint32_t id)
 {
-   ida_free(>resource_ida, id);
+   ida_free(>resource_ida, id - 1);
 }
 
 void virtio_gpu_ctrl_ack(struct virtqueue *vq)
-- 
2.19.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 3/4] drm/virtio: Handle object ID allocation errors

2018-09-26 Thread Matthew Wilcox
It is possible to run out of memory while allocating IDs.  The current
code would create an object with an invalid ID; change it to return
-ENOMEM to the caller.

Signed-off-by: Matthew Wilcox 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  3 +--
 drivers/gpu/drm/virtio/virtgpu_fb.c| 10 --
 drivers/gpu/drm/virtio/virtgpu_gem.c   | 10 --
 drivers/gpu/drm/virtio/virtgpu_ioctl.c |  5 -
 drivers/gpu/drm/virtio/virtgpu_vq.c|  6 ++
 5 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index c4468a4e454e..0a3392f2cda3 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -247,8 +247,7 @@ int virtio_gpu_surface_dirty(struct virtio_gpu_framebuffer 
*qfb,
 /* virtio vg */
 int virtio_gpu_alloc_vbufs(struct virtio_gpu_device *vgdev);
 void virtio_gpu_free_vbufs(struct virtio_gpu_device *vgdev);
-void virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
-  uint32_t *resid);
+int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev);
 void virtio_gpu_resource_id_put(struct virtio_gpu_device *vgdev, uint32_t id);
 void virtio_gpu_cmd_create_resource(struct virtio_gpu_device *vgdev,
uint32_t resource_id,
diff --git a/drivers/gpu/drm/virtio/virtgpu_fb.c 
b/drivers/gpu/drm/virtio/virtgpu_fb.c
index a121b1c79522..74d815483487 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fb.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fb.c
@@ -244,14 +244,17 @@ static int virtio_gpufb_create(struct drm_fb_helper 
*helper,
if (IS_ERR(obj))
return PTR_ERR(obj);
 
-   virtio_gpu_resource_id_get(vgdev, );
+   ret = virtio_gpu_resource_id_get(vgdev);
+   if (ret < 0)
+   goto err_obj_vmap;
+   resid = ret;
virtio_gpu_cmd_create_resource(vgdev, resid, format,
   mode_cmd.width, mode_cmd.height);
 
ret = virtio_gpu_vmap_fb(vgdev, obj);
if (ret) {
DRM_ERROR("failed to vmap fb %d\n", ret);
-   goto err_obj_vmap;
+   goto err_obj_id;
}
 
/* attach the object to the resource */
@@ -293,8 +296,11 @@ static int virtio_gpufb_create(struct drm_fb_helper 
*helper,
 err_fb_alloc:
virtio_gpu_cmd_resource_inval_backing(vgdev, resid);
 err_obj_attach:
+err_obj_id:
+   virtio_gpu_resource_id_put(vgdev, resid);
 err_obj_vmap:
virtio_gpu_gem_free_object(>gem_base);
+
return ret;
 }
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c 
b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 0f2768eacaee..9e3af1ec26db 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -100,7 +100,10 @@ int virtio_gpu_mode_dumb_create(struct drm_file *file_priv,
goto fail;
 
format = virtio_gpu_translate_format(DRM_FORMAT_XRGB);
-   virtio_gpu_resource_id_get(vgdev, );
+   ret = virtio_gpu_resource_id_get(vgdev);
+   if (ret < 0)
+   goto fail;
+   resid = ret;
virtio_gpu_cmd_create_resource(vgdev, resid, format,
   args->width, args->height);
 
@@ -108,13 +111,16 @@ int virtio_gpu_mode_dumb_create(struct drm_file 
*file_priv,
obj = gem_to_virtio_gpu_obj(gobj);
ret = virtio_gpu_object_attach(vgdev, obj, resid, NULL);
if (ret)
-   goto fail;
+   goto fail_id;
 
obj->dumb = true;
args->pitch = pitch;
return ret;
 
+fail_id:
+   virtio_gpu_resource_id_put(vgdev, resid);
 fail:
+   /* Shouldn't we undo virtio_gpu_gem_create()? */
return ret;
 }
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 7bdf6f0e58a5..eec9f09f01f0 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -244,7 +244,10 @@ static int virtio_gpu_resource_create_ioctl(struct 
drm_device *dev, void *data,
INIT_LIST_HEAD(_list);
memset(, 0, sizeof(struct ttm_validate_buffer));
 
-   virtio_gpu_resource_id_get(vgdev, _id);
+   ret = virtio_gpu_resource_id_get(vgdev);
+   if (ret < 0)
+   return ret;
+   res_id = ret;
 
size = rc->size;
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 58be09d2eed6..387951c971d4 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -38,11 +38,9 @@
   + MAX_INLINE_CMD_SIZE \
   + MAX_INLINE_RESP_SIZE)
 
-void virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
-   uint32_t *resid)
+int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev)
 {
-   int handle = ida_a

[PATCH 0/4] Improve virtio ID allocation

2018-09-26 Thread Matthew Wilcox
I noticed you were using IDRs where you could be using the more efficient
IDAs, then while fixing that I noticed the lack of error handling,
and I decided to follow that up with an efficiency improvement.

There's probably a v2 of this to follow because I couldn't figure
out how to properly handle one of the error cases ... see the comment
embedded in one of the patches.

Matthew Wilcox (4):
  drm/virtio: Replace IDRs with IDAs
  drm/virtio: Handle context ID allocation errors
  drm/virtio: Handle object ID allocation errors
  drm/virtio: Use IDAs more efficiently

 drivers/gpu/drm/virtio/virtgpu_drv.h   |  9 ++---
 drivers/gpu/drm/virtio/virtgpu_fb.c| 10 --
 drivers/gpu/drm/virtio/virtgpu_gem.c   | 10 --
 drivers/gpu/drm/virtio/virtgpu_ioctl.c |  5 ++-
 drivers/gpu/drm/virtio/virtgpu_kms.c   | 46 +-
 drivers/gpu/drm/virtio/virtgpu_vq.c| 19 ---
 6 files changed, 44 insertions(+), 55 deletions(-)

-- 
2.19.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/4] drm/virtio: Handle context ID allocation errors

2018-09-26 Thread Matthew Wilcox
It is possible to run out of memory while allocating IDs.  The current
code would create a context with an invalid ID; change it to return
-ENOMEM to userspace.

Signed-off-by: Matthew Wilcox 
---
 drivers/gpu/drm/virtio/virtgpu_kms.c | 29 +++-
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index e2604fe1b4ae..bf609dcae224 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -52,31 +52,22 @@ static void virtio_gpu_config_changed_work_func(struct 
work_struct *work)
  events_clear, _clear);
 }
 
-static void virtio_gpu_ctx_id_get(struct virtio_gpu_device *vgdev,
- uint32_t *resid)
+static int virtio_gpu_context_create(struct virtio_gpu_device *vgdev,
+ uint32_t nlen, const char *name)
 {
int handle = ida_alloc_min(>ctx_id_ida, 1, GFP_KERNEL);
-   *resid = handle;
-}
 
-static void virtio_gpu_ctx_id_put(struct virtio_gpu_device *vgdev, uint32_t id)
-{
-   ida_free(>ctx_id_ida, id);
-}
-
-static void virtio_gpu_context_create(struct virtio_gpu_device *vgdev,
- uint32_t nlen, const char *name,
- uint32_t *ctx_id)
-{
-   virtio_gpu_ctx_id_get(vgdev, ctx_id);
-   virtio_gpu_cmd_context_create(vgdev, *ctx_id, nlen, name);
+   if (handle < 0)
+   return handle;
+   virtio_gpu_cmd_context_create(vgdev, handle, nlen, name);
+   return handle;
 }
 
 static void virtio_gpu_context_destroy(struct virtio_gpu_device *vgdev,
  uint32_t ctx_id)
 {
virtio_gpu_cmd_context_destroy(vgdev, ctx_id);
-   virtio_gpu_ctx_id_put(vgdev, ctx_id);
+   ida_free(>ctx_id_ida, ctx_id);
 }
 
 static void virtio_gpu_init_vq(struct virtio_gpu_queue *vgvq,
@@ -261,7 +252,7 @@ int virtio_gpu_driver_open(struct drm_device *dev, struct 
drm_file *file)
 {
struct virtio_gpu_device *vgdev = dev->dev_private;
struct virtio_gpu_fpriv *vfpriv;
-   uint32_t id;
+   int id;
char dbgname[TASK_COMM_LEN];
 
/* can't create contexts without 3d renderer */
@@ -274,7 +265,9 @@ int virtio_gpu_driver_open(struct drm_device *dev, struct 
drm_file *file)
return -ENOMEM;
 
get_task_comm(dbgname, current);
-   virtio_gpu_context_create(vgdev, strlen(dbgname), dbgname, );
+   id = virtio_gpu_context_create(vgdev, strlen(dbgname), dbgname);
+   if (id < 0)
+   return id;
 
vfpriv->ctx_id = id;
file->driver_priv = vfpriv;
-- 
2.19.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/4] drm/virtio: Replace IDRs with IDAs

2018-09-26 Thread Matthew Wilcox
These IDRs were only being used to allocate unique numbers, not to look
up pointers, so they can use the more space-efficient IDA instead.

Signed-off-by: Matthew Wilcox 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h |  6 ++
 drivers/gpu/drm/virtio/virtgpu_kms.c | 18 --
 drivers/gpu/drm/virtio/virtgpu_vq.c  | 12 ++--
 3 files changed, 8 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 65605e207bbe..c4468a4e454e 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -180,8 +180,7 @@ struct virtio_gpu_device {
struct kmem_cache *vbufs;
bool vqs_ready;
 
-   struct idr  resource_idr;
-   spinlock_t resource_idr_lock;
+   struct ida  resource_ida;
 
wait_queue_head_t resp_wq;
/* current display info */
@@ -190,8 +189,7 @@ struct virtio_gpu_device {
 
struct virtio_gpu_fence_driver fence_drv;
 
-   struct idr  ctx_id_idr;
-   spinlock_t ctx_id_idr_lock;
+   struct ida  ctx_id_ida;
 
bool has_virgl_3d;
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 65060c08522d..e2604fe1b4ae 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -55,21 +55,13 @@ static void virtio_gpu_config_changed_work_func(struct 
work_struct *work)
 static void virtio_gpu_ctx_id_get(struct virtio_gpu_device *vgdev,
  uint32_t *resid)
 {
-   int handle;
-
-   idr_preload(GFP_KERNEL);
-   spin_lock(>ctx_id_idr_lock);
-   handle = idr_alloc(>ctx_id_idr, NULL, 1, 0, 0);
-   spin_unlock(>ctx_id_idr_lock);
-   idr_preload_end();
+   int handle = ida_alloc_min(>ctx_id_ida, 1, GFP_KERNEL);
*resid = handle;
 }
 
 static void virtio_gpu_ctx_id_put(struct virtio_gpu_device *vgdev, uint32_t id)
 {
-   spin_lock(>ctx_id_idr_lock);
-   idr_remove(>ctx_id_idr, id);
-   spin_unlock(>ctx_id_idr_lock);
+   ida_free(>ctx_id_ida, id);
 }
 
 static void virtio_gpu_context_create(struct virtio_gpu_device *vgdev,
@@ -151,10 +143,8 @@ int virtio_gpu_driver_load(struct drm_device *dev, 
unsigned long flags)
vgdev->dev = dev->dev;
 
spin_lock_init(>display_info_lock);
-   spin_lock_init(>ctx_id_idr_lock);
-   idr_init(>ctx_id_idr);
-   spin_lock_init(>resource_idr_lock);
-   idr_init(>resource_idr);
+   ida_init(>ctx_id_ida);
+   ida_init(>resource_ida);
init_waitqueue_head(>resp_wq);
virtio_gpu_init_vq(>ctrlq, virtio_gpu_dequeue_ctrl_func);
virtio_gpu_init_vq(>cursorq, virtio_gpu_dequeue_cursor_func);
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 020070d483d3..58be09d2eed6 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -41,21 +41,13 @@
 void virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
uint32_t *resid)
 {
-   int handle;
-
-   idr_preload(GFP_KERNEL);
-   spin_lock(>resource_idr_lock);
-   handle = idr_alloc(>resource_idr, NULL, 1, 0, GFP_NOWAIT);
-   spin_unlock(>resource_idr_lock);
-   idr_preload_end();
+   int handle = ida_alloc_min(>resource_ida, 1, GFP_KERNEL);
*resid = handle;
 }
 
 void virtio_gpu_resource_id_put(struct virtio_gpu_device *vgdev, uint32_t id)
 {
-   spin_lock(>resource_idr_lock);
-   idr_remove(>resource_idr, id);
-   spin_unlock(>resource_idr_lock);
+   ida_free(>resource_ida, id);
 }
 
 void virtio_gpu_ctrl_ack(struct virtqueue *vq)
-- 
2.19.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v33 1/4] mm: add a function to get free page blocks

2018-06-15 Thread Matthew Wilcox
On Fri, Jun 15, 2018 at 12:43:10PM +0800, Wei Wang wrote:
> +/**
> + * get_from_free_page_list - get free page blocks from a free page list
> + * @order: the order of the free page list to check
> + * @buf: the array to store the physical addresses of the free page blocks
> + * @size: the array size
> + *
> + * This function offers hints about free pages. There is no guarantee that
> + * the obtained free pages are still on the free page list after the function
> + * returns. pfn_to_page on the obtained free pages is strongly discouraged
> + * and if there is an absolute need for that, make sure to contact MM people
> + * to discuss potential problems.
> + *
> + * The addresses are currently stored to the array in little endian. This
> + * avoids the overhead of converting endianness by the caller who needs data
> + * in the little endian format. Big endian support can be added on demand in
> + * the future.
> + *
> + * Return the number of free page blocks obtained from the free page list.
> + * The maximum number of free page blocks that can be obtained is limited to
> + * the caller's array size.
> + */

Please use:

 * Return: The number of free page blocks obtained from the free page list.

Also, please include a

 * Context: Any context.

or

 * Context: Process context.

or whatever other conetext this function can be called from.  Since you're
taking the lock irqsafe, I assume this can be called from any context, but
I wonder if it makes sense to have this function callable from interrupt
context.  Maybe this should be callable from process context only.

> +uint32_t get_from_free_page_list(int order, __le64 buf[], uint32_t size)
> +{
> + struct zone *zone;
> + enum migratetype mt;
> + struct page *page;
> + struct list_head *list;
> + unsigned long addr, flags;
> + uint32_t index = 0;
> +
> + for_each_populated_zone(zone) {
> + spin_lock_irqsave(>lock, flags);
> + for (mt = 0; mt < MIGRATE_TYPES; mt++) {
> + list = >free_area[order].free_list[mt];
> + list_for_each_entry(page, list, lru) {
> + addr = page_to_pfn(page) << PAGE_SHIFT;
> + if (likely(index < size)) {
> + buf[index++] = cpu_to_le64(addr);
> + } else {
> + spin_unlock_irqrestore(>lock,
> +flags);
> + return index;
> + }
> + }
> + }
> + spin_unlock_irqrestore(>lock, flags);
> + }
> +
> + return index;
> +}

I wonder if (to address Michael's concern), you shouldn't instead use
the first free chunk of pages to return the addresses of all the pages.
ie something like this:

__le64 *ret = NULL;
unsigned int max = (PAGE_SIZE << order) / sizeof(__le64);

for_each_populated_zone(zone) {
spin_lock_irq(>lock);
for (mt = 0; mt < MIGRATE_TYPES; mt++) {
list = >free_area[order].free_list[mt];
list_for_each_entry_safe(page, list, lru, ...) {
if (index == size)
break;
addr = page_to_pfn(page) << PAGE_SHIFT;
if (!ret) {
list_del(...);
ret = addr;
}
ret[index++] = cpu_to_le64(addr);
}
}
spin_unlock_irq(>lock);
}

return ret;
}

You'll need to return the page to the freelist afterwards, but free_pages()
should take care of that.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/3] Use sbitmap instead of percpu_ida

2018-06-14 Thread Matthew Wilcox
On Thu, Jun 14, 2018 at 10:06:58PM -0400, Martin K. Petersen wrote:
> 
> Matthew,
> 
> > Removing the percpu_ida code nets over 400 lines of removal.  It's not
> > as spectacular as deleting an entire architecture, but it's still a
> > worthy reduction in lines of code.
> 
> Since most of the changes are in scsi or target, should I take this
> series through my tree?

I'd welcome that.  Nick seems to be inactive as target maintainer;
his tree on kernel.org hasn't seen any updates in five months.

Thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 3/3] Remove percpu_ida

2018-06-12 Thread Matthew Wilcox
With its one user gone, remove the library code.

Signed-off-by: Matthew Wilcox 
---
 include/linux/percpu_ida.h |  83 -
 lib/Makefile   |   2 +-
 lib/percpu_ida.c   | 370 -
 3 files changed, 1 insertion(+), 454 deletions(-)
 delete mode 100644 include/linux/percpu_ida.h
 delete mode 100644 lib/percpu_ida.c

diff --git a/include/linux/percpu_ida.h b/include/linux/percpu_ida.h
deleted file mode 100644
index 07d78e4653bc..
--- a/include/linux/percpu_ida.h
+++ /dev/null
@@ -1,83 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __PERCPU_IDA_H__
-#define __PERCPU_IDA_H__
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-struct percpu_ida_cpu;
-
-struct percpu_ida {
-   /*
-* number of tags available to be allocated, as passed to
-* percpu_ida_init()
-*/
-   unsignednr_tags;
-   unsignedpercpu_max_size;
-   unsignedpercpu_batch_size;
-
-   struct percpu_ida_cpu __percpu  *tag_cpu;
-
-   /*
-* Bitmap of cpus that (may) have tags on their percpu freelists:
-* steal_tags() uses this to decide when to steal tags, and which cpus
-* to try stealing from.
-*
-* It's ok for a freelist to be empty when its bit is set - steal_tags()
-* will just keep looking - but the bitmap _must_ be set whenever a
-* percpu freelist does have tags.
-*/
-   cpumask_t   cpus_have_tags;
-
-   struct {
-   spinlock_t  lock;
-   /*
-* When we go to steal tags from another cpu (see steal_tags()),
-* we want to pick a cpu at random. Cycling through them every
-* time we steal is a bit easier and more or less equivalent:
-*/
-   unsignedcpu_last_stolen;
-
-   /* For sleeping on allocation failure */
-   wait_queue_head_t   wait;
-
-   /*
-* Global freelist - it's a stack where nr_free points to the
-* top
-*/
-   unsignednr_free;
-   unsigned*freelist;
-   } cacheline_aligned_in_smp;
-};
-
-/*
- * Number of tags we move between the percpu freelist and the global freelist 
at
- * a time
- */
-#define IDA_DEFAULT_PCPU_BATCH_MOVE32U
-/* Max size of percpu freelist, */
-#define IDA_DEFAULT_PCPU_SIZE  ((IDA_DEFAULT_PCPU_BATCH_MOVE * 3) / 2)
-
-int percpu_ida_alloc(struct percpu_ida *pool, int state);
-void percpu_ida_free(struct percpu_ida *pool, unsigned tag);
-
-void percpu_ida_destroy(struct percpu_ida *pool);
-int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
-   unsigned long max_size, unsigned long batch_size);
-static inline int percpu_ida_init(struct percpu_ida *pool, unsigned long 
nr_tags)
-{
-   return __percpu_ida_init(pool, nr_tags, IDA_DEFAULT_PCPU_SIZE,
-   IDA_DEFAULT_PCPU_BATCH_MOVE);
-}
-
-typedef int (*percpu_ida_cb)(unsigned, void *);
-int percpu_ida_for_each_free(struct percpu_ida *pool, percpu_ida_cb fn,
-   void *data);
-
-unsigned percpu_ida_free_tags(struct percpu_ida *pool, int cpu);
-#endif /* __PERCPU_IDA_H__ */
diff --git a/lib/Makefile b/lib/Makefile
index 84c6dcb31fbb..f4722a7fa62c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -40,7 +40,7 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o 
random32.o \
 bust_spinlocks.o kasprintf.o bitmap.o scatterlist.o \
 gcd.o lcm.o list_sort.o uuid.o flex_array.o iov_iter.o clz_ctz.o \
 bsearch.o find_bit.o llist.o memweight.o kfifo.o \
-percpu-refcount.o percpu_ida.o rhashtable.o reciprocal_div.o \
+percpu-refcount.o rhashtable.o reciprocal_div.o \
 once.o refcount.o usercopy.o errseq.o bucket_locks.o
 obj-$(CONFIG_STRING_SELFTEST) += test_string.o
 obj-y += string_helpers.o
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
deleted file mode 100644
index 9bbd9c5d375a..
--- a/lib/percpu_ida.c
+++ /dev/null
@@ -1,370 +0,0 @@
-/*
- * Percpu IDA library
- *
- * Copyright (C) 2013 Datera, Inc. Kent Overstreet
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation; either version 2, or (at
- * your option) any later version.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-struct percpu_ida_cpu

[PATCH 1/3] target: Abstract tag freeing

2018-06-12 Thread Matthew Wilcox
Introduce target_free_tag() and convert all drivers to use it.

Signed-off-by: Matthew Wilcox 
---
 drivers/scsi/qla2xxx/qla_target.c| 4 ++--
 drivers/target/iscsi/iscsi_target_util.c | 2 +-
 drivers/target/sbp/sbp_target.c  | 2 +-
 drivers/target/tcm_fc/tfc_cmd.c  | 4 ++--
 drivers/usb/gadget/function/f_tcm.c  | 2 +-
 drivers/vhost/scsi.c | 2 +-
 drivers/xen/xen-scsiback.c   | 4 +---
 include/target/target_core_base.h| 5 +
 8 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_target.c 
b/drivers/scsi/qla2xxx/qla_target.c
index b85c833099ff..05290966e630 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -3783,7 +3783,7 @@ void qlt_free_cmd(struct qla_tgt_cmd *cmd)
return;
}
cmd->jiffies_at_free = get_jiffies_64();
-   percpu_ida_free(>se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
+   target_free_tag(sess->se_sess, >se_cmd);
 }
 EXPORT_SYMBOL(qlt_free_cmd);
 
@@ -4146,7 +4146,7 @@ static void __qlt_do_work(struct qla_tgt_cmd *cmd)
qlt_send_term_exchange(qpair, NULL, >atio, 1, 0);
 
qlt_decr_num_pend_cmds(vha);
-   percpu_ida_free(>se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
+   target_free_tag(sess->se_sess, >se_cmd);
spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
 
spin_lock_irqsave(>tgt.sess_lock, flags);
diff --git a/drivers/target/iscsi/iscsi_target_util.c 
b/drivers/target/iscsi/iscsi_target_util.c
index 4435bf374d2d..7e98697cfb8e 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -711,7 +711,7 @@ void iscsit_release_cmd(struct iscsi_cmd *cmd)
kfree(cmd->iov_data);
kfree(cmd->text_in_ptr);
 
-   percpu_ida_free(>se_sess->sess_tag_pool, se_cmd->map_tag);
+   target_free_tag(sess->se_sess, se_cmd);
 }
 EXPORT_SYMBOL(iscsit_release_cmd);
 
diff --git a/drivers/target/sbp/sbp_target.c b/drivers/target/sbp/sbp_target.c
index fb1003921d85..679ae29d25ab 100644
--- a/drivers/target/sbp/sbp_target.c
+++ b/drivers/target/sbp/sbp_target.c
@@ -1460,7 +1460,7 @@ static void sbp_free_request(struct sbp_target_request 
*req)
kfree(req->pg_tbl);
kfree(req->cmd_buf);
 
-   percpu_ida_free(_sess->sess_tag_pool, se_cmd->map_tag);
+   target_free_tag(se_sess, se_cmd);
 }
 
 static void sbp_mgt_agent_process(struct work_struct *work)
diff --git a/drivers/target/tcm_fc/tfc_cmd.c b/drivers/target/tcm_fc/tfc_cmd.c
index ec372860106f..13e4efbe1ce7 100644
--- a/drivers/target/tcm_fc/tfc_cmd.c
+++ b/drivers/target/tcm_fc/tfc_cmd.c
@@ -92,7 +92,7 @@ static void ft_free_cmd(struct ft_cmd *cmd)
if (fr_seq(fp))
fc_seq_release(fr_seq(fp));
fc_frame_free(fp);
-   percpu_ida_free(>se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
+   target_free_tag(sess->se_sess, >se_cmd);
ft_sess_put(sess);  /* undo get from lookup at recv */
 }
 
@@ -461,7 +461,7 @@ static void ft_recv_cmd(struct ft_sess *sess, struct 
fc_frame *fp)
cmd->sess = sess;
cmd->seq = fc_seq_assign(lport, fp);
if (!cmd->seq) {
-   percpu_ida_free(_sess->sess_tag_pool, tag);
+   target_free_tag(se_sess, >se_cmd);
goto busy;
}
cmd->req_frame = fp;/* hold frame during cmd */
diff --git a/drivers/usb/gadget/function/f_tcm.c 
b/drivers/usb/gadget/function/f_tcm.c
index d78dbb73bde8..9f670d9224b9 100644
--- a/drivers/usb/gadget/function/f_tcm.c
+++ b/drivers/usb/gadget/function/f_tcm.c
@@ -1288,7 +1288,7 @@ static void usbg_release_cmd(struct se_cmd *se_cmd)
struct se_session *se_sess = se_cmd->se_sess;
 
kfree(cmd->data_buf);
-   percpu_ida_free(_sess->sess_tag_pool, se_cmd->map_tag);
+   target_free_tag(se_sess, se_cmd);
 }
 
 static u32 usbg_sess_get_index(struct se_session *se_sess)
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 7ad57094d736..70d35e696533 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -324,7 +324,7 @@ static void vhost_scsi_release_cmd(struct se_cmd *se_cmd)
}
 
vhost_scsi_put_inflight(tv_cmd->inflight);
-   percpu_ida_free(_sess->sess_tag_pool, se_cmd->map_tag);
+   target_free_tag(se_sess, se_cmd);
 }
 
 static u32 vhost_scsi_sess_get_index(struct se_session *se_sess)
diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
index 7bc88fd43cfc..ec6635258ed8 100644
--- a/drivers/xen/xen-scsiback.c
+++ b/drivers/xen/xen-scsiback.c
@@ -1377,9 +1377,7 @@ static int scsiback_check_stop_free(struct se_cmd *se_cmd)
 
 static void scsiback_release_cmd(struct se_cmd *se_cmd)
 {
-   struct se_session *se_sess = se_cmd->se_sess;
-
-   percpu_ida_free(_sess->sess_tag_

Re: [PATCH 1/2] Convert target drivers to use sbitmap

2018-06-12 Thread Matthew Wilcox
On Tue, Jun 12, 2018 at 04:32:03PM +, Bart Van Assche wrote:
> On Tue, 2018-06-12 at 09:15 -0700, Matthew Wilcox wrote:
> > On Tue, Jun 12, 2018 at 03:22:42PM +, Bart Van Assche wrote:
> > > Please introduce functions in the target core for allocating and freeing 
> > > a tag
> > > instead of spreading the knowledge of how to allocate and free tags over 
> > > all
> > > target drivers.
> > 
> > I can't without doing an unreasonably large amount of work on drivers that
> > I have no way to test.  Some of the drivers have the se_cmd already; some
> > of them don't.  I'd be happy to introduce a common function for freeing
> > a tag.
> 
> Which target drivers are you referring to? If you are referring to the sbp 
> driver:
> I think that driver is dead and can be removed from the kernel tree. I even 
> don't
> know whether that driver ever has had any users other than the developer of 
> that
> driver.

For example tcm_fc:

tag = sbitmap_queue_get(_sess->sess_tag_pool, );
if (tag < 0)
goto busy;

cmd = &((struct ft_cmd *)se_sess->sess_cmd_map)[tag];

or qla2xxx:

tag = sbitmap_queue_get(_sess->sess_tag_pool, );
if (tag < 0)
return NULL;

cmd = &((struct qla_tgt_cmd *)se_sess->sess_cmd_map)[tag];

The core doesn't know at what offset from the pointer to store the tag
& cpu.  Only the individual drivers know their cmd layout.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


  1   2   >