Re: [PATCH v3 2/2] secretmem: optimize page_is_secretmem()

2021-05-07 Thread Matthew Wilcox
On Tue, Apr 20, 2021 at 06:00:49PM +0300, Mike Rapoport wrote: > + mapping = (struct address_space *) > + ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS); > + > + if (mapping != page->mapping) > + return false; > + > + return page->mapping->a_ops == _aops;

Re: [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages

2021-05-06 Thread Matthew Wilcox
On Thu, May 06, 2021 at 11:23:25AM +0100, Joao Martins wrote: > >> I think it is ok for dax/nvdimm to continue to maintain their align > >> value because it should be ok to have 4MB align if the device really > >> wanted. However, when it goes to map that alignment with > >> memremap_pages() it

Re: [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages

2021-05-05 Thread Matthew Wilcox
On Wed, May 05, 2021 at 11:44:29AM -0700, Dan Williams wrote: > > @@ -6285,6 +6285,8 @@ void __ref memmap_init_zone_device(struct zone *zone, > > unsigned long pfn, end_pfn = start_pfn + nr_pages; > > struct pglist_data *pgdat = zone->zone_pgdat; > > struct vmem_altmap

Re: [PATCH v4 1/3] dax: Add an enum for specifying dax wakup mode

2021-04-26 Thread Matthew Wilcox
On Mon, Apr 26, 2021 at 01:52:17PM -0400, Vivek Goyal wrote: > On Mon, Apr 26, 2021 at 02:46:32PM +0100, Matthew Wilcox wrote: > > On Fri, Apr 23, 2021 at 09:07:21AM -0400, Vivek Goyal wrote: > > > +enum dax_wake_mode { > > > + WAKE_NEXT, > > > + WAKE_ALL

Re: [PATCH v4 1/3] dax: Add an enum for specifying dax wakup mode

2021-04-26 Thread Matthew Wilcox
On Fri, Apr 23, 2021 at 09:07:21AM -0400, Vivek Goyal wrote: > +enum dax_wake_mode { > + WAKE_NEXT, > + WAKE_ALL, > +}; Why define them in this order when ... > @@ -196,7 +207,7 @@ static void dax_wake_entry(struct xa_state *xas, void > *entry, bool wake_all) >* must be in the

Re: [PATCH v3 1/3] dax: Add an enum for specifying dax wakup mode

2021-04-21 Thread Matthew Wilcox
On Wed, Apr 21, 2021 at 11:56:31AM -0400, Vivek Goyal wrote: > +/** > + * enum dax_entry_wake_mode: waitqueue wakeup toggle s/toggle/behaviour/ ? > + * @WAKE_NEXT: wake only the first waiter in the waitqueue > + * @WAKE_ALL: wake all waiters in the waitqueue > + */ > +enum dax_entry_wake_mode {

Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Matthew Wilcox
On Mon, Apr 19, 2021 at 02:56:17PM +0300, Mike Rapoport wrote: > On Mon, Apr 19, 2021 at 12:23:02PM +0100, Matthew Wilcox wrote: > > So you're calling page_is_secretmem() on a struct page without having > > a refcount on it. That is definitely not allowed. secretmem seems

Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Matthew Wilcox
On Mon, Apr 19, 2021 at 12:36:19PM +0300, Mike Rapoport wrote: > Well, most if the -4.2% of the performance regression kbuild reported were > due to repeated compount_head(page) in page_mapping(). So the whole point > of this patch is to avoid calling page_mapping(). It's quite ludicrous how many

Re: [PATCH] secretmem: optimize page_is_secretmem()

2021-04-19 Thread Matthew Wilcox
On Mon, Apr 19, 2021 at 11:42:18AM +0300, Mike Rapoport wrote: > The perf profile of the test indicated that the regression is caused by > page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range): Uhh ... you're calling it in the wrong place!

Re: [PATCH 1/3] fsdax: Factor helpers to simplify dax fault code

2021-04-07 Thread Matthew Wilcox
On Wed, Apr 07, 2021 at 02:32:05PM +0800, Shiyang Ruan wrote: > +static int dax_fault_cow_page(struct vm_fault *vmf, struct iomap *iomap, > + loff_t pos, vm_fault_t *ret) > +{ > + int error = 0; > + unsigned long vaddr = vmf->address; > + sector_t sector =

Re: BUG_ON(!mapping_empty(>i_data))

2021-04-02 Thread Matthew Wilcox
OK, more competent testing, and that previous bug now detected and fixed. I have a reasonable amount of confidence this will solve your problem. If you do apply this patch, don't enable CONFIG_TEST_XARRAY as the new tests assume that attempting to allocate with a GFP flags of 0 will definitely

Re: BUG_ON(!mapping_empty(>i_data))

2021-04-02 Thread Matthew Wilcox
On Fri, Apr 02, 2021 at 04:13:05AM +0100, Matthew Wilcox wrote: > + for (;;) { > + xas_load(xas); > + if (!xas_is_node(xas)) > + break; > + xas_delete_node(xas); > + xas->xa_index -= XA_CHUNK_SIZE; &g

Re: BUG_ON(!mapping_empty(>i_data))

2021-04-01 Thread Matthew Wilcox
On Thu, Apr 01, 2021 at 06:06:15PM +0100, Matthew Wilcox wrote: > On Wed, Mar 31, 2021 at 02:58:12PM -0700, Hugh Dickins wrote: > > I suspect there's a bug in the XArray handling in collapse_file(), > > which sometimes leaves empty nodes behind. > > Urp, yes, t

Re: BUG_ON(!mapping_empty(>i_data))

2021-04-01 Thread Matthew Wilcox
On Wed, Mar 31, 2021 at 02:58:12PM -0700, Hugh Dickins wrote: > I suspect there's a bug in the XArray handling in collapse_file(), > which sometimes leaves empty nodes behind. Urp, yes, that can easily happen. /* This will be less messy when we use multi-index entries */ do {

Re: BUG_ON(!mapping_empty(>i_data))

2021-03-30 Thread Matthew Wilcox
On Tue, Mar 30, 2021 at 06:30:22PM -0700, Hugh Dickins wrote: > Running my usual tmpfs kernel builds swapping load, on Sunday's rc4-mm1 > mmotm (I never got to try rc3-mm1 but presume it behaved the same way), > I hit clear_inode()'s BUG_ON(!mapping_empty(>i_data)); on two > machines, within an

Re: [PATCH v2 0/4] Remove nrexceptional tracking

2021-03-12 Thread Matthew Wilcox
Ping? On Thu, Jan 21, 2021 at 06:43:34PM +, Matthew Wilcox wrote: > Ping? These patches still apply to next-20210121. > > On Mon, Oct 26, 2020 at 03:18:45PM +, Matthew Wilcox (Oracle) wrote: > > We actually use nrexceptional for very little these days. It's a minor &

Re: [PATCH v2 00/10] fsdax,xfs: Add reflink support for fsdax

2021-03-10 Thread Matthew Wilcox
On Wed, Mar 10, 2021 at 08:21:59AM -0600, Goldwyn Rodrigues wrote: > On 13:02 10/03, Matthew Wilcox wrote: > > On Wed, Mar 10, 2021 at 07:30:41AM -0500, Neal Gompa wrote: > > > Forgive my ignorance, but is there a reason why this isn't wired up to > > > Btrfs at the same

Re: [PATCH v2 00/10] fsdax,xfs: Add reflink support for fsdax

2021-03-10 Thread Matthew Wilcox
On Wed, Mar 10, 2021 at 08:36:06AM -0500, Neal Gompa wrote: > On Wed, Mar 10, 2021 at 8:02 AM Matthew Wilcox wrote: > > > > On Wed, Mar 10, 2021 at 07:30:41AM -0500, Neal Gompa wrote: > > > Forgive my ignorance, but is there a reason why this isn't wired up to >

Re: [PATCH v2 00/10] fsdax,xfs: Add reflink support for fsdax

2021-03-10 Thread Matthew Wilcox
On Wed, Mar 10, 2021 at 07:30:41AM -0500, Neal Gompa wrote: > Forgive my ignorance, but is there a reason why this isn't wired up to > Btrfs at the same time? It seems weird to me that adding a feature btrfs doesn't support DAX. only ext2, ext4, XFS and FUSE have DAX support. If you think about

[PATCH v2] include: Remove pagemap.h from blkdev.h

2021-03-09 Thread Matthew Wilcox (Oracle)
, but there may be implicit include problems on other architectures. Signed-off-by: Matthew Wilcox (Oracle) --- v2: Fix CONFIG_SWAP=n implicit use of pagemap.h by swap.h. Increases the number of files from 240, but that's still a big win -- 68% reduction instead of 77%. block/blk-settings.c

[PATCH] include: Remove pagemap.h from blkdev.h

2021-03-09 Thread Matthew Wilcox (Oracle)
, but there may be implicit include problems on other architectures. Signed-off-by: Matthew Wilcox (Oracle) --- block/blk-settings.c | 1 + drivers/block/brd.c | 1 + drivers/block/loop.c | 1 + drivers/md/bcache/super.c | 1 + drivers/nvdimm/btt.c | 1 + drivers/nvdimm/pmem.c

Re: [PATCH v16 08/11] secretmem: add memcg accounting

2021-01-26 Thread Matthew Wilcox
On Mon, Jan 25, 2021 at 11:38:17PM +0200, Mike Rapoport wrote: > I cannot use __GFP_ACCOUNT because cma_alloc() does not use gfp. > Besides, kmem accounting with __GFP_ACCOUNT does not seem > to update stats and there was an explicit request for statistics: > >

Re: [PATCH v16 08/11] secretmem: add memcg accounting

2021-01-25 Thread Matthew Wilcox
Hildenbrand > Cc: Elena Reshetova > Cc: Hagen Paul Pfeifer > Cc: "H. Peter Anvin" > Cc: Ingo Molnar > Cc: James Bottomley > Cc: "Kirill A. Shutemov" > Cc: Mark Rutland > Cc: Matthew Wilcox > Cc: Michael Kerrisk > Cc: Palmer Dabbelt >

Re: [PATCH v2 1/4] mm: Introduce and use mapping_empty

2021-01-21 Thread Matthew Wilcox
On Thu, Jan 21, 2021 at 03:42:31PM -0500, Johannes Weiner wrote: > On Mon, Oct 26, 2020 at 03:18:46PM +0000, Matthew Wilcox (Oracle) wrote: > > Instead of checking the two counters (nrpages and nrexceptional), we > > can just check whether i_pages is empty. > > > > Si

Re: Expense of read_iter

2021-01-21 Thread Matthew Wilcox
On Wed, Jan 20, 2021 at 10:12:01AM -0500, Mikulas Patocka wrote: > Do you have some idea how to optimize the generic code that calls > ->read_iter? Yes. > It might be better to maintain an f_iocb_flags in the > struct file and just copy that unconditionally. We'd need to remember > to update

Re: [PATCH v15 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation

2021-01-20 Thread Matthew Wilcox
On Wed, Jan 20, 2021 at 08:06:08PM +0200, Mike Rapoport wrote: > +static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp) > { > + unsigned long nr_pages = (1 << PMD_PAGE_ORDER); > + struct gen_pool *pool = ctx->pool; > + unsigned long addr; > + struct page *page;

Re: [PATCH v15 06/11] mm: introduce memfd_secret system call to create "secret" memory areas

2021-01-20 Thread Matthew Wilcox
On Wed, Jan 20, 2021 at 08:06:07PM +0200, Mike Rapoport wrote: > +static struct page *secretmem_alloc_page(gfp_t gfp) > +{ > + /* > + * FIXME: use a cache of large pages to reduce the direct map > + * fragmentation > + */ > + return alloc_page(gfp); > +} > + > +static

Re: [PATCH v14 05/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-01-20 Thread Matthew Wilcox
On Wed, Jan 20, 2021 at 05:05:10PM +0200, Mike Rapoport wrote: > On Tue, Jan 19, 2021 at 08:22:13PM +0000, Matthew Wilcox wrote: > > On Thu, Dec 03, 2020 at 08:29:44AM +0200, Mike Rapoport wrote: > > > +static vm_fault_t secretmem_fault(struct vm_fault *vmf) > > > +{ &

Re: [PATCH v14 05/10] mm: introduce memfd_secret system call to create "secret" memory areas

2021-01-19 Thread Matthew Wilcox
On Thu, Dec 03, 2020 at 08:29:44AM +0200, Mike Rapoport wrote: > +static vm_fault_t secretmem_fault(struct vm_fault *vmf) > +{ > + struct address_space *mapping = vmf->vma->vm_file->f_mapping; > + struct inode *inode = file_inode(vmf->vma->vm_file); > + pgoff_t offset = vmf->pgoff; > +

Re: Expense of read_iter

2021-01-10 Thread Matthew Wilcox
On Sun, Jan 10, 2021 at 04:19:15PM -0500, Mikulas Patocka wrote: > I put counters into vfs_read and vfs_readv. > > After a fresh boot of the virtual machine, the counters show "13385 4". > After a kernel compilation they show "4475220 8". > > So, the readv path is almost unused. > > My

Re: Expense of read_iter

2021-01-09 Thread Matthew Wilcox
On Thu, Jan 07, 2021 at 01:59:01PM -0500, Mikulas Patocka wrote: > On Thu, 7 Jan 2021, Matthew Wilcox wrote: > > On Thu, Jan 07, 2021 at 08:15:41AM -0500, Mikulas Patocka wrote: > > > I'd like to ask about this piece of code in __kernel_read: > > > if (unlikely(!fi

Expense of read_iter

2021-01-07 Thread Matthew Wilcox
On Thu, Jan 07, 2021 at 08:15:41AM -0500, Mikulas Patocka wrote: > I'd like to ask about this piece of code in __kernel_read: > if (unlikely(!file->f_op->read_iter || file->f_op->read)) > return warn_unsupported... > and __kernel_write: > if

Re: [PATCH v2] fs/dax: include to fix build error on ARC

2021-01-04 Thread Matthew Wilcox
On Mon, Jan 04, 2021 at 12:13:02PM -0800, Dan Williams wrote: > On Thu, Dec 31, 2020 at 8:29 PM Randy Dunlap wrote: > > +++ lnx-511-rc1/fs/dax.c > > @@ -25,6 +25,7 @@ > > #include > > #include > > #include > > +#include > > I would expect this to come from one of the linux/ includes like

Re: [PATCH RFC 6/9] mm/gup: Grab head page refcount once for group of subpages

2020-12-09 Thread Matthew Wilcox
On Wed, Dec 09, 2020 at 12:24:38PM -0400, Jason Gunthorpe wrote: > On Wed, Dec 09, 2020 at 04:02:05PM +, Joao Martins wrote: > > > Today (without the series) struct pages are not represented the way they > > are expressed in the page tables, which is what I am hoping to fix in this > > series

Re: [PATCH RFC 1/9] memremap: add ZONE_DEVICE support for compound pages

2020-12-08 Thread Matthew Wilcox
On Tue, Dec 08, 2020 at 09:59:19PM -0800, John Hubbard wrote: > On 12/8/20 9:28 AM, Joao Martins wrote: > > Add a new flag for struct dev_pagemap which designates that a a pagemap > > a a > > > is described as a set of compound pages or in other words, that how > > pages are grouped together in

Re: PATCH] fs/dax: fix compile problem on parisc and mips

2020-12-04 Thread Matthew Wilcox
On Fri, Dec 04, 2020 at 08:28:47AM -0500, John David Anglin wrote: > (.mlocate): page allocation failure: order:5, > mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0 >  [<4035416c>] __kmalloc+0x5e4/0x740 >  [<040ddbe8>] nfsd_reply_cache_init+0x1d0/0x360

Re: PATCH] fs/dax: fix compile problem on parisc and mips

2020-12-04 Thread Matthew Wilcox
On Fri, Dec 04, 2020 at 08:57:37AM +0100, Helge Deller wrote: > On 12/4/20 4:48 AM, Matthew Wilcox wrote: > > On Thu, Dec 03, 2020 at 04:33:10PM -0800, James Bottomley wrote: > >> These platforms define PMD_ORDER in asm/pgtable.h > > > > I think that's the real pr

Re: PATCH] fs/dax: fix compile problem on parisc and mips

2020-12-03 Thread Matthew Wilcox
On Thu, Dec 03, 2020 at 04:33:10PM -0800, James Bottomley wrote: > These platforms define PMD_ORDER in asm/pgtable.h I think that's the real problem, though. #define PGD_ORDER 1 /* Number of pages per pgd */ #define PMD_ORDER 1 /* Number of pages per pmd */ #define PGD_ALLOC_ORDER (2

Re: mapcount corruption regression

2020-12-01 Thread Matthew Wilcox
On Tue, Dec 01, 2020 at 06:28:45PM -0800, Dan Williams wrote: > On Tue, Dec 1, 2020 at 12:49 PM Matthew Wilcox wrote: > > > > On Tue, Dec 01, 2020 at 12:42:39PM -0800, Dan Williams wrote: > > > On Mon, Nov 30, 2020 at 6:24 PM Matthew Wilcox > > > wrote: >

Re: mapcount corruption regression

2020-12-01 Thread Matthew Wilcox
On Tue, Dec 01, 2020 at 12:42:39PM -0800, Dan Williams wrote: > On Mon, Nov 30, 2020 at 6:24 PM Matthew Wilcox wrote: > > > > On Mon, Nov 30, 2020 at 05:20:25PM -0800, Dan Williams wrote: > > > Kirill, Willy, compound page experts, > > > > > > I am se

Re: mapcount corruption regression

2020-11-30 Thread Matthew Wilcox
On Mon, Nov 30, 2020 at 05:20:25PM -0800, Dan Williams wrote: > Kirill, Willy, compound page experts, > > I am seeking some debug ideas about the following splat: > > BUG: Bad page state in process lt-pmem-ns pfn:121a12 > page:51ef73f7 refcount:0 mapcount:-1024 >

Re: [RFC PATCH 1/3] fs: dax.c: move fs hole signifier from DAX_ZERO_PAGE to XA_ZERO_ENTRY

2020-11-30 Thread Matthew Wilcox
On Mon, Nov 30, 2020 at 04:09:23PM +0100, Jan Kara wrote: > On Mon 30-11-20 06:22:42, Amy Parker wrote: > > > > +/* > > > > + * A zero entry, XA_ZERO_ENTRY, is used to represent a zero page. This > > > > + * definition helps with checking if an entry is a PMD size. > > > > + */ > > > > +#define

Re: [PATCH v8 4/9] mm: introduce memfd_secret system call to create "secret" memory areas

2020-11-13 Thread Matthew Wilcox
On Tue, Nov 10, 2020 at 05:14:39PM +0200, Mike Rapoport wrote: > diff --git a/mm/Kconfig b/mm/Kconfig > index c89c5444924b..d8d170fa5210 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -884,4 +884,7 @@ config ARCH_HAS_HUGEPD > config MAPPING_DIRTY_HELPERS > bool > > +config SECRETMEM

Re: [PATCH v8 4/9] mm: introduce memfd_secret system call to create "secret" memory areas

2020-11-13 Thread Matthew Wilcox
On Tue, Nov 10, 2020 at 05:14:39PM +0200, Mike Rapoport wrote: > +static vm_fault_t secretmem_fault(struct vm_fault *vmf) > +{ > + struct address_space *mapping = vmf->vma->vm_file->f_mapping; > + struct inode *inode = file_inode(vmf->vma->vm_file); > + pgoff_t offset = vmf->pgoff; > +

Re: [PATCH 2/2] mm: simplify follow_pte{,pmd}

2020-11-10 Thread Matthew Wilcox
lwig Reviewed-by: Matthew Wilcox (Oracle) I'm not entirely convinced this is the right interface, but your patch makes things better, so I approve. ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org

Re: [PATCH 1/2] mm: unexport follow_pte_pmd

2020-11-10 Thread Matthew Wilcox
On Thu, Oct 29, 2020 at 11:14:31AM +0100, Christoph Hellwig wrote: > follow_pte_pmd is only used by the DAX code, which can't be modular. > > Signed-off-by: Christoph Hellwig Reviewed-by: Matthew Wilcox (Oracle) ___ Linux-nvdimm mailing list

Re: Best solution for shifting DAX_ZERO_PAGE to XA_ZERO_ENTRY

2020-11-08 Thread Matthew Wilcox
On Sun, Nov 08, 2020 at 05:54:14PM -0800, Amy Parker wrote: > On Sun, Nov 8, 2020 at 5:50 PM Matthew Wilcox wrote: > > > > On Sun, Nov 08, 2020 at 05:33:22PM -0800, Darrick J. Wong wrote: > > > On Sun, Nov 08, 2020 at 05:15:55PM -0800, Amy Parker wrote: > > > &

Re: Best solution for shifting DAX_ZERO_PAGE to XA_ZERO_ENTRY

2020-11-08 Thread Matthew Wilcox
On Sun, Nov 08, 2020 at 05:33:22PM -0800, Darrick J. Wong wrote: > On Sun, Nov 08, 2020 at 05:15:55PM -0800, Amy Parker wrote: > > I've been writing a patch to migrate the defined DAX_ZERO_PAGE > > to XA_ZERO_ENTRY for representing holes in files. > > Why? IIRC XA_ZERO_ENTRY ("no mapping in the

[PATCH v2 0/4] Remove nrexceptional tracking

2020-10-26 Thread Matthew Wilcox (Oracle)
boundary, so it doesn't save any memory for ext4. Matthew Wilcox (Oracle) (4): mm: Introduce and use mapping_empty mm: Stop accounting shadow entries dax: Account DAX entries as nrpages mm: Remove nrexceptional from inode fs/block_dev.c | 2 +- fs/dax.c| 8

[PATCH v2 2/4] mm: Stop accounting shadow entries

2020-10-26 Thread Matthew Wilcox (Oracle)
We no longer need to keep track of how many shadow entries are present in a mapping. This saves a few writes to the inode and memory barriers. Signed-off-by: Matthew Wilcox (Oracle) Tested-by: Vishal Verma --- mm/filemap.c| 13 - mm/swap_state.c | 4 mm/truncate.c | 1

[PATCH v2 4/4] mm: Remove nrexceptional from inode

2020-10-26 Thread Matthew Wilcox (Oracle)
We no longer track anything in nrexceptional, so remove it, saving 8 bytes per inode. Signed-off-by: Matthew Wilcox (Oracle) Tested-by: Vishal Verma --- fs/inode.c | 2 +- include/linux/fs.h | 2 -- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/inode.c b/fs/inode.c

[PATCH v2 1/4] mm: Introduce and use mapping_empty

2020-10-26 Thread Matthew Wilcox (Oracle)
Instead of checking the two counters (nrpages and nrexceptional), we can just check whether i_pages is empty. Signed-off-by: Matthew Wilcox (Oracle) Tested-by: Vishal Verma --- fs/block_dev.c | 2 +- fs/dax.c| 2 +- fs/gfs2/glock.c | 3 +-- include/linux

[PATCH v2 3/4] dax: Account DAX entries as nrpages

2020-10-26 Thread Matthew Wilcox (Oracle)
Simplify mapping_needs_writeback() by accounting DAX entries as pages instead of exceptional entries. Signed-off-by: Matthew Wilcox (Oracle) Tested-by: Vishal Verma --- fs/dax.c | 6 +++--- mm/filemap.c | 3 --- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/fs/dax.c b/fs

Re: [Ocfs2-devel] [RFC] treewide: cleanup unreachable breaks

2020-10-18 Thread Matthew Wilcox
On Sun, Oct 18, 2020 at 12:13:35PM -0700, James Bottomley wrote: > On Sun, 2020-10-18 at 19:59 +0100, Matthew Wilcox wrote: > > On Sat, Oct 17, 2020 at 09:09:28AM -0700, t...@redhat.com wrote: > > > clang has a number of useful, new warnings see > > > https:

Re: [Ocfs2-devel] [RFC] treewide: cleanup unreachable breaks

2020-10-18 Thread Matthew Wilcox
On Sat, Oct 17, 2020 at 09:09:28AM -0700, t...@redhat.com wrote: > clang has a number of useful, new warnings see > https://urldefense.com/v3/__https://clang.llvm.org/docs/DiagnosticsReference.html__;!!GqivPVa7Brio!Krxz78O3RKcB9JBMVo_F98FupVhj_jxX60ddN6tKGEbv_cnooXc1nnBmchm-e_O9ieGnyQ$ > Please

Re: [PATCH RFC PKS/PMEM 33/58] fs/cramfs: Utilize new kmap_thread()

2020-10-13 Thread Matthew Wilcox
On Tue, Oct 13, 2020 at 11:44:29AM -0700, Dan Williams wrote: > On Fri, Oct 9, 2020 at 12:52 PM wrote: > > > > From: Ira Weiny > > > > The kmap() calls in this FS are localized to a single thread. To avoid > > the over head of global PKRS updates use the new kmap_thread() call. > > > > Cc:

Re: [PATCH RFC PKS/PMEM 22/58] fs/f2fs: Utilize new kmap_thread()

2020-10-12 Thread Matthew Wilcox
On Mon, Oct 12, 2020 at 12:53:54PM -0700, Ira Weiny wrote: > On Mon, Oct 12, 2020 at 05:44:38PM +0100, Matthew Wilcox wrote: > > On Mon, Oct 12, 2020 at 09:28:29AM -0700, Dave Hansen wrote: > > > kmap_atomic() is always preferred over kmap()/kmap_thread(). > > > k

Re: [PATCH RFC PKS/PMEM 22/58] fs/f2fs: Utilize new kmap_thread()

2020-10-12 Thread Matthew Wilcox
On Mon, Oct 12, 2020 at 09:28:29AM -0700, Dave Hansen wrote: > kmap_atomic() is always preferred over kmap()/kmap_thread(). > kmap_atomic() is _much_ more lightweight since its TLB invalidation is > always CPU-local and never broadcast. > > So, basically, unless you *must* sleep while the mapping

Re: [PATCH RFC PKS/PMEM 22/58] fs/f2fs: Utilize new kmap_thread()

2020-10-09 Thread Matthew Wilcox
On Fri, Oct 09, 2020 at 02:34:34PM -0700, Eric Biggers wrote: > On Fri, Oct 09, 2020 at 12:49:57PM -0700, ira.we...@intel.com wrote: > > The kmap() calls in this FS are localized to a single thread. To avoid > > the over head of global PKRS updates use the new kmap_thread() call. > > > > @@

Re: [PATCH 0/4] Remove nrexceptional tracking

2020-10-08 Thread Matthew Wilcox
On Thu, Aug 06, 2020 at 08:16:02PM +, Verma, Vishal L wrote: > On Thu, 2020-08-06 at 19:44 +, Verma, Vishal L wrote: > > > > > > I'm running xfstests on this patchset right now. If one of the DAX > > > people could try it out, that'd be fantastic. > >

Re: [PATCH v6 5/6] mm: secretmem: use PMD-size pages to amortize direct map fragmentation

2020-09-30 Thread Matthew Wilcox
On Wed, Sep 30, 2020 at 01:27:45PM +0300, Mike Rapoport wrote: > On Tue, Sep 29, 2020 at 05:15:52PM +0200, Peter Zijlstra wrote: > > On Tue, Sep 29, 2020 at 05:58:13PM +0300, Mike Rapoport wrote: > > > On Tue, Sep 29, 2020 at 04:12:16PM +0200, Peter Zijlstra wrote: > > > > > > It will drop them

Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page

2020-09-23 Thread Matthew Wilcox
On Tue, Sep 22, 2020 at 06:05:26PM +0100, Matthew Wilcox wrote: > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > Size the uptodate array dynamically to support larger pages in the > > &g

Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page

2020-09-23 Thread Matthew Wilcox
On Tue, Sep 22, 2020 at 10:00:01PM -0700, Darrick J. Wong wrote: > On Wed, Sep 23, 2020 at 03:48:59AM +0100, Matthew Wilcox wrote: > > On Tue, Sep 22, 2020 at 09:06:03PM -0400, Qian Cai wrote: > > > On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote: > > > > On T

Re: NVFS XFS metadata (was: [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache)

2020-09-23 Thread Matthew Wilcox
On Wed, Sep 23, 2020 at 09:11:43AM -0400, Mikulas Patocka wrote: > I also don't know how to implement journling on persistent memory :) On > EXT4 or XFS you can pin dirty buffers in memory until the journal is > flushed. This is obviously impossible on persistent memory. So, I'm > considering

Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page

2020-09-22 Thread Matthew Wilcox
On Tue, Sep 22, 2020 at 09:06:03PM -0400, Qian Cai wrote: > On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote: > > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > > Size

Re: NVFS XFS metadata (was: [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache)

2020-09-22 Thread Matthew Wilcox
On Tue, Sep 22, 2020 at 12:46:05PM -0400, Mikulas Patocka wrote: > I agree that the b+tree were a good choice for XFS. > > In RAM-based maps, red-black trees or avl trees are used often. In > disk-based maps, btrees or b+trees are used. That's because in RAM, you > are optimizing for the number

Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page

2020-09-22 Thread Matthew Wilcox
On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > Size the uptodate array dynamically to support larger pages in the > > page cache. With a 64kB page, we're only saving 8 bytes per page today, >

Re: NVFS XFS metadata (was: [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache)

2020-09-22 Thread Matthew Wilcox
On Mon, Sep 21, 2020 at 12:20:42PM -0400, Mikulas Patocka wrote: > The same for directories - NVFS hashes the file name and uses radix-tree > to locate a directory page where the directory entry is located. XFS > b+trees would result in much more accesses than the radix-tree. What? Radix trees

Re: [PATCH v2 9/9] iomap: Change calling convention for zeroing

2020-09-17 Thread Matthew Wilcox
On Thu, Sep 17, 2020 at 03:05:00PM -0700, Darrick J. Wong wrote: > > -static loff_t > > -iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, > > - void *data, struct iomap *iomap, struct iomap *srcmap) > > +static loff_t iomap_zero_range_actor(struct inode *inode,

Re: [PATCH v2 2/9] fs: Introduce i_blocks_per_page

2020-09-15 Thread Matthew Wilcox
On Tue, Sep 15, 2020 at 03:40:52PM +, David Laight wrote: > > @@ -147,7 +147,7 @@ iomap_iop_set_range_uptodate(struct page *page, > > unsigned off, unsigned len) > > unsigned int i; > > > > spin_lock_irqsave(>uptodate_lock, flags); > > - for (i = 0; i < PAGE_SIZE /

Re: [RFC] nvfs: a filesystem for persistent memory

2020-09-15 Thread Matthew Wilcox
On Tue, Sep 15, 2020 at 08:34:41AM -0400, Mikulas Patocka wrote: > - when the fsck.nvfs tool mmaps the device /dev/pmem0, the kernel uses > buffer cache for the mapping. The buffer cache slows does fsck by a factor > of 5 to 10. Could it be possible to change the kernel so that it maps DAX >

[PATCH v2 2/9] fs: Introduce i_blocks_per_page

2020-09-10 Thread Matthew Wilcox (Oracle)
This helper is useful for both THPs and for supporting block size larger than page size. Convert all users that I could find (we have a few different ways of writing this idiom, and I may have missed some). Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig Reviewed-by: Dave

[PATCH v2 7/9] iomap: Convert write_count to write_bytes_pending

2020-09-10 Thread Matthew Wilcox (Oracle)
Instead of counting bio segments, count the number of bytes submitted. This insulates us from the block layer's definition of what a 'same page' is, which is not necessarily clear once THPs are involved. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig --- fs/iomap

[PATCH v2 5/9] iomap: Support arbitrarily many blocks per page

2020-09-10 Thread Matthew Wilcox (Oracle)
Size the uptodate array dynamically to support larger pages in the page cache. With a 64kB page, we're only saving 8 bytes per page today, but with a 2MB maximum page size, we'd have to allocate more than 4kB per page. Add a few debugging assertions. Signed-off-by: Matthew Wilcox (Oracle

[PATCH v2 4/9] iomap: Use bitmap ops to set uptodate bits

2020-09-10 Thread Matthew Wilcox (Oracle)
Now that the bitmap is protected by a spinlock, we can use the more efficient bitmap ops instead of individual test/set bit ops. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 12

[PATCH v2 0/9] THP iomap patches for 5.10

2020-09-10 Thread Matthew Wilcox (Oracle)
to flush_dcache_page (Christoph) - Clarify comments (Darrick) - Rename read_count to read_bytes_pending (Christoph) - Rename write_count to write_bytes_pending (Christoph) - Restructure iomap_readpage_actor() (Christoph) - Change return type of the zeroing functions from loff_t to s64 Matthew Wilcox (Oracle

[PATCH v2 9/9] iomap: Change calling convention for zeroing

2020-09-10 Thread Matthew Wilcox (Oracle)
Pass the full length to iomap_zero() and dax_iomap_zero(), and have them return how many bytes they actually handled. This is preparatory work for handling THP, although it looks like DAX could actually take advantage of it if there's a larger contiguous area. Signed-off-by: Matthew Wilcox

[PATCH v2 6/9] iomap: Convert read_count to read_bytes_pending

2020-09-10 Thread Matthew Wilcox (Oracle)
Instead of counting bio segments, count the number of bytes submitted. This insulates us from the block layer's definition of what a 'same page' is, which is not necessarily clear once THPs are involved. Signed-off-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 41

[PATCH v2 3/9] iomap: Use kzalloc to allocate iomap_page

2020-09-10 Thread Matthew Wilcox (Oracle)
We can skip most of the initialisation, although spinlocks still need explicit initialisation as architectures may use a non-zero value to indicate unlocked. The comment is no longer useful as attach_page_private() handles the refcount now. Signed-off-by: Matthew Wilcox (Oracle) Reviewed

[PATCH v2 8/9] iomap: Convert iomap_write_end types

2020-09-10 Thread Matthew Wilcox (Oracle)
iomap_write_end cannot return an error, so switch it to return size_t instead of int and remove the error checking from the callers. Also convert the arguments to size_t from unsigned int, in case anyone ever wants to support a page size larger than 2GB. Signed-off-by: Matthew Wilcox (Oracle

[PATCH v2 1/9] iomap: Fix misplaced page flushing

2020-09-10 Thread Matthew Wilcox (Oracle)
-by: Matthew Wilcox (Oracle) Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 897ab9a26a74..d81a9a86c5aa 100644

Re: [PATCH 5/9] iomap: Support arbitrarily many blocks per page

2020-08-25 Thread Matthew Wilcox
On Tue, Aug 25, 2020 at 02:02:03PM -0700, Darrick J. Wong wrote: > > /* > > - * Structure allocated for each page when block size < PAGE_SIZE to track > > + * Structure allocated for each page when block size < page size to track > > * sub-page uptodate status and I/O completions. > > "for

Re: [PATCH 9/9] iomap: Change calling convention for zeroing

2020-08-25 Thread Matthew Wilcox
On Tue, Aug 25, 2020 at 02:27:11PM +1000, Dave Chinner wrote: > On Mon, Aug 24, 2020 at 09:35:59PM -0600, Andreas Dilger wrote: > > On Aug 24, 2020, at 9:26 PM, Matthew Wilcox wrote: > > > > > > On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinne

Re: [PATCH 9/9] iomap: Change calling convention for zeroing

2020-08-24 Thread Matthew Wilcox
On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinner wrote: > > do { > > - unsigned offset, bytes; > > - > > - offset = offset_in_page(pos); > > - bytes = min_t(loff_t, PAGE_SIZE - offset, count); > > + loff_t bytes; > > > > if

Re: [PATCH 8/9] iomap: Convert iomap_write_end types

2020-08-24 Thread Matthew Wilcox
On Tue, Aug 25, 2020 at 10:12:23AM +1000, Dave Chinner wrote: > > -static int > > -__iomap_write_end(struct inode *inode, loff_t pos, unsigned len, > > - unsigned copied, struct page *page) > > +static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t > > len, > > +

Re: [PATCH 5/9] iomap: Support arbitrarily many blocks per page

2020-08-24 Thread Matthew Wilcox
On Tue, Aug 25, 2020 at 09:59:18AM +1000, Dave Chinner wrote: > On Mon, Aug 24, 2020 at 03:55:06PM +0100, Matthew Wilcox (Oracle) wrote: > > static inline struct iomap_page *to_iomap_page(struct page *page) > > { > > + VM_BUG_ON_PGFLAGS(PageTail(page), page); > >

[PATCH 0/9] THP iomap patches for 5.10

2020-08-24 Thread Matthew Wilcox (Oracle)
today which are the changes to iomap which don't pay their own way until we actually have THPs in the page cache. I would like those to be reviewed with an eye to merging them into 5.11. Matthew Wilcox (Oracle) (9): iomap: Fix misplaced page flushing fs: Introduce i_blocks_per_page iomap

[PATCH 1/9] iomap: Fix misplaced page flushing

2020-08-24 Thread Matthew Wilcox (Oracle)
-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index bcfc288dba3f..cffd575e57b6 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -715,6 +715,7

[PATCH 3/9] iomap: Use kzalloc to allocate iomap_page

2020-08-24 Thread Matthew Wilcox (Oracle)
We can skip most of the initialisation, although spinlocks still need explicit initialisation as architectures may use a non-zero value to indicate unlocked. The comment is no longer useful as attach_page_private() handles the refcount now. Signed-off-by: Matthew Wilcox (Oracle) Reviewed

[PATCH 9/9] iomap: Change calling convention for zeroing

2020-08-24 Thread Matthew Wilcox (Oracle)
Pass the full length to iomap_zero() and dax_iomap_zero(), and have them return how many bytes they actually handled. This is preparatory work for handling THP, although it looks like DAX could actually take advantage of it if there's a larger contiguous area. Signed-off-by: Matthew Wilcox

[PATCH 6/9] iomap: Convert read_count to byte count

2020-08-24 Thread Matthew Wilcox (Oracle)
Instead of counting bio segments, count the number of bytes submitted. This insulates us from the block layer's definition of what a 'same page' is, which is not necessarily clear once THPs are involved. Signed-off-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 29

[PATCH 4/9] iomap: Use bitmap ops to set uptodate bits

2020-08-24 Thread Matthew Wilcox (Oracle)
Now that the bitmap is protected by a spinlock, we can use the more efficient bitmap ops instead of individual test/set bit ops. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions

[PATCH 8/9] iomap: Convert iomap_write_end types

2020-08-24 Thread Matthew Wilcox (Oracle)
iomap_write_end cannot return an error, so switch it to return size_t instead of int and remove the error checking from the callers. Also convert the arguments to size_t from unsigned int, in case anyone ever wants to support a page size larger than 2GB. Signed-off-by: Matthew Wilcox (Oracle

[PATCH 7/9] iomap: Convert write_count to byte count

2020-08-24 Thread Matthew Wilcox (Oracle)
Instead of counting bio segments, count the number of bytes submitted. This insulates us from the block layer's definition of what a 'same page' is, which is not necessarily clear once THPs are involved. Signed-off-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 11 ++- 1 file

[PATCH 5/9] iomap: Support arbitrarily many blocks per page

2020-08-24 Thread Matthew Wilcox (Oracle)
Size the uptodate array dynamically to support larger pages in the page cache. With a 64kB page, we're only saving 8 bytes per page today, but with a 2MB maximum page size, we'd have to allocate more than 4kB per page. Add a few debugging assertions. Signed-off-by: Matthew Wilcox (Oracle

[PATCH 2/9] fs: Introduce i_blocks_per_page

2020-08-24 Thread Matthew Wilcox (Oracle)
This helper is useful for both THPs and for supporting block size larger than page size. Convert all users that I could find (we have a few different ways of writing this idiom, and I may have missed some). Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig --- fs/iomap

Re: [PATCH 1/4] mm: Introduce and use page_cache_empty

2020-08-15 Thread Matthew Wilcox
On Fri, Aug 07, 2020 at 02:24:00AM +0300, Kirill A. Shutemov wrote: > On Tue, Aug 04, 2020 at 05:17:52PM +0100, Matthew Wilcox (Oracle) wrote: > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > > index 484a36185bb5..a474a92a2a72 100644 > > --- a/include/linu

Re: [RFC PATCH 0/8] fsdax: introduce FS query interface to support reflink

2020-08-10 Thread Matthew Wilcox
On Mon, Aug 10, 2020 at 04:15:50PM +0800, Ruan Shiyang wrote: > > > On 2020/8/7 下午9:38, Matthew Wilcox wrote: > > On Fri, Aug 07, 2020 at 09:13:28PM +0800, Shiyang Ruan wrote: > > > This patchset is a try to resolve the problem of tracking shared page > > > for f

Re: [RFC PATCH 0/8] fsdax: introduce FS query interface to support reflink

2020-08-07 Thread Matthew Wilcox
On Fri, Aug 07, 2020 at 09:13:28PM +0800, Shiyang Ruan wrote: > This patchset is a try to resolve the problem of tracking shared page > for fsdax. > > Instead of per-page tracking method, this patchset introduces a query > interface: get_shared_files(), which is implemented by each FS, to >

[PATCH 1/4] mm: Introduce and use page_cache_empty

2020-08-04 Thread Matthew Wilcox (Oracle)
Instead of checking the two counters (nrpages and nrexceptional), we can just check whether i_pages is empty. Signed-off-by: Matthew Wilcox (Oracle) --- fs/block_dev.c | 2 +- fs/dax.c| 2 +- include/linux/pagemap.h | 5 + mm/truncate.c | 18

  1   2   3   >