On Tue, Apr 20, 2021 at 06:00:49PM +0300, Mike Rapoport wrote:
> + mapping = (struct address_space *)
> + ((unsigned long)page->mapping & ~PAGE_MAPPING_FLAGS);
> +
> + if (mapping != page->mapping)
> + return false;
> +
> + return page->mapping->a_ops == _aops;
On Thu, May 06, 2021 at 11:23:25AM +0100, Joao Martins wrote:
> >> I think it is ok for dax/nvdimm to continue to maintain their align
> >> value because it should be ok to have 4MB align if the device really
> >> wanted. However, when it goes to map that alignment with
> >> memremap_pages() it
On Wed, May 05, 2021 at 11:44:29AM -0700, Dan Williams wrote:
> > @@ -6285,6 +6285,8 @@ void __ref memmap_init_zone_device(struct zone *zone,
> > unsigned long pfn, end_pfn = start_pfn + nr_pages;
> > struct pglist_data *pgdat = zone->zone_pgdat;
> > struct vmem_altmap
On Mon, Apr 26, 2021 at 01:52:17PM -0400, Vivek Goyal wrote:
> On Mon, Apr 26, 2021 at 02:46:32PM +0100, Matthew Wilcox wrote:
> > On Fri, Apr 23, 2021 at 09:07:21AM -0400, Vivek Goyal wrote:
> > > +enum dax_wake_mode {
> > > + WAKE_NEXT,
> > > + WAKE_ALL
On Fri, Apr 23, 2021 at 09:07:21AM -0400, Vivek Goyal wrote:
> +enum dax_wake_mode {
> + WAKE_NEXT,
> + WAKE_ALL,
> +};
Why define them in this order when ...
> @@ -196,7 +207,7 @@ static void dax_wake_entry(struct xa_state *xas, void
> *entry, bool wake_all)
>* must be in the
On Wed, Apr 21, 2021 at 11:56:31AM -0400, Vivek Goyal wrote:
> +/**
> + * enum dax_entry_wake_mode: waitqueue wakeup toggle
s/toggle/behaviour/ ?
> + * @WAKE_NEXT: wake only the first waiter in the waitqueue
> + * @WAKE_ALL: wake all waiters in the waitqueue
> + */
> +enum dax_entry_wake_mode {
On Mon, Apr 19, 2021 at 02:56:17PM +0300, Mike Rapoport wrote:
> On Mon, Apr 19, 2021 at 12:23:02PM +0100, Matthew Wilcox wrote:
> > So you're calling page_is_secretmem() on a struct page without having
> > a refcount on it. That is definitely not allowed. secretmem seems
On Mon, Apr 19, 2021 at 12:36:19PM +0300, Mike Rapoport wrote:
> Well, most if the -4.2% of the performance regression kbuild reported were
> due to repeated compount_head(page) in page_mapping(). So the whole point
> of this patch is to avoid calling page_mapping().
It's quite ludicrous how many
On Mon, Apr 19, 2021 at 11:42:18AM +0300, Mike Rapoport wrote:
> The perf profile of the test indicated that the regression is caused by
> page_is_secretmem() called from gup_pte_range() (inlined by gup_pgd_range):
Uhh ... you're calling it in the wrong place!
On Wed, Apr 07, 2021 at 02:32:05PM +0800, Shiyang Ruan wrote:
> +static int dax_fault_cow_page(struct vm_fault *vmf, struct iomap *iomap,
> + loff_t pos, vm_fault_t *ret)
> +{
> + int error = 0;
> + unsigned long vaddr = vmf->address;
> + sector_t sector =
OK, more competent testing, and that previous bug now detected and fixed.
I have a reasonable amount of confidence this will solve your problem.
If you do apply this patch, don't enable CONFIG_TEST_XARRAY as the new
tests assume that attempting to allocate with a GFP flags of 0 will
definitely
On Fri, Apr 02, 2021 at 04:13:05AM +0100, Matthew Wilcox wrote:
> + for (;;) {
> + xas_load(xas);
> + if (!xas_is_node(xas))
> + break;
> + xas_delete_node(xas);
> + xas->xa_index -= XA_CHUNK_SIZE;
&g
On Thu, Apr 01, 2021 at 06:06:15PM +0100, Matthew Wilcox wrote:
> On Wed, Mar 31, 2021 at 02:58:12PM -0700, Hugh Dickins wrote:
> > I suspect there's a bug in the XArray handling in collapse_file(),
> > which sometimes leaves empty nodes behind.
>
> Urp, yes, t
On Wed, Mar 31, 2021 at 02:58:12PM -0700, Hugh Dickins wrote:
> I suspect there's a bug in the XArray handling in collapse_file(),
> which sometimes leaves empty nodes behind.
Urp, yes, that can easily happen.
/* This will be less messy when we use multi-index entries */
do {
On Tue, Mar 30, 2021 at 06:30:22PM -0700, Hugh Dickins wrote:
> Running my usual tmpfs kernel builds swapping load, on Sunday's rc4-mm1
> mmotm (I never got to try rc3-mm1 but presume it behaved the same way),
> I hit clear_inode()'s BUG_ON(!mapping_empty(>i_data)); on two
> machines, within an
Ping?
On Thu, Jan 21, 2021 at 06:43:34PM +, Matthew Wilcox wrote:
> Ping? These patches still apply to next-20210121.
>
> On Mon, Oct 26, 2020 at 03:18:45PM +, Matthew Wilcox (Oracle) wrote:
> > We actually use nrexceptional for very little these days. It's a minor
&
On Wed, Mar 10, 2021 at 08:21:59AM -0600, Goldwyn Rodrigues wrote:
> On 13:02 10/03, Matthew Wilcox wrote:
> > On Wed, Mar 10, 2021 at 07:30:41AM -0500, Neal Gompa wrote:
> > > Forgive my ignorance, but is there a reason why this isn't wired up to
> > > Btrfs at the same
On Wed, Mar 10, 2021 at 08:36:06AM -0500, Neal Gompa wrote:
> On Wed, Mar 10, 2021 at 8:02 AM Matthew Wilcox wrote:
> >
> > On Wed, Mar 10, 2021 at 07:30:41AM -0500, Neal Gompa wrote:
> > > Forgive my ignorance, but is there a reason why this isn't wired up to
>
On Wed, Mar 10, 2021 at 07:30:41AM -0500, Neal Gompa wrote:
> Forgive my ignorance, but is there a reason why this isn't wired up to
> Btrfs at the same time? It seems weird to me that adding a feature
btrfs doesn't support DAX. only ext2, ext4, XFS and FUSE have DAX support.
If you think about
, but there may be implicit include problems
on other architectures.
Signed-off-by: Matthew Wilcox (Oracle)
---
v2: Fix CONFIG_SWAP=n implicit use of pagemap.h by swap.h. Increases
the number of files from 240, but that's still a big win -- 68%
reduction instead of 77%.
block/blk-settings.c
, but there may be implicit include problems
on other architectures.
Signed-off-by: Matthew Wilcox (Oracle)
---
block/blk-settings.c | 1 +
drivers/block/brd.c | 1 +
drivers/block/loop.c | 1 +
drivers/md/bcache/super.c | 1 +
drivers/nvdimm/btt.c | 1 +
drivers/nvdimm/pmem.c
On Mon, Jan 25, 2021 at 11:38:17PM +0200, Mike Rapoport wrote:
> I cannot use __GFP_ACCOUNT because cma_alloc() does not use gfp.
> Besides, kmem accounting with __GFP_ACCOUNT does not seem
> to update stats and there was an explicit request for statistics:
>
>
Hildenbrand
> Cc: Elena Reshetova
> Cc: Hagen Paul Pfeifer
> Cc: "H. Peter Anvin"
> Cc: Ingo Molnar
> Cc: James Bottomley
> Cc: "Kirill A. Shutemov"
> Cc: Mark Rutland
> Cc: Matthew Wilcox
> Cc: Michael Kerrisk
> Cc: Palmer Dabbelt
>
On Thu, Jan 21, 2021 at 03:42:31PM -0500, Johannes Weiner wrote:
> On Mon, Oct 26, 2020 at 03:18:46PM +0000, Matthew Wilcox (Oracle) wrote:
> > Instead of checking the two counters (nrpages and nrexceptional), we
> > can just check whether i_pages is empty.
> >
> > Si
On Wed, Jan 20, 2021 at 10:12:01AM -0500, Mikulas Patocka wrote:
> Do you have some idea how to optimize the generic code that calls
> ->read_iter?
Yes.
> It might be better to maintain an f_iocb_flags in the
> struct file and just copy that unconditionally. We'd need to remember
> to update
On Wed, Jan 20, 2021 at 08:06:08PM +0200, Mike Rapoport wrote:
> +static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp)
> {
> + unsigned long nr_pages = (1 << PMD_PAGE_ORDER);
> + struct gen_pool *pool = ctx->pool;
> + unsigned long addr;
> + struct page *page;
On Wed, Jan 20, 2021 at 08:06:07PM +0200, Mike Rapoport wrote:
> +static struct page *secretmem_alloc_page(gfp_t gfp)
> +{
> + /*
> + * FIXME: use a cache of large pages to reduce the direct map
> + * fragmentation
> + */
> + return alloc_page(gfp);
> +}
> +
> +static
On Wed, Jan 20, 2021 at 05:05:10PM +0200, Mike Rapoport wrote:
> On Tue, Jan 19, 2021 at 08:22:13PM +0000, Matthew Wilcox wrote:
> > On Thu, Dec 03, 2020 at 08:29:44AM +0200, Mike Rapoport wrote:
> > > +static vm_fault_t secretmem_fault(struct vm_fault *vmf)
> > > +{
&
On Thu, Dec 03, 2020 at 08:29:44AM +0200, Mike Rapoport wrote:
> +static vm_fault_t secretmem_fault(struct vm_fault *vmf)
> +{
> + struct address_space *mapping = vmf->vma->vm_file->f_mapping;
> + struct inode *inode = file_inode(vmf->vma->vm_file);
> + pgoff_t offset = vmf->pgoff;
> +
On Sun, Jan 10, 2021 at 04:19:15PM -0500, Mikulas Patocka wrote:
> I put counters into vfs_read and vfs_readv.
>
> After a fresh boot of the virtual machine, the counters show "13385 4".
> After a kernel compilation they show "4475220 8".
>
> So, the readv path is almost unused.
>
> My
On Thu, Jan 07, 2021 at 01:59:01PM -0500, Mikulas Patocka wrote:
> On Thu, 7 Jan 2021, Matthew Wilcox wrote:
> > On Thu, Jan 07, 2021 at 08:15:41AM -0500, Mikulas Patocka wrote:
> > > I'd like to ask about this piece of code in __kernel_read:
> > > if (unlikely(!fi
On Thu, Jan 07, 2021 at 08:15:41AM -0500, Mikulas Patocka wrote:
> I'd like to ask about this piece of code in __kernel_read:
> if (unlikely(!file->f_op->read_iter || file->f_op->read))
> return warn_unsupported...
> and __kernel_write:
> if
On Mon, Jan 04, 2021 at 12:13:02PM -0800, Dan Williams wrote:
> On Thu, Dec 31, 2020 at 8:29 PM Randy Dunlap wrote:
> > +++ lnx-511-rc1/fs/dax.c
> > @@ -25,6 +25,7 @@
> > #include
> > #include
> > #include
> > +#include
>
> I would expect this to come from one of the linux/ includes like
On Wed, Dec 09, 2020 at 12:24:38PM -0400, Jason Gunthorpe wrote:
> On Wed, Dec 09, 2020 at 04:02:05PM +, Joao Martins wrote:
>
> > Today (without the series) struct pages are not represented the way they
> > are expressed in the page tables, which is what I am hoping to fix in this
> > series
On Tue, Dec 08, 2020 at 09:59:19PM -0800, John Hubbard wrote:
> On 12/8/20 9:28 AM, Joao Martins wrote:
> > Add a new flag for struct dev_pagemap which designates that a a pagemap
>
> a a
>
> > is described as a set of compound pages or in other words, that how
> > pages are grouped together in
On Fri, Dec 04, 2020 at 08:28:47AM -0500, John David Anglin wrote:
> (.mlocate): page allocation failure: order:5,
> mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
> [<4035416c>] __kmalloc+0x5e4/0x740
> [<040ddbe8>] nfsd_reply_cache_init+0x1d0/0x360
On Fri, Dec 04, 2020 at 08:57:37AM +0100, Helge Deller wrote:
> On 12/4/20 4:48 AM, Matthew Wilcox wrote:
> > On Thu, Dec 03, 2020 at 04:33:10PM -0800, James Bottomley wrote:
> >> These platforms define PMD_ORDER in asm/pgtable.h
> >
> > I think that's the real pr
On Thu, Dec 03, 2020 at 04:33:10PM -0800, James Bottomley wrote:
> These platforms define PMD_ORDER in asm/pgtable.h
I think that's the real problem, though.
#define PGD_ORDER 1 /* Number of pages per pgd */
#define PMD_ORDER 1 /* Number of pages per pmd */
#define PGD_ALLOC_ORDER (2
On Tue, Dec 01, 2020 at 06:28:45PM -0800, Dan Williams wrote:
> On Tue, Dec 1, 2020 at 12:49 PM Matthew Wilcox wrote:
> >
> > On Tue, Dec 01, 2020 at 12:42:39PM -0800, Dan Williams wrote:
> > > On Mon, Nov 30, 2020 at 6:24 PM Matthew Wilcox
> > > wrote:
>
On Tue, Dec 01, 2020 at 12:42:39PM -0800, Dan Williams wrote:
> On Mon, Nov 30, 2020 at 6:24 PM Matthew Wilcox wrote:
> >
> > On Mon, Nov 30, 2020 at 05:20:25PM -0800, Dan Williams wrote:
> > > Kirill, Willy, compound page experts,
> > >
> > > I am se
On Mon, Nov 30, 2020 at 05:20:25PM -0800, Dan Williams wrote:
> Kirill, Willy, compound page experts,
>
> I am seeking some debug ideas about the following splat:
>
> BUG: Bad page state in process lt-pmem-ns pfn:121a12
> page:51ef73f7 refcount:0 mapcount:-1024
>
On Mon, Nov 30, 2020 at 04:09:23PM +0100, Jan Kara wrote:
> On Mon 30-11-20 06:22:42, Amy Parker wrote:
> > > > +/*
> > > > + * A zero entry, XA_ZERO_ENTRY, is used to represent a zero page. This
> > > > + * definition helps with checking if an entry is a PMD size.
> > > > + */
> > > > +#define
On Tue, Nov 10, 2020 at 05:14:39PM +0200, Mike Rapoport wrote:
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c89c5444924b..d8d170fa5210 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -884,4 +884,7 @@ config ARCH_HAS_HUGEPD
> config MAPPING_DIRTY_HELPERS
> bool
>
> +config SECRETMEM
On Tue, Nov 10, 2020 at 05:14:39PM +0200, Mike Rapoport wrote:
> +static vm_fault_t secretmem_fault(struct vm_fault *vmf)
> +{
> + struct address_space *mapping = vmf->vma->vm_file->f_mapping;
> + struct inode *inode = file_inode(vmf->vma->vm_file);
> + pgoff_t offset = vmf->pgoff;
> +
lwig
Reviewed-by: Matthew Wilcox (Oracle)
I'm not entirely convinced this is the right interface, but your patch
makes things better, so I approve.
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
On Thu, Oct 29, 2020 at 11:14:31AM +0100, Christoph Hellwig wrote:
> follow_pte_pmd is only used by the DAX code, which can't be modular.
>
> Signed-off-by: Christoph Hellwig
Reviewed-by: Matthew Wilcox (Oracle)
___
Linux-nvdimm mailing list
On Sun, Nov 08, 2020 at 05:54:14PM -0800, Amy Parker wrote:
> On Sun, Nov 8, 2020 at 5:50 PM Matthew Wilcox wrote:
> >
> > On Sun, Nov 08, 2020 at 05:33:22PM -0800, Darrick J. Wong wrote:
> > > On Sun, Nov 08, 2020 at 05:15:55PM -0800, Amy Parker wrote:
> > > &
On Sun, Nov 08, 2020 at 05:33:22PM -0800, Darrick J. Wong wrote:
> On Sun, Nov 08, 2020 at 05:15:55PM -0800, Amy Parker wrote:
> > I've been writing a patch to migrate the defined DAX_ZERO_PAGE
> > to XA_ZERO_ENTRY for representing holes in files.
>
> Why? IIRC XA_ZERO_ENTRY ("no mapping in the
boundary, so it doesn't save
any memory for ext4.
Matthew Wilcox (Oracle) (4):
mm: Introduce and use mapping_empty
mm: Stop accounting shadow entries
dax: Account DAX entries as nrpages
mm: Remove nrexceptional from inode
fs/block_dev.c | 2 +-
fs/dax.c| 8
We no longer need to keep track of how many shadow entries are
present in a mapping. This saves a few writes to the inode and
memory barriers.
Signed-off-by: Matthew Wilcox (Oracle)
Tested-by: Vishal Verma
---
mm/filemap.c| 13 -
mm/swap_state.c | 4
mm/truncate.c | 1
We no longer track anything in nrexceptional, so remove it, saving 8
bytes per inode.
Signed-off-by: Matthew Wilcox (Oracle)
Tested-by: Vishal Verma
---
fs/inode.c | 2 +-
include/linux/fs.h | 2 --
2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
Instead of checking the two counters (nrpages and nrexceptional), we
can just check whether i_pages is empty.
Signed-off-by: Matthew Wilcox (Oracle)
Tested-by: Vishal Verma
---
fs/block_dev.c | 2 +-
fs/dax.c| 2 +-
fs/gfs2/glock.c | 3 +--
include/linux
Simplify mapping_needs_writeback() by accounting DAX entries as
pages instead of exceptional entries.
Signed-off-by: Matthew Wilcox (Oracle)
Tested-by: Vishal Verma
---
fs/dax.c | 6 +++---
mm/filemap.c | 3 ---
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/fs/dax.c b/fs
On Sun, Oct 18, 2020 at 12:13:35PM -0700, James Bottomley wrote:
> On Sun, 2020-10-18 at 19:59 +0100, Matthew Wilcox wrote:
> > On Sat, Oct 17, 2020 at 09:09:28AM -0700, t...@redhat.com wrote:
> > > clang has a number of useful, new warnings see
> > > https:
On Sat, Oct 17, 2020 at 09:09:28AM -0700, t...@redhat.com wrote:
> clang has a number of useful, new warnings see
> https://urldefense.com/v3/__https://clang.llvm.org/docs/DiagnosticsReference.html__;!!GqivPVa7Brio!Krxz78O3RKcB9JBMVo_F98FupVhj_jxX60ddN6tKGEbv_cnooXc1nnBmchm-e_O9ieGnyQ$
>
Please
On Tue, Oct 13, 2020 at 11:44:29AM -0700, Dan Williams wrote:
> On Fri, Oct 9, 2020 at 12:52 PM wrote:
> >
> > From: Ira Weiny
> >
> > The kmap() calls in this FS are localized to a single thread. To avoid
> > the over head of global PKRS updates use the new kmap_thread() call.
> >
> > Cc:
On Mon, Oct 12, 2020 at 12:53:54PM -0700, Ira Weiny wrote:
> On Mon, Oct 12, 2020 at 05:44:38PM +0100, Matthew Wilcox wrote:
> > On Mon, Oct 12, 2020 at 09:28:29AM -0700, Dave Hansen wrote:
> > > kmap_atomic() is always preferred over kmap()/kmap_thread().
> > > k
On Mon, Oct 12, 2020 at 09:28:29AM -0700, Dave Hansen wrote:
> kmap_atomic() is always preferred over kmap()/kmap_thread().
> kmap_atomic() is _much_ more lightweight since its TLB invalidation is
> always CPU-local and never broadcast.
>
> So, basically, unless you *must* sleep while the mapping
On Fri, Oct 09, 2020 at 02:34:34PM -0700, Eric Biggers wrote:
> On Fri, Oct 09, 2020 at 12:49:57PM -0700, ira.we...@intel.com wrote:
> > The kmap() calls in this FS are localized to a single thread. To avoid
> > the over head of global PKRS updates use the new kmap_thread() call.
> >
> > @@
On Thu, Aug 06, 2020 at 08:16:02PM +, Verma, Vishal L wrote:
> On Thu, 2020-08-06 at 19:44 +, Verma, Vishal L wrote:
> > >
> > > I'm running xfstests on this patchset right now. If one of the DAX
> > > people could try it out, that'd be fantastic.
> >
On Wed, Sep 30, 2020 at 01:27:45PM +0300, Mike Rapoport wrote:
> On Tue, Sep 29, 2020 at 05:15:52PM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 29, 2020 at 05:58:13PM +0300, Mike Rapoport wrote:
> > > On Tue, Sep 29, 2020 at 04:12:16PM +0200, Peter Zijlstra wrote:
> >
> > > > It will drop them
On Tue, Sep 22, 2020 at 06:05:26PM +0100, Matthew Wilcox wrote:
> On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote:
> > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote:
> > > Size the uptodate array dynamically to support larger pages in the
> > &g
On Tue, Sep 22, 2020 at 10:00:01PM -0700, Darrick J. Wong wrote:
> On Wed, Sep 23, 2020 at 03:48:59AM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 22, 2020 at 09:06:03PM -0400, Qian Cai wrote:
> > > On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote:
> > > > On T
On Wed, Sep 23, 2020 at 09:11:43AM -0400, Mikulas Patocka wrote:
> I also don't know how to implement journling on persistent memory :) On
> EXT4 or XFS you can pin dirty buffers in memory until the journal is
> flushed. This is obviously impossible on persistent memory. So, I'm
> considering
On Tue, Sep 22, 2020 at 09:06:03PM -0400, Qian Cai wrote:
> On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote:
> > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote:
> > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote:
> > > > Size
On Tue, Sep 22, 2020 at 12:46:05PM -0400, Mikulas Patocka wrote:
> I agree that the b+tree were a good choice for XFS.
>
> In RAM-based maps, red-black trees or avl trees are used often. In
> disk-based maps, btrees or b+trees are used. That's because in RAM, you
> are optimizing for the number
On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote:
> On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote:
> > Size the uptodate array dynamically to support larger pages in the
> > page cache. With a 64kB page, we're only saving 8 bytes per page today,
>
On Mon, Sep 21, 2020 at 12:20:42PM -0400, Mikulas Patocka wrote:
> The same for directories - NVFS hashes the file name and uses radix-tree
> to locate a directory page where the directory entry is located. XFS
> b+trees would result in much more accesses than the radix-tree.
What? Radix trees
On Thu, Sep 17, 2020 at 03:05:00PM -0700, Darrick J. Wong wrote:
> > -static loff_t
> > -iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count,
> > - void *data, struct iomap *iomap, struct iomap *srcmap)
> > +static loff_t iomap_zero_range_actor(struct inode *inode,
On Tue, Sep 15, 2020 at 03:40:52PM +, David Laight wrote:
> > @@ -147,7 +147,7 @@ iomap_iop_set_range_uptodate(struct page *page,
> > unsigned off, unsigned len)
> > unsigned int i;
> >
> > spin_lock_irqsave(>uptodate_lock, flags);
> > - for (i = 0; i < PAGE_SIZE /
On Tue, Sep 15, 2020 at 08:34:41AM -0400, Mikulas Patocka wrote:
> - when the fsck.nvfs tool mmaps the device /dev/pmem0, the kernel uses
> buffer cache for the mapping. The buffer cache slows does fsck by a factor
> of 5 to 10. Could it be possible to change the kernel so that it maps DAX
>
This helper is useful for both THPs and for supporting block size larger
than page size. Convert all users that I could find (we have a few
different ways of writing this idiom, and I may have missed some).
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
Reviewed-by: Dave
Instead of counting bio segments, count the number of bytes submitted.
This insulates us from the block layer's definition of what a 'same page'
is, which is not necessarily clear once THPs are involved.
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
---
fs/iomap
Size the uptodate array dynamically to support larger pages in the
page cache. With a 64kB page, we're only saving 8 bytes per page today,
but with a 2MB maximum page size, we'd have to allocate more than 4kB
per page. Add a few debugging assertions.
Signed-off-by: Matthew Wilcox (Oracle
Now that the bitmap is protected by a spinlock, we can use the
more efficient bitmap ops instead of individual test/set bit ops.
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Darrick J. Wong
---
fs/iomap/buffered-io.c | 12
to flush_dcache_page (Christoph)
- Clarify comments (Darrick)
- Rename read_count to read_bytes_pending (Christoph)
- Rename write_count to write_bytes_pending (Christoph)
- Restructure iomap_readpage_actor() (Christoph)
- Change return type of the zeroing functions from loff_t to s64
Matthew Wilcox (Oracle
Pass the full length to iomap_zero() and dax_iomap_zero(), and have
them return how many bytes they actually handled. This is preparatory
work for handling THP, although it looks like DAX could actually take
advantage of it if there's a larger contiguous area.
Signed-off-by: Matthew Wilcox
Instead of counting bio segments, count the number of bytes submitted.
This insulates us from the block layer's definition of what a 'same page'
is, which is not necessarily clear once THPs are involved.
Signed-off-by: Matthew Wilcox (Oracle)
---
fs/iomap/buffered-io.c | 41
We can skip most of the initialisation, although spinlocks still
need explicit initialisation as architectures may use a non-zero
value to indicate unlocked. The comment is no longer useful as
attach_page_private() handles the refcount now.
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed
iomap_write_end cannot return an error, so switch it to return
size_t instead of int and remove the error checking from the callers.
Also convert the arguments to size_t from unsigned int, in case anyone
ever wants to support a page size larger than 2GB.
Signed-off-by: Matthew Wilcox (Oracle
-by: Matthew Wilcox (Oracle)
Reviewed-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
---
fs/iomap/buffered-io.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 897ab9a26a74..d81a9a86c5aa 100644
On Tue, Aug 25, 2020 at 02:02:03PM -0700, Darrick J. Wong wrote:
> > /*
> > - * Structure allocated for each page when block size < PAGE_SIZE to track
> > + * Structure allocated for each page when block size < page size to track
> > * sub-page uptodate status and I/O completions.
>
> "for
On Tue, Aug 25, 2020 at 02:27:11PM +1000, Dave Chinner wrote:
> On Mon, Aug 24, 2020 at 09:35:59PM -0600, Andreas Dilger wrote:
> > On Aug 24, 2020, at 9:26 PM, Matthew Wilcox wrote:
> > >
> > > On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinne
On Tue, Aug 25, 2020 at 10:27:35AM +1000, Dave Chinner wrote:
> > do {
> > - unsigned offset, bytes;
> > -
> > - offset = offset_in_page(pos);
> > - bytes = min_t(loff_t, PAGE_SIZE - offset, count);
> > + loff_t bytes;
> >
> > if
On Tue, Aug 25, 2020 at 10:12:23AM +1000, Dave Chinner wrote:
> > -static int
> > -__iomap_write_end(struct inode *inode, loff_t pos, unsigned len,
> > - unsigned copied, struct page *page)
> > +static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t
> > len,
> > +
On Tue, Aug 25, 2020 at 09:59:18AM +1000, Dave Chinner wrote:
> On Mon, Aug 24, 2020 at 03:55:06PM +0100, Matthew Wilcox (Oracle) wrote:
> > static inline struct iomap_page *to_iomap_page(struct page *page)
> > {
> > + VM_BUG_ON_PGFLAGS(PageTail(page), page);
> >
today which are the changes to
iomap which don't pay their own way until we actually have THPs in the
page cache. I would like those to be reviewed with an eye to merging
them into 5.11.
Matthew Wilcox (Oracle) (9):
iomap: Fix misplaced page flushing
fs: Introduce i_blocks_per_page
iomap
-by: Matthew Wilcox (Oracle)
---
fs/iomap/buffered-io.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index bcfc288dba3f..cffd575e57b6 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -715,6 +715,7
We can skip most of the initialisation, although spinlocks still
need explicit initialisation as architectures may use a non-zero
value to indicate unlocked. The comment is no longer useful as
attach_page_private() handles the refcount now.
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed
Pass the full length to iomap_zero() and dax_iomap_zero(), and have
them return how many bytes they actually handled. This is preparatory
work for handling THP, although it looks like DAX could actually take
advantage of it if there's a larger contiguous area.
Signed-off-by: Matthew Wilcox
Instead of counting bio segments, count the number of bytes submitted.
This insulates us from the block layer's definition of what a 'same page'
is, which is not necessarily clear once THPs are involved.
Signed-off-by: Matthew Wilcox (Oracle)
---
fs/iomap/buffered-io.c | 29
Now that the bitmap is protected by a spinlock, we can use the
more efficient bitmap ops instead of individual test/set bit ops.
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
---
fs/iomap/buffered-io.c | 12 ++--
1 file changed, 2 insertions(+), 10 deletions
iomap_write_end cannot return an error, so switch it to return
size_t instead of int and remove the error checking from the callers.
Also convert the arguments to size_t from unsigned int, in case anyone
ever wants to support a page size larger than 2GB.
Signed-off-by: Matthew Wilcox (Oracle
Instead of counting bio segments, count the number of bytes submitted.
This insulates us from the block layer's definition of what a 'same page'
is, which is not necessarily clear once THPs are involved.
Signed-off-by: Matthew Wilcox (Oracle)
---
fs/iomap/buffered-io.c | 11 ++-
1 file
Size the uptodate array dynamically to support larger pages in the
page cache. With a 64kB page, we're only saving 8 bytes per page today,
but with a 2MB maximum page size, we'd have to allocate more than 4kB
per page. Add a few debugging assertions.
Signed-off-by: Matthew Wilcox (Oracle
This helper is useful for both THPs and for supporting block size larger
than page size. Convert all users that I could find (we have a few
different ways of writing this idiom, and I may have missed some).
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
---
fs/iomap
On Fri, Aug 07, 2020 at 02:24:00AM +0300, Kirill A. Shutemov wrote:
> On Tue, Aug 04, 2020 at 05:17:52PM +0100, Matthew Wilcox (Oracle) wrote:
> > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > index 484a36185bb5..a474a92a2a72 100644
> > --- a/include/linu
On Mon, Aug 10, 2020 at 04:15:50PM +0800, Ruan Shiyang wrote:
>
>
> On 2020/8/7 下午9:38, Matthew Wilcox wrote:
> > On Fri, Aug 07, 2020 at 09:13:28PM +0800, Shiyang Ruan wrote:
> > > This patchset is a try to resolve the problem of tracking shared page
> > > for f
On Fri, Aug 07, 2020 at 09:13:28PM +0800, Shiyang Ruan wrote:
> This patchset is a try to resolve the problem of tracking shared page
> for fsdax.
>
> Instead of per-page tracking method, this patchset introduces a query
> interface: get_shared_files(), which is implemented by each FS, to
>
Instead of checking the two counters (nrpages and nrexceptional), we
can just check whether i_pages is empty.
Signed-off-by: Matthew Wilcox (Oracle)
---
fs/block_dev.c | 2 +-
fs/dax.c| 2 +-
include/linux/pagemap.h | 5 +
mm/truncate.c | 18
1 - 100 of 247 matches
Mail list logo