Re: [PATCH v2 9/9] iomap: Change calling convention for zeroing
On Fri, Sep 11, 2020 at 12:47:07AM +0100, Matthew Wilcox (Oracle) wrote: > Pass the full length to iomap_zero() and dax_iomap_zero(), and have > them return how many bytes they actually handled. This is preparatory > work for handling THP, although it looks like DAX could actually take > advantage of it if there's a larger contiguous area. Looks good, Reviewed-by: Christoph Hellwig ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 6/9] iomap: Convert read_count to read_bytes_pending
On Fri, Sep 11, 2020 at 12:47:04AM +0100, Matthew Wilcox (Oracle) wrote: > Instead of counting bio segments, count the number of bytes submitted. > This insulates us from the block layer's definition of what a 'same page' > is, which is not necessarily clear once THPs are involved. > > Signed-off-by: Matthew Wilcox (Oracle) Looks good, Reviewed-by: Christoph Hellwig ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Fri, Sep 11, 2020 at 12:47:03AM +0100, Matthew Wilcox (Oracle) wrote: > Size the uptodate array dynamically to support larger pages in the > page cache. With a 64kB page, we're only saving 8 bytes per page today, > but with a 2MB maximum page size, we'd have to allocate more than 4kB > per page. Add a few debugging assertions. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: Dave Chinner Looks good, Reviewed-by: Christoph Hellwig ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v3 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
Hi David, I love your patch! Yet something to improve: [auto build test ERROR on next-20200909] [cannot apply to mmotm/master hnaz-linux-mm/master xen-tip/linux-next powerpc/next linus/master v5.9-rc4 v5.9-rc3 v5.9-rc2 v5.9-rc4] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/David-Hildenbrand/mm-memory_hotplug-selective-merging-of-system-ram-resources/20200910-171630 base:7204eaa2c1f509066486f488c9dcb065d7484494 config: x86_64-randconfig-a016-20200909 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 0a5dc7effb191eff740e0e7ae7bd8e1f6bdb3ad9) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): WARNING: unmet direct dependencies detected for PHY_SAMSUNG_UFS Depends on OF && (ARCH_EXYNOS || COMPILE_TEST Selected by - SCSI_UFS_EXYNOS && SCSI_LOWLEVEL && SCSI && SCSI_UFSHCD_PLATFORM && (ARCH_EXYNOS || COMPILE_TEST In file included from arch/x86/kernel/asm-offsets.c:9: In file included from include/linux/crypto.h:20: In file included from include/linux/slab.h:15: In file included from include/linux/gfp.h:6: In file included from include/linux/mmzone.h:853: >> include/linux/memory_hotplug.h:354:55: error: unknown type name 'mhp_t' extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags); ^ include/linux/memory_hotplug.h:355:53: error: unknown type name 'mhp_t' extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags); ^ include/linux/memory_hotplug.h:357:11: error: unknown type name 'mhp_t' mhp_t mhp_flags); ^ include/linux/memory_hotplug.h:360:10: error: unknown type name 'mhp_t' mhp_t mhp_flags); ^ 4 errors generated. Makefile Module.symvers System.map arch block certs crypto drivers fs include init ipc kernel lib mm modules.builtin modules.builtin.modinfo modules.order net scripts security sound source tools usr virt vmlinux vmlinux.o vmlinux.symvers [scripts/Makefile.build:117: arch/x86/kernel/asm-offsets.s] Error 1 Target '__build' not remade because of errors. Makefile Module.symvers System.map arch block certs crypto drivers fs include init ipc kernel lib mm modules.builtin modules.builtin.modinfo modules.order net scripts security sound source tools usr virt vmlinux vmlinux.o vmlinux.symvers [Makefile:1196: prepare0] Error 2 Target 'prepare' not remade because of errors. make: Makefile Module.symvers System.map arch block certs crypto drivers fs include init ipc kernel lib mm modules.builtin modules.builtin.modinfo modules.order net scripts security sound source tools usr virt vmlinux vmlinux.o vmlinux.symvers [Makefile:185: __sub-make] Error 2 make: Target 'prepare' not remade because of errors. # https://github.com/0day-ci/linux/commit/d88270d1c0783a7f99f24a85692be90fd2ae0d7d git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review David-Hildenbrand/mm-memory_hotplug-selective-merging-of-system-ram-resources/20200910-171630 git checkout d88270d1c0783a7f99f24a85692be90fd2ae0d7d vim +/mhp_t +354 include/linux/memory_hotplug.h 352 353 extern void __ref free_area_init_core_hotplug(int nid); > 354 extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags); --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH] dax: fix for do not print error message for non-persistent memory block device
On 2020/9/11 04:29, John Pittman wrote: > But it should be moved prior to the two bdev_dax_pgoff() checks right? > Else a misaligned partition on a dax unsupported block device can > print the below messages. > > kernel: sda1: error: unaligned partition for dax > kernel: sda2: error: unaligned partition for dax > kernel: sda3: error: unaligned partition for dax > Aha, yes you are right, I agree with you. Coly Li > Reviewed-by: John Pittman > > On Thu, Sep 3, 2020 at 12:12 PM Coly Li wrote: >> >> On 2020/9/4 00:06, Ira Weiny wrote: >>> On Thu, Sep 03, 2020 at 07:55:49PM +0800, Coly Li wrote: When calling __generic_fsdax_supported(), a dax-unsupported device may not have dax_dev as NULL, e.g. the dax related code block is not enabled by Kconfig. Therefore in __generic_fsdax_supported(), to check whether a device supports DAX or not, the following order should be performed, - If dax_dev pointer is NULL, it means the device driver explicitly announce it doesn't support DAX. Then it is OK to directly return false from __generic_fsdax_supported(). - If dax_dev pointer is NOT NULL, it might be because the driver doesn't support DAX and not explicitly initialize related data structure. Then bdev_dax_supported() should be called for further check. IMHO if device driver desn't explicitly set its dax_dev pointer to NULL, this is not a bug. Calling bdev_dax_supported() makes sure they can be recognized as dax-unsupported eventually. This patch does the following change for the above purpose, - if (!dax_dev && !bdev_dax_supported(bdev, blocksize)) { + if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) { Fixes: c2affe920b0e ("dax: do not print error message for non-persistent memory block device") Signed-off-by: Coly Li >>> >>> I hate to do this because I realize this is a bug which people really need >>> fixed. >>> >>> However, shouldn't we also check (!dax_dev || !bdev_dax_supported()) as the >>> _first_ check in __generic_fsdax_supported()? >>> >>> It seems like the other pr_info's could also be called when DAX is not >>> supported and we probably don't want them to be? >>> >>> Perhaps that should be a follow on patch though. So... >> >> I am not author of c2affe920b0e, but I guess it was because >> bdev_dax_supported() needed blocksize, so blocksize should pass previous >> checks firstly to make sure bdev_dax_supported() has a correct blocksize >> to check. >> >>> >>> As a direct fix to c2affe920b0e >>> >>> Reviewed-by: Ira Weiny >> >> Thanks. >> >> Coly Li >> [snipped] ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 2/9] fs: Introduce i_blocks_per_page
This helper is useful for both THPs and for supporting block size larger than page size. Convert all users that I could find (we have a few different ways of writing this idiom, and I may have missed some). Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 8 fs/jfs/jfs_metapage.c | 2 +- fs/xfs/xfs_aops.c | 2 +- include/linux/pagemap.h | 16 4 files changed, 22 insertions(+), 6 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index d81a9a86c5aa..330f86b825d7 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -46,7 +46,7 @@ iomap_page_create(struct inode *inode, struct page *page) { struct iomap_page *iop = to_iomap_page(page); - if (iop || i_blocksize(inode) == PAGE_SIZE) + if (iop || i_blocks_per_page(inode, page) <= 1) return iop; iop = kmalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL); @@ -147,7 +147,7 @@ iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len) unsigned int i; spin_lock_irqsave(&iop->uptodate_lock, flags); - for (i = 0; i < PAGE_SIZE / i_blocksize(inode); i++) { + for (i = 0; i < i_blocks_per_page(inode, page); i++) { if (i >= first && i <= last) set_bit(i, iop->uptodate); else if (!test_bit(i, iop->uptodate)) @@ -1077,7 +1077,7 @@ iomap_finish_page_writeback(struct inode *inode, struct page *page, mapping_set_error(inode->i_mapping, -EIO); } - WARN_ON_ONCE(i_blocksize(inode) < PAGE_SIZE && !iop); + WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop); WARN_ON_ONCE(iop && atomic_read(&iop->write_count) <= 0); if (!iop || atomic_dec_and_test(&iop->write_count)) @@ -1373,7 +1373,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, int error = 0, count = 0, i; LIST_HEAD(submit_list); - WARN_ON_ONCE(i_blocksize(inode) < PAGE_SIZE && !iop); + WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop); WARN_ON_ONCE(iop && atomic_read(&iop->write_count) != 0); /* diff --git a/fs/jfs/jfs_metapage.c b/fs/jfs/jfs_metapage.c index a2f5338a5ea1..176580f54af9 100644 --- a/fs/jfs/jfs_metapage.c +++ b/fs/jfs/jfs_metapage.c @@ -473,7 +473,7 @@ static int metapage_readpage(struct file *fp, struct page *page) struct inode *inode = page->mapping->host; struct bio *bio = NULL; int block_offset; - int blocks_per_page = PAGE_SIZE >> inode->i_blkbits; + int blocks_per_page = i_blocks_per_page(inode, page); sector_t page_start;/* address of page in fs blocks */ sector_t pblock; int xlen; diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index b35611882ff9..55d126d4e096 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -544,7 +544,7 @@ xfs_discard_page( page, ip->i_ino, offset); error = xfs_bmap_punch_delalloc_range(ip, start_fsb, - PAGE_SIZE / i_blocksize(inode)); + i_blocks_per_page(inode, page)); if (error && !XFS_FORCED_SHUTDOWN(mp)) xfs_alert(mp, "page discard unable to remove delalloc mapping."); out_invalidate: diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 50d2c39b47ab..f7f602040913 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -975,4 +975,20 @@ static inline int page_mkwrite_check_truncate(struct page *page, return offset; } +/** + * i_blocks_per_page - How many blocks fit in this page. + * @inode: The inode which contains the blocks. + * @page: The page (head page if the page is a THP). + * + * If the block size is larger than the size of this page, return zero. + * + * Context: The caller should hold a refcount on the page to prevent it + * from being split. + * Return: The number of filesystem blocks covered by this page. + */ +static inline +unsigned int i_blocks_per_page(struct inode *inode, struct page *page) +{ + return thp_size(page) >> inode->i_blkbits; +} #endif /* _LINUX_PAGEMAP_H */ -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 7/9] iomap: Convert write_count to write_bytes_pending
Instead of counting bio segments, count the number of bytes submitted. This insulates us from the block layer's definition of what a 'same page' is, which is not necessarily clear once THPs are involved. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 1cf976a8e55c..64a5cb383f30 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -27,7 +27,7 @@ */ struct iomap_page { atomic_tread_bytes_pending; - atomic_twrite_count; + atomic_twrite_bytes_pending; spinlock_t uptodate_lock; unsigned long uptodate[]; }; @@ -73,7 +73,7 @@ iomap_page_release(struct page *page) if (!iop) return; WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending)); - WARN_ON_ONCE(atomic_read(&iop->write_count)); + WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending)); WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)); kfree(iop); @@ -1047,7 +1047,7 @@ EXPORT_SYMBOL_GPL(iomap_page_mkwrite); static void iomap_finish_page_writeback(struct inode *inode, struct page *page, - int error) + int error, unsigned int len) { struct iomap_page *iop = to_iomap_page(page); @@ -1057,9 +1057,9 @@ iomap_finish_page_writeback(struct inode *inode, struct page *page, } WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop); - WARN_ON_ONCE(iop && atomic_read(&iop->write_count) <= 0); + WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) <= 0); - if (!iop || atomic_dec_and_test(&iop->write_count)) + if (!iop || atomic_sub_and_test(len, &iop->write_bytes_pending)) end_page_writeback(page); } @@ -1093,7 +1093,8 @@ iomap_finish_ioend(struct iomap_ioend *ioend, int error) /* walk each page on bio, ending page IO on them */ bio_for_each_segment_all(bv, bio, iter_all) - iomap_finish_page_writeback(inode, bv->bv_page, error); + iomap_finish_page_writeback(inode, bv->bv_page, error, + bv->bv_len); bio_put(bio); } /* The ioend has been freed by bio_put() */ @@ -1309,8 +1310,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page, merged = __bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, &same_page); - if (iop && !same_page) - atomic_inc(&iop->write_count); + if (iop) + atomic_add(len, &iop->write_bytes_pending); if (!merged) { if (bio_full(wpc->ioend->io_bio, len)) { @@ -1353,7 +1354,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, LIST_HEAD(submit_list); WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop); - WARN_ON_ONCE(iop && atomic_read(&iop->write_count) != 0); + WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0); /* * Walk through the page to find areas to write back. If we run off the -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
Size the uptodate array dynamically to support larger pages in the page cache. With a 64kB page, we're only saving 8 bytes per page today, but with a 2MB maximum page size, we'd have to allocate more than 4kB per page. Add a few debugging assertions. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Dave Chinner --- fs/iomap/buffered-io.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 7fc0e02d27b0..9670c096b83e 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -22,18 +22,25 @@ #include "../internal.h" /* - * Structure allocated for each page when block size < PAGE_SIZE to track - * sub-page uptodate status and I/O completions. + * Structure allocated for each page or THP when block size < page size + * to track sub-page uptodate status and I/O completions. */ struct iomap_page { atomic_tread_count; atomic_twrite_count; spinlock_t uptodate_lock; - DECLARE_BITMAP(uptodate, PAGE_SIZE / 512); + unsigned long uptodate[]; }; static inline struct iomap_page *to_iomap_page(struct page *page) { + /* +* per-block data is stored in the head page. Callers should +* not be dealing with tail pages (and if they are, they can +* call thp_head() first. +*/ + VM_BUG_ON_PGFLAGS(PageTail(page), page); + if (page_has_private(page)) return (struct iomap_page *)page_private(page); return NULL; @@ -45,11 +52,13 @@ static struct iomap_page * iomap_page_create(struct inode *inode, struct page *page) { struct iomap_page *iop = to_iomap_page(page); + unsigned int nr_blocks = i_blocks_per_page(inode, page); - if (iop || i_blocks_per_page(inode, page) <= 1) + if (iop || nr_blocks <= 1) return iop; - iop = kzalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL); + iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), + GFP_NOFS | __GFP_NOFAIL); spin_lock_init(&iop->uptodate_lock); attach_page_private(page, iop); return iop; @@ -59,11 +68,14 @@ static void iomap_page_release(struct page *page) { struct iomap_page *iop = detach_page_private(page); + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page); if (!iop) return; WARN_ON_ONCE(atomic_read(&iop->read_count)); WARN_ON_ONCE(atomic_read(&iop->write_count)); + WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != + PageUptodate(page)); kfree(iop); } -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 4/9] iomap: Use bitmap ops to set uptodate bits
Now that the bitmap is protected by a spinlock, we can use the more efficient bitmap ops instead of individual test/set bit ops. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 58a1fd83f2a4..7fc0e02d27b0 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -134,19 +134,11 @@ iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len) struct inode *inode = page->mapping->host; unsigned first = off >> inode->i_blkbits; unsigned last = (off + len - 1) >> inode->i_blkbits; - bool uptodate = true; unsigned long flags; - unsigned int i; spin_lock_irqsave(&iop->uptodate_lock, flags); - for (i = 0; i < i_blocks_per_page(inode, page); i++) { - if (i >= first && i <= last) - set_bit(i, iop->uptodate); - else if (!test_bit(i, iop->uptodate)) - uptodate = false; - } - - if (uptodate) + bitmap_set(iop->uptodate, first, last - first + 1); + if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page))) SetPageUptodate(page); spin_unlock_irqrestore(&iop->uptodate_lock, flags); } -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 0/9] THP iomap patches for 5.10
These patches are carefully plucked from the THP series. I would like them to hit 5.10 to make the THP patchset merge easier. Some of these are just generic improvements that make sense on their own terms, but the overall intent is to support THPs in iomap. v2: - Move the call to flush_dcache_page (Christoph) - Clarify comments (Darrick) - Rename read_count to read_bytes_pending (Christoph) - Rename write_count to write_bytes_pending (Christoph) - Restructure iomap_readpage_actor() (Christoph) - Change return type of the zeroing functions from loff_t to s64 Matthew Wilcox (Oracle) (9): iomap: Fix misplaced page flushing fs: Introduce i_blocks_per_page iomap: Use kzalloc to allocate iomap_page iomap: Use bitmap ops to set uptodate bits iomap: Support arbitrarily many blocks per page iomap: Convert read_count to read_bytes_pending iomap: Convert write_count to write_bytes_pending iomap: Convert iomap_write_end types iomap: Change calling convention for zeroing fs/dax.c| 13 ++- fs/iomap/buffered-io.c | 173 +--- fs/jfs/jfs_metapage.c | 2 +- fs/xfs/xfs_aops.c | 2 +- include/linux/dax.h | 3 +- include/linux/pagemap.h | 16 6 files changed, 96 insertions(+), 113 deletions(-) -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 9/9] iomap: Change calling convention for zeroing
Pass the full length to iomap_zero() and dax_iomap_zero(), and have them return how many bytes they actually handled. This is preparatory work for handling THP, although it looks like DAX could actually take advantage of it if there's a larger contiguous area. Signed-off-by: Matthew Wilcox (Oracle) --- fs/dax.c | 13 ++--- fs/iomap/buffered-io.c | 33 +++-- include/linux/dax.h| 3 +-- 3 files changed, 22 insertions(+), 27 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 994ab66a9907..6ad346352a8c 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1037,18 +1037,18 @@ static vm_fault_t dax_load_hole(struct xa_state *xas, return ret; } -int dax_iomap_zero(loff_t pos, unsigned offset, unsigned size, - struct iomap *iomap) +s64 dax_iomap_zero(loff_t pos, u64 length, struct iomap *iomap) { sector_t sector = iomap_sector(iomap, pos & PAGE_MASK); pgoff_t pgoff; long rc, id; void *kaddr; bool page_aligned = false; - + unsigned offset = offset_in_page(pos); + unsigned size = min_t(u64, PAGE_SIZE - offset, length); if (IS_ALIGNED(sector << SECTOR_SHIFT, PAGE_SIZE) && - IS_ALIGNED(size, PAGE_SIZE)) + (size == PAGE_SIZE)) page_aligned = true; rc = bdev_dax_pgoff(iomap->bdev, sector, PAGE_SIZE, &pgoff); @@ -1058,8 +1058,7 @@ int dax_iomap_zero(loff_t pos, unsigned offset, unsigned size, id = dax_read_lock(); if (page_aligned) - rc = dax_zero_page_range(iomap->dax_dev, pgoff, -size >> PAGE_SHIFT); + rc = dax_zero_page_range(iomap->dax_dev, pgoff, 1); else rc = dax_direct_access(iomap->dax_dev, pgoff, 1, &kaddr, NULL); if (rc < 0) { @@ -1072,7 +1071,7 @@ int dax_iomap_zero(loff_t pos, unsigned offset, unsigned size, dax_flush(iomap->dax_dev, kaddr + offset, size); } dax_read_unlock(id); - return 0; + return size; } static loff_t diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index cb25a7b70401..3e1eb40a73fd 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -898,11 +898,13 @@ iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len, } EXPORT_SYMBOL_GPL(iomap_file_unshare); -static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset, - unsigned bytes, struct iomap *iomap, struct iomap *srcmap) +static s64 iomap_zero(struct inode *inode, loff_t pos, u64 length, + struct iomap *iomap, struct iomap *srcmap) { struct page *page; int status; + unsigned offset = offset_in_page(pos); + unsigned bytes = min_t(u64, PAGE_SIZE - offset, length); status = iomap_write_begin(inode, pos, bytes, 0, &page, iomap, srcmap); if (status) @@ -914,38 +916,33 @@ static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset, return iomap_write_end(inode, pos, bytes, bytes, page, iomap, srcmap); } -static loff_t -iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, - void *data, struct iomap *iomap, struct iomap *srcmap) +static loff_t iomap_zero_range_actor(struct inode *inode, loff_t pos, + loff_t length, void *data, struct iomap *iomap, + struct iomap *srcmap) { bool *did_zero = data; loff_t written = 0; - int status; /* already zeroed? we're done. */ if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN) - return count; + return length; do { - unsigned offset, bytes; - - offset = offset_in_page(pos); - bytes = min_t(loff_t, PAGE_SIZE - offset, count); + s64 bytes; if (IS_DAX(inode)) - status = dax_iomap_zero(pos, offset, bytes, iomap); + bytes = dax_iomap_zero(pos, length, iomap); else - status = iomap_zero(inode, pos, offset, bytes, iomap, - srcmap); - if (status < 0) - return status; + bytes = iomap_zero(inode, pos, length, iomap, srcmap); + if (bytes < 0) + return bytes; pos += bytes; - count -= bytes; + length -= bytes; written += bytes; if (did_zero) *did_zero = true; - } while (count > 0); + } while (length > 0); return written; } diff --git a/include/linux/dax.h b/include/linux/dax.h index 6904d4e0b2e0..951a851a0481 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -214,8 +214,7 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, int dax_delete_mapping_entry(struct ad
[PATCH v2 1/9] iomap: Fix misplaced page flushing
If iomap_unshare_actor() unshares to an inline iomap, the page was not being flushed. block_write_end() and __iomap_write_end() already contain flushes, so adding it to iomap_write_end_inline() seems like the best place. That means we can remove it from iomap_write_actor(). Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 897ab9a26a74..d81a9a86c5aa 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -717,6 +717,7 @@ iomap_write_end_inline(struct inode *inode, struct page *page, WARN_ON_ONCE(!PageUptodate(page)); BUG_ON(pos + copied > PAGE_SIZE - offset_in_page(iomap->inline_data)); + flush_dcache_page(page); addr = kmap_atomic(page); memcpy(iomap->inline_data + pos, addr + pos, copied); kunmap_atomic(addr); @@ -810,8 +811,6 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes); - flush_dcache_page(page); - status = iomap_write_end(inode, pos, bytes, copied, page, iomap, srcmap); if (unlikely(status < 0)) -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 3/9] iomap: Use kzalloc to allocate iomap_page
We can skip most of the initialisation, although spinlocks still need explicit initialisation as architectures may use a non-zero value to indicate unlocked. The comment is no longer useful as attach_page_private() handles the refcount now. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig Reviewed-by: Dave Chinner Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 330f86b825d7..58a1fd83f2a4 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -49,16 +49,8 @@ iomap_page_create(struct inode *inode, struct page *page) if (iop || i_blocks_per_page(inode, page) <= 1) return iop; - iop = kmalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL); - atomic_set(&iop->read_count, 0); - atomic_set(&iop->write_count, 0); + iop = kzalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL); spin_lock_init(&iop->uptodate_lock); - bitmap_zero(iop->uptodate, PAGE_SIZE / SECTOR_SIZE); - - /* -* migrate_page_move_mapping() assumes that pages with private data have -* their count elevated by 1. -*/ attach_page_private(page, iop); return iop; } -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 8/9] iomap: Convert iomap_write_end types
iomap_write_end cannot return an error, so switch it to return size_t instead of int and remove the error checking from the callers. Also convert the arguments to size_t from unsigned int, in case anyone ever wants to support a page size larger than 2GB. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 31 --- 1 file changed, 12 insertions(+), 19 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 64a5cb383f30..cb25a7b70401 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -663,9 +663,8 @@ iomap_set_page_dirty(struct page *page) } EXPORT_SYMBOL_GPL(iomap_set_page_dirty); -static int -__iomap_write_end(struct inode *inode, loff_t pos, unsigned len, - unsigned copied, struct page *page) +static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, + size_t copied, struct page *page) { flush_dcache_page(page); @@ -687,9 +686,8 @@ __iomap_write_end(struct inode *inode, loff_t pos, unsigned len, return copied; } -static int -iomap_write_end_inline(struct inode *inode, struct page *page, - struct iomap *iomap, loff_t pos, unsigned copied) +static size_t iomap_write_end_inline(struct inode *inode, struct page *page, + struct iomap *iomap, loff_t pos, size_t copied) { void *addr; @@ -705,13 +703,14 @@ iomap_write_end_inline(struct inode *inode, struct page *page, return copied; } -static int -iomap_write_end(struct inode *inode, loff_t pos, unsigned len, unsigned copied, - struct page *page, struct iomap *iomap, struct iomap *srcmap) +/* Returns the number of bytes copied. May be 0. Cannot be an errno. */ +static size_t iomap_write_end(struct inode *inode, loff_t pos, size_t len, + size_t copied, struct page *page, struct iomap *iomap, + struct iomap *srcmap) { const struct iomap_page_ops *page_ops = iomap->page_ops; loff_t old_size = inode->i_size; - int ret; + size_t ret; if (srcmap->type == IOMAP_INLINE) { ret = iomap_write_end_inline(inode, page, iomap, pos, copied); @@ -790,11 +789,8 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes); - status = iomap_write_end(inode, pos, bytes, copied, page, iomap, + copied = iomap_write_end(inode, pos, bytes, copied, page, iomap, srcmap); - if (unlikely(status < 0)) - break; - copied = status; cond_resched(); @@ -868,11 +864,8 @@ iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data, status = iomap_write_end(inode, pos, bytes, bytes, page, iomap, srcmap); - if (unlikely(status <= 0)) { - if (WARN_ON_ONCE(status == 0)) - return -EIO; - return status; - } + if (WARN_ON_ONCE(status == 0)) + return -EIO; cond_resched(); -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v2 6/9] iomap: Convert read_count to read_bytes_pending
Instead of counting bio segments, count the number of bytes submitted. This insulates us from the block layer's definition of what a 'same page' is, which is not necessarily clear once THPs are involved. Signed-off-by: Matthew Wilcox (Oracle) --- fs/iomap/buffered-io.c | 41 - 1 file changed, 12 insertions(+), 29 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 9670c096b83e..1cf976a8e55c 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -26,7 +26,7 @@ * to track sub-page uptodate status and I/O completions. */ struct iomap_page { - atomic_tread_count; + atomic_tread_bytes_pending; atomic_twrite_count; spinlock_t uptodate_lock; unsigned long uptodate[]; @@ -72,7 +72,7 @@ iomap_page_release(struct page *page) if (!iop) return; - WARN_ON_ONCE(atomic_read(&iop->read_count)); + WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending)); WARN_ON_ONCE(atomic_read(&iop->write_count)); WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)); @@ -167,13 +167,6 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len) SetPageUptodate(page); } -static void -iomap_read_finish(struct iomap_page *iop, struct page *page) -{ - if (!iop || atomic_dec_and_test(&iop->read_count)) - unlock_page(page); -} - static void iomap_read_page_end_io(struct bio_vec *bvec, int error) { @@ -187,7 +180,8 @@ iomap_read_page_end_io(struct bio_vec *bvec, int error) iomap_set_range_uptodate(page, bvec->bv_offset, bvec->bv_len); } - iomap_read_finish(iop, page); + if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending)) + unlock_page(page); } static void @@ -267,30 +261,19 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, } ctx->cur_page_in_bio = true; + if (iop) + atomic_add(plen, &iop->read_bytes_pending); - /* -* Try to merge into a previous segment if we can. -*/ + /* Try to merge into a previous segment if we can */ sector = iomap_sector(iomap, pos); - if (ctx->bio && bio_end_sector(ctx->bio) == sector) + if (ctx->bio && bio_end_sector(ctx->bio) == sector) { + if (__bio_try_merge_page(ctx->bio, page, plen, poff, + &same_page)) + goto done; is_contig = true; - - if (is_contig && - __bio_try_merge_page(ctx->bio, page, plen, poff, &same_page)) { - if (!same_page && iop) - atomic_inc(&iop->read_count); - goto done; } - /* -* If we start a new segment we need to increase the read count, and we -* need to do so before submitting any previous full bio to make sure -* that we don't prematurely unlock the page. -*/ - if (iop) - atomic_inc(&iop->read_count); - - if (!ctx->bio || !is_contig || bio_full(ctx->bio, plen)) { + if (!is_contig || bio_full(ctx->bio, plen)) { gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL); gfp_t orig_gfp = gfp; int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT; -- 2.28.0 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH] dax: fix for do not print error message for non-persistent memory block device
But it should be moved prior to the two bdev_dax_pgoff() checks right? Else a misaligned partition on a dax unsupported block device can print the below messages. kernel: sda1: error: unaligned partition for dax kernel: sda2: error: unaligned partition for dax kernel: sda3: error: unaligned partition for dax Reviewed-by: John Pittman On Thu, Sep 3, 2020 at 12:12 PM Coly Li wrote: > > On 2020/9/4 00:06, Ira Weiny wrote: > > On Thu, Sep 03, 2020 at 07:55:49PM +0800, Coly Li wrote: > >> When calling __generic_fsdax_supported(), a dax-unsupported device may > >> not have dax_dev as NULL, e.g. the dax related code block is not enabled > >> by Kconfig. > >> > >> Therefore in __generic_fsdax_supported(), to check whether a device > >> supports DAX or not, the following order should be performed, > >> - If dax_dev pointer is NULL, it means the device driver explicitly > >> announce it doesn't support DAX. Then it is OK to directly return > >> false from __generic_fsdax_supported(). > >> - If dax_dev pointer is NOT NULL, it might be because the driver doesn't > >> support DAX and not explicitly initialize related data structure. Then > >> bdev_dax_supported() should be called for further check. > >> > >> IMHO if device driver desn't explicitly set its dax_dev pointer to NULL, > >> this is not a bug. Calling bdev_dax_supported() makes sure they can be > >> recognized as dax-unsupported eventually. > >> > >> This patch does the following change for the above purpose, > >> - if (!dax_dev && !bdev_dax_supported(bdev, blocksize)) { > >> + if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) { > >> > >> > >> Fixes: c2affe920b0e ("dax: do not print error message for non-persistent > >> memory block device") > >> Signed-off-by: Coly Li > > > > I hate to do this because I realize this is a bug which people really need > > fixed. > > > > However, shouldn't we also check (!dax_dev || !bdev_dax_supported()) as the > > _first_ check in __generic_fsdax_supported()? > > > > It seems like the other pr_info's could also be called when DAX is not > > supported and we probably don't want them to be? > > > > Perhaps that should be a follow on patch though. So... > > I am not author of c2affe920b0e, but I guess it was because > bdev_dax_supported() needed blocksize, so blocksize should pass previous > checks firstly to make sure bdev_dax_supported() has a correct blocksize > to check. > > > > > As a direct fix to c2affe920b0e > > > > Reviewed-by: Ira Weiny > > Thanks. > > Coly Li > > > > > >> Cc: Adrian Huang > >> Cc: Ira Weiny > >> Cc: Jan Kara > >> Cc: Mike Snitzer > >> Cc: Pankaj Gupta > >> Cc: Vishal Verma > >> --- > >> drivers/dax/super.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/drivers/dax/super.c b/drivers/dax/super.c > >> index 32642634c1bb..e5767c83ea23 100644 > >> --- a/drivers/dax/super.c > >> +++ b/drivers/dax/super.c > >> @@ -100,7 +100,7 @@ bool __generic_fsdax_supported(struct dax_device > >> *dax_dev, > >> return false; > >> } > >> > >> -if (!dax_dev && !bdev_dax_supported(bdev, blocksize)) { > >> +if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) { > >> pr_debug("%s: error: dax unsupported by block device\n", > >> bdevname(bdev, buf)); > >> return false; > >> -- > >> 2.26.2 > >> > ___ > Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org > To unsubscribe send an email to linux-nvdimm-le...@lists.01.org > ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH] powerpc/papr_scm: Fix warning triggered by perf_stats_show()
On Thu, Sep 10, 2020 at 02:52:12PM +0530, Vaibhav Jain wrote: > A warning is reported by the kernel in case perf_stats_show() returns > an error code. The warning is of the form below: > > papr_scm ibm,persistent-memory:ibm,pmemory@4411: > Failed to query performance stats, Err:-10 > dev_attr_show: perf_stats_show+0x0/0x1c0 [papr_scm] returned bad count > fill_read_buffer: dev_attr_show+0x0/0xb0 returned bad count > > On investigation it looks like that the compiler is silently truncating the > return value of drc_pmem_query_stats() from 'long' to 'int', since the > variable used to store the return code 'rc' is an 'int'. This > truncated value is then returned back as a 'ssize_t' back from > perf_stats_show() to 'dev_attr_show()' which thinks of it as a large > unsigned number and triggers this warning.. > > To fix this we update the type of variable 'rc' from 'int' to > 'ssize_t' that prevents the compiler from truncating the return value > of drc_pmem_query_stats() and returning correct signed value back from > perf_stats_show(). > > Fixes: 2d02bf835e573 ('powerpc/papr_scm: Fetch nvdimm performance >stats from PHYP') > Signed-off-by: Vaibhav Jain > --- > arch/powerpc/platforms/pseries/papr_scm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c > b/arch/powerpc/platforms/pseries/papr_scm.c > index a88a707a608aa..9f00b61676ab9 100644 > --- a/arch/powerpc/platforms/pseries/papr_scm.c > +++ b/arch/powerpc/platforms/pseries/papr_scm.c > @@ -785,7 +785,8 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor > *nd_desc, > static ssize_t perf_stats_show(struct device *dev, > struct device_attribute *attr, char *buf) > { > - int index, rc; > + int index; > + ssize_t rc; I'm not sure this is really fixing everything here. drc_pmem_query_stats() can return negative errno's. Why are those not checked somewhere in perf_stats_show()? It seems like all this fix is handling is a > 0 return value: 'ret[0]' from line 289 in papr_scm.c... Or something? Worse yet drc_pmem_query_stats() is returning ssize_t which is a signed value. Therefore, it should not be returning -errno. I'm surprised the static checkers did not catch that. I believe I caught similar errors with a patch series before which did not pay attention to variable types. Please audit this code for these types of errors and ensure you are really doing the correct thing when using the sysfs interface. I'm pretty sure bad things will eventually happen (if they are not already) if you return some really big number to the sysfs core from *_show(). Ira > struct seq_buf s; > struct papr_scm_perf_stat *stat; > struct papr_scm_perf_stats *stats; > -- > 2.26.2 > ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Hi,
Hi, i am trying to reach you hope this message get to you.from franca thanks, ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
On Mon, 7 Sep 2020 16:35:40 +0530, Vaibhav Jain wrote: > The newly introduced 'perf_stats' attribute uses the default access > mode of 0444 letting non-root users access performance stats of an > nvdimm and potentially force the kernel into issuing large number of > expensive HCALLs. Since the information exposed by this attribute > cannot be cached hence its better to ward of access to this attribute > from users who don't need to access these performance statistics. > > [...] Applied to powerpc/fixes. [1/1] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute https://git.kernel.org/powerpc/c/0460534b532e5518c657c7d6492b9337d975eaa3 cheers ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v3 5/7] virtio-mem: try to merge system ram resources
Reviewed-by: Pankaj Gupta ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v3 4/7] mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources
Looks good to me. Reviewed-by: Pankaj Gupta ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v3 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
> We soon want to pass flags, e.g., to mark added System RAM resources. > mergeable. Prepare for that. > > This patch is based on a similar patch by Oscar Salvador: > > https://lkml.kernel.org/r/20190625075227.15193-3-osalva...@suse.de > > Acked-by: Wei Liu > Reviewed-by: Juergen Gross # Xen related part > Cc: Andrew Morton > Cc: Michal Hocko > Cc: Dan Williams > Cc: Jason Gunthorpe > Cc: Pankaj Gupta > Cc: Baoquan He > Cc: Wei Yang > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: "Rafael J. Wysocki" > Cc: Len Brown > Cc: Greg Kroah-Hartman > Cc: Vishal Verma > Cc: Dave Jiang > Cc: "K. Y. Srinivasan" > Cc: Haiyang Zhang > Cc: Stephen Hemminger > Cc: Wei Liu > Cc: Heiko Carstens > Cc: Vasily Gorbik > Cc: Christian Borntraeger > Cc: David Hildenbrand > Cc: "Michael S. Tsirkin" > Cc: Jason Wang > Cc: Boris Ostrovsky > Cc: Juergen Gross > Cc: Stefano Stabellini > Cc: "Oliver O'Halloran" > Cc: Pingfan Liu > Cc: Nathan Lynch > Cc: Libor Pechacek > Cc: Anton Blanchard > Cc: Leonardo Bras > Cc: linuxppc-...@lists.ozlabs.org > Cc: linux-a...@vger.kernel.org > Cc: linux-nvdimm@lists.01.org > Cc: linux-hyp...@vger.kernel.org > Cc: linux-s...@vger.kernel.org > Cc: virtualizat...@lists.linux-foundation.org > Cc: xen-de...@lists.xenproject.org > Signed-off-by: David Hildenbrand > --- > arch/powerpc/platforms/powernv/memtrace.c | 2 +- > arch/powerpc/platforms/pseries/hotplug-memory.c | 2 +- > drivers/acpi/acpi_memhotplug.c | 3 ++- > drivers/base/memory.c | 3 ++- > drivers/dax/kmem.c | 2 +- > drivers/hv/hv_balloon.c | 2 +- > drivers/s390/char/sclp_cmd.c| 2 +- > drivers/virtio/virtio_mem.c | 2 +- > drivers/xen/balloon.c | 2 +- > include/linux/memory_hotplug.h | 16 > mm/memory_hotplug.c | 14 +++--- > 11 files changed, 30 insertions(+), 20 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/memtrace.c > b/arch/powerpc/platforms/powernv/memtrace.c > index 13b369d2cc454..6828108486f83 100644 > --- a/arch/powerpc/platforms/powernv/memtrace.c > +++ b/arch/powerpc/platforms/powernv/memtrace.c > @@ -224,7 +224,7 @@ static int memtrace_online(void) > ent->mem = 0; > } > > - if (add_memory(ent->nid, ent->start, ent->size)) { > + if (add_memory(ent->nid, ent->start, ent->size, MHP_NONE)) { > pr_err("Failed to add trace memory to node %d\n", > ent->nid); > ret += 1; > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c > b/arch/powerpc/platforms/pseries/hotplug-memory.c > index 0ea976d1cac47..e1c9fa0d730f5 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -615,7 +615,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb) > nid = memory_add_physaddr_to_nid(lmb->base_addr); > > /* Add the memory */ > - rc = __add_memory(nid, lmb->base_addr, block_sz); > + rc = __add_memory(nid, lmb->base_addr, block_sz, MHP_NONE); > if (rc) { > invalidate_lmb_associativity_index(lmb); > return rc; > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c > index e294f44a78504..2067c3bc55763 100644 > --- a/drivers/acpi/acpi_memhotplug.c > +++ b/drivers/acpi/acpi_memhotplug.c > @@ -207,7 +207,8 @@ static int acpi_memory_enable_device(struct > acpi_memory_device *mem_device) > if (node < 0) > node = memory_add_physaddr_to_nid(info->start_addr); > > - result = __add_memory(node, info->start_addr, info->length); > + result = __add_memory(node, info->start_addr, info->length, > + MHP_NONE); > > /* > * If the memory block has been used by the kernel, > add_memory() > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index 4db3c660de831..b4c297dd04755 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -432,7 +432,8 @@ static ssize_t probe_store(struct device *dev, struct > device_attribute *attr, > > nid = memory_add_physaddr_to_nid(phys_addr); > ret = __add_memory(nid, phys_addr, > - MIN_MEMORY_BLOCK_SIZE * sections_per_block); > + MIN_MEMORY_BLOCK_SIZE * sections_per_block, > + MHP_NONE); > > if (ret) > goto out; > diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c > index 7dcb2902e9b1b..896cb9444e727 100644 > --- a/drivers/dax/kmem.c > +++ b/drivers/dax/kmem.c > @@ -95,7 +95,7 @@ int dev_dax_kmem_probe(struct dev_dax *dev_dax) >
[PATCH] powerpc/papr_scm: Fix warning triggered by perf_stats_show()
A warning is reported by the kernel in case perf_stats_show() returns an error code. The warning is of the form below: papr_scm ibm,persistent-memory:ibm,pmemory@4411: Failed to query performance stats, Err:-10 dev_attr_show: perf_stats_show+0x0/0x1c0 [papr_scm] returned bad count fill_read_buffer: dev_attr_show+0x0/0xb0 returned bad count On investigation it looks like that the compiler is silently truncating the return value of drc_pmem_query_stats() from 'long' to 'int', since the variable used to store the return code 'rc' is an 'int'. This truncated value is then returned back as a 'ssize_t' back from perf_stats_show() to 'dev_attr_show()' which thinks of it as a large unsigned number and triggers this warning.. To fix this we update the type of variable 'rc' from 'int' to 'ssize_t' that prevents the compiler from truncating the return value of drc_pmem_query_stats() and returning correct signed value back from perf_stats_show(). Fixes: 2d02bf835e573 ('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP') Signed-off-by: Vaibhav Jain --- arch/powerpc/platforms/pseries/papr_scm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index a88a707a608aa..9f00b61676ab9 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -785,7 +785,8 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc, static ssize_t perf_stats_show(struct device *dev, struct device_attribute *attr, char *buf) { - int index, rc; + int index; + ssize_t rc; struct seq_buf s; struct papr_scm_perf_stat *stat; struct papr_scm_perf_stats *stats; -- 2.26.2 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v3 5/7] virtio-mem: try to merge system ram resources
virtio-mem adds memory in memory block granularity, to be able to remove it in the same granularity again later, and to grow slowly on demand. This, however, results in quite a lot of resources when adding a lot of memory. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources later when e.g., trying to create a kdump header). Before this patch, we get (/proc/iomem) when hotplugging 2G via virtio-mem on x86-64: [...] 1-13fff : System RAM 14000-33fff : virtio0 14000-147ff : System RAM (virtio_mem) 14800-14fff : System RAM (virtio_mem) 15000-157ff : System RAM (virtio_mem) 15800-15fff : System RAM (virtio_mem) 16000-167ff : System RAM (virtio_mem) 16800-16fff : System RAM (virtio_mem) 17000-177ff : System RAM (virtio_mem) 17800-17fff : System RAM (virtio_mem) 18000-187ff : System RAM (virtio_mem) 18800-18fff : System RAM (virtio_mem) 19000-197ff : System RAM (virtio_mem) 19800-19fff : System RAM (virtio_mem) 1a000-1a7ff : System RAM (virtio_mem) 1a800-1afff : System RAM (virtio_mem) 1b000-1b7ff : System RAM (virtio_mem) 1b800-1bfff : System RAM (virtio_mem) 328000-32 : PCI Bus :00 With this patch, we get (/proc/iomem): [...] fffc- : Reserved 1-13fff : System RAM 14000-33fff : virtio0 14000-1bfff : System RAM (virtio_mem) 328000-32 : PCI Bus :00 Of course, with more hotplugged memory, it gets worse. When unplugging memory blocks again, try_remove_memory() (via offline_and_remove_memory()) will properly split the resource up again. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- drivers/virtio/virtio_mem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index ed99e43354010..ba4de598f6636 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -424,7 +424,8 @@ static int virtio_mem_mb_add(struct virtio_mem *vm, unsigned long mb_id) dev_dbg(&vm->vdev->dev, "adding memory block: %lu\n", mb_id); return add_memory_driver_managed(nid, addr, memory_block_size_bytes(), -vm->resource_name, MHP_NONE); +vm->resource_name, +MEMHP_MERGE_RESOURCE); } /* -- 2.26.2 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v3 6/7] xen/balloon: try to merge system ram resources
Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Reviewed-by: Juergen Gross Cc: Andrew Morton Cc: Michal Hocko Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Roger Pau Monné Cc: Julien Grall Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- drivers/xen/balloon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 9f40a294d398d..b57b2067ecbfb 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -331,7 +331,7 @@ static enum bp_state reserve_additional_memory(void) mutex_unlock(&balloon_mutex); /* add_memory_resource() requires the device_hotplug lock */ lock_device_hotplug(); - rc = add_memory_resource(nid, resource, MHP_NONE); + rc = add_memory_resource(nid, resource, MEMHP_MERGE_RESOURCE); unlock_device_hotplug(); mutex_lock(&balloon_mutex); -- 2.26.2 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v3 4/7] mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources
Some add_memory*() users add memory in small, contiguous memory blocks. Examples include virtio-mem, hyper-v balloon, and the XEN balloon. This can quickly result in a lot of memory resources, whereby the actual resource boundaries are not of interest (e.g., it might be relevant for DIMMs, exposed via /proc/iomem to user space). We really want to merge added resources in this scenario where possible. Let's provide a flag (MEMHP_MERGE_RESOURCE) to specify that a resource either created within add_memory*() or passed via add_memory_resource() shall be marked mergeable and merged with applicable siblings. To implement that, we need a kernel/resource interface to mark selected System RAM resources mergeable (IORESOURCE_SYSRAM_MERGEABLE) and trigger merging. Note: We really want to merge after the whole operation succeeded, not directly when adding a resource to the resource tree (it would break add_memory_resource() and require splitting resources again when the operation failed - e.g., due to -ENOMEM). Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Thomas Gleixner Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Roger Pau Monné Cc: Julien Grall Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 +++ include/linux/memory_hotplug.h | 7 kernel/resource.c | 60 ++ mm/memory_hotplug.c| 7 4 files changed, 78 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index d7620d7c941a0..7e61389dcb017 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -60,6 +60,7 @@ struct resource { /* IORESOURCE_SYSRAM specific bits. */ #define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x0200 /* Always detected via a driver. */ +#define IORESOURCE_SYSRAM_MERGEABLE0x0400 /* Resource can be merged. */ #define IORESOURCE_EXCLUSIVE 0x0800 /* Userland may not map this resource */ @@ -253,6 +254,9 @@ extern void __release_region(struct resource *, resource_size_t, extern void release_mem_region_adjustable(struct resource *, resource_size_t, resource_size_t); #endif +#ifdef CONFIG_MEMORY_HOTPLUG +extern void merge_system_ram_resource(struct resource *res); +#endif /* Wrappers for managed devices */ struct device; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index e53d1058f3443..869a59006cd8e 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -62,6 +62,13 @@ typedef int __bitwise mhp_t; /* No special request */ #define MHP_NONE ((__force mhp_t)0) +/* + * Allow merging of the added System RAM resource with adjacent, + * mergeable resources. After a successful call to add_memory_resource() + * with this flag set, the resource pointer must no longer be used as it + * might be stale, or the resource might have changed. + */ +#define MEMHP_MERGE_RESOURCE ((__force mhp_t)BIT(0)) /* * Extended parameters for memory hotplug: diff --git a/kernel/resource.c b/kernel/resource.c index 36b3552210120..7a91b935f4c20 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1363,6 +1363,66 @@ void release_mem_region_adjustable(struct resource *parent, } #endif /* CONFIG_MEMORY_HOTREMOVE */ +#ifdef CONFIG_MEMORY_HOTPLUG +static bool system_ram_resources_mergeable(struct resource *r1, + struct resource *r2) +{ + /* We assume either r1 or r2 is IORESOURCE_SYSRAM_MERGEABLE. */ + return r1->flags == r2->flags && r1->end + 1 == r2->start && + r1->name == r2->name && r1->desc == r2->desc && + !r1->child && !r2->child; +} + +/* + * merge_system_ram_resource - mark the System RAM resource mergeable and try to + * merge it with adjacent, mergeable resources + * @res: resource descriptor + * + * This interface is intended for memory hotplug, whereby lots of contiguous + * system ram resources are added (e.g., via add_memory*()) by a driver, and + * the actual resource boundaries are not of interest (e.g., it might be + * relevant for DIMMs). Only resources that are marked mergeable, that have the + * same parent, and that don't have any children are considered. All mergeable + * resources must be immutable during the request. + * + * Note: + * - The caller has to make sure that no pointers to resources that are + * marked mergeable are used anymore after this call - the resource might + * be freed and the pointer might be stale! + * - release_mem_region_adjustable() will split on demand on memory hotunplug + */ +void merge_system_ram_resource(struct resource *res) +{ + const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + struct
[PATCH v3 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/20190625075227.15193-3-osalva...@suse.de Acked-by: Wei Liu Reviewed-by: Juergen Gross # Xen related part Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Greg Kroah-Hartman Cc: Vishal Verma Cc: Dave Jiang Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: David Hildenbrand Cc: "Michael S. Tsirkin" Cc: Jason Wang Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: "Oliver O'Halloran" Cc: Pingfan Liu Cc: Nathan Lynch Cc: Libor Pechacek Cc: Anton Blanchard Cc: Leonardo Bras Cc: linuxppc-...@lists.ozlabs.org Cc: linux-a...@vger.kernel.org Cc: linux-nvdimm@lists.01.org Cc: linux-hyp...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: virtualizat...@lists.linux-foundation.org Cc: xen-de...@lists.xenproject.org Signed-off-by: David Hildenbrand --- arch/powerpc/platforms/powernv/memtrace.c | 2 +- arch/powerpc/platforms/pseries/hotplug-memory.c | 2 +- drivers/acpi/acpi_memhotplug.c | 3 ++- drivers/base/memory.c | 3 ++- drivers/dax/kmem.c | 2 +- drivers/hv/hv_balloon.c | 2 +- drivers/s390/char/sclp_cmd.c| 2 +- drivers/virtio/virtio_mem.c | 2 +- drivers/xen/balloon.c | 2 +- include/linux/memory_hotplug.h | 16 mm/memory_hotplug.c | 14 +++--- 11 files changed, 30 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c index 13b369d2cc454..6828108486f83 100644 --- a/arch/powerpc/platforms/powernv/memtrace.c +++ b/arch/powerpc/platforms/powernv/memtrace.c @@ -224,7 +224,7 @@ static int memtrace_online(void) ent->mem = 0; } - if (add_memory(ent->nid, ent->start, ent->size)) { + if (add_memory(ent->nid, ent->start, ent->size, MHP_NONE)) { pr_err("Failed to add trace memory to node %d\n", ent->nid); ret += 1; diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 0ea976d1cac47..e1c9fa0d730f5 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -615,7 +615,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb) nid = memory_add_physaddr_to_nid(lmb->base_addr); /* Add the memory */ - rc = __add_memory(nid, lmb->base_addr, block_sz); + rc = __add_memory(nid, lmb->base_addr, block_sz, MHP_NONE); if (rc) { invalidate_lmb_associativity_index(lmb); return rc; diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c index e294f44a78504..2067c3bc55763 100644 --- a/drivers/acpi/acpi_memhotplug.c +++ b/drivers/acpi/acpi_memhotplug.c @@ -207,7 +207,8 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device) if (node < 0) node = memory_add_physaddr_to_nid(info->start_addr); - result = __add_memory(node, info->start_addr, info->length); + result = __add_memory(node, info->start_addr, info->length, + MHP_NONE); /* * If the memory block has been used by the kernel, add_memory() diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 4db3c660de831..b4c297dd04755 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -432,7 +432,8 @@ static ssize_t probe_store(struct device *dev, struct device_attribute *attr, nid = memory_add_physaddr_to_nid(phys_addr); ret = __add_memory(nid, phys_addr, - MIN_MEMORY_BLOCK_SIZE * sections_per_block); + MIN_MEMORY_BLOCK_SIZE * sections_per_block, + MHP_NONE); if (ret) goto out; diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 7dcb2902e9b1b..896cb9444e727 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -95,7 +95,7 @@ int dev_dax_kmem_probe(struct dev_dax *dev_dax) * this as RAM automatically. */ rc = add_memory_driver_managed(numa_node, range.start, - range_len(&range), kmem_name); + range_len(&range
[PATCH v3 7/7] hv_balloon: try to merge system ram resources
Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Reviewed-by: Wei Liu Cc: Andrew Morton Cc: Michal Hocko Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- drivers/hv/hv_balloon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 3c0d52e244520..b64d2efbefe71 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -726,7 +726,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), - (HA_CHUNK << PAGE_SHIFT), MHP_NONE); + (HA_CHUNK << PAGE_SHIFT), MEMHP_MERGE_RESOURCE); if (ret) { pr_err("hot_add memory failed error is %d\n", ret); -- 2.26.2 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v3 2/7] kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED
IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is always set to 0 by hardware. This is far from beautiful (and confusing), and the bit only applies to SYSRAM. So let's move it out of the bus-specific (PnP) defined bits. We'll add another SYSRAM specific bit soon. If we ever need more bits for other purposes, we can steal some from "desc", or reshuffle/regroup what we have. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Cc: Eric Biederman Cc: Thomas Gleixner Cc: Greg Kroah-Hartman Cc: ke...@lists.infradead.org Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 +++- kernel/kexec_file.c| 2 +- mm/memory_hotplug.c| 4 ++-- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 52a91f5fa1a36..d7620d7c941a0 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -58,6 +58,9 @@ struct resource { #define IORESOURCE_EXT_TYPE_BITS 0x0100/* Resource extended types */ #define IORESOURCE_SYSRAM 0x0100 /* System RAM (modifier) */ +/* IORESOURCE_SYSRAM specific bits. */ +#define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x0200 /* Always detected via a driver. */ + #define IORESOURCE_EXCLUSIVE 0x0800 /* Userland may not map this resource */ #define IORESOURCE_DISABLED0x1000 @@ -103,7 +106,6 @@ struct resource { #define IORESOURCE_MEM_32BIT (3<<3) #define IORESOURCE_MEM_SHADOWABLE (1<<5) /* dup: IORESOURCE_SHADOWABLE */ #define IORESOURCE_MEM_EXPANSIONROM(1<<6) -#define IORESOURCE_MEM_DRIVER_MANAGED (1<<7) /* PnP I/O specific bits (IORESOURCE_BITS) */ #define IORESOURCE_IO_16BIT_ADDR (1<<0) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index ca40bef75a616..dfeeed1aed084 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -520,7 +520,7 @@ static int locate_mem_hole_callback(struct resource *res, void *arg) /* Returning 0 will take to next memory range */ /* Don't use memory that will be detected and handled by a driver. */ - if (res->flags & IORESOURCE_MEM_DRIVER_MANAGED) + if (res->flags & IORESOURCE_SYSRAM_DRIVER_MANAGED) return 0; if (sz < kbuf->memsz) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 4c47b68a9f4b5..8e1cd18b5cf14 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -105,7 +105,7 @@ static struct resource *register_memory_resource(u64 start, u64 size, unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; if (strcmp(resource_name, "System RAM")) - flags |= IORESOURCE_MEM_DRIVER_MANAGED; + flags |= IORESOURCE_SYSRAM_DRIVER_MANAGED; /* * Make sure value parsed from 'mem=' only restricts memory adding @@ -1160,7 +1160,7 @@ EXPORT_SYMBOL_GPL(add_memory); * * For this memory, no entries in /sys/firmware/memmap ("raw firmware-provided * memory map") are created. Also, the created memory resource is flagged - * with IORESOURCE_MEM_DRIVER_MANAGED, so in-kernel users can special-case + * with IORESOURCE_SYSRAM_DRIVER_MANAGED, so in-kernel users can special-case * this memory as well (esp., not place kexec images onto it). * * The resource_name (visible via /proc/iomem) has to have the format -- 2.26.2 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v3 0/7] mm/memory_hotplug: selective merging of system ram resources
Some add_memory*() users add memory in small, contiguous memory blocks. Examples include virtio-mem, hyper-v balloon, and the XEN balloon. This can quickly result in a lot of memory resources, whereby the actual resource boundaries are not of interest (e.g., it might be relevant for DIMMs, exposed via /proc/iomem to user space). We really want to merge added resources in this scenario where possible. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources when creating a kdump header. The current kexec-tools resource count limit does not allow for more than ~100GB of memory with a memory block size of 128MB on x86-64). Let's allow to selectively merge system ram resources by specifying a new flag for add_memory*(). Patch #5 contains a /proc/iomem example. Only tested with virtio-mem. v2 -> v3: - "mm/memory_hotplug: prepare passing flags to add_memory() and friends" -- Use proper __bitwise type for flags -- Use "MHP_NONE" for empty flags - Rebased to latest -next, added rb's v1 -> v2: - I had another look at v1 after vacation and didn't like it - it felt like a hack. So I want forward and added a proper flag to add_memory*(), and introduce a clean (non-racy) way to mark System RAM resources mergeable. - "kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED" -- Clean that flag up, felt wrong in the PnP section - "mm/memory_hotplug: prepare passing flags to add_memory() and friends" -- Previously sent in other context - decided to keep Wei's ack - "mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources" -- Cleaner approach to get the job done by using proper flags and only merging the single, specified resource - "virtio-mem: try to merge system ram resources" "xen/balloon: try to merge system ram resources" "hv_balloon: try to merge system ram resources" -- Use the new flag MEMHP_MERGE_RESOURCE, much cleaner RFC -> v1: - Switch from rather generic "merge_child_mem_resources()" where a resource name has to be specified to "merge_system_ram_resources(). - Smaller comment/documentation/patch description changes/fixes David Hildenbrand (7): kernel/resource: make release_mem_region_adjustable() never fail kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED mm/memory_hotplug: prepare passing flags to add_memory() and friends mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources virtio-mem: try to merge system ram resources xen/balloon: try to merge system ram resources hv_balloon: try to merge system ram resources arch/powerpc/platforms/powernv/memtrace.c | 2 +- .../platforms/pseries/hotplug-memory.c| 2 +- drivers/acpi/acpi_memhotplug.c| 3 +- drivers/base/memory.c | 3 +- drivers/dax/kmem.c| 2 +- drivers/hv/hv_balloon.c | 2 +- drivers/s390/char/sclp_cmd.c | 2 +- drivers/virtio/virtio_mem.c | 3 +- drivers/xen/balloon.c | 2 +- include/linux/ioport.h| 12 +- include/linux/memory_hotplug.h| 23 +++- kernel/kexec_file.c | 2 +- kernel/resource.c | 109 ++ mm/memory_hotplug.c | 47 +++- 14 files changed, 146 insertions(+), 68 deletions(-) -- 2.26.2 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
[PATCH v3 1/7] kernel/resource: make release_mem_region_adjustable() never fail
Let's make sure splitting a resource on memory hotunplug will never fail. This will become more relevant once we merge selected System RAM resources - then, we'll trigger that case more often on memory hotunplug. In general, this function is already unlikely to fail. When we remove memory, we free up quite a lot of metadata (memmap, page tables, memory block device, etc.). The only reason it could really fail would be when injecting allocation errors. All other error cases inside release_mem_region_adjustable() seem to be sanity checks if the function would be abused in different context - let's add WARN_ON_ONCE() in these cases so we can catch them. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 ++-- kernel/resource.c | 49 -- mm/memory_hotplug.c| 22 +-- 3 files changed, 31 insertions(+), 44 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 6c2b06fe8beb7..52a91f5fa1a36 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -248,8 +248,8 @@ extern struct resource * __request_region(struct resource *, extern void __release_region(struct resource *, resource_size_t, resource_size_t); #ifdef CONFIG_MEMORY_HOTREMOVE -extern int release_mem_region_adjustable(struct resource *, resource_size_t, - resource_size_t); +extern void release_mem_region_adjustable(struct resource *, resource_size_t, + resource_size_t); #endif /* Wrappers for managed devices */ diff --git a/kernel/resource.c b/kernel/resource.c index f1175ce93a1d5..36b3552210120 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1258,21 +1258,28 @@ EXPORT_SYMBOL(__release_region); * assumes that all children remain in the lower address entry for * simplicity. Enhance this logic when necessary. */ -int release_mem_region_adjustable(struct resource *parent, - resource_size_t start, resource_size_t size) +void release_mem_region_adjustable(struct resource *parent, + resource_size_t start, resource_size_t size) { + struct resource *new_res = NULL; + bool alloc_nofail = false; struct resource **p; struct resource *res; - struct resource *new_res; resource_size_t end; - int ret = -EINVAL; end = start + size - 1; - if ((start < parent->start) || (end > parent->end)) - return ret; + if (WARN_ON_ONCE((start < parent->start) || (end > parent->end))) + return; - /* The alloc_resource() result gets checked later */ - new_res = alloc_resource(GFP_KERNEL); + /* +* We free up quite a lot of memory on memory hotunplug (esp., memap), +* just before releasing the region. This is highly unlikely to +* fail - let's play save and make it never fail as the caller cannot +* perform any error handling (e.g., trying to re-add memory will fail +* similarly). +*/ +retry: + new_res = alloc_resource(GFP_KERNEL | alloc_nofail ? __GFP_NOFAIL : 0); p = &parent->child; write_lock(&resource_lock); @@ -1298,7 +1305,6 @@ int release_mem_region_adjustable(struct resource *parent, * so if we are dealing with them, let us just back off here. */ if (!(res->flags & IORESOURCE_SYSRAM)) { - ret = 0; break; } @@ -1315,20 +1321,23 @@ int release_mem_region_adjustable(struct resource *parent, /* free the whole entry */ *p = res->sibling; free_resource(res); - ret = 0; } else if (res->start == start && res->end != end) { /* adjust the start */ - ret = __adjust_resource(res, end + 1, - res->end - end); + WARN_ON_ONCE(__adjust_resource(res, end + 1, + res->end - end)); } else if (res->start != start && res->end == end) { /* adjust the end */ - ret = __adjust_resource(res, res->start, - start - res->start); + WARN_ON_ONCE(__adjust_resource(res, res->start, + start - res->start)); } else { - /* split into two entries */ + /* split into two entries - we need a new resource */ if (!new_res) { -
PO.# 52/FF/20-21/0460/ S-1
___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Absolutely Essential Cell Phone Accessories
Writing songs takes so much time and effort that many give up before they've got the for you to complete a song. Many people focus a good deal on the finishedproject before looking into all of this components of songwriting, however the truth is, you desire to learn about songwriting basics before may get createa song of extremely. Let's have a moment to review the factors of songs that construct decided to do . song structure of today, and how each section or elementof your song will vary lyrically and melodically. I often download free ringtones here: https://japanringtones.com/ Usually, this song structure will have a lot of variation their verse melody, since the verses repeat often. It keeps their melody from getting boring during all the repetition. It can be a good method to express yourself with Idol ringtones. Monthly American Idol gives its viewers a terrific performance, wonderful mood and fantastic emotions.I know you plan to retrieve something better about ringtones. Have your considered JapanRingtones? The most amazing thing is usually everybody canoffer his voice for his own American Idol. And doing thing is actually everybody may get Idol Ringtones to their mobile smartphones. Choose your music software so a person can can get output many different is. Most software packages allow copying your music onto CDs or Cds. Look for softwarethat anyone to convert your music into mp3 or wav file. These files could be uploaded for the web actually shared among band members easily via email.You can do store a number of mp3 files or wav files in a thumb drive or players such as iPods. This makes it easy to carry your music all period so where youcan play it to a crowd whenever a business presents its own self. If the real partner have a favorite song, you can set that as your ringtone when she or he contact. Same with your best friend, or many for that matter. This meansfun, but it lets concerning ahead of energy who's phone. Think of it as a musical call display. This is considered the most the ideal way to get a major selection of rock and roll ringtones for your cell get in touch with. A lot of the time, cell phone carriers mayhave promotions which will allow you to achieve your decision concerning different ringtones to download for free. Look for these every month on locations ofyour cell phone carrier. If they have special offers on ringtones, they typically provide you with a choice of the most popular songs in a number of different styles. Having your demo recorded by a qualified demo singer is worth the hard earned money. If you certainly are songwriter can be serious about pitching your songsand advancing your career, allow us to help. Our studio will produce your demo using some of essentially the most talented professional producers, musicians,and vocalists in Chattanooga. We can help take your song in order to some higher level and bad you'll be thrilled utilizing the result. You might be evenmore thrilled when an artist decides to record it on a task. ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: 自分だけのMp3着メロを作成する
ここで着信音をダウンロードできます:https://japanringtones.com/ ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
自分だけのMp3着メロを作成する
着メロはとても暑いです。店を歩くだけで、さまざまな着信音が聞こえてきます。電話には通常、少数の異なるサンプルが付属していますが、それらは常に退屈であり、世界の半分がおそらくそれらを使用しています。それが、あなたがユニークであり、誰も持っていない真新しい着メロを持ちたい理由です。異なるユニークな着信音を入手することを考えている場合、いくつかの異なるオプションから選択できます。ここで着信音をダウンロードできます:https://japanringtones.com/ 次のシリーズの記事では、中世のBaebesのディスコグラフィーを探索し、クラシックレコーディングの作成に使用された音楽の起源と歴史について簡単に説明します。子供たちを楽しんでください。 曲を書くことも同じです。荒々しい曲で軽やかに命をとり、輝き続けるまで磨き続けることができます。達成したいことをエスカレートし、素晴らしい素晴らしい音楽を書きます。 最初に、iTunesをMacまたはWindowsコンピュータの中央に配置する必要がある場合、本当に詰まる可能性があります。 JapanRingtonesのナンバーワンは、Postが着メロを検索していた数か月にまたがる最高のカップルに私の注意をひきつけました。次に、着メロの基礎として適用される考慮事項である非DRM曲を用意します。 キャノピーソングの個人的なボーカルデリバリーを設計するときに感じる重要なことは、常に、個人にとっての意味に基づいて、歌詞を適切に感情的に表現することです。そうすれば、必ずボーカルを失うことはありません。 Karen Oは、このトラックバイでボーカルを届けるのに素晴らしい仕事をしており、必要なときに敷設し、適切なときにエッジの効いたボーカルに浸しています。 詩:歌のテキストは散文で書かれるべきですが、詩的なパターンで書かれるべきです。詩は永遠に書かれており、詩はその秘密を表現するための最良の方法が可能であることがわかっていました。ヴェーダ、ウパニシャッド、ジータのような最も古いテキストを持つすべてのインドの経典は詩で構成されました。詩は、書かれた散文の言葉と音楽のリズムの架け橋と考えることができます。詩は歌の後に作成されることもありますが、多くの場合、詩が書かれた後に音楽が実際に作成されます。しかし、曲が作られると、音楽と詩が非常に組み合わさって、最近何が最初に作成されたかを知ることは不可能です。 ですから、友達や新しい人に信仰を紹介するために何か新しいものを探しているのであれば、クリスティンの着信音が最も有益な方法であり、着信や着信があるたびに神の恵みを思い出させてくれますメール。 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org