Re: [PATCH 05/26] block: Allow using bio_attempt_back_merge() internally
On 2/1/24 23:30, Damien Le Moal wrote: Remove the static definition of bio_attempt_back_merge() to allow using To me this suggests that the function definition is removed entirely but that is not what this patch does ... Otherwise this patch looks good to me. Thanks, Bart.
Re: [PATCH 03/26] block: Introduce bio_straddle_zones() and bio_offset_from_zone_start()
On 2/1/24 23:30, Damien Le Moal wrote: Implement the inline helper functions bio_straddle_zones() and bio_offset_from_zone_start() to respectively test if a BIO crosses a zone boundary (the start sector and last sector belong to different zones) and to obtain the oofset from a zone starting sector of a BIO. oofset -> offset +static inline bool bio_straddle_zones(struct bio *bio) +{ + return bio_zone_no(bio) != + disk_zone_no(bio->bi_bdev->bd_disk, bio_end_sector(bio) - 1); +} It seems to me that the above code is not intended to handle the case where bi_size == 0, as is the case for an empty flush request. Should a comment be added above this function or do we perhaps need to add a WARN_ON_ONCE() statement? Thanks, Bart.
Re: [PATCH v5 00/14] dm-raid/md/raid: fix v6.7 regressions
On Thu, Feb 01, 2024 at 05:25:45PM +0800, Yu Kuai wrote: > From: Yu Kuai > I apply this patchset on top of v6.8-rc1, and run lvm2 tests suite with > folling cmd for 24 round(for about 2 days): > > for t in `ls test/shell`; do > if cat test/shell/$t | grep raid &> /dev/null; then > make check T=shell/$t > fi > done > > failed count failed test > 1 ### failed: [ndev-vanilla] shell/dmsecuretest.sh > 1 ### failed: [ndev-vanilla] shell/dmsetup-integrity-keys.sh > 1 ### failed: [ndev-vanilla] shell/dmsetup-keyring.sh > 5 ### failed: [ndev-vanilla] shell/duplicate-pvs-md0.sh > 1 ### failed: [ndev-vanilla] shell/duplicate-vgid.sh > 2 ### failed: [ndev-vanilla] shell/duplicate-vgnames.sh > 1 ### failed: [ndev-vanilla] shell/fsadm-crypt.sh > 1 ### failed: [ndev-vanilla] shell/integrity.sh > 6 ### failed: [ndev-vanilla] shell/lvchange-raid1-writemostly.sh > 2 ### failed: [ndev-vanilla] shell/lvchange-rebuild-raid.sh > 5 ### failed: [ndev-vanilla] > shell/lvconvert-raid-reshape-stripes-load-reload.sh > 4 ### failed: [ndev-vanilla] > shell/lvconvert-raid-restripe-linear.sh > 1 ### failed: [ndev-vanilla] > shell/lvconvert-raid1-split-trackchanges.sh > 20 ### failed: [ndev-vanilla] shell/lvconvert-repair-raid.sh > 20 ### failed: [ndev-vanilla] shell/lvcreate-large-raid.sh > 24 ### failed: [ndev-vanilla] shell/lvextend-raid.sh > > And I ramdomly pick some tests verified by hand that these test will > fail in v6.6 as well(not all tests): > > shell/lvextend-raid.sh > shell/lvcreate-large-raid.sh > shell/lvconvert-repair-raid.sh > shell/lvchange-rebuild-raid.sh > shell/lvchange-raid1-writemostly.sh In my testing with this patchset on top of the head of linus's tree (5c24e4e9e708) I am seeing failures in shell/lvconvert-raid-reshape-stripes-load-reload.sh and shell/lvconvert-repair-raid.sh in about 20% of my runs. I have never seen either of these these fail running on the 6.6 kernel (ffc253263a13). lvconvert-repair-raid.sh creates a raid array and then disables one if its drives before there's enough time to finish the initial sync and tries to repair it. This is supposed to fail (it uses dm-delay devices to slow down the sync). When the test succeeds, I see things like this: [ 0:13.469] #lvconvert-repair-raid.sh:161+ lvcreate --type raid10 -m 1 -i 2 -L 64 -n LV1 LVMTEST191946vg /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv1 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv2 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv3 /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4 [ 0:13.469] Using default stripesize 64.00 KiB. [ 0:13.483] Logical volume "LV1" created. [ 0:14.042] 6,8908,1194343108,-;device-mapper: raid: Superblocks created for new raid set [ 0:14.042] 5,8909,1194348704,-;md/raid10:mdX: not clean -- starting background reconstruction [ 0:14.042] 6,8910,1194349443,-;md/raid10:mdX: active with 4 out of 4 devices [ 0:14.042] 4,8911,1194459161,-;mdX: bitmap file is out of date, doing full recovery [ 0:14.042] 6,8912,1194563810,-;md: resync of RAID array mdX [ 0:14.042] WARNING: This metadata update is NOT backed up. [ 0:14.042] aux disable_dev "$dev4" [ 0:14.058] #lvconvert-repair-raid.sh:163+ aux disable_dev /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4 [ 0:14.058] Disabling device /tmp/LVMTEST191946.ImUMG6dyqB/dev/mapper/LVMTEST191946pv4 (253:5) [ 0:14.101] not lvconvert -y --repair $vg/$lv1 When it fails, I see: [ 0:13.831] #lvconvert-repair-raid.sh:161+ lvcreate --type raid10 -m 1 -i 2 -L 64 -n LV1 LVMTEST192248vg /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv1 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv2 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv3 /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192248pv4 [ 0:13.831] Using default stripesize 64.00 KiB. [ 0:13.847] Logical volume "LV1" created. [ 0:14.499] WARNING: This metadata update is NOT backed up. [ 0:14.499] 6,8925,1187444256,-;device-mapper: raid: Superblocks created for new raid set [ 0:14.499] 5,8926,1187449525,-;md/raid10:mdX: not clean -- starting background reconstruction [ 0:14.499] 6,8927,1187450148,-;md/raid10:mdX: active with 4 out of 4 devices [ 0:14.499] 6,8928,1187452472,-;md: resync of RAID array mdX [ 0:14.499] 6,8929,1187453016,-;md: mdX: resync done. [ 0:14.499] 4,8930,1187555486,-;mdX: bitmap file is out of date, doing full recovery [ 0:14.499] aux disable_dev "$dev4" [ 0:14.515] #lvconvert-repair-raid.sh:163+ aux disable_dev /tmp/LVMTEST192248.AT cecgSGfE/dev/mapper/LVMTEST192248pv4 [ 0:14.515] Disabling device /tmp/LVMTEST192248.ATcecgSGfE/dev/mapper/LVMTEST192 248pv4 (253:5) [ 0:14.554] not lvconvert -y --repair $vg/$lv1 To me the important looking difference (and I admi
[RFC PATCH v4 11/12] dcssblk: Cleanup alloc_dax() error handling
Now that alloc_dax() returns ERR_PTR(-EOPNOTSUPP) rather than NULL, the callers do not have to handle NULL return values anymore. Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev Cc: linux-s...@vger.kernel.org --- drivers/s390/block/dcssblk.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index a3010849bfed..f363c1d51d9a 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -679,8 +679,8 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char goto put_dev; dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops); - if (IS_ERR_OR_NULL(dax_dev)) { - rc = IS_ERR(dax_dev) ? PTR_ERR(dax_dev) : -EOPNOTSUPP; + if (IS_ERR(dax_dev)) { + rc = PTR_ERR(dax_dev); goto put_dev; } set_dax_synchronous(dax_dev); -- 2.39.2
[RFC PATCH v4 12/12] virtio: Cleanup alloc_dax() error handling
Now that alloc_dax() returns ERR_PTR(-EOPNOTSUPP) rather than NULL, the callers do not have to handle NULL return values anymore. Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev Cc: linux-s...@vger.kernel.org --- fs/fuse/virtio_fs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 621b1bca2d55..a28466c2da71 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -809,8 +809,8 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) return 0; dax_dev = alloc_dax(fs, &virtio_fs_dax_ops); - if (IS_ERR_OR_NULL(dax_dev)) { - int rc = IS_ERR(dax_dev) ? PTR_ERR(dax_dev) : -EOPNOTSUPP; + if (IS_ERR(dax_dev)) { + int rc = PTR_ERR(dax_dev); return rc == -EOPNOTSUPP ? 0 : rc; } -- 2.39.2
[RFC PATCH v4 09/12] nvdimm/pmem: Cleanup alloc_dax() error handling
Now that alloc_dax() returns ERR_PTR(-EOPNOTSUPP) rather than NULL, the callers do not have to handle NULL return values anymore. Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/nvdimm/pmem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index f1d9f5c6dbac..e9898457a7bd 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -558,8 +558,8 @@ static int pmem_attach_disk(struct device *dev, disk->bb = &pmem->bb; dax_dev = alloc_dax(pmem, &pmem_dax_ops); - if (IS_ERR_OR_NULL(dax_dev)) { - rc = IS_ERR(dax_dev) ? PTR_ERR(dax_dev) : -EOPNOTSUPP; + if (IS_ERR(dax_dev)) { + rc = PTR_ERR(dax_dev); if (rc != -EOPNOTSUPP) goto out; } else { -- 2.39.2
[RFC PATCH v4 10/12] dm: Cleanup alloc_dax() error handling
Now that alloc_dax() returns ERR_PTR(-EOPNOTSUPP) rather than NULL, the callers do not have to handle NULL return values anymore. Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/md/dm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 2fc22cae9089..acdc00bc05be 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2124,8 +2124,8 @@ static struct mapped_device *alloc_dev(int minor) sprintf(md->disk->disk_name, "dm-%d", minor); dax_dev = alloc_dax(md, &dm_dax_ops); - if (IS_ERR_OR_NULL(dax_dev)) { - if (IS_ERR(dax_dev) && PTR_ERR(dax_dev) != -EOPNOTSUPP) + if (IS_ERR(dax_dev)) { + if (PTR_ERR(dax_dev) != -EOPNOTSUPP) goto bad; } else { set_dax_nocache(dax_dev); -- 2.39.2
[RFC PATCH v4 07/12] Introduce cpu_dcache_is_aliasing() across all architectures
Introduce a generic way to query whether the data cache is virtually aliased on all architectures. Its purpose is to ensure that subsystems which are incompatible with virtually aliased data caches (e.g. FS_DAX) can reliably query this. For data cache aliasing, there are three scenarios dependending on the architecture. Here is a breakdown based on my understanding: A) The data cache is always aliasing: * arc * csky * m68k (note: shared memory mappings are incoherent ? SHMLBA is missing there.) * sh * parisc B) The data cache aliasing is statically known or depends on querying CPU state at runtime: * arm (cache_is_vivt() || cache_is_vipt_aliasing()) * mips (cpu_has_dc_aliases) * nios2 (NIOS2_DCACHE_SIZE > PAGE_SIZE) * sparc32 (vac_cache_size > PAGE_SIZE) * sparc64 (L1DCACHE_SIZE > PAGE_SIZE) * xtensa (DCACHE_WAY_SIZE > PAGE_SIZE) C) The data cache is never aliasing: * alpha * arm64 (aarch64) * hexagon * loongarch (but with incoherent write buffers, which are disabled since commit d23b7795 ("LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE")) * microblaze * openrisc * powerpc * riscv * s390 * um * x86 Require architectures in A) and B) to select ARCH_HAS_CPU_CACHE_ALIASING and implement "cpu_dcache_is_aliasing()". Architectures in C) don't select ARCH_HAS_CPU_CACHE_ALIASING, and thus cpu_dcache_is_aliasing() simply evaluates to "false". Note that this leaves "cpu_icache_is_aliasing()" to be implemented as future work. This would be useful to gate features like XIP on architectures which have aliasing CPU dcache-icache but not CPU dcache-dcache. Use "cpu_dcache" and "cpu_cache" rather than just "dcache" and "cache" to clarify that we really mean "CPU data cache" and "CPU cache" to eliminate any possible confusion with VFS "dentry cache" and "page cache". Link: https://lore.kernel.org/lkml/20030910210416.ga24...@mail.jlokier.co.uk/ Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- arch/arc/Kconfig| 1 + arch/arc/include/asm/cachetype.h| 9 + arch/arm/Kconfig| 1 + arch/arm/include/asm/cachetype.h| 2 ++ arch/csky/Kconfig | 1 + arch/csky/include/asm/cachetype.h | 9 + arch/m68k/Kconfig | 1 + arch/m68k/include/asm/cachetype.h | 9 + arch/mips/Kconfig | 1 + arch/mips/include/asm/cachetype.h | 9 + arch/nios2/Kconfig | 1 + arch/nios2/include/asm/cachetype.h | 10 ++ arch/parisc/Kconfig | 1 + arch/parisc/include/asm/cachetype.h | 9 + arch/sh/Kconfig | 1 + arch/sh/include/asm/cachetype.h | 9 + arch/sparc/Kconfig | 1 + arch/sparc/include/asm/cachetype.h | 14 ++ arch/xtensa/Kconfig | 1 + arch/xtensa/include/asm/cachetype.h | 10 ++ include/linux/cacheinfo.h | 6 ++ mm/Kconfig | 6 ++ 22 files changed, 112 insertions(+) create mode 100644 arch/arc/include/asm/cachetype.h create mode 100644 arch/csky/include/asm/cachetype.h create mode 100644 arch/m68k/include/asm/cachetype.h create mode 100644 arch/mips/include/asm/cachetype.h create mode 100644 arch/nios2/include/asm/cachetype.h create mode 100644 arch/parisc/include/asm/cachetype.h create mode 100644 arch/sh/include/asm/cachetype.h create mode 100644 arch/sparc/include/asm/cachetype.h create mode 100644 arch/xtensa/include/asm/cachetype.h diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index 1b0483c51cc1..7d294a3242a4 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -6,6 +6,7 @@ config ARC def_bool y select ARC_TIMERS + select ARCH_HAS_CPU_CACHE_ALIASING select ARCH_HAS_CACHE_LINE_SIZE select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DMA_PREP_COHERENT diff --git a/arch/arc/include/asm/cachetype.h b/arch/arc/include/asm/cachetype.h new file mode 100644 index ..05fc7ed59712 --- /dev/null +++ b/arch/arc/include/asm/cachetype.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_ARC_CACHETYPE_H +#define __ASM_ARC_CACHETYPE_H + +#include + +#define cpu_dcache_is_aliasing() true + +#endif diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index f8567e95f98b..cd13b1788973 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -5,6 +5,7 @@ config ARM select ARCH_32BIT_OFF_T select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE if HAVE_KRETPROBES && FRAME_POINTER && !ARM_UNWIND
[RFC PATCH v4 08/12] dax: Fix incorrect list of data cache aliasing architectures
commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches") prevents DAX from building on architectures with virtually aliased dcache with: depends on !(ARM || MIPS || SPARC) This check is too broad (e.g. recent ARMv7 don't have virtually aliased dcaches), and also misses many other architectures with virtually aliased data cache. This is a regression introduced in the v4.0 Linux kernel where the dax mount option is removed for 32-bit ARMv7 boards which have no data cache aliasing, and therefore should work fine with FS_DAX. This was turned into the following check in alloc_dax() by a preparatory change: if (ops && (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_MIPS) || IS_ENABLED(CONFIG_SPARC))) return NULL; Use cpu_dcache_is_aliasing() instead to figure out whether the environment has aliasing data caches. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/dax/super.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index ce5bffa86bba..a21a7c262382 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "dax-private.h" /** @@ -455,9 +456,7 @@ struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) * except for device-dax (NULL operations pointer), which does * not use aliased mappings from the kernel. */ - if (ops && (IS_ENABLED(CONFIG_ARM) || - IS_ENABLED(CONFIG_MIPS) || - IS_ENABLED(CONFIG_SPARC))) + if (ops && cpu_dcache_is_aliasing()) return ERR_PTR(-EOPNOTSUPP); if (WARN_ON_ONCE(ops && !ops->zero_page_range)) -- 2.39.2
[RFC PATCH v4 05/12] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of virtio virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. For the transition, consider that alloc_dax() returning NULL is the same as returning -EOPNOTSUPP. Co-developed-by: Dan Williams Signed-off-by: Dan Williams Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- fs/fuse/virtio_fs.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 5f1be1da92ce..621b1bca2d55 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "fuse_i.h" @@ -795,8 +796,11 @@ static void virtio_fs_cleanup_dax(void *data) put_dax(dax_dev); } +DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR_OR_NULL(_T)) virtio_fs_cleanup_dax(_T)) + static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) { + struct dax_device *dax_dev __free(cleanup_dax) = NULL; struct virtio_shm_region cache_reg; struct dev_pagemap *pgmap; bool have_cache; @@ -804,6 +808,12 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) if (!IS_ENABLED(CONFIG_FUSE_DAX)) return 0; + dax_dev = alloc_dax(fs, &virtio_fs_dax_ops); + if (IS_ERR_OR_NULL(dax_dev)) { + int rc = IS_ERR(dax_dev) ? PTR_ERR(dax_dev) : -EOPNOTSUPP; + return rc == -EOPNOTSUPP ? 0 : rc; + } + /* Get cache region */ have_cache = virtio_get_shm_region(vdev, &cache_reg, (u8)VIRTIO_FS_SHMCAP_ID_CACHE); @@ -849,10 +859,7 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) dev_dbg(&vdev->dev, "%s: window kaddr 0x%px phys_addr 0x%llx len 0x%llx\n", __func__, fs->window_kaddr, cache_reg.addr, cache_reg.len); - fs->dax_dev = alloc_dax(fs, &virtio_fs_dax_ops); - if (IS_ERR(fs->dax_dev)) - return PTR_ERR(fs->dax_dev); - + fs->dax_dev = no_free_ptr(dax_dev); return devm_add_action_or_reset(&vdev->dev, virtio_fs_cleanup_dax, fs->dax_dev); } -- 2.39.2
[RFC PATCH v4 06/12] dax: Check for data cache aliasing at runtime
Replace the following fs/Kconfig:FS_DAX dependency: depends on !(ARM || MIPS || SPARC) By a runtime check within alloc_dax(). This runtime check returns ERR_PTR(-EOPNOTSUPP) if the @ops parameter is non-NULL (which means the kernel is using an aliased mapping) on an architecture which has data cache aliasing. Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for CONFIG_DAX=n for consistency. This is done in preparation for using cpu_dcache_is_aliasing() in a following change which will properly support architectures which detect data cache aliasing at runtime. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/dax/super.c | 15 +++ fs/Kconfig | 1 - include/linux/dax.h | 6 +- 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index 0da9232ea175..ce5bffa86bba 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -319,6 +319,11 @@ EXPORT_SYMBOL_GPL(dax_alive); * that any fault handlers or operations that might have seen * dax_alive(), have completed. Any operations that start after * synchronize_srcu() has run will abort upon seeing !dax_alive(). + * + * Note, because alloc_dax() returns an ERR_PTR() on error, callers + * typically store its result into a local variable in order to check + * the result. Therefore, care must be taken to populate the struct + * device dax_dev field make sure the dax_dev is not leaked. */ void kill_dax(struct dax_device *dax_dev) { @@ -445,6 +450,16 @@ struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) dev_t devt; int minor; + /* +* Unavailable on architectures with virtually aliased data caches, +* except for device-dax (NULL operations pointer), which does +* not use aliased mappings from the kernel. +*/ + if (ops && (IS_ENABLED(CONFIG_ARM) || + IS_ENABLED(CONFIG_MIPS) || + IS_ENABLED(CONFIG_SPARC))) + return ERR_PTR(-EOPNOTSUPP); + if (WARN_ON_ONCE(ops && !ops->zero_page_range)) return ERR_PTR(-EINVAL); diff --git a/fs/Kconfig b/fs/Kconfig index 42837617a55b..e5efdb3b276b 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -56,7 +56,6 @@ endif # BLOCK config FS_DAX bool "File system based Direct Access (DAX) support" depends on MMU - depends on !(ARM || MIPS || SPARC) depends on ZONE_DEVICE || FS_DAX_LIMITED select FS_IOMAP select DAX diff --git a/include/linux/dax.h b/include/linux/dax.h index b463502b16e1..df2d52b8a245 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -86,11 +86,7 @@ static inline void *dax_holder(struct dax_device *dax_dev) static inline struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) { - /* -* Callers should check IS_ENABLED(CONFIG_DAX) to know if this -* NULL is an error or expected. -*/ - return NULL; + return ERR_PTR(-EOPNOTSUPP); } static inline void put_dax(struct dax_device *dax_dev) { -- 2.39.2
[RFC PATCH v4 04/12] dcssblk: Handle alloc_dax() -EOPNOTSUPP failure
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dcssblk dcssblk_add_store() to handle alloc_dax() -EOPNOTSUPP failures. Considering that s390 is not a data cache aliasing architecture, and considering that DCSSBLK selects DAX, a return value of -EOPNOTSUPP from alloc_dax() should make dcssblk_add_store() fail. For the transition, consider that alloc_dax() returning NULL is the same as returning -EOPNOTSUPP. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev Cc: linux-s...@vger.kernel.org --- drivers/s390/block/dcssblk.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index 4b7ecd4fd431..a3010849bfed 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -549,6 +549,7 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char int rc, i, j, num_of_segments; struct dcssblk_dev_info *dev_info; struct segment_info *seg_info, *temp; + struct dax_device *dax_dev; char *local_buf; unsigned long seg_byte_size; @@ -677,13 +678,13 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char if (rc) goto put_dev; - dev_info->dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops); - if (IS_ERR(dev_info->dax_dev)) { - rc = PTR_ERR(dev_info->dax_dev); - dev_info->dax_dev = NULL; + dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops); + if (IS_ERR_OR_NULL(dax_dev)) { + rc = IS_ERR(dax_dev) ? PTR_ERR(dax_dev) : -EOPNOTSUPP; goto put_dev; } - set_dax_synchronous(dev_info->dax_dev); + set_dax_synchronous(dax_dev); + dev_info->dax_dev = dax_dev; rc = dax_add_host(dev_info->dax_dev, dev_info->gd); if (rc) goto out_dax; -- 2.39.2
[RFC PATCH v4 00/12] Introduce cpu_dcache_is_aliasing() to fix DAX regression
This commit introduced in v4.0 prevents building FS_DAX on 32-bit ARM, even on ARMv7 which does not have virtually aliased data caches: commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches") It used to work fine before: I have customers using DAX over pmem on ARMv7, but this regression will likely prevent them from upgrading their kernel. The root of the issue here is the fact that DAX was never designed to handle virtually aliasing data caches (VIVT and VIPT with aliasing data cache). It touches the pages through their linear mapping, which is not consistent with the userspace mappings with virtually aliasing data caches. This patch series introduces cpu_dcache_is_aliasing() with the new Kconfig option ARCH_HAS_CPU_CACHE_ALIASING and implements it for all architectures. The implementation of cpu_dcache_is_aliasing() is either evaluated to a constant at compile-time or a runtime check, which is what is needed on ARM. With this we can basically narrow down the list of architectures which are unsupported by DAX to those which are really affected. Testing done so far: - Compile allyesconfig on x86-64, - Compile allyesconfig on x86-64, with FS_DAX=n. - Compile allyesconfig on x86-64, with DAX=n. - Boot test after modifying alloc_dax() to force returning -EOPNOTSUPP even on x86-64, thus simulating the behavior expected on an architecture with data cache aliasing. There are many more axes to test however. I would welcome Tested-by for: - affected architectures, - affected drivers, - affected filesytems. Thanks, Mathieu Changes since v3: - Fix a leak on dax_add_host() failure in nvdimm/pmem. - Split the series into a bissectable sequence of changes. - Ensure that device-dax use-cases still works on data cache aliasing architectures. Changes since v2: - Move DAX supported runtime check to alloc_dax(), - Modify DM to handle alloc_dax() error as non-fatal, - Remove all filesystem modifications, since the check is now done by alloc_dax(), - rename "dcache" and "cache" to "cpu dcache" and "cpu cache" to eliminate confusion with VFS terminology. Changes since v1: - The order of the series was completely changed based on the feedback received on v1, - cache_is_aliasing() is renamed to dcache_is_aliasing(), - ARCH_HAS_CACHE_ALIASING_DYNAMIC is gone, - dcache_is_aliasing() vs ARCH_HAS_CACHE_ALIASING relationship is simplified, - the dax_is_supported() check was moved to its rightful place in all filesystems. Cc: Andrew Morton Cc: Linus Torvalds Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev Cc: linux-s...@vger.kernel.org Mathieu Desnoyers (12): nvdimm/pmem: Fix leak on dax_add_host() failure nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal dcssblk: Handle alloc_dax() -EOPNOTSUPP failure virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal dax: Check for data cache aliasing at runtime Introduce cpu_dcache_is_aliasing() across all architectures dax: Fix incorrect list of data cache aliasing architectures nvdimm/pmem: Cleanup alloc_dax() error handling dm: Cleanup alloc_dax() error handling dcssblk: Cleanup alloc_dax() error handling virtio: Cleanup alloc_dax() error handling arch/arc/Kconfig| 1 + arch/arc/include/asm/cachetype.h| 9 + arch/arm/Kconfig| 1 + arch/arm/include/asm/cachetype.h| 2 ++ arch/csky/Kconfig | 1 + arch/csky/include/asm/cachetype.h | 9 + arch/m68k/Kconfig | 1 + arch/m68k/include/asm/cachetype.h | 9 + arch/mips/Kconfig | 1 + arch/mips/include/asm/cachetype.h | 9 + arch/nios2/Kconfig | 1 + arch/nios2/include/asm/cachetype.h | 10 ++ arch/parisc/Kconfig | 1 + arch/parisc/include/asm/cachetype.h | 9 + arch/sh/Kconfig | 1 + arch/sh/include/asm/cachetype.h | 9 + arch/sparc/Kconfig | 1 + arch/sparc/include/asm/cachetype.h | 14 ++ arch/xtensa/Kconfig | 1 + arch/xtensa/include/asm/cachetype.h | 10 ++ drivers/dax/super.c | 14 ++ drivers/md/dm.c | 17 + drivers/nvdimm/pmem.c | 23 --- drivers/s390/block/dcssblk.c| 11 ++- fs/Kconfig | 1 - fs/fuse/virtio_fs.c | 15 +++ include/linux/cacheinfo.h | 6 ++ include/linux/dax.h | 6 +- mm/Kconfig | 6 ++ 29 files changed, 165 insertions(+), 34 deleti
[RFC PATCH v4 03/12] dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dm alloc_dev() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. For the transition, consider that alloc_dax() returning NULL is the same as returning -EOPNOTSUPP. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Suggested-by: Dan Williams Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/md/dm.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 23c32cd1f1d8..2fc22cae9089 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2054,6 +2054,7 @@ static void cleanup_mapped_device(struct mapped_device *md) static struct mapped_device *alloc_dev(int minor) { int r, numa_node_id = dm_get_numa_node(); + struct dax_device *dax_dev; struct mapped_device *md; void *old_md; @@ -2122,15 +2123,15 @@ static struct mapped_device *alloc_dev(int minor) md->disk->private_data = md; sprintf(md->disk->disk_name, "dm-%d", minor); - if (IS_ENABLED(CONFIG_FS_DAX)) { - md->dax_dev = alloc_dax(md, &dm_dax_ops); - if (IS_ERR(md->dax_dev)) { - md->dax_dev = NULL; + dax_dev = alloc_dax(md, &dm_dax_ops); + if (IS_ERR_OR_NULL(dax_dev)) { + if (IS_ERR(dax_dev) && PTR_ERR(dax_dev) != -EOPNOTSUPP) goto bad; - } - set_dax_nocache(md->dax_dev); - set_dax_nomc(md->dax_dev); - if (dax_add_host(md->dax_dev, md->disk)) + } else { + set_dax_nocache(dax_dev); + set_dax_nomc(dax_dev); + md->dax_dev = dax_dev; + if (dax_add_host(dax_dev, md->disk)) goto bad; } -- 2.39.2
[RFC PATCH v4 02/12] nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of nvdimm/pmem pmem_attach_disk() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. For the transition, consider that alloc_dax() returning NULL is the same as returning -EOPNOTSUPP. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/nvdimm/pmem.c | 26 ++ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 9fe358090720..f1d9f5c6dbac 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -558,19 +558,21 @@ static int pmem_attach_disk(struct device *dev, disk->bb = &pmem->bb; dax_dev = alloc_dax(pmem, &pmem_dax_ops); - if (IS_ERR(dax_dev)) { - rc = PTR_ERR(dax_dev); - goto out; + if (IS_ERR_OR_NULL(dax_dev)) { + rc = IS_ERR(dax_dev) ? PTR_ERR(dax_dev) : -EOPNOTSUPP; + if (rc != -EOPNOTSUPP) + goto out; + } else { + set_dax_nocache(dax_dev); + set_dax_nomc(dax_dev); + if (is_nvdimm_sync(nd_region)) + set_dax_synchronous(dax_dev); + pmem->dax_dev = dax_dev; + rc = dax_add_host(dax_dev, disk); + if (rc) + goto out_cleanup_dax; + dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); } - set_dax_nocache(dax_dev); - set_dax_nomc(dax_dev); - if (is_nvdimm_sync(nd_region)) - set_dax_synchronous(dax_dev); - pmem->dax_dev = dax_dev; - rc = dax_add_host(dax_dev, disk); - if (rc) - goto out_cleanup_dax; - dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); rc = device_add_disk(dev, disk, pmem_attribute_groups); if (rc) goto out_remove_host; -- 2.39.2
[RFC PATCH v4 01/12] nvdimm/pmem: Fix leak on dax_add_host() failure
Fix a leak on dax_add_host() error, where "goto out_cleanup_dax" is done before setting pmem->dax_dev, which therefore issues the two following calls on NULL pointers: out_cleanup_dax: kill_dax(pmem->dax_dev); put_dax(pmem->dax_dev); Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/nvdimm/pmem.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 4e8fdcb3f1c8..9fe358090720 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -566,12 +566,11 @@ static int pmem_attach_disk(struct device *dev, set_dax_nomc(dax_dev); if (is_nvdimm_sync(nd_region)) set_dax_synchronous(dax_dev); + pmem->dax_dev = dax_dev; rc = dax_add_host(dax_dev, disk); if (rc) goto out_cleanup_dax; dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); - pmem->dax_dev = dax_dev; - rc = device_add_disk(dev, disk, pmem_attribute_groups); if (rc) goto out_remove_host; -- 2.39.2
Re: [git pull] device mapper fixes for 6.8-rc3
The pull request you sent on Fri, 2 Feb 2024 13:00:38 -0500: > git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git > tags/for-6.8/dm-fixes has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/6897cea7183762b4bc97b0ed1b75274ece9d518b Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
On 2024-02-02 15:14, Dan Williams wrote: Mathieu Desnoyers wrote: [..] Thanks for that. All of those need to be done before the fs goes live later in virtio_device_ready(), but before that point nothing should be calling into virtio_fs_dax_ops, so as far as I can see it is safe to change the order. Sounds good, I'll do that. I will soon be ready to send out a RFC v4, which is still only compiled-tested. Do you happen to have some kind of test suite you can use to automate some of the runtime testing ? There is a test suite for the pmem, dm, and dax changes (https://github.com/pmem/ndctl?tab=readme-ov-file#unit-tests), but not automated unfortunately. The NVDIMM maintainer team will run that before pushing patches out to the fixes branch if you just want to lean on that. For the rest I think we will need to depend on tested-by's from s390 + virtio_fs folks, and / or sufficient soak time in linux-next. I suspect this will be necessary. There are just so many combinations of architectures, drivers and filesystems involved here that I don't think it is realistic to try to do all this testing manually on my own. I prefer to voice this up front, so there are no misplaced expectations about testing. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
Mathieu Desnoyers wrote: [..] > > Thanks for that. All of those need to be done before the fs goes live > > later in virtio_device_ready(), but before that point nothing should be > > calling into virtio_fs_dax_ops, so as far as I can see it is safe to > > change the order. > > Sounds good, I'll do that. > > I will soon be ready to send out a RFC v4, which is still only > compiled-tested. Do you happen to have some kind of test suite > you can use to automate some of the runtime testing ? There is a test suite for the pmem, dm, and dax changes (https://github.com/pmem/ndctl?tab=readme-ov-file#unit-tests), but not automated unfortunately. The NVDIMM maintainer team will run that before pushing patches out to the fixes branch if you just want to lean on that. For the rest I think we will need to depend on tested-by's from s390 + virtio_fs folks, and / or sufficient soak time in linux-next.
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
On 2024-02-02 14:41, Dan Williams wrote: Mathieu Desnoyers wrote: On 2024-02-02 12:37, Dan Williams wrote: Mathieu Desnoyers wrote: [...] The alternative route I intend to take is to audit all callers of alloc_dax() and make sure they all save the alloc_dax() return value in a struct dax_device * local variable first for the sake of checking for IS_ERR(). This will leave the xyz->dax_dev pointer initialized to NULL in the error case and simplify the rest of error checking. I could maybe get on board with that, but it needs a comment somewhere about the asymmetric subtlety. Is this "somewhere" at every alloc_dax() call site, or do you have something else in mind ? At least kill_dax() should mention the asymmetry I think. Here is what I intend to add: * Note, because alloc_dax() returns an ERR_PTR() on error, callers * typically store its result into a local variable in order to check * the result. Therefore, care must be taken to populate the struct * device dax_dev field make sure the dax_dev is not leaked. The real question is what to do about device-dax. I *think* it is not affected by cpu_dcache aliasing because it never accesses user mappings through a kernel alias. I doubt device-dax is in use on these platforms, but we might need another fixup for that if someone screams about the alloc_dax() behavior change making them lose device-dax access. By "device-dax", I understand you mean drivers/dax/Kconfig:DEV_DAX. Based on your analysis, is alloc_dax() still the right spot where to place this runtime check ? Which call sites are responsible for invoking alloc_dax() for device-dax ? That is in devm_create_dev_dax(). If we know which call sites do not intend to use the kernel linear mapping, we could introduce a flag (or a new variant of the alloc_dax() API) that would either enforce or skip the check. Hmmm, it looks like there is already a natural flag for that. If alloc_dax() is passed a NULL operations pointer it means there are no kernel usages of the aliased mapping. That actually fits rather nicely. Good, I was reaching the same conclusion when I received your reply. I'll do that. It ends up being: /* * Unavailable on architectures with virtually aliased data caches, * except for device-dax (NULL operations pointer), which does * not use aliased mappings from the kernel. */ if (ops && cpu_dcache_is_aliasing()) return ERR_PTR(-EOPNOTSUPP); [..] @@ -804,6 +808,15 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) if (!IS_ENABLED(CONFIG_FUSE_DAX)) return 0; + dax_dev = alloc_dax(fs, &virtio_fs_dax_ops); + if (IS_ERR(dax_dev)) { + int rc = PTR_ERR(dax_dev); + + if (rc == -EOPNOTSUPP) + return 0; + return rc; + } What is gained by moving this allocation here ? The gain is to fail early in virtio_fs_setup_dax() since the fundamental dependency of alloc_dax() success is not met. For example why let the setup progress to devm_memremap_pages() when alloc_dax() is going to return ERR_PTR(-EOPNOTSUPP). What I don't know is whether there is a dependency requiring to do devm_request_mem_region(), devm_kzalloc(), devm_memremap_pages() before calling alloc_dax() ? Those 3 calls are used to populate: fs->window_phys_addr = (phys_addr_t) cache_reg.addr; fs->window_len = (phys_addr_t) cache_reg.len; and then alloc_dax() takes "fs" as private data parameter. So it's unclear to me whether we can swap the invocation order. I suspect that it is not an issue because it is only used to populate dax_dev->private, but I prefer to confirm this with you just to be on the safe side. Thanks for that. All of those need to be done before the fs goes live later in virtio_device_ready(), but before that point nothing should be calling into virtio_fs_dax_ops, so as far as I can see it is safe to change the order. Sounds good, I'll do that. I will soon be ready to send out a RFC v4, which is still only compiled-tested. Do you happen to have some kind of test suite you can use to automate some of the runtime testing ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
Mathieu Desnoyers wrote: > On 2024-02-02 12:37, Dan Williams wrote: > > Mathieu Desnoyers wrote: > [...] > >> > > > >> The alternative route I intend to take is to audit all callers > >> of alloc_dax() and make sure they all save the alloc_dax() return > >> value in a struct dax_device * local variable first for the sake > >> of checking for IS_ERR(). This will leave the xyz->dax_dev pointer > >> initialized to NULL in the error case and simplify the rest of > >> error checking. > > > > I could maybe get on board with that, but it needs a comment somewhere > > about the asymmetric subtlety. > > Is this "somewhere" at every alloc_dax() call site, or do you have > something else in mind ? At least kill_dax() should mention the asymmetry I think. > > > > >> > >> > >>> return; > >>> > >>> if (dax_dev->holder_data != NULL) > >>> diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > >>> index 4e8fdcb3f1c8..b69c9e442cf4 100644 > >>> --- a/drivers/nvdimm/pmem.c > >>> +++ b/drivers/nvdimm/pmem.c > >>> @@ -560,17 +560,19 @@ static int pmem_attach_disk(struct device *dev, > >>> dax_dev = alloc_dax(pmem, &pmem_dax_ops); > >>> if (IS_ERR(dax_dev)) { > >>> rc = PTR_ERR(dax_dev); > >>> - goto out; > >>> + if (rc != -EOPNOTSUPP) > >>> + goto out; > >> > >> If I compare the before / after this change, if previously > >> pmem_attach_disk() was called in a configuration with FS_DAX=n, it would > >> result in a NULL pointer dereference. > > > > No, alloc_dax() only returns NULL CONFIG_DAX=n case, not the > > CONFIG_FS_DAX=n case. > > Indeed, I was wrong there. > > > So that means that pmem devices on ARM have been > > possible without FS_DAX. So, in order for alloc_dax() returning > > ERR_PTR(-EOPNOTSUPP) to not regress pmem device availability this error > > path needs to be changed. > Good point. We're moving the depends on !(ARM || MIPS |PARC) from FS_DAX > Kconfig to a runtime check in alloc_dax(), which is used whenever DAX=y, > which includes configurations that had FS_DAX=n previously. > > I'll change the error path in pmem_attack_disk to treat -EOPNOTSUPP > alloc_dax() return value as non-fatal. > > > > >> This would be an error handling fix all by itself. Do we really want > >> to return successfully if dax is unsupported, or should we return > >> an error here ? > > > > Per above, there is no error handling fix, and pmem block device > > available should not depend on alloc_dax() succeeding. > > I agree on treating alloc_dax() failure as non-fatal. There is > however one error handling fix to nvdimm/pmem which I plan to > introduce as an initial patch before this change: > > nvdimm/pmem: Fix leak on dax_add_host() failure > > Fix a leak on dax_add_host() error, where "goto out_cleanup_dax" is done > before setting pmem->dax_dev, which therefore issues the two following > calls on NULL pointers: > > out_cleanup_dax: > kill_dax(pmem->dax_dev); > put_dax(pmem->dax_dev); > > Signed-off-by: Mathieu Desnoyers > > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > index 4e8fdcb3f1c8..9fe358090720 100644 > --- a/drivers/nvdimm/pmem.c > +++ b/drivers/nvdimm/pmem.c > @@ -566,12 +566,11 @@ static int pmem_attach_disk(struct device *dev, > set_dax_nomc(dax_dev); > if (is_nvdimm_sync(nd_region)) > set_dax_synchronous(dax_dev); > + pmem->dax_dev = dax_dev; > rc = dax_add_host(dax_dev, disk); > if (rc) > goto out_cleanup_dax; > dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); > - pmem->dax_dev = dax_dev; > - > rc = device_add_disk(dev, disk, pmem_attribute_groups); > if (rc) > goto out_remove_host; Yup, looks good. > > The real question is what to do about device-dax. I *think* it is not > > affected by cpu_dcache aliasing because it never accesses user mappings > > through a kernel alias. I doubt device-dax is in use on these platforms, > > but we might need another fixup for that if someone screams about the > > alloc_dax() behavior change making them lose device-dax access. > > By "device-dax", I understand you mean drivers/dax/Kconfig:DEV_DAX. > > Based on your analysis, is alloc_dax() still the right spot where > to place this runtime check ? Which call sites are responsible > for invoking alloc_dax() for device-dax ? That is in devm_create_dev_dax(). > If we know which call sites do not intend to use the kernel linear > mapping, we could introduce a flag (or a new variant of the alloc_dax() > API) that would either enforce or skip the check. Hmmm, it looks like there is already a natural flag for that. If alloc_dax() is passed a NULL operations pointer it means there are no kernel usages of the aliased mapping. That actually fits rather nicely. [..] > >>> @@ -804,6 +808,15 @@ static int virtio_fs_setup_dax(struct
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
On 2024-02-02 12:37, Dan Williams wrote: Mathieu Desnoyers wrote: [...] The alternative route I intend to take is to audit all callers of alloc_dax() and make sure they all save the alloc_dax() return value in a struct dax_device * local variable first for the sake of checking for IS_ERR(). This will leave the xyz->dax_dev pointer initialized to NULL in the error case and simplify the rest of error checking. I could maybe get on board with that, but it needs a comment somewhere about the asymmetric subtlety. Is this "somewhere" at every alloc_dax() call site, or do you have something else in mind ? return; if (dax_dev->holder_data != NULL) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 4e8fdcb3f1c8..b69c9e442cf4 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -560,17 +560,19 @@ static int pmem_attach_disk(struct device *dev, dax_dev = alloc_dax(pmem, &pmem_dax_ops); if (IS_ERR(dax_dev)) { rc = PTR_ERR(dax_dev); - goto out; + if (rc != -EOPNOTSUPP) + goto out; If I compare the before / after this change, if previously pmem_attach_disk() was called in a configuration with FS_DAX=n, it would result in a NULL pointer dereference. No, alloc_dax() only returns NULL CONFIG_DAX=n case, not the CONFIG_FS_DAX=n case. Indeed, I was wrong there. So that means that pmem devices on ARM have been possible without FS_DAX. So, in order for alloc_dax() returning ERR_PTR(-EOPNOTSUPP) to not regress pmem device availability this error path needs to be changed. Good point. We're moving the depends on !(ARM || MIPS |PARC) from FS_DAX Kconfig to a runtime check in alloc_dax(), which is used whenever DAX=y, which includes configurations that had FS_DAX=n previously. I'll change the error path in pmem_attack_disk to treat -EOPNOTSUPP alloc_dax() return value as non-fatal. This would be an error handling fix all by itself. Do we really want to return successfully if dax is unsupported, or should we return an error here ? Per above, there is no error handling fix, and pmem block device available should not depend on alloc_dax() succeeding. I agree on treating alloc_dax() failure as non-fatal. There is however one error handling fix to nvdimm/pmem which I plan to introduce as an initial patch before this change: nvdimm/pmem: Fix leak on dax_add_host() failure Fix a leak on dax_add_host() error, where "goto out_cleanup_dax" is done before setting pmem->dax_dev, which therefore issues the two following calls on NULL pointers: out_cleanup_dax: kill_dax(pmem->dax_dev); put_dax(pmem->dax_dev); Signed-off-by: Mathieu Desnoyers diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 4e8fdcb3f1c8..9fe358090720 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -566,12 +566,11 @@ static int pmem_attach_disk(struct device *dev, set_dax_nomc(dax_dev); if (is_nvdimm_sync(nd_region)) set_dax_synchronous(dax_dev); + pmem->dax_dev = dax_dev; rc = dax_add_host(dax_dev, disk); if (rc) goto out_cleanup_dax; dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); - pmem->dax_dev = dax_dev; - rc = device_add_disk(dev, disk, pmem_attribute_groups); if (rc) goto out_remove_host; The real question is what to do about device-dax. I *think* it is not affected by cpu_dcache aliasing because it never accesses user mappings through a kernel alias. I doubt device-dax is in use on these platforms, but we might need another fixup for that if someone screams about the alloc_dax() behavior change making them lose device-dax access. By "device-dax", I understand you mean drivers/dax/Kconfig:DEV_DAX. Based on your analysis, is alloc_dax() still the right spot where to place this runtime check ? Which call sites are responsible for invoking alloc_dax() for device-dax ? If we know which call sites do not intend to use the kernel linear mapping, we could introduce a flag (or a new variant of the alloc_dax() API) that would either enforce or skip the check. [...] Here what I'm seeing so far: - devm_release_mem_region() is never called after devm_request_mem_region(). Not on error, neither on teardown, devm_release_mem_region() is called from virtio_fs_probe() context. That I guess you mean "devm_request_mem_region()" here. means that when virtio_fs_probe() returns an error the driver core will automatically call devm_request_mem_region(). And "devm_release_mem_region()" here. - pgmap is never freed on error after devm_kzalloc. That is what the "devm_" in devm_kzalloc() does, free the memory on driver-probe failure, or after the driver remove callback is invoked. Got it. { + struct dax_device *dax_dev __free(cleanup_dax) = NULL; struct vi
[git pull] device mapper fixes for 6.8-rc3
Hi Linus, The following changes since commit 41bccc98fb7931d63d03f326a746ac4d429c1dd3: Linux 6.8-rc2 (2024-01-28 17:01:12 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git tags/for-6.8/dm-fixes for you to fetch changes up to 0a9bab391e336489169b95cb0d4553d921302189: dm-crypt, dm-verity: disable tasklets (2024-02-02 12:33:50 -0500) Please pull, thanks. Mike - Fix DM ioctl interface to avoid INT_MAX overflow warnings from kvmalloc by limiting the number of targets and parameter size area. - Fix DM stats to avoid INT_MAX overflow warnings from kvmalloc by limiting the number of entries supported. - Fix DM writecache to support mapping devices larger than 1 TiB by switching from using kvmalloc_array to vmalloc_array -- which avoids INT_MAX overflow in kvmalloc_node and associated warnings. - Remove the (ab)use of tasklets from both the DM crypt and verity targets. They will be converted to use BH workqueue in future. -BEGIN PGP SIGNATURE- iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmW9LCsACgkQxSPxCi2d A1qILwgAmL9XOtavSKJ/8o9scJygutYpNSLE0f6mdkdCgJB2nknJ1vR38bXyDpNr X3s6QC5TqKTtG7DtRTfnZc8zgtBHajjUZTFBu1NUF9kgNQcrjG3jW+quZ51pxkV0 1rvzOiYts6ca8csbFViMPS9FJVq1h3PnAyrkhI0SUS7+jEvDZy/QIX4DP20ye9SX wKguOSK544haSLHPNYuZqqCEoTBF+Vh1k1gDkxr594NwjsIJJK0+HGelamjzN/96 /jr88P4bm/6OIVdvwTUefnpIhNIum1Dfa8QWciKOzuct0jqsub65+SUSoTLmoiY4 /3AZDvp0ZMEwpMAvCIyvHnm81K72MA== =sioN -END PGP SIGNATURE- Mikulas Patocka (4): dm: limit the number of targets and parameter size area dm stats: limit the number of entries dm writecache: allow allocations larger than 2GiB dm-crypt, dm-verity: disable tasklets drivers/md/dm-core.h | 2 ++ drivers/md/dm-crypt.c | 38 ++ drivers/md/dm-ioctl.c | 3 ++- drivers/md/dm-stats.c | 9 + drivers/md/dm-table.c | 9 +++-- drivers/md/dm-verity-target.c | 26 ++ drivers/md/dm-verity.h| 1 - drivers/md/dm-writecache.c| 8 8 files changed, 28 insertions(+), 68 deletions(-)
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
Mathieu Desnoyers wrote: [..] > The change for -EOPNOTSUPP in header and implementation would look like > this (more questions below): > > diff --git a/include/linux/dax.h b/include/linux/dax.h > index b463502b16e1..df2d52b8a245 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -86,11 +86,7 @@ static inline void *dax_holder(struct dax_device *dax_dev) > static inline struct dax_device *alloc_dax(void *private, > const struct dax_operations *ops) > { > - /* > - * Callers should check IS_ENABLED(CONFIG_DAX) to know if this > - * NULL is an error or expected. > - */ > - return NULL; > + return ERR_PTR(-EOPNOTSUPP); > } > static inline void put_dax(struct dax_device *dax_dev) > { > diff --git a/drivers/dax/super.c b/drivers/dax/super.c > index 8c3a6e8e6334..c1cf6f0bbe12 100644 > --- a/drivers/dax/super.c > +++ b/drivers/dax/super.c > @@ -448,7 +448,7 @@ struct dax_device *alloc_dax(void *private, const struct > dax_operations *ops) > > /* Unavailable on architectures with virtually aliased data caches. */ > if (cpu_dcache_is_aliasing()) > - return NULL; > + return ERR_PTR(-EOPNOTSUPP); > > if (WARN_ON_ONCE(ops && !ops->zero_page_range)) > return ERR_PTR(-EINVAL); > > > > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c > > index f4b635526345..254d3b1e420e 100644 > > --- a/drivers/dax/super.c > > +++ b/drivers/dax/super.c > > @@ -322,7 +322,7 @@ EXPORT_SYMBOL_GPL(dax_alive); > >*/ > > void kill_dax(struct dax_device *dax_dev) > > { > > - if (!dax_dev) > > + if (IS_ERR_OR_NULL(dax_dev)) > > I am tempted to just leave the "if (!dax_dev)" check here, because > many other functions of this API are only protected by a NULL > pointer check. I would hate to forget to convert one check in this > change, and I don't think it simplifies anything. It's not that it simplifies anything, but mistakes fail safely as a memory leak rather than a crash. > The alternative route I intend to take is to audit all callers > of alloc_dax() and make sure they all save the alloc_dax() return > value in a struct dax_device * local variable first for the sake > of checking for IS_ERR(). This will leave the xyz->dax_dev pointer > initialized to NULL in the error case and simplify the rest of > error checking. I could maybe get on board with that, but it needs a comment somewhere about the asymmetric subtlety. > > > > return; > > > > if (dax_dev->holder_data != NULL) > > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > > index 4e8fdcb3f1c8..b69c9e442cf4 100644 > > --- a/drivers/nvdimm/pmem.c > > +++ b/drivers/nvdimm/pmem.c > > @@ -560,17 +560,19 @@ static int pmem_attach_disk(struct device *dev, > > dax_dev = alloc_dax(pmem, &pmem_dax_ops); > > if (IS_ERR(dax_dev)) { > > rc = PTR_ERR(dax_dev); > > - goto out; > > + if (rc != -EOPNOTSUPP) > > + goto out; > > If I compare the before / after this change, if previously > pmem_attach_disk() was called in a configuration with FS_DAX=n, it would > result in a NULL pointer dereference. No, alloc_dax() only returns NULL CONFIG_DAX=n case, not the CONFIG_FS_DAX=n case. So that means that pmem devices on ARM have been possible without FS_DAX. So, in order for alloc_dax() returning ERR_PTR(-EOPNOTSUPP) to not regress pmem device availability this error path needs to be changed. > This would be an error handling fix all by itself. Do we really want > to return successfully if dax is unsupported, or should we return > an error here ? Per above, there is no error handling fix, and pmem block device available should not depend on alloc_dax() succeeding. The real question is what to do about device-dax. I *think* it is not affected by cpu_dcache aliasing because it never accesses user mappings through a kernel alias. I doubt device-dax is in use on these platforms, but we might need another fixup for that if someone screams about the alloc_dax() behavior change making them lose device-dax access. > > + } else { > > + set_dax_nocache(dax_dev); > > + set_dax_nomc(dax_dev); > > + if (is_nvdimm_sync(nd_region)) > > + set_dax_synchronous(dax_dev); > > + rc = dax_add_host(dax_dev, disk); > > + if (rc) > > + goto out_cleanup_dax; > > + dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); > > + pmem->dax_dev = dax_dev; > > } > > - set_dax_nocache(dax_dev); > > - set_dax_nomc(dax_dev); > > - if (is_nvdimm_sync(nd_region)) > > - set_dax_synchronous(dax_dev); > > - rc = dax_add_host(dax_dev, disk); > > - if (rc) > > - goto out_cleanup_dax; > > - dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); > > - pmem->dax_dev = dax_dev; > > > > rc = device_add_disk(dev, disk, pmem_attribute_groups); > > if (rc) > > d
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
On 2024-02-01 10:44, Mathieu Desnoyers wrote: On 2024-01-31 17:18, Dan Williams wrote: [...] diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 5f1be1da92ce..11053a70f5ab 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "fuse_i.h" @@ -795,8 +796,11 @@ static void virtio_fs_cleanup_dax(void *data) put_dax(dax_dev); } +DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR_OR_NULL(_T)) virtio_fs_cleanup_dax(_T)) + static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) So either I'm completely missing how ownership works in this function, or we should be really concerned about the fact that it does no actual cleanup of anything on any error. [...] Here what I'm seeing so far: - devm_release_mem_region() is never called after devm_request_mem_region(). Not on error, neither on teardown, - pgmap is never freed on error after devm_kzalloc. I was indeed missing something: the devm_ family of functions keeps ownership at the device level, so we would not need explicit teardown. { + struct dax_device *dax_dev __free(cleanup_dax) = NULL; struct virtio_shm_region cache_reg; struct dev_pagemap *pgmap; bool have_cache; @@ -804,6 +808,15 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) if (!IS_ENABLED(CONFIG_FUSE_DAX)) return 0; + dax_dev = alloc_dax(fs, &virtio_fs_dax_ops); + if (IS_ERR(dax_dev)) { + int rc = PTR_ERR(dax_dev); + + if (rc == -EOPNOTSUPP) + return 0; + return rc; + } What is gained by moving this allocation here ? I'm still concerned about moving the call to alloc_dax() before the setup of the memory region it will use. Are those completely independent ? + /* Get cache region */ have_cache = virtio_get_shm_region(vdev, &cache_reg, (u8)VIRTIO_FS_SHMCAP_ID_CACHE); @@ -849,10 +862,7 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) dev_dbg(&vdev->dev, "%s: window kaddr 0x%px phys_addr 0x%llx len 0x%llx\n", __func__, fs->window_kaddr, cache_reg.addr, cache_reg.len); - fs->dax_dev = alloc_dax(fs, &virtio_fs_dax_ops); - if (IS_ERR(fs->dax_dev)) - return PTR_ERR(fs->dax_dev); - + fs->dax_dev = no_free_ptr(dax_dev); return devm_add_action_or_reset(&vdev->dev, virtio_fs_cleanup_dax, fs->dax_dev); } [...] Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime
On 2024-02-01 10:44, Mathieu Desnoyers wrote: [...] diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 4e8fdcb3f1c8..b69c9e442cf4 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -560,17 +560,19 @@ static int pmem_attach_disk(struct device *dev, dax_dev = alloc_dax(pmem, &pmem_dax_ops); if (IS_ERR(dax_dev)) { rc = PTR_ERR(dax_dev); - goto out; + if (rc != -EOPNOTSUPP) + goto out; If I compare the before / after this change, if previously pmem_attach_disk() was called in a configuration with FS_DAX=n, it would result in a NULL pointer dereference. I was wrong. drivers/nvdimm/Kconfig has: config BLK_DEV_PMEM select DAX and drivers/nvdimm/Makefile has: obj-$(CONFIG_BLK_DEV_PMEM) += nd_pmem.o nd_pmem-y := pmem.o which means that anything in pmem.c can assume that alloc_dax() is implemented. [...] diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index 4b7ecd4fd431..f911e58a24dd 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -681,12 +681,14 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char if (IS_ERR(dev_info->dax_dev)) { rc = PTR_ERR(dev_info->dax_dev); dev_info->dax_dev = NULL; - goto put_dev; + if (rc != -EOPNOTSUPP) + goto put_dev; config DCSSBLK selects FS_DAX_LIMITED and DAX. I'm not sure what selecting DAX is trying to achieve here, because the Kconfig option is "FS_DAX". So depending on the real motivation behind this select, we may want to consider failure rather than success in the -EOPNOTSUPP case. I missed that alloc_dax() is implemented as not supported based on CONFIG_DAX (not CONFIG_FS_DAX). Therefore DCSSBLK Kconfig does the right thing and always selects DAX, and thus an implemented version of alloc_dax(). This takes care of two of my open questions at least. :) Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com