Re: [PATCH v5 01/14] md: don't ignore suspended array in md_check_recovery()
On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai wrote: > > From: Yu Kuai > > mddev_suspend() never stop sync_thread, hence it doesn't make sense to > ignore suspended array in md_check_recovery(), which might cause > sync_thread can't be unregistered. > > After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following > hang can be triggered by test shell/integrity-caching.sh: Hi Kuai After applying this patch, it's still stuck at mddev_suspend. Maybe the deadlock can be fixed by other patches from the patch set. But this patch can't fix this issue. If so, the comment is not right. > > 1) suspend the array: > raid_postsuspend > mddev_suspend > > 2) stop the array: > raid_dtr > md_stop > __md_stop_writes >stop_sync_thread > set_bit(MD_RECOVERY_INTR, >recovery); > md_wakeup_thread_directly(mddev->sync_thread); > wait_event(..., !test_bit(MD_RECOVERY_RUNNING, >recovery)) > > 3) sync thread done: > md_do_sync > set_bit(MD_RECOVERY_DONE, >recovery); > md_wakeup_thread(mddev->thread); > > 4) daemon thread can't unregister sync thread: > md_check_recovery > if (mddev->suspended) >return; -> return directly > md_read_sync_thread > clear_bit(MD_RECOVERY_RUNNING, >recovery); > -> MD_RECOVERY_RUNNING can't be cleared, hence step 2 hang; I add some debug logs when stopping dmraid with lvremove command. The step you mentioned are sequential but not async. The process is : dev_remove->dm_destroy->__dm_destroy->dm_table_postsuspend_targets(raid_postsuspend) -> dm_table_destroy(raid_dtr). It looks like mddev_suspend is waiting for active_io to be zero. Best Regards Xiao > This problem is not just related to dm-raid, fix it by ignoring > suspended array in md_check_recovery(). And follow up patches will > improve dm-raid better to frozen sync thread during suspend. > > Reported-by: Mikulas Patocka > Closes: > https://lore.kernel.org/all/8fb335e-6d2c-dbb5-d7-ded8db51...@redhat.com/ > Fixes: 68866e425be2 ("MD: no sync IO while suspended") > Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") > Signed-off-by: Yu Kuai > --- > drivers/md/md.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 2266358d8074..07b80278eaa5 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -9469,9 +9469,6 @@ static void md_start_sync(struct work_struct *ws) > */ > void md_check_recovery(struct mddev *mddev) > { > - if (READ_ONCE(mddev->suspended)) > - return; > - > if (mddev->bitmap) > md_bitmap_daemon_work(mddev); > > -- > 2.39.2 >
Re: [PATCH v5 00/14] dm-raid/md/raid: fix v6.7 regressions
On Thu, Feb 15, 2024 at 02:24:34PM -0800, Song Liu wrote: > On Thu, Feb 1, 2024 at 1:30 AM Yu Kuai wrote: > > > [...] > > > > [1] > > https://lore.kernel.org/all/CALTww29QO5kzmN6Vd+jT=-8w5f52tjjhksgrfuc1z1zaerk...@mail.gmail.com/ > > > > Yu Kuai (14): > > md: don't ignore suspended array in md_check_recovery() > > md: don't ignore read-only array in md_check_recovery() > > md: make sure md_do_sync() will set MD_RECOVERY_DONE > > md: don't register sync_thread for reshape directly > > md: don't suspend the array for interrupted reshape > > md: fix missing release of 'active_io' for flush > > Applied 1/14-5/14 to md-6.8 branch (6/14 was applied earlier). > > Thanks, > Song I'm still seeing new failures that I can't reproduce in the 6.6 kernel, specifically: lvconvert-raid-reshape-stripes-load-reload.sh lvconvert-repair-raid.sh with lvconvert-raid-reshape-stripes-load-reload.sh Patch 12/14 ("md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape") is changing a hang to a corruption. The issues is that we can't simply fail IO that crosses the reshape position. I assume that the correct thing to do is have dm-raid reissue it after the suspend, when the reshape can make progress again. Perhaps something like this, only less naive (although this patch does make the test pass for me). Heinz, any thoughts on this? Otherwise, I'll look into this a little more and post a RFC patch. = diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index ed8c28952b14..ff481d494b04 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3345,6 +3345,14 @@ static int raid_map(struct dm_target *ti, struct bio *bio) return DM_MAPIO_SUBMITTED; } +static int raid_end_io(struct dm_target *ti, struct bio *bio, + blk_status_t *error) +{ + if (*error != BLK_STS_IOERR || !dm_noflush_suspending(ti)) + return DM_ENDIO_DONE; + return DM_ENDIO_REQUEUE; +} + /* Return sync state string for @state */ enum sync_state { st_frozen, st_reshape, st_resync, st_check, st_repair, st_recover, st_idle }; static const char *sync_str(enum sync_state state) @@ -4100,6 +4108,7 @@ static struct target_type raid_target = { .ctr = raid_ctr, .dtr = raid_dtr, .map = raid_map, + .end_io = raid_end_io, .status = raid_status, .message = raid_message, .iterate_devices = raid_iterate_devices, = > > > > md: export helpers to stop sync_thread > > md: export helper md_is_rdwr() > > dm-raid: really frozen sync_thread during suspend > > md/dm-raid: don't call md_reap_sync_thread() directly > > dm-raid: add a new helper prepare_suspend() in md_personality > > md/raid456: fix a deadlock for dm-raid456 while io concurrent with > > reshape > > dm-raid: fix lockdep waring in "pers->hot_add_disk" > > dm-raid: remove mddev_suspend/resume() > > > > drivers/md/dm-raid.c | 78 +++ > > drivers/md/md.c | 126 +-- > > drivers/md/md.h | 16 ++ > > drivers/md/raid10.c | 16 +- > > drivers/md/raid5.c | 61 +++-- > > 5 files changed, 192 insertions(+), 105 deletions(-) > > > > -- > > 2.39.2 > > > >
Re: [PATCH 6/8] net: tcp: tsq: Convert from tasklet to BH workqueue
Hello, On Mon, Jan 29, 2024 at 11:11:53PM -1000, Tejun Heo wrote: > The only generic interface to execute asynchronously in the BH context is > tasklet; however, it's marked deprecated and has some design flaws. To > replace tasklets, BH workqueue support was recently added. A BH workqueue > behaves similarly to regular workqueues except that the queued work items > are executed in the BH context. > > This patch converts TCP Small Queues implementation from tasklet to BH > workqueue. > > Semantically, this is an equivalent conversion and there shouldn't be any > user-visible behavior changes. While workqueue's queueing and execution > paths are a bit heavier than tasklet's, unless the work item is being queued > every packet, the difference hopefully shouldn't matter. > > My experience with the networking stack is very limited and this patch > definitely needs attention from someone who actually understands networking. On Jakub's recommendation, I asked David Wei to perform production memcache benchmark on the backported conversion patch. There was no discernible difference before and after. Given that this is likely as hot as it gets for the path on a real workloal, the conversions shouldn't hopefully be noticeable in terms of performance impact. Jakub, I'd really appreciate if you could ack. David, would it be okay if I add your Tested-by? Thanks. -- tejun
Re: About DM_UDEV_DISABLE_OTHER_RULES_FLAG and DM_NOSCAN
On Mon, Feb 12, 2024 at 03:16:27PM +0100, Martin Wilck wrote: > On Mon, 2024-02-12 at 13:32 +0100, Peter Rajnoha wrote: > > On 2/12/24 12:09, Martin Wilck wrote: > > > > > > Right, underneath within DM/DM-subystem, we should be able to keep > > and > > restore those reasons for why it has been flagged with > > DM_UDEV_DISABLE_OTHER_RULES_FLAG till now and when the state is good > > enough that we can drop it, we would do it transparently for higher > > (non-dm and non-dm-subsystem) layers. So if there's a case that is > > not > > currently handled by 10-dm.rules or 11-dm-.rules, we can > > fix > > that there. If it's a generic rule that applies to all DM, not just > > subystem, then yes, we can move that to 10-dm.rules (will have a look > > at > > your patch [1]). > > I don't think that patch can be used as-is for DM. For multipath, I'm > not aware of any situation where DM_UDEV_DISABLE_OTHER_RULES_FLAG would > be set in the udev cookie, therefore DM_UDEV_DISABLE_OTHER_RULES_FLAG > is essentially an alias for DM_SUSPENDED before 11-dm-mpath.rules > changes it. That's not the case for other targets, in particular LVM. > > > > > > > Well, IIUC the main reason that systemd couldn't use > > > DM_UDEV_DISABLE_OTHER_RULES_FLAG was at least in part due to the > > > fact > > > that the multipath rules used it in a special way that was > > > inconsistent > > > with the rest of DM ;-) > > > > > > Aha! Well, honestly, I don't remember the exact details and context > > of > > that fix, but I know we haven't found a better way... > > > > > > > > I think there are 3 variants of "unusable": > > > > > > a) temporarily unusable (just for this event), ignore this uevent > > > and > > > restore previous properties from db. > > > b) unusable, avoid IO, don't scan, don't activate (this is how we > > > use > > > DM_UDEV_DISABLE_OTHER_RULES_FLAG). Upper layers will usually load > > > saved > > > properties from udev db in this case, too. > > > c) like b), but also try to unmount / unconfigure if already used. > > > This > > > is SYSTEMD_READY=0. I don't think DM has a flag with these > > > semantics at > > > this time. I can imagine such a flag being set if a device was > > > reloaded > > > with an incompatible table, but that's rather a corner case. > > > > > > It's an honorable goal to condense everything into a single > > > variable > > > for consumer rules, but I think it doesn't work if we want the > > > upper > > > layers to be able to distinguish these. We can merge a) and b) I > > > think, > > > because their meaning for upper layers is practically the same, if > > > we > > > get the saving and restoring right. > > > > > > > The c) case - well, it's questionable what should be done in that > > case, > > because that means we have literally cut off the underlying device > > while > > it was still in use. Any further IO from higher layers will return IO > > errors, will queue IOs or, in the worse case, issue IOs to a device > > that > > we don't want to anymore. > > > > Anyway, if I understand correctly, we simply need to signal higher > > layers either: > > > > A) device is unusable, forget it and clear all your current extra > > records you have about this device (including removing any custom > > symlinks for udev). That would also map to SYSTEMD_READY=0. > > > > B) device is unusable temporarily, restore any records you need, > > wait > > for the DM_UDEV_DISABLE_OTHER_RULES_FLAG=1 to drop to 0 (or being > > unset). > > > > What do you think about keeping a single > > DM_UDEV_DISABLE_OTHER_RULES_FLAG for this, just having a different > > value, say "2" to denote the B case? Otherwise, we need 2 distinct > > variables (which is harder for others to accept I bet). > > Yes, that could work, if the save / restore is implemented cleanly. What if we never read DM_UDEV_DISABLE_OTHER_RULES_FLAG from the database. Instead how about, if DM_UDEV_DISABLE_OTHER_RULES_FLAG is set by "dmsetup udevflags", we save it as something like DM_IGNORE_DEVICE. Otherwise, if it's a spurious event, we read DM_IGNORE_DEVICE from the database. After "dm_flags_done", if DM_IGNORE_DEVICE is set, we set DM_UDEV_DISABLE_OTHER_RULES_FLAG. This leaves the other rules free to mess with DM_UDEV_DISABLE_OTHER_RULES_FLAG all they want. > > > > > > The DM_NOSCAN was actually just a helper and a more human > > > > readable > > > > name > > > > for "DM_SUSBYSTEM_UDEV_FLAG0" within LVM subsystem *only* at > > > > first. > > > > It is used during LV initialization - the wiping/zeroing of the > > > > LV > > > > before it is pronounced as usable - that's why, when it is set, > > > > we > > > > signal to "others" the DM_UDEV_DISABLE_OTHER_RULES_FLAG based on > > > > this > > > > flag. However, since we have the 13-dm-disk.rules which manages > > > > the > > > > blkid call for DM devices (and which is ours - owned by DM), we > > > > need > > > > to > > > > signal these rules to avoid calling blkid (as it could see > > > > uninitiliazed/stale
Re: [PATCH v5 07/14] md: export helpers to stop sync_thread
On Thu, Feb 1, 2024 at 1:30 AM Yu Kuai wrote: > [...] > + > static void idle_sync_thread(struct mddev *mddev) > { > mutex_lock(>sync_mutex); > - clear_bit(MD_RECOVERY_FROZEN, >recovery); > > if (mddev_lock(mddev)) { > mutex_unlock(>sync_mutex); > return; > } > > + clear_bit(MD_RECOVERY_FROZEN, >recovery); > stop_sync_thread(mddev, false, true); > mutex_unlock(>sync_mutex); > } > @@ -4936,13 +4965,13 @@ static void idle_sync_thread(struct mddev *mddev) > static void frozen_sync_thread(struct mddev *mddev) > { > mutex_lock(>sync_mutex); > - set_bit(MD_RECOVERY_FROZEN, >recovery); > > if (mddev_lock(mddev)) { > mutex_unlock(>sync_mutex); > return; > } > > + set_bit(MD_RECOVERY_FROZEN, >recovery); > stop_sync_thread(mddev, false, false); > mutex_unlock(>sync_mutex); > } The two changes above (moving set_bit) don't seem to belong to this patch. If they are still needed, please submit a separate patch. Thanks, Song
Re: [PATCH v5 00/14] dm-raid/md/raid: fix v6.7 regressions
On Thu, Feb 1, 2024 at 1:30 AM Yu Kuai wrote: > [...] > > [1] > https://lore.kernel.org/all/CALTww29QO5kzmN6Vd+jT=-8w5f52tjjhksgrfuc1z1zaerk...@mail.gmail.com/ > > Yu Kuai (14): > md: don't ignore suspended array in md_check_recovery() > md: don't ignore read-only array in md_check_recovery() > md: make sure md_do_sync() will set MD_RECOVERY_DONE > md: don't register sync_thread for reshape directly > md: don't suspend the array for interrupted reshape > md: fix missing release of 'active_io' for flush Applied 1/14-5/14 to md-6.8 branch (6/14 was applied earlier). Thanks, Song > md: export helpers to stop sync_thread > md: export helper md_is_rdwr() > dm-raid: really frozen sync_thread during suspend > md/dm-raid: don't call md_reap_sync_thread() directly > dm-raid: add a new helper prepare_suspend() in md_personality > md/raid456: fix a deadlock for dm-raid456 while io concurrent with > reshape > dm-raid: fix lockdep waring in "pers->hot_add_disk" > dm-raid: remove mddev_suspend/resume() > > drivers/md/dm-raid.c | 78 +++ > drivers/md/md.c | 126 +-- > drivers/md/md.h | 16 ++ > drivers/md/raid10.c | 16 +- > drivers/md/raid5.c | 61 +++-- > 5 files changed, 192 insertions(+), 105 deletions(-) > > -- > 2.39.2 > >
+ introduce-cpu_dcache_is_aliasing-across-all-architectures.patch added to mm-unstable branch
The patch titled Subject: Introduce cpu_dcache_is_aliasing() across all architectures has been added to the -mm mm-unstable branch. Its filename is introduce-cpu_dcache_is_aliasing-across-all-architectures.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/introduce-cpu_dcache_is_aliasing-across-all-architectures.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: Introduce cpu_dcache_is_aliasing() across all architectures Date: Thu, 15 Feb 2024 09:46:32 -0500 Introduce a generic way to query whether the data cache is virtually aliased on all architectures. Its purpose is to ensure that subsystems which are incompatible with virtually aliased data caches (e.g. FS_DAX) can reliably query this. For data cache aliasing, there are three scenarios dependending on the architecture. Here is a breakdown based on my understanding: A) The data cache is always aliasing: * arc * csky * m68k (note: shared memory mappings are incoherent ? SHMLBA is missing there.) * sh * parisc B) The data cache aliasing is statically known or depends on querying CPU state at runtime: * arm (cache_is_vivt() || cache_is_vipt_aliasing()) * mips (cpu_has_dc_aliases) * nios2 (NIOS2_DCACHE_SIZE > PAGE_SIZE) * sparc32 (vac_cache_size > PAGE_SIZE) * sparc64 (L1DCACHE_SIZE > PAGE_SIZE) * xtensa (DCACHE_WAY_SIZE > PAGE_SIZE) C) The data cache is never aliasing: * alpha * arm64 (aarch64) * hexagon * loongarch (but with incoherent write buffers, which are disabled since commit d23b7795 ("LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE")) * microblaze * openrisc * powerpc * riscv * s390 * um * x86 Require architectures in A) and B) to select ARCH_HAS_CPU_CACHE_ALIASING and implement "cpu_dcache_is_aliasing()". Architectures in C) don't select ARCH_HAS_CPU_CACHE_ALIASING, and thus cpu_dcache_is_aliasing() simply evaluates to "false". Note that this leaves "cpu_icache_is_aliasing()" to be implemented as future work. This would be useful to gate features like XIP on architectures which have aliasing CPU dcache-icache but not CPU dcache-dcache. Use "cpu_dcache" and "cpu_cache" rather than just "dcache" and "cache" to clarify that we really mean "CPU data cache" and "CPU cache" to eliminate any possible confusion with VFS "dentry cache" and "page cache". Link: https://lore.kernel.org/lkml/20030910210416.ga24...@mail.jlokier.co.uk/ Link: https://lkml.kernel.org/r/20240215144633.96437-9-mathieu.desnoy...@efficios.com Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Alasdair Kergon Cc: Christoph Hellwig Cc: Dave Chinner Cc: Heiko Carstens Cc: kernel test robot Cc: Michael Sclafani Cc: Mike Snitzer Cc: Mikulas Patocka Signed-off-by: Andrew Morton --- arch/arc/Kconfig|1 + arch/arc/include/asm/cachetype.h|9 + arch/arm/Kconfig|1 + arch/arm/include/asm/cachetype.h|2 ++ arch/csky/Kconfig |1 + arch/csky/include/asm/cachetype.h |9 + arch/m68k/Kconfig |1 + arch/m68k/include/asm/cachetype.h |9 + arch/mips/Kconfig |1 + arch/mips/include/asm/cachetype.h |9 + arch/nios2/Kconfig |1 + arch/nios2/include/asm/cachetype.h | 10 ++ arch/parisc/Kconfig |1 + arch/parisc/include/asm/cachetype.h |9 + arch/sh/Kconfig |1 + arch/sh/include/asm/cachetype.h |9 + arch/sparc/Kconfig |1 + arch/sparc/include/asm/cachetype.h | 14 ++ arch/xtensa/Kconfig |1 + arch/xtensa/include/asm/cachetype.h | 10 ++ include/linux/cacheinfo.h |6 ++ mm/Kconfig |6 ++ 22 files changed, 112 insertions(+) --- /dev/null +++ a/arch/arc/include/asm/cachetype.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_ARC_CACHETYPE_H +#define __ASM_ARC_CACHETYPE_H + +#include + +#define
+ dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch added to mm-unstable branch
The patch titled Subject: dax: Fix incorrect list of data cache aliasing architectures has been added to the -mm mm-unstable branch. Its filename is dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: dax: Fix incorrect list of data cache aliasing architectures Date: Thu, 15 Feb 2024 09:46:33 -0500 commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches") prevents DAX from building on architectures with virtually aliased dcache with: depends on !(ARM || MIPS || SPARC) This check is too broad (e.g. recent ARMv7 don't have virtually aliased dcaches), and also misses many other architectures with virtually aliased data cache. This is a regression introduced in the v4.0 Linux kernel where the dax mount option is removed for 32-bit ARMv7 boards which have no data cache aliasing, and therefore should work fine with FS_DAX. This was turned into the following check in alloc_dax() by a preparatory change: if (ops && (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_MIPS) || IS_ENABLED(CONFIG_SPARC))) return NULL; Use cpu_dcache_is_aliasing() instead to figure out whether the environment has aliasing data caches. Link: https://lkml.kernel.org/r/20240215144633.96437-10-mathieu.desnoy...@efficios.com Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Alasdair Kergon Cc: Christoph Hellwig Cc: Dave Chinner Cc: Heiko Carstens Cc: kernel test robot Cc: Michael Sclafani Cc: Mike Snitzer Cc: Mikulas Patocka Signed-off-by: Andrew Morton --- drivers/dax/super.c |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/drivers/dax/super.c~dax-fix-incorrect-list-of-data-cache-aliasing-architectures +++ a/drivers/dax/super.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "dax-private.h" /** @@ -456,9 +457,7 @@ struct dax_device *alloc_dax(void *priva * except for device-dax (NULL operations pointer), which does * not use aliased mappings from the kernel. */ - if (ops && (IS_ENABLED(CONFIG_ARM) || - IS_ENABLED(CONFIG_MIPS) || - IS_ENABLED(CONFIG_SPARC))) + if (ops && cpu_dcache_is_aliasing()) return ERR_PTR(-EOPNOTSUPP); if (WARN_ON_ONCE(ops && !ops->zero_page_range)) _ Patches currently in -mm which might be from mathieu.desnoy...@efficios.com are nvdimm-pmem-fix-leak-on-dax_add_host-failure.patch dax-add-empty-static-inline-for-config_dax=n.patch dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dcssblk-handle-alloc_dax-eopnotsupp-failure.patch virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dax-check-for-data-cache-aliasing-at-runtime.patch introduce-cpu_dcache_is_aliasing-across-all-architectures.patch dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch
+ virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch added to mm-unstable branch
The patch titled Subject: virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal has been added to the -mm mm-unstable branch. Its filename is virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal Date: Thu, 15 Feb 2024 09:46:30 -0500 In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of virtio virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. Link: https://lkml.kernel.org/r/20240215144633.96437-7-mathieu.desnoy...@efficios.com Co-developed-by: Dan Williams Signed-off-by: Dan Williams Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Christoph Hellwig Cc: Dave Chinner Cc: Heiko Carstens Cc: kernel test robot Cc: Michael Sclafani Signed-off-by: Andrew Morton --- fs/fuse/virtio_fs.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) --- a/fs/fuse/virtio_fs.c~virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal +++ a/fs/fuse/virtio_fs.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "fuse_i.h" @@ -795,8 +796,11 @@ static void virtio_fs_cleanup_dax(void * put_dax(dax_dev); } +DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR_OR_NULL(_T)) virtio_fs_cleanup_dax(_T)) + static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) { + struct dax_device *dax_dev __free(cleanup_dax) = NULL; struct virtio_shm_region cache_reg; struct dev_pagemap *pgmap; bool have_cache; @@ -804,6 +808,12 @@ static int virtio_fs_setup_dax(struct vi if (!IS_ENABLED(CONFIG_FUSE_DAX)) return 0; + dax_dev = alloc_dax(fs, _fs_dax_ops); + if (IS_ERR(dax_dev)) { + int rc = PTR_ERR(dax_dev); + return rc == -EOPNOTSUPP ? 0 : rc; + } + /* Get cache region */ have_cache = virtio_get_shm_region(vdev, _reg, (u8)VIRTIO_FS_SHMCAP_ID_CACHE); @@ -849,10 +859,7 @@ static int virtio_fs_setup_dax(struct vi dev_dbg(>dev, "%s: window kaddr 0x%px phys_addr 0x%llx len 0x%llx\n", __func__, fs->window_kaddr, cache_reg.addr, cache_reg.len); - fs->dax_dev = alloc_dax(fs, _fs_dax_ops); - if (IS_ERR(fs->dax_dev)) - return PTR_ERR(fs->dax_dev); - + fs->dax_dev = no_free_ptr(dax_dev); return devm_add_action_or_reset(>dev, virtio_fs_cleanup_dax, fs->dax_dev); } _ Patches currently in -mm which might be from mathieu.desnoy...@efficios.com are nvdimm-pmem-fix-leak-on-dax_add_host-failure.patch dax-add-empty-static-inline-for-config_dax=n.patch dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dcssblk-handle-alloc_dax-eopnotsupp-failure.patch virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dax-check-for-data-cache-aliasing-at-runtime.patch introduce-cpu_dcache_is_aliasing-across-all-architectures.patch dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch
+ dax-check-for-data-cache-aliasing-at-runtime.patch added to mm-unstable branch
The patch titled Subject: dax: Check for data cache aliasing at runtime has been added to the -mm mm-unstable branch. Its filename is dax-check-for-data-cache-aliasing-at-runtime.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/dax-check-for-data-cache-aliasing-at-runtime.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: dax: Check for data cache aliasing at runtime Date: Thu, 15 Feb 2024 09:46:31 -0500 Replace the following fs/Kconfig:FS_DAX dependency: depends on !(ARM || MIPS || SPARC) By a runtime check within alloc_dax(). This runtime check returns ERR_PTR(-EOPNOTSUPP) if the @ops parameter is non-NULL (which means the kernel is using an aliased mapping) on an architecture which has data cache aliasing. Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for CONFIG_DAX=n for consistency. This is done in preparation for using cpu_dcache_is_aliasing() in a following change which will properly support architectures which detect data cache aliasing at runtime. Link: https://lkml.kernel.org/r/20240215144633.96437-8-mathieu.desnoy...@efficios.com Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Alasdair Kergon Cc: Christoph Hellwig Cc: Dave Chinner Cc: Heiko Carstens Cc: kernel test robot Cc: Michael Sclafani Cc: Mike Snitzer Cc: Mikulas Patocka Signed-off-by: Andrew Morton --- drivers/dax/super.c | 10 ++ fs/Kconfig |1 - 2 files changed, 10 insertions(+), 1 deletion(-) --- a/drivers/dax/super.c~dax-check-for-data-cache-aliasing-at-runtime +++ a/drivers/dax/super.c @@ -451,6 +451,16 @@ struct dax_device *alloc_dax(void *priva dev_t devt; int minor; + /* +* Unavailable on architectures with virtually aliased data caches, +* except for device-dax (NULL operations pointer), which does +* not use aliased mappings from the kernel. +*/ + if (ops && (IS_ENABLED(CONFIG_ARM) || + IS_ENABLED(CONFIG_MIPS) || + IS_ENABLED(CONFIG_SPARC))) + return ERR_PTR(-EOPNOTSUPP); + if (WARN_ON_ONCE(ops && !ops->zero_page_range)) return ERR_PTR(-EINVAL); --- a/fs/Kconfig~dax-check-for-data-cache-aliasing-at-runtime +++ a/fs/Kconfig @@ -60,7 +60,6 @@ endif # BLOCK config FS_DAX bool "File system based Direct Access (DAX) support" depends on MMU - depends on !(ARM || MIPS || SPARC) depends on ZONE_DEVICE || FS_DAX_LIMITED select FS_IOMAP select DAX _ Patches currently in -mm which might be from mathieu.desnoy...@efficios.com are nvdimm-pmem-fix-leak-on-dax_add_host-failure.patch dax-add-empty-static-inline-for-config_dax=n.patch dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dcssblk-handle-alloc_dax-eopnotsupp-failure.patch virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dax-check-for-data-cache-aliasing-at-runtime.patch introduce-cpu_dcache_is_aliasing-across-all-architectures.patch dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch
+ dcssblk-handle-alloc_dax-eopnotsupp-failure.patch added to mm-unstable branch
The patch titled Subject: dcssblk: Handle alloc_dax() -EOPNOTSUPP failure has been added to the -mm mm-unstable branch. Its filename is dcssblk-handle-alloc_dax-eopnotsupp-failure.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/dcssblk-handle-alloc_dax-eopnotsupp-failure.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: dcssblk: Handle alloc_dax() -EOPNOTSUPP failure Date: Thu, 15 Feb 2024 09:46:29 -0500 In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dcssblk dcssblk_add_store() to handle alloc_dax() -EOPNOTSUPP failures. Considering that s390 is not a data cache aliasing architecture, and considering that DCSSBLK selects DAX, a return value of -EOPNOTSUPP from alloc_dax() should make dcssblk_add_store() fail. Link: https://lkml.kernel.org/r/20240215144633.96437-6-mathieu.desnoy...@efficios.com Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Acked-by: Heiko Carstens Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Christoph Hellwig Cc: Dave Chinner Cc: kernel test robot Cc: Michael Sclafani Signed-off-by: Andrew Morton --- drivers/s390/block/dcssblk.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) --- a/drivers/s390/block/dcssblk.c~dcssblk-handle-alloc_dax-eopnotsupp-failure +++ a/drivers/s390/block/dcssblk.c @@ -549,6 +549,7 @@ dcssblk_add_store(struct device *dev, st int rc, i, j, num_of_segments; struct dcssblk_dev_info *dev_info; struct segment_info *seg_info, *temp; + struct dax_device *dax_dev; char *local_buf; unsigned long seg_byte_size; @@ -677,13 +678,13 @@ dcssblk_add_store(struct device *dev, st if (rc) goto put_dev; - dev_info->dax_dev = alloc_dax(dev_info, _dax_ops); - if (IS_ERR(dev_info->dax_dev)) { - rc = PTR_ERR(dev_info->dax_dev); - dev_info->dax_dev = NULL; + dax_dev = alloc_dax(dev_info, _dax_ops); + if (IS_ERR(dax_dev)) { + rc = PTR_ERR(dax_dev); goto put_dev; } - set_dax_synchronous(dev_info->dax_dev); + set_dax_synchronous(dax_dev); + dev_info->dax_dev = dax_dev; rc = dax_add_host(dev_info->dax_dev, dev_info->gd); if (rc) goto out_dax; _ Patches currently in -mm which might be from mathieu.desnoy...@efficios.com are nvdimm-pmem-fix-leak-on-dax_add_host-failure.patch dax-add-empty-static-inline-for-config_dax=n.patch dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dcssblk-handle-alloc_dax-eopnotsupp-failure.patch virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dax-check-for-data-cache-aliasing-at-runtime.patch introduce-cpu_dcache_is_aliasing-across-all-architectures.patch dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch
+ dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch added to mm-unstable branch
The patch titled Subject: dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal has been added to the -mm mm-unstable branch. Its filename is dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal Date: Thu, 15 Feb 2024 09:46:28 -0500 In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dm alloc_dev() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. Link: https://lkml.kernel.org/r/20240215144633.96437-5-mathieu.desnoy...@efficios.com Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Suggested-by: Dan Williams Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Christoph Hellwig Cc: Dave Chinner Cc: Heiko Carstens Cc: kernel test robot Cc: Michael Sclafani Signed-off-by: Andrew Morton --- drivers/md/dm.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) --- a/drivers/md/dm.c~dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal +++ a/drivers/md/dm.c @@ -2054,6 +2054,7 @@ static void cleanup_mapped_device(struct static struct mapped_device *alloc_dev(int minor) { int r, numa_node_id = dm_get_numa_node(); + struct dax_device *dax_dev; struct mapped_device *md; void *old_md; @@ -2122,15 +2123,15 @@ static struct mapped_device *alloc_dev(i md->disk->private_data = md; sprintf(md->disk->disk_name, "dm-%d", minor); - if (IS_ENABLED(CONFIG_FS_DAX)) { - md->dax_dev = alloc_dax(md, _dax_ops); - if (IS_ERR(md->dax_dev)) { - md->dax_dev = NULL; + dax_dev = alloc_dax(md, _dax_ops); + if (IS_ERR(dax_dev)) { + if (PTR_ERR(dax_dev) != -EOPNOTSUPP) goto bad; - } - set_dax_nocache(md->dax_dev); - set_dax_nomc(md->dax_dev); - if (dax_add_host(md->dax_dev, md->disk)) + } else { + set_dax_nocache(dax_dev); + set_dax_nomc(dax_dev); + md->dax_dev = dax_dev; + if (dax_add_host(dax_dev, md->disk)) goto bad; } _ Patches currently in -mm which might be from mathieu.desnoy...@efficios.com are nvdimm-pmem-fix-leak-on-dax_add_host-failure.patch dax-add-empty-static-inline-for-config_dax=n.patch dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dcssblk-handle-alloc_dax-eopnotsupp-failure.patch virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dax-check-for-data-cache-aliasing-at-runtime.patch introduce-cpu_dcache_is_aliasing-across-all-architectures.patch dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch
+ nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch added to mm-unstable branch
The patch titled Subject: nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal has been added to the -mm mm-unstable branch. Its filename is nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal Date: Thu, 15 Feb 2024 09:46:27 -0500 In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of nvdimm/pmem pmem_attach_disk() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. [ Based on commit "nvdimm/pmem: Fix leak on dax_add_host() failure". ] Link: https://lkml.kernel.org/r/20240215144633.96437-4-mathieu.desnoy...@efficios.com Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Christoph Hellwig Cc: Dave Chinner Cc: Heiko Carstens Cc: kernel test robot Cc: Michael Sclafani Signed-off-by: Andrew Morton --- drivers/nvdimm/pmem.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) --- a/drivers/nvdimm/pmem.c~nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal +++ a/drivers/nvdimm/pmem.c @@ -560,17 +560,19 @@ static int pmem_attach_disk(struct devic dax_dev = alloc_dax(pmem, _dax_ops); if (IS_ERR(dax_dev)) { rc = PTR_ERR(dax_dev); - goto out; + if (rc != -EOPNOTSUPP) + goto out; + } else { + set_dax_nocache(dax_dev); + set_dax_nomc(dax_dev); + if (is_nvdimm_sync(nd_region)) + set_dax_synchronous(dax_dev); + pmem->dax_dev = dax_dev; + rc = dax_add_host(dax_dev, disk); + if (rc) + goto out_cleanup_dax; + dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); } - set_dax_nocache(dax_dev); - set_dax_nomc(dax_dev); - if (is_nvdimm_sync(nd_region)) - set_dax_synchronous(dax_dev); - pmem->dax_dev = dax_dev; - rc = dax_add_host(dax_dev, disk); - if (rc) - goto out_cleanup_dax; - dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); rc = device_add_disk(dev, disk, pmem_attribute_groups); if (rc) goto out_remove_host; _ Patches currently in -mm which might be from mathieu.desnoy...@efficios.com are nvdimm-pmem-fix-leak-on-dax_add_host-failure.patch dax-add-empty-static-inline-for-config_dax=n.patch dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dcssblk-handle-alloc_dax-eopnotsupp-failure.patch virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dax-check-for-data-cache-aliasing-at-runtime.patch introduce-cpu_dcache_is_aliasing-across-all-architectures.patch dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch
+ dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch added to mm-unstable branch
The patch titled Subject: dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n has been added to the -mm mm-unstable branch. Its filename is dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n Date: Thu, 15 Feb 2024 09:46:26 -0500 Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for CONFIG_DAX=n to be consistent with the fact that CONFIG_DAX=y never returns NULL. This is done in preparation for using cpu_dcache_is_aliasing() in a following change which will properly support architectures which detect data cache aliasing at runtime. Link: https://lkml.kernel.org/r/20240215144633.96437-3-mathieu.desnoy...@efficios.com Fixes: 4e4ced93794a ("dax: Move mandatory ->zero_page_range() check in alloc_dax()") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Alasdair Kergon Cc: Christoph Hellwig Cc: Dave Chinner Cc: Heiko Carstens Cc: kernel test robot Cc: Michael Sclafani Cc: Mike Snitzer Cc: Mikulas Patocka Signed-off-by: Andrew Morton --- drivers/dax/super.c |5 + include/linux/dax.h |6 +- 2 files changed, 6 insertions(+), 5 deletions(-) --- a/drivers/dax/super.c~dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n +++ a/drivers/dax/super.c @@ -319,6 +319,11 @@ EXPORT_SYMBOL_GPL(dax_alive); * that any fault handlers or operations that might have seen * dax_alive(), have completed. Any operations that start after * synchronize_srcu() has run will abort upon seeing !dax_alive(). + * + * Note, because alloc_dax() returns an ERR_PTR() on error, callers + * typically store its result into a local variable in order to check + * the result. Therefore, care must be taken to populate the struct + * device dax_dev field make sure the dax_dev is not leaked. */ void kill_dax(struct dax_device *dax_dev) { --- a/include/linux/dax.h~dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n +++ a/include/linux/dax.h @@ -88,11 +88,7 @@ static inline void *dax_holder(struct da static inline struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) { - /* -* Callers should check IS_ENABLED(CONFIG_DAX) to know if this -* NULL is an error or expected. -*/ - return NULL; + return ERR_PTR(-EOPNOTSUPP); } static inline void put_dax(struct dax_device *dax_dev) { _ Patches currently in -mm which might be from mathieu.desnoy...@efficios.com are nvdimm-pmem-fix-leak-on-dax_add_host-failure.patch dax-add-empty-static-inline-for-config_dax=n.patch dax-alloc_dax-return-err_ptr-eopnotsupp-for-config_dax=n.patch nvdimm-pmem-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dm-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dcssblk-handle-alloc_dax-eopnotsupp-failure.patch virtio-treat-alloc_dax-eopnotsupp-failure-as-non-fatal.patch dax-check-for-data-cache-aliasing-at-runtime.patch introduce-cpu_dcache_is_aliasing-across-all-architectures.patch dax-fix-incorrect-list-of-data-cache-aliasing-architectures.patch
+ dax-add-empty-static-inline-for-config_dax=n.patch added to mm-unstable branch
The patch titled Subject: dax: add empty static inline for CONFIG_DAX=n has been added to the -mm mm-unstable branch. Its filename is dax-add-empty-static-inline-for-config_dax=n.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/dax-add-empty-static-inline-for-config_dax=n.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days -- From: Mathieu Desnoyers Subject: dax: add empty static inline for CONFIG_DAX=n Date: Thu, 15 Feb 2024 09:46:25 -0500 Patch series "Introduce cpu_dcache_is_aliasing() to fix DAX regression", v6. This commit introduced in v4.0 prevents building FS_DAX on 32-bit ARM, even on ARMv7 which does not have virtually aliased data caches: commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Even though it used to work fine before. The root of the issue here is the fact that DAX was never designed to handle virtually aliasing data caches (VIVT and VIPT with aliasing data cache). It touches the pages through their linear mapping, which is not consistent with the userspace mappings with virtually aliasing data caches. This patch series introduces cpu_dcache_is_aliasing() with the new Kconfig option ARCH_HAS_CPU_CACHE_ALIASING and implements it for all architectures. The implementation of cpu_dcache_is_aliasing() is either evaluated to a constant at compile-time or a runtime check, which is what is needed on ARM. With this we can basically narrow down the list of architectures which are unsupported by DAX to those which are really affected. This patch (of 9): When building a kernel with CONFIG_DAX=n, all uses of set_dax_nocache() and set_dax_nomc() need to be either within regions of code or compile units which are explicitly not compiled, or they need to rely on compiler optimizations to eliminate calls to those undefined symbols. It appears that at least the openrisc and loongarch architectures don't end up eliminating those undefined symbols even if they are provably within code which is eliminated due to conditional branches depending on constants. Implement empty static inline functions for set_dax_nocache() and set_dax_nomc() in CONFIG_DAX=n to ensure those undefined references are removed. Link: https://lkml.kernel.org/r/20240215144633.96437-1-mathieu.desnoy...@efficios.com Link: https://lkml.kernel.org/r/20240215144633.96437-2-mathieu.desnoy...@efficios.com Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202402140037.wgfa1kqx-...@intel.com/ Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202402131351.a0fzogeg-...@intel.com/ Fixes: 7ac5360cd4d0 ("dax: remove the copy_from_iter and copy_to_iter methods") Signed-off-by: Mathieu Desnoyers Cc: Christoph Hellwig Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: Dave Chinner Cc: Michael Sclafani Cc: Alasdair Kergon Cc: Heiko Carstens Cc: Mike Snitzer Cc: Mikulas Patocka Signed-off-by: Andrew Morton --- include/linux/dax.h | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) --- a/include/linux/dax.h~dax-add-empty-static-inline-for-config_dax=n +++ a/include/linux/dax.h @@ -63,6 +63,8 @@ void kill_dax(struct dax_device *dax_dev void dax_write_cache(struct dax_device *dax_dev, bool wc); bool dax_write_cache_enabled(struct dax_device *dax_dev); bool dax_synchronous(struct dax_device *dax_dev); +void set_dax_nocache(struct dax_device *dax_dev); +void set_dax_nomc(struct dax_device *dax_dev); void set_dax_synchronous(struct dax_device *dax_dev); size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, void *addr, size_t bytes, struct iov_iter *i); @@ -109,6 +111,12 @@ static inline bool dax_synchronous(struc { return true; } +static inline void set_dax_nocache(struct dax_device *dax_dev) +{ +} +static inline void set_dax_nomc(struct dax_device *dax_dev) +{ +} static inline void set_dax_synchronous(struct dax_device *dax_dev) { } @@ -124,9 +132,6 @@ static inline size_t dax_recovery_write( } #endif -void set_dax_nocache(struct dax_device *dax_dev); -void set_dax_nomc(struct dax_device *dax_dev); - struct writeback_control; #if defined(CONFIG_BLOCK) && defined(CONFIG_FS_DAX) int
[PATCH v6 8/9] Introduce cpu_dcache_is_aliasing() across all architectures
Introduce a generic way to query whether the data cache is virtually aliased on all architectures. Its purpose is to ensure that subsystems which are incompatible with virtually aliased data caches (e.g. FS_DAX) can reliably query this. For data cache aliasing, there are three scenarios dependending on the architecture. Here is a breakdown based on my understanding: A) The data cache is always aliasing: * arc * csky * m68k (note: shared memory mappings are incoherent ? SHMLBA is missing there.) * sh * parisc B) The data cache aliasing is statically known or depends on querying CPU state at runtime: * arm (cache_is_vivt() || cache_is_vipt_aliasing()) * mips (cpu_has_dc_aliases) * nios2 (NIOS2_DCACHE_SIZE > PAGE_SIZE) * sparc32 (vac_cache_size > PAGE_SIZE) * sparc64 (L1DCACHE_SIZE > PAGE_SIZE) * xtensa (DCACHE_WAY_SIZE > PAGE_SIZE) C) The data cache is never aliasing: * alpha * arm64 (aarch64) * hexagon * loongarch (but with incoherent write buffers, which are disabled since commit d23b7795 ("LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE")) * microblaze * openrisc * powerpc * riscv * s390 * um * x86 Require architectures in A) and B) to select ARCH_HAS_CPU_CACHE_ALIASING and implement "cpu_dcache_is_aliasing()". Architectures in C) don't select ARCH_HAS_CPU_CACHE_ALIASING, and thus cpu_dcache_is_aliasing() simply evaluates to "false". Note that this leaves "cpu_icache_is_aliasing()" to be implemented as future work. This would be useful to gate features like XIP on architectures which have aliasing CPU dcache-icache but not CPU dcache-dcache. Use "cpu_dcache" and "cpu_cache" rather than just "dcache" and "cache" to clarify that we really mean "CPU data cache" and "CPU cache" to eliminate any possible confusion with VFS "dentry cache" and "page cache". Link: https://lore.kernel.org/lkml/20030910210416.ga24...@mail.jlokier.co.uk/ Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- arch/arc/Kconfig| 1 + arch/arc/include/asm/cachetype.h| 9 + arch/arm/Kconfig| 1 + arch/arm/include/asm/cachetype.h| 2 ++ arch/csky/Kconfig | 1 + arch/csky/include/asm/cachetype.h | 9 + arch/m68k/Kconfig | 1 + arch/m68k/include/asm/cachetype.h | 9 + arch/mips/Kconfig | 1 + arch/mips/include/asm/cachetype.h | 9 + arch/nios2/Kconfig | 1 + arch/nios2/include/asm/cachetype.h | 10 ++ arch/parisc/Kconfig | 1 + arch/parisc/include/asm/cachetype.h | 9 + arch/sh/Kconfig | 1 + arch/sh/include/asm/cachetype.h | 9 + arch/sparc/Kconfig | 1 + arch/sparc/include/asm/cachetype.h | 14 ++ arch/xtensa/Kconfig | 1 + arch/xtensa/include/asm/cachetype.h | 10 ++ include/linux/cacheinfo.h | 6 ++ mm/Kconfig | 6 ++ 22 files changed, 112 insertions(+) create mode 100644 arch/arc/include/asm/cachetype.h create mode 100644 arch/csky/include/asm/cachetype.h create mode 100644 arch/m68k/include/asm/cachetype.h create mode 100644 arch/mips/include/asm/cachetype.h create mode 100644 arch/nios2/include/asm/cachetype.h create mode 100644 arch/parisc/include/asm/cachetype.h create mode 100644 arch/sh/include/asm/cachetype.h create mode 100644 arch/sparc/include/asm/cachetype.h create mode 100644 arch/xtensa/include/asm/cachetype.h diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index 1b0483c51cc1..7d294a3242a4 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -6,6 +6,7 @@ config ARC def_bool y select ARC_TIMERS + select ARCH_HAS_CPU_CACHE_ALIASING select ARCH_HAS_CACHE_LINE_SIZE select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DMA_PREP_COHERENT diff --git a/arch/arc/include/asm/cachetype.h b/arch/arc/include/asm/cachetype.h new file mode 100644 index ..05fc7ed59712 --- /dev/null +++ b/arch/arc/include/asm/cachetype.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_ARC_CACHETYPE_H +#define __ASM_ARC_CACHETYPE_H + +#include + +#define cpu_dcache_is_aliasing() true + +#endif diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index f8567e95f98b..cd13b1788973 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -5,6 +5,7 @@ config ARM select ARCH_32BIT_OFF_T select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE if HAVE_KRETPROBES && FRAME_POINTER && !ARM_UNWIND
[PATCH v6 9/9] dax: Fix incorrect list of data cache aliasing architectures
commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches") prevents DAX from building on architectures with virtually aliased dcache with: depends on !(ARM || MIPS || SPARC) This check is too broad (e.g. recent ARMv7 don't have virtually aliased dcaches), and also misses many other architectures with virtually aliased data cache. This is a regression introduced in the v4.0 Linux kernel where the dax mount option is removed for 32-bit ARMv7 boards which have no data cache aliasing, and therefore should work fine with FS_DAX. This was turned into the following check in alloc_dax() by a preparatory change: if (ops && (IS_ENABLED(CONFIG_ARM) || IS_ENABLED(CONFIG_MIPS) || IS_ENABLED(CONFIG_SPARC))) return NULL; Use cpu_dcache_is_aliasing() instead to figure out whether the environment has aliasing data caches. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/dax/super.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index ce5bffa86bba..a21a7c262382 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "dax-private.h" /** @@ -455,9 +456,7 @@ struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) * except for device-dax (NULL operations pointer), which does * not use aliased mappings from the kernel. */ - if (ops && (IS_ENABLED(CONFIG_ARM) || - IS_ENABLED(CONFIG_MIPS) || - IS_ENABLED(CONFIG_SPARC))) + if (ops && cpu_dcache_is_aliasing()) return ERR_PTR(-EOPNOTSUPP); if (WARN_ON_ONCE(ops && !ops->zero_page_range)) -- 2.39.2
[PATCH v6 7/9] dax: Check for data cache aliasing at runtime
Replace the following fs/Kconfig:FS_DAX dependency: depends on !(ARM || MIPS || SPARC) By a runtime check within alloc_dax(). This runtime check returns ERR_PTR(-EOPNOTSUPP) if the @ops parameter is non-NULL (which means the kernel is using an aliased mapping) on an architecture which has data cache aliasing. Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for CONFIG_DAX=n for consistency. This is done in preparation for using cpu_dcache_is_aliasing() in a following change which will properly support architectures which detect data cache aliasing at runtime. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/dax/super.c | 10 ++ fs/Kconfig | 1 - 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index 205b888d45bf..ce5bffa86bba 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -450,6 +450,16 @@ struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) dev_t devt; int minor; + /* +* Unavailable on architectures with virtually aliased data caches, +* except for device-dax (NULL operations pointer), which does +* not use aliased mappings from the kernel. +*/ + if (ops && (IS_ENABLED(CONFIG_ARM) || + IS_ENABLED(CONFIG_MIPS) || + IS_ENABLED(CONFIG_SPARC))) + return ERR_PTR(-EOPNOTSUPP); + if (WARN_ON_ONCE(ops && !ops->zero_page_range)) return ERR_PTR(-EINVAL); diff --git a/fs/Kconfig b/fs/Kconfig index 42837617a55b..e5efdb3b276b 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -56,7 +56,6 @@ endif # BLOCK config FS_DAX bool "File system based Direct Access (DAX) support" depends on MMU - depends on !(ARM || MIPS || SPARC) depends on ZONE_DEVICE || FS_DAX_LIMITED select FS_IOMAP select DAX -- 2.39.2
[PATCH v6 5/9] dcssblk: Handle alloc_dax() -EOPNOTSUPP failure
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dcssblk dcssblk_add_store() to handle alloc_dax() -EOPNOTSUPP failures. Considering that s390 is not a data cache aliasing architecture, and considering that DCSSBLK selects DAX, a return value of -EOPNOTSUPP from alloc_dax() should make dcssblk_add_store() fail. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Acked-by: Heiko Carstens Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev Cc: linux-s...@vger.kernel.org --- drivers/s390/block/dcssblk.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index 4b7ecd4fd431..f363c1d51d9a 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -549,6 +549,7 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char int rc, i, j, num_of_segments; struct dcssblk_dev_info *dev_info; struct segment_info *seg_info, *temp; + struct dax_device *dax_dev; char *local_buf; unsigned long seg_byte_size; @@ -677,13 +678,13 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char if (rc) goto put_dev; - dev_info->dax_dev = alloc_dax(dev_info, _dax_ops); - if (IS_ERR(dev_info->dax_dev)) { - rc = PTR_ERR(dev_info->dax_dev); - dev_info->dax_dev = NULL; + dax_dev = alloc_dax(dev_info, _dax_ops); + if (IS_ERR(dax_dev)) { + rc = PTR_ERR(dax_dev); goto put_dev; } - set_dax_synchronous(dev_info->dax_dev); + set_dax_synchronous(dax_dev); + dev_info->dax_dev = dax_dev; rc = dax_add_host(dev_info->dax_dev, dev_info->gd); if (rc) goto out_dax; -- 2.39.2
[PATCH v6 6/9] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of virtio virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. Co-developed-by: Dan Williams Signed-off-by: Dan Williams Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- fs/fuse/virtio_fs.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 5f1be1da92ce..a28466c2da71 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "fuse_i.h" @@ -795,8 +796,11 @@ static void virtio_fs_cleanup_dax(void *data) put_dax(dax_dev); } +DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR_OR_NULL(_T)) virtio_fs_cleanup_dax(_T)) + static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) { + struct dax_device *dax_dev __free(cleanup_dax) = NULL; struct virtio_shm_region cache_reg; struct dev_pagemap *pgmap; bool have_cache; @@ -804,6 +808,12 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) if (!IS_ENABLED(CONFIG_FUSE_DAX)) return 0; + dax_dev = alloc_dax(fs, _fs_dax_ops); + if (IS_ERR(dax_dev)) { + int rc = PTR_ERR(dax_dev); + return rc == -EOPNOTSUPP ? 0 : rc; + } + /* Get cache region */ have_cache = virtio_get_shm_region(vdev, _reg, (u8)VIRTIO_FS_SHMCAP_ID_CACHE); @@ -849,10 +859,7 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) dev_dbg(>dev, "%s: window kaddr 0x%px phys_addr 0x%llx len 0x%llx\n", __func__, fs->window_kaddr, cache_reg.addr, cache_reg.len); - fs->dax_dev = alloc_dax(fs, _fs_dax_ops); - if (IS_ERR(fs->dax_dev)) - return PTR_ERR(fs->dax_dev); - + fs->dax_dev = no_free_ptr(dax_dev); return devm_add_action_or_reset(>dev, virtio_fs_cleanup_dax, fs->dax_dev); } -- 2.39.2
[PATCH v6 3/9] nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of nvdimm/pmem pmem_attach_disk() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. [ Based on commit "nvdimm/pmem: Fix leak on dax_add_host() failure". ] Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/nvdimm/pmem.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 9fe358090720..e9898457a7bd 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -560,17 +560,19 @@ static int pmem_attach_disk(struct device *dev, dax_dev = alloc_dax(pmem, _dax_ops); if (IS_ERR(dax_dev)) { rc = PTR_ERR(dax_dev); - goto out; + if (rc != -EOPNOTSUPP) + goto out; + } else { + set_dax_nocache(dax_dev); + set_dax_nomc(dax_dev); + if (is_nvdimm_sync(nd_region)) + set_dax_synchronous(dax_dev); + pmem->dax_dev = dax_dev; + rc = dax_add_host(dax_dev, disk); + if (rc) + goto out_cleanup_dax; + dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); } - set_dax_nocache(dax_dev); - set_dax_nomc(dax_dev); - if (is_nvdimm_sync(nd_region)) - set_dax_synchronous(dax_dev); - pmem->dax_dev = dax_dev; - rc = dax_add_host(dax_dev, disk); - if (rc) - goto out_cleanup_dax; - dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); rc = device_add_disk(dev, disk, pmem_attribute_groups); if (rc) goto out_remove_host; -- 2.39.2
[PATCH v6 4/9] dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dm alloc_dev() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Suggested-by: Dan Williams Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/md/dm.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 23c32cd1f1d8..acdc00bc05be 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2054,6 +2054,7 @@ static void cleanup_mapped_device(struct mapped_device *md) static struct mapped_device *alloc_dev(int minor) { int r, numa_node_id = dm_get_numa_node(); + struct dax_device *dax_dev; struct mapped_device *md; void *old_md; @@ -2122,15 +2123,15 @@ static struct mapped_device *alloc_dev(int minor) md->disk->private_data = md; sprintf(md->disk->disk_name, "dm-%d", minor); - if (IS_ENABLED(CONFIG_FS_DAX)) { - md->dax_dev = alloc_dax(md, _dax_ops); - if (IS_ERR(md->dax_dev)) { - md->dax_dev = NULL; + dax_dev = alloc_dax(md, _dax_ops); + if (IS_ERR(dax_dev)) { + if (PTR_ERR(dax_dev) != -EOPNOTSUPP) goto bad; - } - set_dax_nocache(md->dax_dev); - set_dax_nomc(md->dax_dev); - if (dax_add_host(md->dax_dev, md->disk)) + } else { + set_dax_nocache(dax_dev); + set_dax_nomc(dax_dev); + md->dax_dev = dax_dev; + if (dax_add_host(dax_dev, md->disk)) goto bad; } -- 2.39.2
[PATCH v6 2/9] dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n
Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for CONFIG_DAX=n to be consistent with the fact that CONFIG_DAX=y never returns NULL. This is done in preparation for using cpu_dcache_is_aliasing() in a following change which will properly support architectures which detect data cache aliasing at runtime. Fixes: 4e4ced93794a ("dax: Move mandatory ->zero_page_range() check in alloc_dax()") Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- drivers/dax/super.c | 5 + include/linux/dax.h | 6 +- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index 0da9232ea175..205b888d45bf 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -319,6 +319,11 @@ EXPORT_SYMBOL_GPL(dax_alive); * that any fault handlers or operations that might have seen * dax_alive(), have completed. Any operations that start after * synchronize_srcu() has run will abort upon seeing !dax_alive(). + * + * Note, because alloc_dax() returns an ERR_PTR() on error, callers + * typically store its result into a local variable in order to check + * the result. Therefore, care must be taken to populate the struct + * device dax_dev field make sure the dax_dev is not leaked. */ void kill_dax(struct dax_device *dax_dev) { diff --git a/include/linux/dax.h b/include/linux/dax.h index e3ffe7c7f01d..9d3e3327af4c 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -88,11 +88,7 @@ static inline void *dax_holder(struct dax_device *dax_dev) static inline struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) { - /* -* Callers should check IS_ENABLED(CONFIG_DAX) to know if this -* NULL is an error or expected. -*/ - return NULL; + return ERR_PTR(-EOPNOTSUPP); } static inline void put_dax(struct dax_device *dax_dev) { -- 2.39.2
[PATCH v6 1/9] dax: add empty static inline for CONFIG_DAX=n
When building a kernel with CONFIG_DAX=n, all uses of set_dax_nocache() and set_dax_nomc() need to be either within regions of code or compile units which are explicitly not compiled, or they need to rely on compiler optimizations to eliminate calls to those undefined symbols. It appears that at least the openrisc and loongarch architectures don't end up eliminating those undefined symbols even if they are provably within code which is eliminated due to conditional branches depending on constants. Implement empty static inline functions for set_dax_nocache() and set_dax_nomc() in CONFIG_DAX=n to ensure those undefined references are removed. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202402140037.wgfa1kqx-...@intel.com/ Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202402131351.a0fzogeg-...@intel.com/ Fixes: 7ac5360cd4d0 ("dax: remove the copy_from_iter and copy_to_iter methods") Signed-off-by: Mathieu Desnoyers Cc: Christoph Hellwig Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- include/linux/dax.h | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/include/linux/dax.h b/include/linux/dax.h index b463502b16e1..e3ffe7c7f01d 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -63,6 +63,8 @@ void kill_dax(struct dax_device *dax_dev); void dax_write_cache(struct dax_device *dax_dev, bool wc); bool dax_write_cache_enabled(struct dax_device *dax_dev); bool dax_synchronous(struct dax_device *dax_dev); +void set_dax_nocache(struct dax_device *dax_dev); +void set_dax_nomc(struct dax_device *dax_dev); void set_dax_synchronous(struct dax_device *dax_dev); size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, void *addr, size_t bytes, struct iov_iter *i); @@ -109,6 +111,12 @@ static inline bool dax_synchronous(struct dax_device *dax_dev) { return true; } +static inline void set_dax_nocache(struct dax_device *dax_dev) +{ +} +static inline void set_dax_nomc(struct dax_device *dax_dev) +{ +} static inline void set_dax_synchronous(struct dax_device *dax_dev) { } @@ -124,9 +132,6 @@ static inline size_t dax_recovery_write(struct dax_device *dax_dev, } #endif -void set_dax_nocache(struct dax_device *dax_dev); -void set_dax_nomc(struct dax_device *dax_dev); - struct writeback_control; #if defined(CONFIG_BLOCK) && defined(CONFIG_FS_DAX) int dax_add_host(struct dax_device *dax_dev, struct gendisk *disk); -- 2.39.2
[PATCH v6 0/9] Introduce cpu_dcache_is_aliasing() to fix DAX regression
This commit introduced in v4.0 prevents building FS_DAX on 32-bit ARM, even on ARMv7 which does not have virtually aliased data caches: commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Even though it used to work fine before. The root of the issue here is the fact that DAX was never designed to handle virtually aliasing data caches (VIVT and VIPT with aliasing data cache). It touches the pages through their linear mapping, which is not consistent with the userspace mappings with virtually aliasing data caches. This patch series introduces cpu_dcache_is_aliasing() with the new Kconfig option ARCH_HAS_CPU_CACHE_ALIASING and implements it for all architectures. The implementation of cpu_dcache_is_aliasing() is either evaluated to a constant at compile-time or a runtime check, which is what is needed on ARM. With this we can basically narrow down the list of architectures which are unsupported by DAX to those which are really affected. Testing done so far: - Compile allyesconfig on x86-64, - Compile allyesconfig on x86-64, with FS_DAX=n. - Compile allyesconfig on x86-64, with DAX=n. - Boot test after modifying alloc_dax() to force returning -EOPNOTSUPP even on x86-64, thus simulating the behavior expected on an architecture with data cache aliasing. There are many more axes to test however. I would welcome Tested-by for: - affected architectures, - affected drivers, - affected filesytems. [ Based on commit "nvdimm/pmem: Fix leak on dax_add_host() failure". ] Thanks, Mathieu Changes since v5: - Add empty static inline set_dax_nocache() and set_dax_nomc() for CONFIG_DAX=n. - Update "Fixes" tag for "dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n". - Check IS_ERR_OR_NULL() before calling virtio_fs_cleanup_dax() within virtio_fs_setup_dax(). Changes since v4: - Move the change which makes alloc_dax() return ERR_PTR(-EOPNOTSUPP) when CONFIG_DAX=n earlier in the series, - Fold driver cleanup patches into their respective per-driver changes. - Move "nvdimm/pmem: Fix leak on dax_add_host() failure" outside of this series. Changes since v3: - Fix a leak on dax_add_host() failure in nvdimm/pmem. - Split the series into a bissectable sequence of changes. - Ensure that device-dax use-cases still works on data cache aliasing architectures. Changes since v2: - Move DAX supported runtime check to alloc_dax(), - Modify DM to handle alloc_dax() error as non-fatal, - Remove all filesystem modifications, since the check is now done by alloc_dax(), - rename "dcache" and "cache" to "cpu dcache" and "cpu cache" to eliminate confusion with VFS terminology. Changes since v1: - The order of the series was completely changed based on the feedback received on v1, - cache_is_aliasing() is renamed to dcache_is_aliasing(), - ARCH_HAS_CACHE_ALIASING_DYNAMIC is gone, - dcache_is_aliasing() vs ARCH_HAS_CACHE_ALIASING relationship is simplified, - the dax_is_supported() check was moved to its rightful place in all filesystems. Cc: Andrew Morton Cc: Linus Torvalds Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev Cc: linux-s...@vger.kernel.org Mathieu Desnoyers (9): dax: add empty static inline for CONFIG_DAX=n dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal dcssblk: Handle alloc_dax() -EOPNOTSUPP failure virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal dax: Check for data cache aliasing at runtime Introduce cpu_dcache_is_aliasing() across all architectures dax: Fix incorrect list of data cache aliasing architectures arch/arc/Kconfig| 1 + arch/arc/include/asm/cachetype.h| 9 + arch/arm/Kconfig| 1 + arch/arm/include/asm/cachetype.h| 2 ++ arch/csky/Kconfig | 1 + arch/csky/include/asm/cachetype.h | 9 + arch/m68k/Kconfig | 1 + arch/m68k/include/asm/cachetype.h | 9 + arch/mips/Kconfig | 1 + arch/mips/include/asm/cachetype.h | 9 + arch/nios2/Kconfig | 1 + arch/nios2/include/asm/cachetype.h | 10 ++ arch/parisc/Kconfig | 1 + arch/parisc/include/asm/cachetype.h | 9 + arch/sh/Kconfig | 1 + arch/sh/include/asm/cachetype.h | 9 + arch/sparc/Kconfig | 1 + arch/sparc/include/asm/cachetype.h | 14 ++ arch/xtensa/Kconfig | 1 + arch/xtensa/include/asm/cachetype.h | 10 ++ drivers/dax/super.c | 14 ++ drivers/md/dm.c | 17
[PATCH v2] nvdimm/pmem: Fix leak on dax_add_host() failure
Fix a leak on dax_add_host() error, where "goto out_cleanup_dax" is done before setting pmem->dax_dev, which therefore issues the two following calls on NULL pointers: out_cleanup_dax: kill_dax(pmem->dax_dev); put_dax(pmem->dax_dev); Signed-off-by: Mathieu Desnoyers Reviewed-by: Dan Williams Reviewed-by: Dave Jiang Reviewed-by: Fan Ni Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Mikulas Patocka Cc: Andrew Morton Cc: Linus Torvalds Cc: Dan Williams Cc: Vishal Verma Cc: Dave Jiang Cc: Matthew Wilcox Cc: Arnd Bergmann Cc: Russell King Cc: linux-a...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org Cc: linux-...@vger.kernel.org Cc: dm-devel@lists.linux.dev Cc: nvd...@lists.linux.dev --- Changes since v1: - Add Reviewed-by tags. --- drivers/nvdimm/pmem.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 4e8fdcb3f1c8..9fe358090720 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -566,12 +566,11 @@ static int pmem_attach_disk(struct device *dev, set_dax_nomc(dax_dev); if (is_nvdimm_sync(nd_region)) set_dax_synchronous(dax_dev); + pmem->dax_dev = dax_dev; rc = dax_add_host(dax_dev, disk); if (rc) goto out_cleanup_dax; dax_write_cache(dax_dev, nvdimm_has_cache(nd_region)); - pmem->dax_dev = dax_dev; - rc = device_add_disk(dev, disk, pmem_attribute_groups); if (rc) goto out_remove_host; -- 2.39.2