Re: [PATCH 11/11] sysctl: treewide: constify the ctl_table argument of handlers

2024-03-27 Thread Dave Chinner
ency_record latency_record[MAXLR]; > int latencytop_enabled; > > #ifdef CONFIG_SYSCTL > -static int sysctl_latencytop(struct ctl_table *table, int write, void > *buffer, > - size_t *lenp, loff_t *ppos) > +static int sysctl_latencytop(const struct ctl_table *table, int w

Re: [PATCH 11/11] sysctl: treewide: constify the ctl_table argument of handlers

2024-03-15 Thread Dave Chinner
ency_record latency_record[MAXLR]; > int latencytop_enabled; > > #ifdef CONFIG_SYSCTL > -static int sysctl_latencytop(struct ctl_table *table, int write, void > *buffer, > - size_t *lenp, loff_t *ppos) > +static int sysctl_latencytop(const struct ctl_table *table, int write, > + void *buffer, > + size_t *lenp, loff_t *ppos) > { > int err; > And this. I could go on, but there are so many examples of this in the patch that I think that it needs to be toosed away and regenerated in a way that doesn't trash the existing function parameter formatting. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 11/11] sysctl: treewide: constify the ctl_table argument of handlers

2024-03-15 Thread Dave Chinner
ency_record latency_record[MAXLR]; > int latencytop_enabled; > > #ifdef CONFIG_SYSCTL > -static int sysctl_latencytop(struct ctl_table *table, int write, void > *buffer, > - size_t *lenp, loff_t *ppos) > +static int sysctl_latencytop(const struct ctl_table *table, int write, > + void *buffer, > + size_t *lenp, loff_t *ppos) > { > int err; > And this. I could go on, but there are so many examples of this in the patch that I think that it needs to be toosed away and regenerated in a way that doesn't trash the existing function parameter formatting. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v2] statx: stx_subvol

2024-03-10 Thread Dave Chinner
t, truncate, clear rt, copy data back into data dev". It's still the same inode, and may have exactly the same data, so why should change stx_vol and make it appear to userspace as being a different inode? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/4] xattr: Use dedicated slab buckets for setxattr()

2024-03-04 Thread Dave Chinner
ng vector that almost no-one will ever see for a far more frequent -ENOMEM denial of service that will be seen on production systems where large xattrs are used. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v3 25/35] xfs: Memory allocation profiling fixups

2024-02-14 Thread Dave Chinner
> - return kmem_alloc(size, flags | KM_ZERO); > -} > +#define kmem_zalloc(_size, _flags) kmem_alloc((_size), (_flags) | KM_ZERO) > > /* > * Zone interfaces > -- > 2.43.0.687.g38aa6559b0-goog These changes can be dropped - the fs/xfs/kmem.[ch] stuff is now gone in linux-xfs/for-next. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC 05/18] pkernfs: add file mmap callback

2024-02-05 Thread Dave Chinner
t happens when this file is truncated whilst it is mmap()d by an application? Ain't that just a great big UAF waiting to be exploited? -Dave. -- Dave Chinner da...@fromorbit.com ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec

Re: [RFC PATCH v2 7/8] Introduce dcache_is_aliasing() across all architectures

2024-01-31 Thread Dave Chinner
On Wed, Jan 31, 2024 at 09:58:21AM -0500, Mathieu Desnoyers wrote: > On 2024-01-30 21:48, Dave Chinner wrote: > > On Tue, Jan 30, 2024 at 11:52:54AM -0500, Mathieu Desnoyers wrote: > > > Introduce a generic way to query whether the dcache is virtually aliased > > >

Re: [RFC PATCH v2 8/8] dax: Fix incorrect list of dcache aliasing architectures

2024-01-30 Thread Dave Chinner
ner should go into fs_dax_get_by_bdev(), similar to the blk_queue_dax() check at the start of the function. I also noticed that device mapper uses fs_dax_get_by_bdev() to determine if it can support DAX, but this patch set does not address that case. Hence it really seems to me like fs_dax_get_by_bdev() is the right place to put this check. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 7/8] Introduce dcache_is_aliasing() across all architectures

2024-01-30 Thread Dave Chinner
tions with the VFS dentry cache aliasing when we read this code? Something like cpu_dcache_is_aliased(), perhaps? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH v2 1/8] dax: Introduce dax_is_supported()

2024-01-30 Thread Dave Chinner
h currently returns NULL if CONFIG_FS_DAX=n and so should be cahnged to return NULL if any of these platform configs is enabled. Then I don't think you need to change a single line of filesystem code - they'll all just do what they do now if the block device doesn't support DAX -Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 7/7] xfs: Use dax_is_supported()

2024-01-29 Thread Dave Chinner
che - if the inode has a flag that says "use DAX" and dax is suppoortable by the hardware, then the turn on DAX for that inode. Otherwise we just use the normal non-dax IO paths. Again, we don't error out the filesystem if DAX is not supported, we just don't turn it on. This check is done in xfs_inode_should_enable_dax() and I think all you need to do is replace the IS_ENABLED(CONFIG_FS_DAX) with a dax_is_supported() call... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH 1/5] zonefs: pass GFP_KERNEL to blkdev_zone_mgmt() call

2024-01-23 Thread Dave Chinner via Linux-f2fs-devel
(_I(inode)->i_truncate_mutex); - so, this > function is called with the mutex held - could it happen that the > GFP_KERNEL allocation recurses into the filesystem and attempts to take > i_truncate_mutex as well? > > i.e. GFP_KERNEL -> iomap_do_writ

Re: [PATCH 1/5] zonefs: pass GFP_KERNEL to blkdev_zone_mgmt() call

2024-01-23 Thread Dave Chinner
(_I(inode)->i_truncate_mutex); - so, this > function is called with the mutex held - could it happen that the > GFP_KERNEL allocation recurses into the filesystem and attempts to take > i_truncate_mutex as well? > > i.e. GFP_KERNEL -> iomap_do_writepage -> zonefs_write_map_blocks -> > zonefs_write_iomap_begin -> mutex_lock(>i_truncate_mutex) zonefs doesn't have a ->writepage method, so writeback can't be called from memory reclaim like this. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] bcachefs: fix incorrect usage of REQ_OP_FLUSH

2024-01-22 Thread Dave Chinner
t corpse to isolate the write groups where the consistency failure occurs when doing work to optimise flushes being issued by the XFS journal checkpoint writes. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: file handle in statx

2023-12-12 Thread Dave Chinner
On Tue, Dec 12, 2023 at 05:39:27PM -0500, Kent Overstreet wrote: > On Wed, Dec 13, 2023 at 09:23:18AM +1100, Dave Chinner wrote: > > On Wed, Dec 13, 2023 at 08:57:43AM +1100, NeilBrown wrote: > > > On Wed, 13 Dec 2023, Dave Chinner wrote: > > > > On Tue, Dec 12, 2

Re: file handle in statx

2023-12-12 Thread Dave Chinner
On Tue, Dec 12, 2023 at 09:15:29AM -0800, Frank Filz wrote: > > On Tue, Dec 12, 2023 at 10:10:23AM +0100, Donald Buczek wrote: > > > On 12/12/23 06:53, Dave Chinner wrote: > > > > > > > So can someone please explain to me why we need to try to re-invent

Re: file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?)

2023-12-12 Thread Dave Chinner
On Tue, Dec 12, 2023 at 10:21:53AM -0500, Kent Overstreet wrote: > On Tue, Dec 12, 2023 at 04:53:28PM +1100, Dave Chinner wrote: > > Doesn't anyone else see or hear the elephant trumpeting loudly in > > the middle of the room? > > > > I mean, we already have name_t

Re: file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?)

2023-12-11 Thread Dave Chinner
andle to determine what subvol/snapshot the inode belongs to when the handle is passed back to it (e.g. from open_by_handle_at()) then nothing else needs to care how it is encoded. So can someone please explain to me why we need to try to re-invent a generic filehandle concept in statx when we already have a have working and widely supported user API that provides exactly this functionality? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: About the conflict between XFS inode recycle and VFS rcu-walk

2023-12-07 Thread Dave Chinner
any particular urgency to address it. > Are there any recommended workarounds until an elegant and efficient solution > can be proposed? After all, causing a crash is extremely unacceptable in a > production environment. What crashes are you seeing in your production environment? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 04/11] lib/dlock-list: Make sibling CPUs share the same linked list

2023-12-06 Thread Dave Chinner
On Thu, Dec 07, 2023 at 12:42:59AM -0500, Kent Overstreet wrote: > On Wed, Dec 06, 2023 at 05:05:33PM +1100, Dave Chinner wrote: > > From: Waiman Long > > > > The dlock list needs one list for each of the CPUs available. However, > > for sibling CPUs, they are sha

Re: [PATCH 08/11] vfs: inode cache conversion to hash-bl

2023-12-06 Thread Dave Chinner
On Wed, Dec 06, 2023 at 11:58:44PM -0500, Kent Overstreet wrote: > On Wed, Dec 06, 2023 at 05:05:37PM +1100, Dave Chinner wrote: > > From: Dave Chinner > > > > Scalability of the global inode_hash_lock really sucks for > > filesystems that use the vfs inode cac

Re: [PATCH 03/11] vfs: Use dlock list for superblock's inode list

2023-12-06 Thread Dave Chinner
On Thu, Dec 07, 2023 at 02:40:24AM +, Al Viro wrote: > On Wed, Dec 06, 2023 at 05:05:32PM +1100, Dave Chinner wrote: > > > @@ -303,6 +303,7 @@ static void destroy_unused_super(struct super_block *s) > > super_unlock_excl(s); > > list_l

Re: [PATCH 10/11] list_bl: don't use bit locks for PREEMPT_RT or lockdep

2023-12-06 Thread Dave Chinner
On Wed, Dec 06, 2023 at 11:16:50PM -0500, Kent Overstreet wrote: > On Wed, Dec 06, 2023 at 05:05:39PM +1100, Dave Chinner wrote: > > From: Dave Chinner > > > > hash-bl nests spinlocks inside the bit locks. This causes problems > > for CONFIG_PREEMPT_RT which converts s

Re: [PATCH 05/11] selinux: use dlist for isec inode list

2023-12-06 Thread Dave Chinner
On Wed, Dec 06, 2023 at 04:52:42PM -0500, Paul Moore wrote: > On Wed, Dec 6, 2023 at 1:07 AM Dave Chinner wrote: > > > > From: Dave Chinner > > > > Because it's a horrible point of lock contention under heavily > > concurrent directory traversals...

[PATCH 03/11] vfs: Use dlock list for superblock's inode list

2023-12-05 Thread Dave Chinner
unlock 0.67% __raw_callee_save___pv_queued_spin_unlock Signed-off-by: Waiman Long Signed-off-by: Dave Chinner --- block/bdev.c | 24 fs/drop_caches.c | 9 - fs/gfs2/ops_fstype.c | 21 +++-- fs/inode.c

[PATCH 10/11] list_bl: don't use bit locks for PREEMPT_RT or lockdep

2023-12-05 Thread Dave Chinner
From: Dave Chinner hash-bl nests spinlocks inside the bit locks. This causes problems for CONFIG_PREEMPT_RT which converts spin locks to sleeping locks, and we're not allowed to sleep while holding a spinning lock. Further, lockdep does not support bit locks, so we lose lockdep coverage

[PATCH 08/11] vfs: inode cache conversion to hash-bl

2023-12-05 Thread Dave Chinner
From: Dave Chinner Scalability of the global inode_hash_lock really sucks for filesystems that use the vfs inode cache (i.e. everything but XFS). Profiles of a 32-way concurrent sharded directory walk (no contended directories) on a couple of different filesystems. All numbers from a 6.7-rc4

[PATCH 01/11] lib/dlock-list: Distributed and lock-protected lists

2023-12-05 Thread Dave Chinner
From: Waiman Long Linked list is used everywhere in the Linux kernel. However, if many threads are trying to add or delete entries into the same linked list, it can create a performance bottleneck. This patch introduces a new list APIs that provide a set of distributed lists (one per CPU), each

[PATCH 04/11] lib/dlock-list: Make sibling CPUs share the same linked list

2023-12-05 Thread Dave Chinner
From: Waiman Long The dlock list needs one list for each of the CPUs available. However, for sibling CPUs, they are sharing the L2 and probably L1 caches too. As a result, there is not much to gain in term of avoiding cacheline contention while increasing the cacheline footprint of the L1/L2

[PATCH 11/11] hlist-bl: introduced nested locking for dm-snap

2023-12-05 Thread Dave Chinner
From: Dave Chinner Testing with lockdep enabled threw this warning from generic/081 in fstests: [ 2369.724151] [ 2369.725805] WARNING: possible recursive locking detected [ 2369.727125] 6.7.0-rc2-dgc+ #1952 Not tainted [ 2369.728647

[PATCH 09/11] hash-bl: explicitly initialise hash-bl heads

2023-12-05 Thread Dave Chinner
From: Dave Chinner Because we are going to change how the structure is laid out to support RTPREEMPT and LOCKDEP, just assuming that the hash table is allocated as zeroed memory is no longer sufficient to initialise a hash-bl table. Signed-off-by: Dave Chinner --- fs/dcache.c | 21

[PATCH 05/11] selinux: use dlist for isec inode list

2023-12-05 Thread Dave Chinner
From: Dave Chinner Because it's a horrible point of lock contention under heavily concurrent directory traversals... - 12.14% d_instantiate - 12.06% security_d_instantiate - 12.13% selinux_d_instantiate - 12.16% inode_doinit_with_dentry - 15.45

[PATCH 0/11] vfs: inode cache scalability improvements

2023-12-05 Thread Dave Chinner
We all know that the global inode_hash_lock and the per-fs global sb->s_inode_list_lock locks are contention points in filesystem workloads that stream inodes through memory, so it's about time we addressed these limitations. The first part of the patchset address the sb->s_inode_list_lock. This

[PATCH 02/11] vfs: Remove unnecessary list_for_each_entry_safe() variants

2023-12-05 Thread Dave Chinner
From: Jan Kara evict_inodes() and invalidate_inodes() use list_for_each_entry_safe() to iterate sb->s_inodes list. However, since we use i_lru list entry for our local temporary list of inodes to destroy, the inode is guaranteed to stay in sb->s_inodes list while we hold sb->s_inode_list_lock.

[PATCH 07/11] hlist-bl: add hlist_bl_fake()

2023-12-05 Thread Dave Chinner
From: Dave Chinner in preparation for switching the VFS inode cache over the hlist_bl lists, we nee dto be able to fake a list node that looks like it is hased for correct operation of filesystems that don't directly use the VFS indoe cache. Signed-off-by: Dave Chinner --- include/linux

[PATCH 06/11] vfs: factor out inode hash head calculation

2023-12-05 Thread Dave Chinner
From: Dave Chinner In preparation for changing the inode hash table implementation. Signed-off-by: Dave Chinner --- fs/inode.c | 44 +--- 1 file changed, 25 insertions(+), 19 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 3426691fa305

Re: [PATCH 2/5] pstore: inode: Convert mutex usage to guard(mutex)

2023-12-04 Thread Dave Chinner
lock); > + guard(mutex)(_list_lock); > INIT_LIST_HEAD(_list); > - mutex_unlock(_list_lock); > - > - mutex_unlock(_sb_lock); > } And this worries me, because guard() makes it harder to see where locks are nested and the scope they apply to. At least with lock/unlock pairs the scope of the critical sections and the nestings are obvious. So, yeah, i see that there is a bit less code with these fancy new macros, but I don't think it's made the code is easier to read and maintain at all. Just my 2c worth... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/3] MAINTAINERS: Require kvm-xfstests smoke for ext4

2023-11-22 Thread Dave Chinner
sive, long running tests after code has been integrated into the tree. Forcing individual developers to run this sort of testing just isn't an efficient use of resources > For /new features/, the developer(s) ought to come up with a testing > plan and run that by the community. Eventually those will merge into > fstests or ktest or wherever. That's how it already works, isn't it? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v9 0/3] [PATCH v9 0/3] Introduce provisioning primitives

2023-11-20 Thread Dave Chinner
On Mon, Nov 13, 2023 at 01:26:51PM -0800, Sarthak Kukreti wrote: > On Fri, Nov 10, 2023 at 4:56 PM Dave Chinner wrote: > > > > On Thu, Nov 09, 2023 at 05:01:35PM -0800, Sarthak Kukreti wrote: > > > Hi, > > > > > > This patch series is version 9 of the p

Re: [PATCH v9 0/3] [PATCH v9 0/3] Introduce provisioning primitives

2023-11-10 Thread Dave Chinner
e() operations through XFS? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: (subset) [PATCH 22/32] vfs: inode cache conversion to hash-bl

2023-11-04 Thread Dave Chinner
s because RT makes spinlocks sleeping locks. - There's been additions for lockless RCU inode hash lookups from AFS and ext4 in weird, uncommon corner cases and I have no idea how to validate they still work correctly with hash-bl. I suspect they should just go away with hash-bl, but There's more, but these are the big ones. -Dave. -- Dave Chinner da...@fromorbit.com

Re: (subset) [PATCH 22/32] vfs: inode cache conversion to hash-bl

2023-10-22 Thread Dave Chinner
On Fri, Oct 20, 2023 at 07:49:18PM +0200, Mateusz Guzik wrote: > On 10/20/23, Dave Chinner wrote: > > On Thu, Oct 19, 2023 at 05:59:58PM +0200, Mateusz Guzik wrote: > >> > To be clear there is no urgency as far as I'm concerned, but I did run > >> > into somethin

Re: (subset) [PATCH 22/32] vfs: inode cache conversion to hash-bl

2023-10-20 Thread Dave Chinner
Hence there's no urgency to "fix" these lock contention problems despite the ease with which micro-benchmarks can reproduce it... I've kept the patches current for years, even though there hasn't been a pressing need for them. The last "vfs-scale" version I did some validation on is here: https://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git/log/?h=vfs-scale 5.17 was the last kernel I did any serious validation and measurement against, and that all needs to be repeated before proposing it for inclusion because lots of stuff has changed since I last did some serious multi-filesystem a/b testing of this code -Dave. -- Dave Chinner da...@fromorbit.com

Re: [powerpc] kernel BUG fs/xfs/xfs_message.c:102! [4k block]

2023-10-12 Thread Dave Chinner
sis yet. I suspect the fix may well be to use xfs_trans_buf_get() in the xfs_inode_item_precommit() path if XFS_ISTALE is already set on the inode we are trying to log. We don't need a populated cluster buffer to read data out of or write data into in this path - all we need to do is attach the inode to the buffer so that when the buffer invalidation is committed to the journal it will also correctly finish the stale inode log item. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [dm-devel] [PATCH v8 0/5] Introduce provisioning primitives

2023-10-10 Thread Dave Chinner
On Tue, Oct 10, 2023 at 03:42:53PM -0700, Sarthak Kukreti wrote: > On Sun, Oct 8, 2023 at 4:50 PM Dave Chinner wrote: > > > > On Fri, Oct 06, 2023 at 06:28:12PM -0700, Sarthak Kukreti wrote: > > > Hi, > > > > > > This patch series is version 8 of the p

Re: [dm-devel] [PATCH v8 5/5] block: Pass unshare intent via REQ_OP_PROVISION

2023-10-10 Thread Dave Chinner
On Tue, Oct 10, 2023 at 03:42:39PM -0700, Sarthak Kukreti wrote: > On Sun, Oct 8, 2023 at 4:27 PM Dave Chinner wrote: > > > > On Fri, Oct 06, 2023 at 06:28:17PM -0700, Sarthak Kukreti wrote: > > > Allow REQ_OP_PROVISION to pass in an extra REQ_UNSHARE bit to > &g

Re: [dm-devel] [PATCH v8 3/5] loop: Add support for provision requests

2023-10-10 Thread Dave Chinner
On Tue, Oct 10, 2023 at 03:43:10PM -0700, Sarthak Kukreti wrote: > On Sun, Oct 8, 2023 at 4:37 PM Dave Chinner wrote: > > > > On Fri, Oct 06, 2023 at 06:28:15PM -0700, Sarthak Kukreti wrote: > > > Add support for provision requests to loopback devices. > > > Loop

Re: [dm-devel] [PATCH v8 0/5] Introduce provisioning primitives

2023-10-08 Thread Dave Chinner
eallocation gets propagated by the filesystem down to the backing device correctly and that subsequent IO to the file then does the right thing (e.g. fio testing using fallocate() to set up the files being written to) -Dave. -- Dave Chinner da...@fromorbit.com -- dm-devel mailing list dm-devel@

[dm-devel] [RFC PATCH 7/5] xfs: add block device provisioning for fallocate

2023-10-08 Thread Dave Chinner
From: Dave Chinner Provision space in the block device for preallocated file space when userspace asks for it. Make sure to do this outside of transaction context so it can fail without causing a filesystem shutdown. XXX: async provisioning submission/completion interface would be really useful

[dm-devel] [RFC PATCH 6/5] xfs: detect block devices requiring provisioning

2023-10-08 Thread Dave Chinner
From: Dave Chinner Block device provisioning detection infrastructure. Signed-off-by: Dave Chinner --- fs/xfs/xfs_buf.c | 2 ++ fs/xfs/xfs_buf.h | 1 + fs/xfs/xfs_mount.h | 11 ++- fs/xfs/xfs_super.c | 4 4 files changed, 17 insertions(+), 1 deletion(-) diff --git a/fs

Re: [dm-devel] [PATCH v8 3/5] loop: Add support for provision requests

2023-10-08 Thread Dave Chinner
ONE At minimuim, this set of implementation constraints needs tobe documented somewhere... -Dave. -- Dave Chinner da...@fromorbit.com -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH v8 5/5] block: Pass unshare intent via REQ_OP_PROVISION

2023-10-08 Thread Dave Chinner
hould be blkdev_issue_unshare() rather than optional behaviour to _provision() which - before this patch - had clear and well defined meaning Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel

Re: [PATCH] xfs: drop experimental warning for FSDAX

2023-09-26 Thread Dave Chinner
emory failure (2 pages) > Inject poison... > -Process is killed by signal: 7 > +Memory failure didn't kill the process > > (yes, rmap is enabled) Yes, I see the same failures, too. I've just been ignoring them because I thought that all the memory failure code was still not complete -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-10 Thread Dave Chinner via Linux-f2fs-devel
On Fri, Sep 08, 2023 at 01:29:55AM +0100, Pavel Begunkov wrote: > On 9/3/23 23:30, Dave Chinner wrote: > > On Wed, Aug 30, 2023 at 02:11:31PM +0800, Hao Xu wrote: > > > On 8/29/23 19:53, Matthew Wilcox wrote: > > > > On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu

Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-10 Thread Dave Chinner
On Fri, Sep 08, 2023 at 01:29:55AM +0100, Pavel Begunkov wrote: > On 9/3/23 23:30, Dave Chinner wrote: > > On Wed, Aug 30, 2023 at 02:11:31PM +0800, Hao Xu wrote: > > > On 8/29/23 19:53, Matthew Wilcox wrote: > > > > On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu

Re: [Linux-cachefs] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-10 Thread Dave Chinner
On Fri, Sep 08, 2023 at 01:29:55AM +0100, Pavel Begunkov wrote: > On 9/3/23 23:30, Dave Chinner wrote: > > On Wed, Aug 30, 2023 at 02:11:31PM +0800, Hao Xu wrote: > > > On 8/29/23 19:53, Matthew Wilcox wrote: > > > > On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu

Re: [f2fs-dev] [PATCH 02/11] xfs: add NOWAIT semantics for readdir

2023-09-03 Thread Dave Chinner via Linux-f2fs-devel
On Sun, Aug 27, 2023 at 09:28:26PM +0800, Hao Xu wrote: > From: Hao Xu > > Implement NOWAIT semantics for readdir. Return EAGAIN error to the > caller if it would block, like failing to get locks, or going to > do IO. > > Co-developed-by: Dave Chinner Not really. "C

Re: [Linux-cachefs] [PATCH 02/11] xfs: add NOWAIT semantics for readdir

2023-09-03 Thread Dave Chinner
On Sun, Aug 27, 2023 at 09:28:26PM +0800, Hao Xu wrote: > From: Hao Xu > > Implement NOWAIT semantics for readdir. Return EAGAIN error to the > caller if it would block, like failing to get locks, or going to > do IO. > > Co-developed-by: Dave Chinner Not really. "C

Re: [Cluster-devel] [PATCH 02/11] xfs: add NOWAIT semantics for readdir

2023-09-03 Thread Dave Chinner
On Sun, Aug 27, 2023 at 09:28:26PM +0800, Hao Xu wrote: > From: Hao Xu > > Implement NOWAIT semantics for readdir. Return EAGAIN error to the > caller if it would block, like failing to get locks, or going to > do IO. > > Co-developed-by: Dave Chinner Not really. "C

Re: [f2fs-dev] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-03 Thread Dave Chinner via Linux-f2fs-devel
> > Hi Matthew, > The previous discussion shows this does cause issues in real > producations: > https://lore.kernel.org/io-uring/2785f009-2ebb-028d-8250-d5f3a3051...@gmail.com/#:~:text=fwiw%2C%20we%27ve%20just%20recently%20had%20similar%20problems%20with%20io_ur

Re: [Linux-cachefs] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-03 Thread Dave Chinner
> > Hi Matthew, > The previous discussion shows this does cause issues in real > producations: > https://lore.kernel.org/io-uring/2785f009-2ebb-028d-8250-d5f3a3051...@gmail.com/#:~:text=fwiw%2C%20we%27ve%20just%20recently%20had%20similar%20problems%20with%20io_uring%20read/write > Then se

Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-03 Thread Dave Chinner
> > Hi Matthew, > The previous discussion shows this does cause issues in real > producations: > https://lore.kernel.org/io-uring/2785f009-2ebb-028d-8250-d5f3a3051...@gmail.com/#:~:text=fwiw%2C%20we%27ve%20just%20recently%20had%20similar%20problems%20with%20io_uring%20read/write > Then separate it out into it's own patch set so we can have a discussion on the merits of requiring using noatime, relatime or lazytime for really latency sensitive IO applications. Changing code is not always the right solution... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH RFC v5 00/29] io_uring getdents

2023-08-25 Thread Dave Chinner via Linux-f2fs-devel
tarted with and don't try to solve every little blocking problem that might exist in the VFS and filesystems... -Dave -- Dave Chinner da...@fromorbit.com ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [Cluster-devel] [PATCH RFC v5 00/29] io_uring getdents

2023-08-25 Thread Dave Chinner
tarted with and don't try to solve every little blocking problem that might exist in the VFS and filesystems... -Dave -- Dave Chinner da...@fromorbit.com

Re: [Linux-cachefs] [PATCH RFC v5 00/29] io_uring getdents

2023-08-25 Thread Dave Chinner
tarted with and don't try to solve every little blocking problem that might exist in the VFS and filesystems... -Dave -- Dave Chinner da...@fromorbit.com -- Linux-cachefs mailing list Linux-cachefs@redhat.com https://listman.redhat.com/mailman/listinfo/linux-cachefs

Re: [f2fs-dev] [PATCH 25/29] xfs: support nowait for xfs_buf_item_init()

2023-08-25 Thread Dave Chinner via Linux-f2fs-devel
ven worse than that - once we have committed intents, the whole chain of intent processing must be run to completionr. Hence we can't tolerate backing out of that defered processing chain half way through because we might have to block. Until we can roll back partial dirty transactions and partially completed defered intent chains at any random point of completion, XFS_TRANS_NOWAIT will not work. -Dave. -- Dave Chinner da...@fromorbit.com ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [Linux-cachefs] [PATCH 25/29] xfs: support nowait for xfs_buf_item_init()

2023-08-25 Thread Dave Chinner
ven worse than that - once we have committed intents, the whole chain of intent processing must be run to completionr. Hence we can't tolerate backing out of that defered processing chain half way through because we might have to block. Until we can roll back partial dirty transactions and partially completed defered intent chains at any random point of completion, XFS_TRANS_NOWAIT will not work. -Dave. -- Dave Chinner da...@fromorbit.com -- Linux-cachefs mailing list Linux-cachefs@redhat.com https://listman.redhat.com/mailman/listinfo/linux-cachefs

Re: [Cluster-devel] [PATCH 25/29] xfs: support nowait for xfs_buf_item_init()

2023-08-25 Thread Dave Chinner
ven worse than that - once we have committed intents, the whole chain of intent processing must be run to completionr. Hence we can't tolerate backing out of that defered processing chain half way through because we might have to block. Until we can roll back partial dirty transactions and partially completed defered intent chains at any random point of completion, XFS_TRANS_NOWAIT will not work. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH 24/29] xfs: support nowait for xfs_buf_read_map()

2023-08-25 Thread Dave Chinner via Linux-f2fs-devel
to my first comments that XBF_TRYLOCK cannot simpy be replaced with XBF_NOWAIT semantics... -Dave. -- Dave Chinner da...@fromorbit.com ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 28/29] xfs: support nowait semantics for xc_ctx_lock in xlog_cil_commit()

2023-08-25 Thread Dave Chinner via Linux-f2fs-devel
tem shutdown as we cannot back out from failure with dirty log items gracefully at this point. -Dave. -- Dave Chinner da...@fromorbit.com ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinf

Re: [Linux-cachefs] [PATCH 28/29] xfs: support nowait semantics for xc_ctx_lock in xlog_cil_commit()

2023-08-25 Thread Dave Chinner
tem shutdown as we cannot back out from failure with dirty log items gracefully at this point. -Dave. -- Dave Chinner da...@fromorbit.com -- Linux-cachefs mailing list Linux-cachefs@redhat.com https://listman.redhat.com/mailman/listinfo/linux-cachefs

Re: [Cluster-devel] [PATCH 28/29] xfs: support nowait semantics for xc_ctx_lock in xlog_cil_commit()

2023-08-25 Thread Dave Chinner
tem shutdown as we cannot back out from failure with dirty log items gracefully at this point. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH 26/29] xfs: return -EAGAIN when nowait meets sync in transaction commit

2023-08-25 Thread Dave Chinner via Linux-f2fs-devel
is point with shutting down the filesystem. This points to XFS_TRANS_NOWAIT being completely broken, too, because we don't call xfs_trans_set_sync() until just before we commit the transaction. At this point, it is -too late- for nowait+sync to be handled gracefully, and it will *always* go bad. IOWs,

Re: [Cluster-devel] [PATCH 26/29] xfs: return -EAGAIN when nowait meets sync in transaction commit

2023-08-25 Thread Dave Chinner
is point with shutting down the filesystem. This points to XFS_TRANS_NOWAIT being completely broken, too, because we don't call xfs_trans_set_sync() until just before we commit the transaction. At this point, it is -too late- for nowait+sync to be handled gracefully, and it will *always* go bad. IOWs,

Re: [Linux-cachefs] [PATCH 26/29] xfs: return -EAGAIN when nowait meets sync in transaction commit

2023-08-25 Thread Dave Chinner
is point with shutting down the filesystem. This points to XFS_TRANS_NOWAIT being completely broken, too, because we don't call xfs_trans_set_sync() until just before we commit the transaction. At this point, it is -too late- for nowait+sync to be handled gracefully, and it will *always* go bad. IOWs,

Re: [Linux-cachefs] [PATCH 24/29] xfs: support nowait for xfs_buf_read_map()

2023-08-25 Thread Dave Chinner
to my first comments that XBF_TRYLOCK cannot simpy be replaced with XBF_NOWAIT semantics... -Dave. -- Dave Chinner da...@fromorbit.com -- Linux-cachefs mailing list Linux-cachefs@redhat.com https://listman.redhat.com/mailman/listinfo/linux-cachefs

Re: [Cluster-devel] [PATCH 24/29] xfs: support nowait for xfs_buf_read_map()

2023-08-25 Thread Dave Chinner
to my first comments that XBF_TRYLOCK cannot simpy be replaced with XBF_NOWAIT semantics... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH 02/29] xfs: rename XBF_TRYLOCK to XBF_NOWAIT

2023-08-25 Thread Dave Chinner via Linux-f2fs-devel
XBF_INCORE(1u << 29)/* lookup only, return if found in cache */ > -#define XBF_TRYLOCK (1u << 30)/* lock requested, but do not wait */ > +#define XBF_NOWAIT(1u << 30)/* mem/lock requested, but do not wait */ That's now a really poor comment. It doesn't describe the semantics or constraints that NOWAIT might imply. -Dave. -- Dave Chinner da...@fromorbit.com ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [Cluster-devel] [PATCH 02/29] xfs: rename XBF_TRYLOCK to XBF_NOWAIT

2023-08-25 Thread Dave Chinner
XBF_INCORE(1u << 29)/* lookup only, return if found in cache */ > -#define XBF_TRYLOCK (1u << 30)/* lock requested, but do not wait */ > +#define XBF_NOWAIT(1u << 30)/* mem/lock requested, but do not wait */ That's now a really poor comment. It doesn't describe the semantics or constraints that NOWAIT might imply. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [Linux-cachefs] [PATCH 02/29] xfs: rename XBF_TRYLOCK to XBF_NOWAIT

2023-08-25 Thread Dave Chinner
XBF_INCORE(1u << 29)/* lookup only, return if found in cache */ > -#define XBF_TRYLOCK (1u << 30)/* lock requested, but do not wait */ > +#define XBF_NOWAIT(1u << 30)/* mem/lock requested, but do not wait */ That's now a really poor comment. It doesn't describe the semantics or constraints that NOWAIT might imply. -Dave. -- Dave Chinner da...@fromorbit.com -- Linux-cachefs mailing list Linux-cachefs@redhat.com https://listman.redhat.com/mailman/listinfo/linux-cachefs

Re: [f2fs-dev] [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-07 Thread Dave Chinner via Linux-f2fs-devel
ropping the refcount to zero and freeing occuring in a different context... > + /* > + * We have already exited the read-side of rcu critical section > + * before calling do_shrink_slab(), the shrinker_info may be > + * re

Re: [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-07 Thread Dave Chinner via Linux-erofs
ropping the refcount to zero and freeing occuring in a different context... > + /* > + * We have already exited the read-side of rcu critical section > + * before calling do_shrink_slab(), the shrinker_info may be > + * released in expand_one_shrinker_info(), so reacquire the > + * shrinker_info. > + */ > + index++; > + goto again; With that, what makes the use of shrinker_info in xchg_nr_deferred_memcg() in do_shrink_slab() coherent and valid? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [Cluster-devel] [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-07 Thread Dave Chinner
ropping the refcount to zero and freeing occuring in a different context... > + /* > + * We have already exited the read-side of rcu critical section > + * before calling do_shrink_slab(), the shrinker_info may be > + * released in expand_one_shrinker_info(), so reacquire the > + * shrinker_info. > + */ > + index++; > + goto again; With that, what makes the use of shrinker_info in xchg_nr_deferred_memcg() in do_shrink_slab() coherent and valid? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [dm-devel] [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-07 Thread Dave Chinner
ropping the refcount to zero and freeing occuring in a different context... > + /* > + * We have already exited the read-side of rcu critical section > + * before calling do_shrink_slab(), the shrinker_info may be > + * released in expand_one_shrinker_info(), so reacquire the > + * shrinker_info. > + */ > + index++; > + goto again; With that, what makes the use of shrinker_info in xchg_nr_deferred_memcg() in do_shrink_slab() coherent and valid? -Dave. -- Dave Chinner da...@fromorbit.com -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel

Re: [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-07 Thread Dave Chinner
ropping the refcount to zero and freeing occuring in a different context... > + /* > + * We have already exited the read-side of rcu critical section > + * before calling do_shrink_slab(), the shrinker_info may be > + * released in expand_one_shrinker_info(), so reacquire the > + * shrinker_info. > + */ > + index++; > + goto again; With that, what makes the use of shrinker_info in xchg_nr_deferred_memcg() in do_shrink_slab() coherent and valid? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-07 Thread Dave Chinner via Virtualization
ropping the refcount to zero and freeing occuring in a different context... > + /* > + * We have already exited the read-side of rcu critical section > + * before calling do_shrink_slab(), the shrinker_info may be > + * r

Re: [PATCH v4 46/48] mm: shrinker: make memcg slab shrink lockless

2023-08-07 Thread Dave Chinner
ropping the refcount to zero and freeing occuring in a different context... > + /* > + * We have already exited the read-side of rcu critical section > + * before calling do_shrink_slab(), the shrinker_info may be > + * released in expand_one_shrinker_info(), so reacquire the > + * shrinker_info. > + */ > + index++; > + goto again; With that, what makes the use of shrinker_info in xchg_nr_deferred_memcg() in do_shrink_slab() coherent and valid? -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-07 Thread Dave Chinner via Linux-f2fs-devel
ker_put(shrinker); > + wait_for_completion(>done); > + } Needs a comment explaining why we need to wait here... > + > down_write(_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(>list); > + /* &g

Re: [dm-devel] [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-07 Thread Dave Chinner
ker_put(shrinker); > + wait_for_completion(>done); > + } Needs a comment explaining why we need to wait here... > + > down_write(_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(>list); > +

Re: [Cluster-devel] [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-07 Thread Dave Chinner
ker_put(shrinker); > + wait_for_completion(>done); > + } Needs a comment explaining why we need to wait here... > + > down_write(_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(>list); > + /* > + * Lookups on the shrinker are over and will fail in the future, > + * so we can now remove it from the lists and free it. > + */ rather than here after the wait has been done and provided the guarantee that no shrinker is running or will run again... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-07 Thread Dave Chinner via Linux-erofs
ker_put(shrinker); > + wait_for_completion(>done); > + } Needs a comment explaining why we need to wait here... > + > down_write(_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(>list); > + /* > + * Lookups on the shrinker are over and will fail in the future, > + * so we can now remove it from the lists and free it. > + */ rather than here after the wait has been done and provided the guarantee that no shrinker is running or will run again... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-07 Thread Dave Chinner
ker_put(shrinker); > + wait_for_completion(>done); > + } Needs a comment explaining why we need to wait here... > + > down_write(_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(>list); > + /* > + * Lookups on the shrinker are over and will fail in the future, > + * so we can now remove it from the lists and free it. > + */ rather than here after the wait has been done and provided the guarantee that no shrinker is running or will run again... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-07 Thread Dave Chinner via Virtualization
ker_put(shrinker); > + wait_for_completion(>done); > + } Needs a comment explaining why we need to wait here... > + > down_write(_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(>list); > + /

Re: [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless

2023-08-07 Thread Dave Chinner
ker_put(shrinker); > + wait_for_completion(>done); > + } Needs a comment explaining why we need to wait here... > + > down_write(_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(>list); > + /* > + * Lookups on the shrinker are over and will fail in the future, > + * so we can now remove it from the lists and free it. > + */ rather than here after the wait has been done and provided the guarantee that no shrinker is running or will run again... -Dave. -- Dave Chinner da...@fromorbit.com

Re: [f2fs-dev] [PATCH v4 44/48] mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}

2023-08-07 Thread Dave Chinner via Linux-f2fs-devel
> unsigned long ret, freed = 0; > - int i; > + int offset, index = 0; > > if (!mem_cgroup_online(memcg)) > return 0; > @@ -419,56 +470,63 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, > int nid, > if (unlikely(!inf

Re: [PATCH v4 44/48] mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}

2023-08-07 Thread Dave Chinner via Linux-erofs
> unsigned long ret, freed = 0; > - int i; > + int offset, index = 0; > > if (!mem_cgroup_online(memcg)) > return 0; > @@ -419,56 +470,63 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, > int nid, > if (unlikely(!inf

Re: [PATCH v4 44/48] mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}

2023-08-07 Thread Dave Chinner via Virtualization
> unsigned long ret, freed = 0; > - int i; > + int offset, index = 0; > > if (!mem_cgroup_online(memcg)) > return 0; > @@ -419,56 +470,63 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, > int nid, > if (unlikely(!inf

Re: [Cluster-devel] [PATCH v4 44/48] mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}

2023-08-07 Thread Dave Chinner
> unsigned long ret, freed = 0; > - int i; > + int offset, index = 0; > > if (!mem_cgroup_online(memcg)) > return 0; > @@ -419,56 +470,63 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, > int nid, > if (unlikely(!inf

Re: [PATCH v4 44/48] mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}

2023-08-07 Thread Dave Chinner
> unsigned long ret, freed = 0; > - int i; > + int offset, index = 0; > > if (!mem_cgroup_online(memcg)) > return 0; > @@ -419,56 +470,63 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, > int nid, > if (unlikely(!inf

  1   2   3   4   5   6   7   8   9   10   >