Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-10 Thread Dave Chinner
On Fri, Sep 08, 2023 at 01:29:55AM +0100, Pavel Begunkov wrote:
> On 9/3/23 23:30, Dave Chinner wrote:
> > On Wed, Aug 30, 2023 at 02:11:31PM +0800, Hao Xu wrote:
> > > On 8/29/23 19:53, Matthew Wilcox wrote:
> > > > On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu wrote:
> > > > > On 8/28/23 05:32, Matthew Wilcox wrote:
> > > > > > On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:
> > > > > > > From: Hao Xu 
> > > > > > > 
> > > > > > > Add a boolean parameter for file_accessed() to support nowait 
> > > > > > > semantics.
> > > > > > > Currently it is true only with io_uring as its initial caller.
> > > > > > 
> > > > > > So why do we need to do this as part of this series?  Apparently it
> > > > > > hasn't caused any problems for filemap_read().
> > > > > > 
> > > > > 
> > > > > We need this parameter to indicate if nowait semantics should be 
> > > > > enforced in
> > > > > touch_atime(), There are locks and maybe IOs in it.
> > > > 
> > > > That's not my point.  We currently call file_accessed() and
> > > > touch_atime() for nowait reads and nowait writes.  You haven't done
> > > > anything to fix those.
> > > > 
> > > > I suspect you can trim this patchset down significantly by avoiding
> > > > fixing the file_accessed() problem.  And then come back with a later
> > > > patchset that fixes it for all nowait i/o.  Or do a separate prep series
> > > 
> > > I'm ok to do that.
> > > 
> > > > first that fixes it for the existing nowait users, and then a second
> > > > series to do all the directory stuff.
> > > > 
> > > > I'd do the first thing.  Just ignore the problem.  Directory atime
> > > > updates cause I/O so rarely that you can afford to ignore it.  Almost
> > > > everyone uses relatime or nodiratime.
> > > 
> > > Hi Matthew,
> > > The previous discussion shows this does cause issues in real
> > > producations: 
> > > https://lore.kernel.org/io-uring/2785f009-2ebb-028d-8250-d5f3a3051...@gmail.com/#:~:text=fwiw%2C%20we%27ve%20just%20recently%20had%20similar%20problems%20with%20io_uring%20read/write
> > > 
> > 
> > Then separate it out into it's own patch set so we can have a
> > discussion on the merits of requiring using noatime, relatime or
> > lazytime for really latency sensitive IO applications. Changing code
> > is not always the right solution...
> 
> Separation sounds reasonable, but it can hardly be said that only
> latency sensitive apps would care about >1s nowait/async submission
> delays. Presumably, btrfs can improve on that, but it still looks
> like it's perfectly legit for filesystems do heavy stuff in
> timestamping like waiting for IO. Right?

Yes, it is, no-one is denying that. And some filesystems are worse
than others, but none of that means it has to be fixed so getdents
can be converted to NOWAIT semantics.

ie. this patchset is about the getdents NOWAIT machinery, and
fiddling around with timestamps has much, much wider scope than just
NOWAIT getdents machinery. We'll have this discussion about NOWAIT
timestamp updates when a RFC is proposed to address the wider
problem of how timestamp updates should behave in NOWAIT context.

-Dave.
-- 
Dave Chinner
da...@fromorbit.com



Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-08 Thread Pavel Begunkov

On 9/3/23 23:30, Dave Chinner wrote:

On Wed, Aug 30, 2023 at 02:11:31PM +0800, Hao Xu wrote:

On 8/29/23 19:53, Matthew Wilcox wrote:

On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu wrote:

On 8/28/23 05:32, Matthew Wilcox wrote:

On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:

From: Hao Xu 

Add a boolean parameter for file_accessed() to support nowait semantics.
Currently it is true only with io_uring as its initial caller.


So why do we need to do this as part of this series?  Apparently it
hasn't caused any problems for filemap_read().



We need this parameter to indicate if nowait semantics should be enforced in
touch_atime(), There are locks and maybe IOs in it.


That's not my point.  We currently call file_accessed() and
touch_atime() for nowait reads and nowait writes.  You haven't done
anything to fix those.

I suspect you can trim this patchset down significantly by avoiding
fixing the file_accessed() problem.  And then come back with a later
patchset that fixes it for all nowait i/o.  Or do a separate prep series


I'm ok to do that.


first that fixes it for the existing nowait users, and then a second
series to do all the directory stuff.

I'd do the first thing.  Just ignore the problem.  Directory atime
updates cause I/O so rarely that you can afford to ignore it.  Almost
everyone uses relatime or nodiratime.


Hi Matthew,
The previous discussion shows this does cause issues in real
producations: 
https://lore.kernel.org/io-uring/2785f009-2ebb-028d-8250-d5f3a3051...@gmail.com/#:~:text=fwiw%2C%20we%27ve%20just%20recently%20had%20similar%20problems%20with%20io_uring%20read/write



Then separate it out into it's own patch set so we can have a
discussion on the merits of requiring using noatime, relatime or
lazytime for really latency sensitive IO applications. Changing code
is not always the right solution...


Separation sounds reasonable, but it can hardly be said that only
latency sensitive apps would care about >1s nowait/async submission
delays. Presumably, btrfs can improve on that, but it still looks
like it's perfectly legit for filesystems do heavy stuff in
timestamping like waiting for IO. Right?

--
Pavel Begunkov



Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-04 Thread Christian Brauner
On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:
> From: Hao Xu 
> 
> Add a boolean parameter for file_accessed() to support nowait semantics.
> Currently it is true only with io_uring as its initial caller.
> 
> Signed-off-by: Hao Xu 
> ---
>  arch/s390/hypfs/inode.c | 2 +-
>  block/fops.c| 2 +-
>  fs/btrfs/file.c | 2 +-
>  fs/btrfs/inode.c| 2 +-
>  fs/coda/dir.c   | 4 ++--
>  fs/ext2/file.c  | 4 ++--
>  fs/ext4/file.c  | 6 +++---
>  fs/f2fs/file.c  | 4 ++--
>  fs/fuse/dax.c   | 2 +-
>  fs/fuse/file.c  | 4 ++--
>  fs/gfs2/file.c  | 2 +-
>  fs/hugetlbfs/inode.c| 2 +-
>  fs/nilfs2/file.c| 2 +-
>  fs/orangefs/file.c  | 2 +-
>  fs/orangefs/inode.c | 2 +-
>  fs/pipe.c   | 2 +-
>  fs/ramfs/file-nommu.c   | 2 +-
>  fs/readdir.c| 2 +-
>  fs/smb/client/cifsfs.c  | 2 +-
>  fs/splice.c | 2 +-
>  fs/ubifs/file.c | 2 +-
>  fs/udf/file.c   | 2 +-
>  fs/xfs/xfs_file.c   | 6 +++---
>  fs/zonefs/file.c| 4 ++--
>  include/linux/fs.h  | 5 +++--
>  mm/filemap.c| 8 
>  mm/shmem.c  | 6 +++---
>  27 files changed, 43 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
> index ee919bfc8186..55f562027c4f 100644
> --- a/arch/s390/hypfs/inode.c
> +++ b/arch/s390/hypfs/inode.c
> @@ -157,7 +157,7 @@ static ssize_t hypfs_read_iter(struct kiocb *iocb, struct 
> iov_iter *to)
>   if (!count)
>   return -EFAULT;
>   iocb->ki_pos = pos + count;
> - file_accessed(file);
> + file_accessed(file, false);

Why? If all you do is skip atime update anyway then just add something
like:

bool file_needs_atime(struct file *file)
{
   return !(file->f_flags & O_NOATIME) &&
  atime_needs_update(>f_path, d_inode(path->dentry));
}

and then

if (file_needs_atime(file) && IOURING_WANTS_ASYNC)
return -EAGAIN;

instead of touching all this code.



Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-09-03 Thread Dave Chinner
On Wed, Aug 30, 2023 at 02:11:31PM +0800, Hao Xu wrote:
> On 8/29/23 19:53, Matthew Wilcox wrote:
> > On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu wrote:
> > > On 8/28/23 05:32, Matthew Wilcox wrote:
> > > > On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:
> > > > > From: Hao Xu 
> > > > > 
> > > > > Add a boolean parameter for file_accessed() to support nowait 
> > > > > semantics.
> > > > > Currently it is true only with io_uring as its initial caller.
> > > > 
> > > > So why do we need to do this as part of this series?  Apparently it
> > > > hasn't caused any problems for filemap_read().
> > > > 
> > > 
> > > We need this parameter to indicate if nowait semantics should be enforced 
> > > in
> > > touch_atime(), There are locks and maybe IOs in it.
> > 
> > That's not my point.  We currently call file_accessed() and
> > touch_atime() for nowait reads and nowait writes.  You haven't done
> > anything to fix those.
> > 
> > I suspect you can trim this patchset down significantly by avoiding
> > fixing the file_accessed() problem.  And then come back with a later
> > patchset that fixes it for all nowait i/o.  Or do a separate prep series
> 
> I'm ok to do that.
> 
> > first that fixes it for the existing nowait users, and then a second
> > series to do all the directory stuff.
> > 
> > I'd do the first thing.  Just ignore the problem.  Directory atime
> > updates cause I/O so rarely that you can afford to ignore it.  Almost
> > everyone uses relatime or nodiratime.
> 
> Hi Matthew,
> The previous discussion shows this does cause issues in real
> producations: 
> https://lore.kernel.org/io-uring/2785f009-2ebb-028d-8250-d5f3a3051...@gmail.com/#:~:text=fwiw%2C%20we%27ve%20just%20recently%20had%20similar%20problems%20with%20io_uring%20read/write
> 

Then separate it out into it's own patch set so we can have a
discussion on the merits of requiring using noatime, relatime or
lazytime for really latency sensitive IO applications. Changing code
is not always the right solution...

-Dave.
-- 
Dave Chinner
da...@fromorbit.com



Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-08-30 Thread Hao Xu

On 8/29/23 19:53, Matthew Wilcox wrote:

On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu wrote:

On 8/28/23 05:32, Matthew Wilcox wrote:

On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:

From: Hao Xu 

Add a boolean parameter for file_accessed() to support nowait semantics.
Currently it is true only with io_uring as its initial caller.


So why do we need to do this as part of this series?  Apparently it
hasn't caused any problems for filemap_read().



We need this parameter to indicate if nowait semantics should be enforced in
touch_atime(), There are locks and maybe IOs in it.


That's not my point.  We currently call file_accessed() and
touch_atime() for nowait reads and nowait writes.  You haven't done
anything to fix those.

I suspect you can trim this patchset down significantly by avoiding
fixing the file_accessed() problem.  And then come back with a later
patchset that fixes it for all nowait i/o.  Or do a separate prep series


I'm ok to do that.


first that fixes it for the existing nowait users, and then a second
series to do all the directory stuff.

I'd do the first thing.  Just ignore the problem.  Directory atime
updates cause I/O so rarely that you can afford to ignore it.  Almost
everyone uses relatime or nodiratime.


Hi Matthew,
The previous discussion shows this does cause issues in real
producations: 
https://lore.kernel.org/io-uring/2785f009-2ebb-028d-8250-d5f3a3051...@gmail.com/#:~:text=fwiw%2C%20we%27ve%20just%20recently%20had%20similar%20problems%20with%20io_uring%20read/write







Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-08-29 Thread Matthew Wilcox
On Tue, Aug 29, 2023 at 03:46:13PM +0800, Hao Xu wrote:
> On 8/28/23 05:32, Matthew Wilcox wrote:
> > On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:
> > > From: Hao Xu 
> > > 
> > > Add a boolean parameter for file_accessed() to support nowait semantics.
> > > Currently it is true only with io_uring as its initial caller.
> > 
> > So why do we need to do this as part of this series?  Apparently it
> > hasn't caused any problems for filemap_read().
> > 
> 
> We need this parameter to indicate if nowait semantics should be enforced in
> touch_atime(), There are locks and maybe IOs in it.

That's not my point.  We currently call file_accessed() and
touch_atime() for nowait reads and nowait writes.  You haven't done
anything to fix those.

I suspect you can trim this patchset down significantly by avoiding
fixing the file_accessed() problem.  And then come back with a later
patchset that fixes it for all nowait i/o.  Or do a separate prep series
first that fixes it for the existing nowait users, and then a second
series to do all the directory stuff.

I'd do the first thing.  Just ignore the problem.  Directory atime
updates cause I/O so rarely that you can afford to ignore it.  Almost
everyone uses relatime or nodiratime.



[Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-08-29 Thread Hao Xu
From: Hao Xu 

Add a boolean parameter for file_accessed() to support nowait semantics.
Currently it is true only with io_uring as its initial caller.

Signed-off-by: Hao Xu 
---
 arch/s390/hypfs/inode.c | 2 +-
 block/fops.c| 2 +-
 fs/btrfs/file.c | 2 +-
 fs/btrfs/inode.c| 2 +-
 fs/coda/dir.c   | 4 ++--
 fs/ext2/file.c  | 4 ++--
 fs/ext4/file.c  | 6 +++---
 fs/f2fs/file.c  | 4 ++--
 fs/fuse/dax.c   | 2 +-
 fs/fuse/file.c  | 4 ++--
 fs/gfs2/file.c  | 2 +-
 fs/hugetlbfs/inode.c| 2 +-
 fs/nilfs2/file.c| 2 +-
 fs/orangefs/file.c  | 2 +-
 fs/orangefs/inode.c | 2 +-
 fs/pipe.c   | 2 +-
 fs/ramfs/file-nommu.c   | 2 +-
 fs/readdir.c| 2 +-
 fs/smb/client/cifsfs.c  | 2 +-
 fs/splice.c | 2 +-
 fs/ubifs/file.c | 2 +-
 fs/udf/file.c   | 2 +-
 fs/xfs/xfs_file.c   | 6 +++---
 fs/zonefs/file.c| 4 ++--
 include/linux/fs.h  | 5 +++--
 mm/filemap.c| 8 
 mm/shmem.c  | 6 +++---
 27 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index ee919bfc8186..55f562027c4f 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -157,7 +157,7 @@ static ssize_t hypfs_read_iter(struct kiocb *iocb, struct 
iov_iter *to)
if (!count)
return -EFAULT;
iocb->ki_pos = pos + count;
-   file_accessed(file);
+   file_accessed(file, false);
return count;
 }
 
diff --git a/block/fops.c b/block/fops.c
index a286bf3325c5..546ecd3c8084 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -601,7 +601,7 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct 
iov_iter *to)
ret = kiocb_write_and_wait(iocb, count);
if (ret < 0)
goto reexpand;
-   file_accessed(iocb->ki_filp);
+   file_accessed(iocb->ki_filp, false);
 
ret = blkdev_direct_IO(iocb, to);
if (ret >= 0) {
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index fd03e689a6be..24c0bf3818a6 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2013,7 +2013,7 @@ static int btrfs_file_mmap(struct file*filp, struct 
vm_area_struct *vma)
if (!mapping->a_ops->read_folio)
return -ENOEXEC;
 
-   file_accessed(filp);
+   file_accessed(filp, false);
vma->vm_ops = _file_vm_ops;
 
return 0;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index dbbb67293e34..50e9ae8c388c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10153,7 +10153,7 @@ ssize_t btrfs_encoded_read(struct kiocb *iocb, struct 
iov_iter *iter,
struct extent_map *em;
bool unlocked = false;
 
-   file_accessed(iocb->ki_filp);
+   file_accessed(iocb->ki_filp, false);
 
btrfs_inode_lock(inode, BTRFS_ILOCK_SHARED);
 
diff --git a/fs/coda/dir.c b/fs/coda/dir.c
index 8450b1bd354b..1d94c013ac88 100644
--- a/fs/coda/dir.c
+++ b/fs/coda/dir.c
@@ -436,12 +436,12 @@ static int coda_readdir(struct file *coda_file, struct 
dir_context *ctx)
if (host_file->f_op->iterate_shared) {
inode_lock_shared(host_inode);
ret = 
host_file->f_op->iterate_shared(host_file, ctx);
-   file_accessed(host_file);
+   file_accessed(host_file, false);
inode_unlock_shared(host_inode);
} else {
inode_lock(host_inode);
ret = host_file->f_op->iterate(host_file, ctx);
-   file_accessed(host_file);
+   file_accessed(host_file, false);
inode_unlock(host_inode);
}
}
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index 0b4c91c62e1f..dc059cae50a4 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -44,7 +44,7 @@ static ssize_t ext2_dax_read_iter(struct kiocb *iocb, struct 
iov_iter *to)
ret = dax_iomap_rw(iocb, to, _iomap_ops);
inode_unlock_shared(inode);
 
-   file_accessed(iocb->ki_filp);
+   file_accessed(iocb->ki_filp, false);
return ret;
 }
 
@@ -127,7 +127,7 @@ static int ext2_file_mmap(struct file *file, struct 
vm_area_struct *vma)
if (!IS_DAX(file_inode(file)))
return generic_file_mmap(file, vma);
 
-   file_accessed(file);
+   file_accessed(file, false);
vma->vm_ops = _dax_vm_ops;
return 0;
 }
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index c457c8517f0f..2ab790a668a8 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -94,7 +94,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct 
iov_iter *to)
ret = iomap_dio_rw(iocb, to, _iomap_ops, NULL, 0, NULL, 0);

Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-08-29 Thread Hao Xu

On 8/28/23 05:32, Matthew Wilcox wrote:

On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:

From: Hao Xu 

Add a boolean parameter for file_accessed() to support nowait semantics.
Currently it is true only with io_uring as its initial caller.


So why do we need to do this as part of this series?  Apparently it
hasn't caused any problems for filemap_read().



We need this parameter to indicate if nowait semantics should be 
enforced in touch_atime(), There are locks and maybe IOs in it.




Re: [Cluster-devel] [PATCH 07/11] vfs: add nowait parameter for file_accessed()

2023-08-27 Thread Matthew Wilcox
On Sun, Aug 27, 2023 at 09:28:31PM +0800, Hao Xu wrote:
> From: Hao Xu 
> 
> Add a boolean parameter for file_accessed() to support nowait semantics.
> Currently it is true only with io_uring as its initial caller.

So why do we need to do this as part of this series?  Apparently it
hasn't caused any problems for filemap_read().

> +++ b/mm/filemap.c
> @@ -2723,7 +2723,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct 
> iov_iter *iter,
>   folio_batch_init();
>   } while (iov_iter_count(iter) && iocb->ki_pos < isize && !error);
>  
> - file_accessed(filp);
> + file_accessed(filp, false);
>  
>   return already_read ? already_read : error;
>  }
> @@ -2809,7 +2809,7 @@ generic_file_read_iter(struct kiocb *iocb, struct 
> iov_iter *iter)
>   retval = kiocb_write_and_wait(iocb, count);
>   if (retval < 0)
>   return retval;
> - file_accessed(file);
> + file_accessed(file, false);
>  
>   retval = mapping->a_ops->direct_IO(iocb, iter);
>   if (retval >= 0) {
> @@ -2978,7 +2978,7 @@ ssize_t filemap_splice_read(struct file *in, loff_t 
> *ppos,
>  
>  out:
>   folio_batch_release();
> - file_accessed(in);
> + file_accessed(in, false);
>  
>   return total_spliced ? total_spliced : error;
>  }