Re: [Cluster-devel] [PATCH v3 6/6] gfs2: Replace kmap_atomic() by kmap_local_page() in gfs2_write_buf_to_page

2023-08-10 Thread Deepak R Varma
On Sat, Jul 01, 2023 at 03:54:06PM +0200, Fabio M. De Francesco wrote:
> On giovedì 29 giugno 2023 23:52:27 CEST Deepak R Varma wrote:
> > kmap_atomic() is deprecated in favor of kmap_local_{folio,page}().
>
> Deepak,
>
> Again please refer to documentation and/or Ira's deprecation patch. The
> reasons why are in one of my previous messages.

Hi Fabio,
This change was already added by Andreas. So my patchset can be dropped.
However, your feedback on the individual patches is agreed to and accepted. I
will keep your suggestions in mind when I submit next patches.

Thank you :)

Deepak.

>
> > Therefore, replace kmap_atomic() with kmap_local_page() in
> > --
> > 2.34.1
>
>
>
>




Re: [Cluster-devel] [RFC v6.5-rc2 3/3] fs: lockd: introduce safe async lock op

2023-08-10 Thread Alexander Aring
Hi,

On Fri, Jul 21, 2023 at 1:46 PM Jeff Layton  wrote:
>
> On Thu, 2023-07-20 at 08:58 -0400, Alexander Aring wrote:
> > This patch reverts mostly commit 40595cdc93ed ("nfs: block notification
> > on fs with its own ->lock") and introduces an EXPORT_OP_SAFE_ASYNC_LOCK
> > export flag to signal that the "own ->lock" implementation supports
> > async lock requests. The only main user is DLM that is used by GFS2 and
> > OCFS2 filesystem. Those implement their own lock() implementation and
> > return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93ed
> > ("nfs: block notification on fs with its own ->lock") the DLM
> > implementation were never updated. This patch should prepare for DLM
> > to set the EXPORT_OP_SAFE_ASYNC_LOCK export flag and update the DLM
> > plock implementation regarding to it.
> >
> > Signed-off-by: Alexander Aring 
> > ---
> >  fs/lockd/svclock.c   |  5 ++---
> >  fs/nfsd/nfs4state.c  | 11 ---
> >  include/linux/exportfs.h |  1 +
> >  3 files changed, 11 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> > index 62ef27a69a9e..54a67bd33843 100644
> > --- a/fs/lockd/svclock.c
> > +++ b/fs/lockd/svclock.c
> > @@ -483,9 +483,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > *file,
> >   struct nlm_host *host, struct nlm_lock *lock, int wait,
> >   struct nlm_cookie *cookie, int reclaim)
> >  {
> > -#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
> >   struct inode*inode = nlmsvc_file_inode(file);
> > -#endif
> >   struct nlm_block*block = NULL;
> >   int error;
> >   int mode;
> > @@ -499,7 +497,8 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > *file,
> >   (long long)lock->fl.fl_end,
> >   wait);
> >
> > - if (nlmsvc_file_file(file)->f_op->lock) {
> > + if (!(inode->i_sb->s_export_op->flags & EXPORT_OP_SAFE_ASYNC_LOCK) &&
> > + nlmsvc_file_file(file)->f_op->lock) {
> >   async_block = wait;
> >   wait = 0;
> >   }
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 6e61fa3acaf1..efcea229d640 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -7432,6 +7432,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct 
> > nfsd4_compound_state *cstate,
> >   struct nfsd4_blocked_lock *nbl = NULL;
> >   struct file_lock *file_lock = NULL;
> >   struct file_lock *conflock = NULL;
> > + struct super_block *sb;
> >   __be32 status = 0;
> >   int lkflg;
> >   int err;
> > @@ -7453,6 +7454,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct 
> > nfsd4_compound_state *cstate,
> >   dprintk("NFSD: nfsd4_lock: permission denied!\n");
> >   return status;
> >   }
> > + sb = cstate->current_fh.fh_dentry->d_sb;
> >
> >   if (lock->lk_is_new) {
> >   if (nfsd4_has_session(cstate))
> > @@ -7504,7 +7506,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct 
> > nfsd4_compound_state *cstate,
> >   fp = lock_stp->st_stid.sc_file;
> >   switch (lock->lk_type) {
> >   case NFS4_READW_LT:
> > - if (nfsd4_has_session(cstate))
> > + if (sb->s_export_op->flags & 
> > EXPORT_OP_SAFE_ASYNC_LOCK &&
>
> This will break existing filesystems that don't set the new flag. Maybe
> you also need to test for the filesystem's ->lock operation here too?
>

yes.

> This might be more nicely expressed in a helper function.

ok.

- Alex



Re: [Cluster-devel] [RFC v6.5-rc2 2/3] fs: lockd: fix race in async lock request handling

2023-08-10 Thread Alexander Aring
Hi,

On Fri, Jul 21, 2023 at 11:45 AM Jeff Layton  wrote:
>
> On Thu, 2023-07-20 at 08:58 -0400, Alexander Aring wrote:
> > This patch fixes a race in async lock request handling between adding
> > the relevant struct nlm_block to nlm_blocked list after the request was
> > sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
> > nlm_block in the nlm_blocked list. It could be that the async request is
> > completed before the nlm_block was added to the list. This would end
> > in a -ENOENT and a kernel log message of "lockd: grant for unknown
> > block".
> >
> > To solve this issue we add the nlm_block before the vfs_lock_file() call
> > to be sure it has been added when a possible nlmsvc_grant_deferred() is
> > called. If the vfs_lock_file() results in an case when it wouldn't be
> > added to nlm_blocked list, the nlm_block struct will be removed from
> > this list again.
> >
> > Signed-off-by: Alexander Aring 
> > ---
> >  fs/lockd/svclock.c  | 80 +++--
> >  include/linux/lockd/lockd.h |  1 +
> >  2 files changed, 60 insertions(+), 21 deletions(-)
> >
> > diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> > index 28abec5c451d..62ef27a69a9e 100644
> > --- a/fs/lockd/svclock.c
> > +++ b/fs/lockd/svclock.c
> > @@ -297,6 +297,8 @@ static void nlmsvc_free_block(struct kref *kref)
> >
> >   dprintk("lockd: freeing block %p...\n", block);
> >
> > + WARN_ON_ONCE(block->b_flags & B_PENDING_CALLBACK);
> > +
> >   /* Remove block from file's list of blocks */
> >   list_del_init(&block->b_flist);
> >   mutex_unlock(&file->f_mutex);
> > @@ -543,6 +545,12 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > *file,
> >   goto out;
> >   }
> >
> > + if (block->b_flags & B_PENDING_CALLBACK)
> > + goto pending_request;
> > +
> > + /* Append to list of blocked */
> > + nlmsvc_insert_block(block, NLM_NEVER);
> > +
> >   if (!wait)
> >   lock->fl.fl_flags &= ~FL_SLEEP;
> >   mode = lock_to_openmode(&lock->fl);
> > @@ -552,9 +560,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > *file,
> >   dprintk("lockd: vfs_lock_file returned %d\n", error);
> >   switch (error) {
> >   case 0:
> > + nlmsvc_remove_block(block);
> >   ret = nlm_granted;
> >   goto out;
> >   case -EAGAIN:
> > + if (!wait)
> > + nlmsvc_remove_block(block);
> > +pending_request:
> >   /*
> >* If this is a blocking request for an
> >* already pending lock request then we need
> > @@ -565,6 +577,8 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > *file,
> >   ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
> >   goto out;
> >   case FILE_LOCK_DEFERRED:
> > + block->b_flags |= B_PENDING_CALLBACK;
> > +
> >   if (wait)
> >   break;
> >   /* Filesystem lock operation is in progress
> > @@ -572,17 +586,16 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > *file,
> >   ret = nlmsvc_defer_lock_rqst(rqstp, block);
>
> When the above function is called, it's going to end up reinserting the
> block into the list. I think you probably also need to remove the call
> to nlmsvc_insert_block from nlmsvc_defer_lock_rqst since it could have
> been granted before that occurs.
>

it cannot be granted during this time because the f_mutex is held. We
insert it in the first place to have a way to get the block lookup
working when a lm_grant() is really fast. Then lm_grant() will lookup
the lock and have a way to get f_mutex to hold it. Then lm_grant()
will only run when nobody is in this critical area (on a per nlm_file
basis).

There is a difference in the call between NLM_NEVER and NLM_TIMEOUT in
nlmsvc_defer_lock_rqst(), when nlmsvc_defer_lock_rqst() it will just
update the timeout value. I am not sure about the consequences when it
does a nlmsvc_insert_block() with NLM_NEVER instead of NLM_TIMEOUT.
But as I said it should not be possible to grant the block when
f_mutex is held.

> >   goto out;
> >   case -EDEADLK:
> > + nlmsvc_remove_block(block);
> >   ret = nlm_deadlock;
> >   goto out;
> >   default:/* includes ENOLCK */
> > + nlmsvc_remove_block(block);
> >   ret = nlm_lck_denied_nolocks;
> >   goto out;
> >   }
> >
> >   ret = nlm_lck_blocked;
> > -
> > - /* Append to list of blocked */
> > - nlmsvc_insert_block(block, NLM_NEVER);
> >  out:
> >   mutex_unlock(&file->f_mutex);
> >   nlmsvc_release_block(block);
> > @@ 

Re: [Cluster-devel] [RFC v6.5-rc2 2/3] fs: lockd: fix race in async lock request handling

2023-08-10 Thread Alexander Aring
Hi,

On Fri, Jul 21, 2023 at 12:43 PM Jeff Layton  wrote:
>
> On Fri, 2023-07-21 at 09:09 -0400, Alexander Aring wrote:
> > Hi,
> >
> > On Thu, Jul 20, 2023 at 8:58 AM Alexander Aring  wrote:
> > >
> > > This patch fixes a race in async lock request handling between adding
> > > the relevant struct nlm_block to nlm_blocked list after the request was
> > > sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
> > > nlm_block in the nlm_blocked list. It could be that the async request is
> > > completed before the nlm_block was added to the list. This would end
> > > in a -ENOENT and a kernel log message of "lockd: grant for unknown
> > > block".
> > >
> > > To solve this issue we add the nlm_block before the vfs_lock_file() call
> > > to be sure it has been added when a possible nlmsvc_grant_deferred() is
> > > called. If the vfs_lock_file() results in an case when it wouldn't be
> > > added to nlm_blocked list, the nlm_block struct will be removed from
> > > this list again.
> > >
> > > Signed-off-by: Alexander Aring 
> > > ---
> > >  fs/lockd/svclock.c  | 80 +++--
> > >  include/linux/lockd/lockd.h |  1 +
> > >  2 files changed, 60 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> > > index 28abec5c451d..62ef27a69a9e 100644
> > > --- a/fs/lockd/svclock.c
> > > +++ b/fs/lockd/svclock.c
> > > @@ -297,6 +297,8 @@ static void nlmsvc_free_block(struct kref *kref)
> > >
> > > dprintk("lockd: freeing block %p...\n", block);
> > >
> > > +   WARN_ON_ONCE(block->b_flags & B_PENDING_CALLBACK);
> > > +
> > > /* Remove block from file's list of blocks */
> > > list_del_init(&block->b_flist);
> > > mutex_unlock(&file->f_mutex);
> > > @@ -543,6 +545,12 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > > *file,
> > > goto out;
> > > }
> > >
> > > +   if (block->b_flags & B_PENDING_CALLBACK)
> > > +   goto pending_request;
> > > +
> > > +   /* Append to list of blocked */
> > > +   nlmsvc_insert_block(block, NLM_NEVER);
> > > +
> > > if (!wait)
> > > lock->fl.fl_flags &= ~FL_SLEEP;
> > > mode = lock_to_openmode(&lock->fl);
> > > @@ -552,9 +560,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file 
> > > *file,
> > > dprintk("lockd: vfs_lock_file returned %d\n", error);
> > > switch (error) {
> > > case 0:
> > > +   nlmsvc_remove_block(block);
> >
> > reacting here with nlmsvc_remove_block() assumes that the block was
> > not being added to the nlm_blocked list before nlmsvc_insert_block()
> > was called. I am not sure if this is always the case here.
> >
> > Does somebody see a problem with that?
> >
>
> The scenario is: we have a block on the list already and now another
> lock request comes in for the same thing: the client decided to just re-
> poll for the lock. That's a plausible scenario. I think the Linux NLM
> client will poll for locks periodically.
>
> In this case though, the lock request was granted by the filesystem, so
> this is likely racing with (and winning vs.) a lm_grant callback. Given
> that the client decided to repoll for it, we're probably safe to just
> dequeue the block and respond here, and not worry about sending a grant
> callback.
>
> Ditto for the other cases where the block is removed.
>

ok.

> > > ret = nlm_granted;
> > > goto out;
> > > case -EAGAIN:
> > > +   if (!wait)
> > > +   nlmsvc_remove_block(block);
>
> I was thinking that it would be best to not insert a block at all in the
> !wait case, but it looks like DLM just returns DEFERRED and almost
> always does a callback, even when it's not a blocking lock request?
>
> Anyway, I think we probably do have to handle this like you are.
>

I would prefer to have !wait blocked. We even don't do that in DLM, it
causes problems with cancellation as a cancellation will only do
something (at least in DLM) when there is a waiter that the lock
request waits to be granted, which is only being the case for wait
lock requests.

A !wait is only a trylock, the answer should be back being mostly
immediate and it also makes no sense for me to make them async,
because we have the same problems with cancellation/unlock which are
not being offered to be handled in an asynchronous way. As I said, the
answer should be back mostly immediately. We are somehow doing this
optimization for !wait lock requests only, but operations like unlock
are also being called by lockd and are not being handled
asynchronously. That means we probably don't care about this
optimization, it looks different on wait lock requests.

We should update the documentation and only do async lock requests on
wait only. Is this okay?

- Alex