Re: [Devel] [PATCH v3] nfs: protect callback execution against per-net callback thread shutdown
Please, see my comments below. 07.11.2017 12:30, Kirill Tkhai пишет: > On 07.11.2017 13:39, Stanislav Kinsburskiy wrote: >> From: Stanislav Kinsburskiy >> >> Here is the race: >> >> CPU #0 CPU#1 >> >> cleanup_mnt nfs41_callback_svc (get xprt from the list) >> nfs_callback_down... >> ... ... >> svc_close_net... >> ... ... >> svc_xprt_free... >> svc_bc_sock_free bc_svc_process >> kfree(xprt) svc_process_common >> rqstp->rq_xprt->xpt_ops (use after free) >> >> The problem is that per-net SUNRPC transports shutdown is done regardless >> current callback execution. This is a race leading to transport >> use-after-free >> in callback handler. >> This patch fixes it in stright-forward way. I.e. it protects callback >> execution with the same mutex used for per-net data creation and destruction. >> Hopefully, it won't slow down NFS client significantly. >> >> https://jira.sw.ru/browse/PSBM-75751 >> >> v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with >> mutex taken), while thread wait for the mutex to take. >> The idea is to simply check if thread has to exit, if mutex lock has failed. >> This is a busy loop, but it shouldn't happend often and for long. >> >> Signed-off-by: Stanislav Kinsburskiy >> --- >> fs/nfs/callback.c |6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c >> index 0beb275..bbb07e4 100644 >> --- a/fs/nfs/callback.c >> +++ b/fs/nfs/callback.c >> @@ -117,7 +117,11 @@ nfs41_callback_svc(void *vrqstp) >> if (try_to_freeze()) >> continue; >> >> +if (!mutex_trylock(&nfs_callback_mutex)) >> + continue; > > This looks like a busy loop (that especially bad, because mutex-owner may > sleep > at the moment we are looping). > Well, yes. It a busy loop. And that's not good. But I have to mention, that this busy loop can happen only on umount request. > It seems the solution may be to change nfs_callback_down() function. > Can't we flush pending request in nfs_callback_down()? Well, this already happens. The problem is in race between transport usage (unconditional) and its destruction (also unconditional). IOW, there should be either reference counting or some critical section. Looks like the former will need a way more code and thus more error-prone, the latter is uglier, but code should be simpler at first sight. > Or just delete it from sv_cb_list without handling (does nfs proto allow > that)? > Well, this looks like a promising idea. But yet again, there should be some protection against transport usage in shutdown helper. And it's not yet clear, how to implement it without significant code change... ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [PATCH v3] nfs: protect callback execution against per-net callback thread shutdown
On 07.11.2017 13:39, Stanislav Kinsburskiy wrote: > From: Stanislav Kinsburskiy > > Here is the race: > > CPU #0CPU#1 > > cleanup_mnt nfs41_callback_svc (get xprt from the list) > nfs_callback_down ... > ... ... > svc_close_net ... > ... ... > svc_xprt_free ... > svc_bc_sock_free bc_svc_process > kfree(xprt) svc_process_common > rqstp->rq_xprt->xpt_ops (use after free) > > The problem is that per-net SUNRPC transports shutdown is done regardless > current callback execution. This is a race leading to transport use-after-free > in callback handler. > This patch fixes it in stright-forward way. I.e. it protects callback > execution with the same mutex used for per-net data creation and destruction. > Hopefully, it won't slow down NFS client significantly. > > https://jira.sw.ru/browse/PSBM-75751 > > v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with > mutex taken), while thread wait for the mutex to take. > The idea is to simply check if thread has to exit, if mutex lock has failed. > This is a busy loop, but it shouldn't happend often and for long. > > Signed-off-by: Stanislav Kinsburskiy > --- > fs/nfs/callback.c |6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c > index 0beb275..bbb07e4 100644 > --- a/fs/nfs/callback.c > +++ b/fs/nfs/callback.c > @@ -117,7 +117,11 @@ nfs41_callback_svc(void *vrqstp) > if (try_to_freeze()) > continue; > > + if (!mutex_trylock(&nfs_callback_mutex)) > +continue; This looks like a busy loop (that especially bad, because mutex-owner may sleep at the moment we are looping). It seems the solution may be to change nfs_callback_down() function. Can't we flush pending request in nfs_callback_down()? Or just delete it from sv_cb_list without handling (does nfs proto allow that)? Also, one more additional argument for flush is a suspicion there may be more than one pending request in the list. Can it be? > + > prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_INTERRUPTIBLE); > + > spin_lock_bh(&serv->sv_cb_lock); > if (!list_empty(&serv->sv_cb_list)) { > req = list_first_entry(&serv->sv_cb_list, > @@ -129,8 +133,10 @@ nfs41_callback_svc(void *vrqstp) > error = bc_svc_process(serv, req, rqstp); > dprintk("bc_svc_process() returned w/ error code= %d\n", > error); > + mutex_unlock(&nfs_callback_mutex); > } else { > spin_unlock_bh(&serv->sv_cb_lock); > + mutex_unlock(&nfs_callback_mutex); > schedule(); > finish_wait(&serv->sv_cb_waitq, &wq); > } > ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH v3] nfs: protect callback execution against per-net callback thread shutdown
From: Stanislav Kinsburskiy Here is the race: CPU #0 CPU#1 cleanup_mnt nfs41_callback_svc (get xprt from the list) nfs_callback_down ... ... ... svc_close_net ... ... ... svc_xprt_free ... svc_bc_sock_freebc_svc_process kfree(xprt) svc_process_common rqstp->rq_xprt->xpt_ops (use after free) The problem is that per-net SUNRPC transports shutdown is done regardless current callback execution. This is a race leading to transport use-after-free in callback handler. This patch fixes it in stright-forward way. I.e. it protects callback execution with the same mutex used for per-net data creation and destruction. Hopefully, it won't slow down NFS client significantly. https://jira.sw.ru/browse/PSBM-75751 v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with mutex taken), while thread wait for the mutex to take. The idea is to simply check if thread has to exit, if mutex lock has failed. This is a busy loop, but it shouldn't happend often and for long. Signed-off-by: Stanislav Kinsburskiy --- fs/nfs/callback.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index 0beb275..bbb07e4 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -117,7 +117,11 @@ nfs41_callback_svc(void *vrqstp) if (try_to_freeze()) continue; + if (!mutex_trylock(&nfs_callback_mutex)) + continue; + prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_INTERRUPTIBLE); + spin_lock_bh(&serv->sv_cb_lock); if (!list_empty(&serv->sv_cb_list)) { req = list_first_entry(&serv->sv_cb_list, @@ -129,8 +133,10 @@ nfs41_callback_svc(void *vrqstp) error = bc_svc_process(serv, req, rqstp); dprintk("bc_svc_process() returned w/ error code= %d\n", error); + mutex_unlock(&nfs_callback_mutex); } else { spin_unlock_bh(&serv->sv_cb_lock); + mutex_unlock(&nfs_callback_mutex); schedule(); finish_wait(&serv->sv_cb_waitq, &wq); } ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel