On 08.11.2017 20:23, Stanislav Kinsburskiy wrote: > From: Stanislav Kinsburskiy <skinsbur...@parallels.com> > > Here is the race: > > CPU #0 CPU#1 > > cleanup_mnt nfs41_callback_svc (get xprt from the list) > nfs_callback_down ... > ... ... > svc_close_net ... > ... ... > svc_xprt_free ... > svc_bc_sock_free bc_svc_process > kfree(xprt) svc_process_common > rqstp->rq_xprt->xpt_ops (use after free) > > The problem is that per-net SUNRPC transports shutdown is done regardless > current callback execution. This is a race leading to transport use-after-free > in callback handler. > This patch fixes it in stright-forward way. I.e. it protects callback > execution with the same mutex used for per-net data creation and destruction. > Hopefully, it won't slow down NFS client significantly. > > https://jira.sw.ru/browse/PSBM-75751 > > v5: destroy all per-net backchannel requests before transports on in > nfs_callback_down_net > > v4: use another mutex to protect callback execution agains per-net transports > shutdown. > This guarantees, that transports won't be destroyed by shutdown callback while > execution is in progress and vice versa. > > v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with > mutex taken), while thread wait for the mutex to take. > The idea is to simply check if thread has to exit, if mutex lock has failed. > This is a busy loop, but it shouldn't happend often and for long. > > Signed-off-by: Stanislav Kinsburskiy <skinsbur...@virtuozzo.com> > --- > fs/nfs/callback.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c > index 0beb275..e18d774 100644 > --- a/fs/nfs/callback.c > +++ b/fs/nfs/callback.c > @@ -99,6 +99,8 @@ nfs4_callback_up(struct svc_serv *serv) > } > > #if defined(CONFIG_NFS_V4_1) > +static DEFINE_MUTEX(nfs41_callback_mutex); > + > /* > * The callback service for NFSv4.1 callbacks > */ > @@ -117,6 +119,12 @@ nfs41_callback_svc(void *vrqstp) > if (try_to_freeze()) > continue; > > + mutex_lock(&nfs41_callback_mutex); > + if (kthread_should_stop()) { > + mutex_unlock(&nfs41_callback_mutex); > + return 0; > + } > + > prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_INTERRUPTIBLE); > spin_lock_bh(&serv->sv_cb_lock); > if (!list_empty(&serv->sv_cb_list)) { > @@ -129,8 +137,10 @@ nfs41_callback_svc(void *vrqstp) > error = bc_svc_process(serv, req, rqstp); > dprintk("bc_svc_process() returned w/ error code= %d\n", > error); > + mutex_unlock(&nfs41_callback_mutex); > } else { > spin_unlock_bh(&serv->sv_cb_lock); > + mutex_unlock(&nfs41_callback_mutex); > schedule(); > finish_wait(&serv->sv_cb_waitq, &wq); > } > @@ -242,6 +252,7 @@ static void nfs_callback_down_net(u32 minorversion, > struct svc_serv *serv, struc > return; > > dprintk("NFS: destroy per-net callback data; net=%p\n", net); > + bc_svc_flush_queue_net(serv, net); > svc_shutdown_net(serv, net); > } > > @@ -377,7 +388,13 @@ void nfs_callback_down(int minorversion, struct net *net) > struct nfs_callback_data *cb_info = &nfs_callback_info[minorversion]; > > mutex_lock(&nfs_callback_mutex); > +#if defined(CONFIG_NFS_V4_1) > + mutex_lock(&nfs41_callback_mutex); > + nfs_callback_down_net(minorversion, cb_info->serv, net); > + mutex_unlock(&nfs41_callback_mutex); > +#else > nfs_callback_down_net(minorversion, cb_info->serv, net); > +#endif > cb_info->users--; > if (cb_info->users == 0 && cb_info->task != NULL) { > kthread_stop(cb_info->task); >
Looks good for me, except there is one more call of nfs_callback_down_net() in nfs_callback_up(). It's not clear for me, we shouldn't protect it via the mutex too. Should we? _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel