Re: [Devel] [PATCH v5 2/2] nfs: protect callback execution against per-net callback thread shutdown
09.11.2017 10:33, Kirill Tkhai пишет: > > Looks good for me, except there is one more call of nfs_callback_down_net() > in nfs_callback_up(). > It's not clear for me, we shouldn't protect it via the mutex too. Should we? > Looks like no, mount is not ready yet and there is no connection to server, thus no request from server can come to the moment and we don't need any additional protection here. ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [PATCH v5 2/2] nfs: protect callback execution against per-net callback thread shutdown
On 08.11.2017 20:23, Stanislav Kinsburskiy wrote: > From: Stanislav Kinsburskiy> > Here is the race: > > CPU #0CPU#1 > > cleanup_mnt nfs41_callback_svc (get xprt from the list) > nfs_callback_down ... > ... ... > svc_close_net ... > ... ... > svc_xprt_free ... > svc_bc_sock_free bc_svc_process > kfree(xprt) svc_process_common > rqstp->rq_xprt->xpt_ops (use after free) > > The problem is that per-net SUNRPC transports shutdown is done regardless > current callback execution. This is a race leading to transport use-after-free > in callback handler. > This patch fixes it in stright-forward way. I.e. it protects callback > execution with the same mutex used for per-net data creation and destruction. > Hopefully, it won't slow down NFS client significantly. > > https://jira.sw.ru/browse/PSBM-75751 > > v5: destroy all per-net backchannel requests before transports on in > nfs_callback_down_net > > v4: use another mutex to protect callback execution agains per-net transports > shutdown. > This guarantees, that transports won't be destroyed by shutdown callback while > execution is in progress and vice versa. > > v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with > mutex taken), while thread wait for the mutex to take. > The idea is to simply check if thread has to exit, if mutex lock has failed. > This is a busy loop, but it shouldn't happend often and for long. > > Signed-off-by: Stanislav Kinsburskiy > --- > fs/nfs/callback.c | 17 + > 1 file changed, 17 insertions(+) > > diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c > index 0beb275..e18d774 100644 > --- a/fs/nfs/callback.c > +++ b/fs/nfs/callback.c > @@ -99,6 +99,8 @@ nfs4_callback_up(struct svc_serv *serv) > } > > #if defined(CONFIG_NFS_V4_1) > +static DEFINE_MUTEX(nfs41_callback_mutex); > + > /* > * The callback service for NFSv4.1 callbacks > */ > @@ -117,6 +119,12 @@ nfs41_callback_svc(void *vrqstp) > if (try_to_freeze()) > continue; > > + mutex_lock(_callback_mutex); > + if (kthread_should_stop()) { > + mutex_unlock(_callback_mutex); > + return 0; > + } > + > prepare_to_wait(>sv_cb_waitq, , TASK_INTERRUPTIBLE); > spin_lock_bh(>sv_cb_lock); > if (!list_empty(>sv_cb_list)) { > @@ -129,8 +137,10 @@ nfs41_callback_svc(void *vrqstp) > error = bc_svc_process(serv, req, rqstp); > dprintk("bc_svc_process() returned w/ error code= %d\n", > error); > + mutex_unlock(_callback_mutex); > } else { > spin_unlock_bh(>sv_cb_lock); > + mutex_unlock(_callback_mutex); > schedule(); > finish_wait(>sv_cb_waitq, ); > } > @@ -242,6 +252,7 @@ static void nfs_callback_down_net(u32 minorversion, > struct svc_serv *serv, struc > return; > > dprintk("NFS: destroy per-net callback data; net=%p\n", net); > + bc_svc_flush_queue_net(serv, net); > svc_shutdown_net(serv, net); > } > > @@ -377,7 +388,13 @@ void nfs_callback_down(int minorversion, struct net *net) > struct nfs_callback_data *cb_info = _callback_info[minorversion]; > > mutex_lock(_callback_mutex); > +#if defined(CONFIG_NFS_V4_1) > + mutex_lock(_callback_mutex); > + nfs_callback_down_net(minorversion, cb_info->serv, net); > + mutex_unlock(_callback_mutex); > +#else > nfs_callback_down_net(minorversion, cb_info->serv, net); > +#endif > cb_info->users--; > if (cb_info->users == 0 && cb_info->task != NULL) { > kthread_stop(cb_info->task); > Looks good for me, except there is one more call of nfs_callback_down_net() in nfs_callback_up(). It's not clear for me, we shouldn't protect it via the mutex too. Should we? ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH v5 2/2] nfs: protect callback execution against per-net callback thread shutdown
From: Stanislav KinsburskiyHere is the race: CPU #0 CPU#1 cleanup_mnt nfs41_callback_svc (get xprt from the list) nfs_callback_down ... ... ... svc_close_net ... ... ... svc_xprt_free ... svc_bc_sock_freebc_svc_process kfree(xprt) svc_process_common rqstp->rq_xprt->xpt_ops (use after free) The problem is that per-net SUNRPC transports shutdown is done regardless current callback execution. This is a race leading to transport use-after-free in callback handler. This patch fixes it in stright-forward way. I.e. it protects callback execution with the same mutex used for per-net data creation and destruction. Hopefully, it won't slow down NFS client significantly. https://jira.sw.ru/browse/PSBM-75751 v5: destroy all per-net backchannel requests before transports on in nfs_callback_down_net v4: use another mutex to protect callback execution agains per-net transports shutdown. This guarantees, that transports won't be destroyed by shutdown callback while execution is in progress and vice versa. v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with mutex taken), while thread wait for the mutex to take. The idea is to simply check if thread has to exit, if mutex lock has failed. This is a busy loop, but it shouldn't happend often and for long. Signed-off-by: Stanislav Kinsburskiy --- fs/nfs/callback.c | 17 + 1 file changed, 17 insertions(+) diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index 0beb275..e18d774 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -99,6 +99,8 @@ nfs4_callback_up(struct svc_serv *serv) } #if defined(CONFIG_NFS_V4_1) +static DEFINE_MUTEX(nfs41_callback_mutex); + /* * The callback service for NFSv4.1 callbacks */ @@ -117,6 +119,12 @@ nfs41_callback_svc(void *vrqstp) if (try_to_freeze()) continue; + mutex_lock(_callback_mutex); + if (kthread_should_stop()) { + mutex_unlock(_callback_mutex); + return 0; + } + prepare_to_wait(>sv_cb_waitq, , TASK_INTERRUPTIBLE); spin_lock_bh(>sv_cb_lock); if (!list_empty(>sv_cb_list)) { @@ -129,8 +137,10 @@ nfs41_callback_svc(void *vrqstp) error = bc_svc_process(serv, req, rqstp); dprintk("bc_svc_process() returned w/ error code= %d\n", error); + mutex_unlock(_callback_mutex); } else { spin_unlock_bh(>sv_cb_lock); + mutex_unlock(_callback_mutex); schedule(); finish_wait(>sv_cb_waitq, ); } @@ -242,6 +252,7 @@ static void nfs_callback_down_net(u32 minorversion, struct svc_serv *serv, struc return; dprintk("NFS: destroy per-net callback data; net=%p\n", net); + bc_svc_flush_queue_net(serv, net); svc_shutdown_net(serv, net); } @@ -377,7 +388,13 @@ void nfs_callback_down(int minorversion, struct net *net) struct nfs_callback_data *cb_info = _callback_info[minorversion]; mutex_lock(_callback_mutex); +#if defined(CONFIG_NFS_V4_1) + mutex_lock(_callback_mutex); + nfs_callback_down_net(minorversion, cb_info->serv, net); + mutex_unlock(_callback_mutex); +#else nfs_callback_down_net(minorversion, cb_info->serv, net); +#endif cb_info->users--; if (cb_info->users == 0 && cb_info->task != NULL) { kthread_stop(cb_info->task); ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel