From: Santosh Shilimkar <santosh.shilim...@oracle.com> Date: Wed, 24 Jan 2018 13:32:20 -0800
> On 1/24/2018 1:03 PM, Sowmini Varadhan wrote: >> An rds_connection can get added during netns deletion between lines >> 528 >> and 529 of >> 506 static void rds_tcp_kill_sock(struct net *net) >> : >> /* code to pull out all the rds_connections that should be destroyed >> */ >> : >> 528 spin_unlock_irq(&rds_tcp_conn_lock); >> 529 list_for_each_entry_safe(tc, _tc, &tmp_list, t_tcp_node) >> 530 rds_conn_destroy(tc->t_cpath->cp_conn); >> Such an rds_connection would miss out the rds_conn_destroy() >> loop (that cancels all pending work) and (if it was scheduled >> after netns deletion) could trigger the use-after-free. >> A similar race-window exists for the module unload path >> in rds_tcp_exit -> rds_tcp_destroy_conns >> To avoid the addition of new rds_connections during kill_sock >> or netns_delete, this patch introduces a per-netns flag, >> RTN_DELETE_PENDING, that will cause RDS connection creation to fail. >> RCU is used to make sure that we wait for the critical >> section of __rds_conn_create threads (that may have started before >> the setting of RTN_DELETE_PENDING) to complete before starting >> the connection destruction. >> Reported-by: syzbot+bbd8e9a06452cc480...@syzkaller.appspotmail.com >> Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> >> --- >> net/rds/connection.c | 3 ++ >> net/rds/tcp.c | 82 ++++++++++++++++++++++++++++++++----------------- >> net/rds/tcp.h | 1 + >> 3 files changed, 57 insertions(+), 29 deletions(-) >> > FWIW, > Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com> > > Just for archives, just summarizing off-list discussion. Netns > destroy making use of conn_destroy now which in past was used for only > module unload is racy. > > Its not possible to make it race free with just flags alone and needs > rcu sync kind of mechanism. RDS being sensitive to brownouts on > reconnects, rcu usage was has been minimised. Netns delete > is expected to be non-frequent operation and hence usage of rcu as > done in this patch is probably ok. If needed it will be revisited in > future for optimization. Please have a look at: commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d Author: Dan Streetman <ddstr...@ieee.org> Date: Thu Jan 18 16:14:26 2018 -0500 net: tcp: close sock if net namespace is exiting in the 'net' tree, it adds a helper 'check_net()' that you can probably use to guard RDS connection creation in order to avoid these races, and therefore without the RDS specific RTN_PENDING_DELETE flag. Thank you.