Commit 333f796235a527 ("tipc: fix a race condition leading to subscriber refcnt bug") reveals a soft lockup while acquiring nametbl_lock.
Before commit 333f796235a527, we call tipc_conn_shutdown() from tipc_close_conn() in the context of tipc_topsrv_stop(). In that context, we are allowed to grab the nametbl_lock. In commit 333f796235a527, i moved the tipc_conn_release (renamed from tipc_conn_shutdown) to the connection refcount cleanup. This allows either tipc_nametbl_withdraw() or tipc_topsrv_stop() to perform tipc_sock_release(). Since tipc_exit_net() first calls tipc_topsrv_stop() and then tipc_nametble_withdraw() increases the chances for the later to perform the connection cleanup. The soft lockup occurs in the call chain of tipc_nametbl_withdraw(), when it performs the tipc_conn_kref_release() as it tries to grab nametbl_lock again while holding it already. tipc_nametbl_withdraw() grabs nametbl_lock tipc_nametbl_remove_publ() tipc_subscrp_report_overlap() tipc_subscrp_send_event() tipc_conn_sendmsg() << if (con->flags != CF_CONNECTED) we do conn_put(), triggering the cleanup as refcount=0. >> tipc_conn_kref_release tipc_sock_release tipc_conn_release tipc_subscrb_delete tipc_subscrp_delete tipc_nametbl_unsubscribe << Soft Lockup >> Until now, tipc_server_stop() grabs and releases the idr_lock twice for every connection. Once to find the connection and second to unset connection flag, remove it from list and decrement the refcount. The above lockup occurs when tipc_nametbl_withdraw() grabs the connection in between the two, thus owning the connection and triggering the cleanup while decrementing the refcount later. In this commit, we perform: - call tipc_nametbl_withdraw() before tipc_topsrv_stop() to avoid: 1. soft lockup 2. its too late to actually notify the subscribers, as the topology server might already have started shutting down. - In tipc_server_stop(), we remove all the connections from connection list in the scope of the idr_lock to prevent any other thread finding a connection which has unset CF_CONNECTED in its flags. Fixes: 333f796235a52727 ("tipc: fix a race condition leading to subscriber refcnt bug") Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvara...@ericsson.com> --- v2: commit message update. cleanup tipc_server_stop() as per Ying Xue. --- net/tipc/core.c | 4 ++++ net/tipc/net.c | 2 -- net/tipc/server.c | 13 ++++++++----- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/net/tipc/core.c b/net/tipc/core.c index 236b043a4156..3ea1452e2f06 100644 --- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -93,6 +93,10 @@ static int __net_init tipc_init_net(struct net *net) static void __net_exit tipc_exit_net(struct net *net) { + struct tipc_net *tn = net_generic(net, tipc_net_id); + + tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0, + tn->own_addr); tipc_topsrv_stop(net); tipc_net_stop(net); tipc_bcast_stop(net); diff --git a/net/tipc/net.c b/net/tipc/net.c index 28bf4feeb81c..92696cc6e763 100644 --- a/net/tipc/net.c +++ b/net/tipc/net.c @@ -130,8 +130,6 @@ void tipc_net_stop(struct net *net) if (!tn->own_addr) return; - tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0, - tn->own_addr); rtnl_lock(); tipc_bearer_stop(net); tipc_node_stop(net); diff --git a/net/tipc/server.c b/net/tipc/server.c index 215849ce453d..54c23ecccd26 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -623,16 +623,19 @@ int tipc_server_start(struct tipc_server *s) void tipc_server_stop(struct tipc_server *s) { struct tipc_conn *con; - int total = 0; int id; spin_lock_bh(&s->idr_lock); - for (id = 0; total < s->idr_in_use; id++) { + for (id = 0; s->idr_in_use; id++) { con = idr_find(&s->conn_idr, id); - if (con) { - total++; + if (con && test_and_clear_bit(CF_CONNECTED, &con->flags)) { + idr_remove(&s->conn_idr, con->conid); + s->idr_in_use--; + /* release spinlock as kernel_sock_shutdown can sleep. + */ spin_unlock_bh(&s->idr_lock); - tipc_close_conn(con); + kernel_sock_shutdown(con->sock, SHUT_RDWR); + conn_put(con); spin_lock_bh(&s->idr_lock); } } -- 2.1.4 ------------------------------------------------------------------------------ _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion