On Feb 22, 2016 11:22 PM, "Xue, Ying" <ying....@windriver.com> wrote: > > Hi Jon, > > I think the scenario described below is not true. This is because del_timer() doesn't return 1 while node_timeout() is being called. Please take a look at __run_timers() defined in kernel/time/timer.c. When a timer is expired, __run_timers() will call the timeout function attached to the timer. But before __run_timers() runs timeout function, it first detaches the expired timer,
Hi Ying, But the timeout function here do mod_timer() which attach itself again? and then performs its timeout function. That means del_timer() will definitely return 0 while node_timeout() is under execution. So, the following scenario cannot happen at all. > > Regards, > Ying > > -----Original Message----- > From: Jon Maloy [mailto:jon.ma...@ericsson.com] > Sent: 2016年2月20日 9:17 > To: tipc-discussion@lists.sourceforge.net; parthasarathy.bhuvara...@ericsson.com; Xue, Ying; richard.a...@ericsson.com; jon.ma...@ericsson.com; huzhiji...@gmail.com > Cc: ma...@donjonn.com > Subject: [PATCH net-next v3 1/1] tipc: fix crash during node removal > > When the TIPC module is unloaded, we have identified a race condition that allows a node reference counter to go to zero and the node instance being freed before the node timer is finished with accessing it. This leads to occasional crashes, especially in multi-namespace environments. > > The scenario goes as follows: > > CPU0:(node_stop) CPU1:(node_timeout) // ref == 2 > > 1: if(!mod_timer()) > 2: if (del_timer()) > 3: tipc_node_put() // ref -> 1 > 4: tipc_node_put() // ref -> 0 > 5: kfree_rcu(node); > 6: tipc_node_get(node) > 7: // BOOM! > > We now clean up this functionality as follows: > > 1) We remove the node pointer from the node lookup table before we > attempt deactivating the timer. This way, we reduce the risk that > tipc_node_find() may obtain a valid pointer to an instance marked > for deletion; a harmless but undesirable situation. > > 2) We use del_timer_sync() instead of del_timer() to safely deactivate > the node timer without any risk that it might be reactivated by the > timeout handler. There is no risk of deadlock here, since the two > functions never touch the same spinlocks. > > 3: We remove a pointless tipc_node_get() + tipc_node_put() from the > timeout handler. > > v3: - changed to del_timer_sync() > - don't test for return value of mod_timer() in tipc_node_timeout(), > and don't touch kref. > > Reported-by: Zhijiang Hu <huzhiji...@gmail.com> > Signed-off-by: Jon Maloy <jon.ma...@ericsson.com> > --- > net/tipc/node.c | 22 ++++++++++------------ > 1 file changed, 10 insertions(+), 12 deletions(-) > > diff --git a/net/tipc/node.c b/net/tipc/node.c index f310e931..1fdaed0 100644 > --- a/net/tipc/node.c > +++ b/net/tipc/node.c > @@ -225,9 +225,10 @@ static unsigned int tipc_hashfn(u32 addr) > > static void tipc_node_kref_release(struct kref *kref) { > - struct tipc_node *node = container_of(kref, struct tipc_node, kref); > + struct tipc_node *n = container_of(kref, struct tipc_node, kref); > > - tipc_node_delete(node); > + kfree(n->bc_entry.link); > + kfree_rcu(n, rcu); > } > > static void tipc_node_put(struct tipc_node *node) @@ -395,8 +396,10 @@ static void tipc_node_delete(struct tipc_node *node) { > list_del_rcu(&node->list); > hlist_del_rcu(&node->hash); > - kfree(node->bc_entry.link); > - kfree_rcu(node, rcu); > + tipc_node_put(node); > + > + del_timer_sync(&node->timer); > + tipc_node_put(node); > } > > void tipc_node_stop(struct net *net) > @@ -405,11 +408,8 @@ void tipc_node_stop(struct net *net) > struct tipc_node *node, *t_node; > > spin_lock_bh(&tn->node_list_lock); > - list_for_each_entry_safe(node, t_node, &tn->node_list, list) { > - if (del_timer(&node->timer)) > - tipc_node_put(node); > - tipc_node_put(node); > - } > + list_for_each_entry_safe(node, t_node, &tn->node_list, list) > + tipc_node_delete(node); > spin_unlock_bh(&tn->node_list_lock); > } > > @@ -530,9 +530,7 @@ static void tipc_node_timeout(unsigned long data) > if (rc & TIPC_LINK_DOWN_EVT) > tipc_node_link_down(n, bearer_id, false); > } > - if (!mod_timer(&n->timer, jiffies + n->keepalive_intv)) > - tipc_node_get(n); > - tipc_node_put(n); > + mod_timer(&n->timer, jiffies + n->keepalive_intv); > } > > /** > -- > 1.9.1 > ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion