From: Jon Maloy <jon.ma...@ericsson.com>
Date: Wed, 24 Feb 2016 11:10:48 -0500

> When the TIPC module is unloaded, we have identified a race condition
> that allows a node reference counter to go to zero and the node instance
> being freed before the node timer is finished with accessing it. This
> leads to occasional crashes, especially in multi-namespace environments.
> 
> The scenario goes as follows:
> 
> CPU0:(node_stop)                       CPU1:(node_timeout)  // ref == 2
> 
> 1:                                          if(!mod_timer())
> 2: if (del_timer())
> 3:   tipc_node_put()                                        // ref -> 1
> 4: tipc_node_put()                                          // ref -> 0
> 5:   kfree_rcu(node);
> 6:                                               tipc_node_get(node)
> 7:                                               // BOOM!
> 
> We now clean up this functionality as follows:
> 
> 1) We remove the node pointer from the node lookup table before we
>    attempt deactivating the timer. This way, we reduce the risk that
>    tipc_node_find() may obtain a valid pointer to an instance marked
>    for deletion; a harmless but undesirable situation.
> 
> 2) We use del_timer_sync() instead of del_timer() to safely deactivate
>    the node timer without any risk that it might be reactivated by the
>    timeout handler. There is no risk of deadlock here, since the two
>    functions never touch the same spinlocks.
> 
> 3: We remove a pointless tipc_node_get() + tipc_node_put() from the
>    timeout handler.
> 
> Reported-by: Zhijiang Hu <huzhiji...@gmail.com>
> Acked-by: Ying Xue <ying....@windriver.com>
> Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>

Applied.

Reply via email to