Re: [tipc-discussion] [PATCH net-next v5 2/6] tipc: add subscription refcount to avoid invalid delete

2017-01-18 Thread Parthasarathy Bhuvaragan
On 01/18/2017 01:07 PM, Parthasarathy Bhuvaragan wrote: > On 01/18/2017 11:06 AM, Ying Xue wrote: >> On 01/16/2017 11:41 PM, Parthasarathy Bhuvaragan wrote: >>> Until now, the subscribers keep track of the subscriptions using >>> reference count at subscriber level. At subscription cancel or >>>

[tipc-discussion] [net-next 2/4] tipc: add functionality to lookup multicast destination nodes

2017-01-18 Thread Jon Maloy
As a further preparation for the upcoming 'replicast' functionality, we add some necessary structs and functions for looking up and returning a list of all nodes that host destinations for a given multicast message. Reviewed-by: Parthasarathy Bhuvaragan

[tipc-discussion] [net-next 0/4] tipc: emulate multicast through replication

2017-01-18 Thread Jon Maloy
TIPC multicast messages are currently distributed via L2 broadcast or IP multicast to all nodes in the cluster, irrespective of the number of real destinations of the message. In this series we introduce an option to transport messages via replication ("replicast") across a selected number of

[tipc-discussion] [net-next 1/4] tipc: add function for checking broadcast support in bearer

2017-01-18 Thread Jon Maloy
As a preparation for the 'replicast' functionality we are going to introduce in the next commits, we need the broadcast base structure to store whether bearer broadcast is available at all from the currently used bearer or bearers. We do this by adding a new function tipc_bearer_bcast_support()

[tipc-discussion] [net-next 4/4] tipc: make replicast a user selectable option

2017-01-18 Thread Jon Maloy
If the bearer carrying multicast messages supports broadcast, those messages will be sent to all cluster nodes, irrespective of whether these nodes host any actual destinations socket or not. This is clearly wasteful if the cluster is large and there are only a few real destinations for the

Re: [tipc-discussion] [PATCH net-next v4 0/6] topology server fixes for nametable soft lockup

2017-01-18 Thread Parthasarathy Bhuvaragan
On 01/18/2017 11:30 AM, Xue, Ying wrote: > Hi John, > > > > Thank you for the testing. > > > > I think your suggestion is reasonable. But we need to find out its exact > scenario. Regarding the following message, after one object refcnt is > decreased to zero by one thread, another thread tries to

[tipc-discussion] [PATCH net-next v6 2/6] tipc: add subscription refcount to avoid invalid delete

2017-01-18 Thread Parthasarathy Bhuvaragan
Until now, the subscribers keep track of the subscriptions using reference count at subscriber level. At subscription cancel or subscriber delete, we delete the subscription by checking for pending timer using del_timer(). del_timer() is not SMP safe, if on CPU0 the check for pending timer returns

[tipc-discussion] [PATCH net-next v6 6/6] tipc: fix cleanup at module unload

2017-01-18 Thread Parthasarathy Bhuvaragan
In tipc_server_stop(), we iterate over the connections with limiting factor as server's idr_in_use. We ignore the fact that this variable is decremented in tipc_close_conn(), leading to premature exit. In this commit, we iterate until the we have no connections left. Acked-by: Ying Xue

[tipc-discussion] [PATCH net-next v6 4/6] tipc: fix nametbl_lock soft lockup at module exit

2017-01-18 Thread Parthasarathy Bhuvaragan
Commit 333f796235a527 ("tipc: fix a race condition leading to subscriber refcnt bug") reveals a soft lockup while acquiring nametbl_lock. Before commit 333f796235a527, we call tipc_conn_shutdown() from tipc_close_conn() in the context of tipc_topsrv_stop(). In that context, we are allowed to grab

[tipc-discussion] [PATCH net-next v6 5/6] tipc: ignore requests when the connection state is not CONNECTED

2017-01-18 Thread Parthasarathy Bhuvaragan
In tipc_conn_sendmsg(), we first queue the request to the outqueue followed by the connection state check. If the connection is not connected, we should not queue this message. In this commit, we reject the messages if the connection state is not CF_CONNECTED. Acked-by: Ying Xue

[tipc-discussion] [PATCH net-next v6 1/6] tipc: fix nametbl_lock soft lockup at node/link events

2017-01-18 Thread Parthasarathy Bhuvaragan
We trigger a soft lockup as we grab nametbl_lock twice if the node has a pending node up/down or link up/down event while: - we process an incoming named message in tipc_named_rcv() and perform an tipc_update_nametbl(). - we have pending backlog items in the name distributor queue during a

[tipc-discussion] [PATCH net-next v6 0/6] topology server fixes for nametable soft lockup

2017-01-18 Thread Parthasarathy Bhuvaragan
In this series, we revert the commit 333f796235a527 ("tipc: fix a race condition leading to subscriber refcnt bug") and provide an alternate solution to fix the race conditions in commits 2-4. We have to do this as the above commit introduced a nametbl soft lockup at module exit as described by

Re: [tipc-discussion] [PATCH net-next v5 2/6] tipc: add subscription refcount to avoid invalid delete

2017-01-18 Thread Parthasarathy Bhuvaragan
On 01/18/2017 11:06 AM, Ying Xue wrote: > On 01/16/2017 11:41 PM, Parthasarathy Bhuvaragan wrote: >> Until now, the subscribers keep track of the subscriptions using >> reference count at subscriber level. At subscription cancel or >> subscriber delete, we delete the subscription by checking for

Re: [tipc-discussion] [PATCH net-next v4 0/6] topology server fixes for nametable soft lockup

2017-01-18 Thread Xue, Ying
Hi John, Thank you for the testing. I think your suggestion is reasonable. But we need to find out its exact scenario. Regarding the following message, after one object refcnt is decreased to zero by one thread, another thread tries to increment its refcnt, which means that we have a race

Re: [tipc-discussion] [PATCH net-next v5 2/6] tipc: add subscription refcount to avoid invalid delete

2017-01-18 Thread Ying Xue
On 01/16/2017 11:41 PM, Parthasarathy Bhuvaragan wrote: > Until now, the subscribers keep track of the subscriptions using > reference count at subscriber level. At subscription cancel or > subscriber delete, we delete the subscription by checking for > pending timer using del_timer(). del_timer()