In a dual bearer configuration, if the second tipc link becomes
active while the first link still has pending nametable "bulk"
updates, it randomly leads to reset of the second link.

We have seen this occurring frequently in cloud environments
while scaling up nodes. One reason this occurs is when the
nodes are fully configured w.r.t tipc while the underlying
media is still not UP.

Consider the following scenario between 2 nodes.
On node 1:
1. configure tipc for 2 bearers when the media is up.
On node 2:
1. enable two tipc bearers when the media is down
2. start an application to publish 10K services in the nametable.
3. bring up the media for the first bearer.
4. bring up the media for the second bearer.

On node 2:
- Between 3 & 4 above, it may happen that when establishing the second
  link the first link queue contains pending nametable "bulk" updates.
- The pending bulk updates are now sent also via the second link using
  the TUNNEL_PROTOCOL. The synch point for the second link is based on
  the message count of the tunnel packets i.e it includes even the
  NAME_DISTRIBUTOR messages in the transmit and backlog link queue.
- As the first link contains several publications, the skb's are filled
  to the link mtu. Thus while adding the tunnel header on these skb's,
  they exceed the link mtu and message transmission fails (-EMSGSIZE).
- All skb's except the last one which has a smaller size are transmitted.

On node 1:
- The link fsm waits for the sync point, which is never reached.
- Thus the links are timed out and all the queued messaged to be
  sent on the newly established link are delayed considerably.

When the first link is established, the function named_distribute(),
fills the skb based on node mtu (allows room for TUNNEL_PROTOCOL)
with NAME_DISTRIBUTOR message for each PUBLICATION.
However, the function named_distribute() allocates the buffer by
increasing the node mtu by INT_H_SIZE (to insert NAME_DISTRIBUTOR).
This consumes the space allocated for TUNNEL_PROTOCOL.

In this commit, we adjust the size of name distributor message so that
they can be tunnelled.

Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvara...@ericsson.com>
---
 net/tipc/name_distr.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 6b626a64b517..ec455b84f389 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -62,6 +62,8 @@ static void publ_to_item(struct distr_item *i, struct 
publication *p)
 
 /**
  * named_prepare_buf - allocate & initialize a publication message
+ *
+ * The buffer returned is of size INT_H_SIZE + payload size
  */
 static struct sk_buff *named_prepare_buf(struct net *net, u32 type, u32 size,
                                         u32 dest)
@@ -141,8 +143,8 @@ static void named_distribute(struct net *net, struct 
sk_buff_head *list,
        struct publication *publ;
        struct sk_buff *skb = NULL;
        struct distr_item *item = NULL;
-       uint msg_dsz = (tipc_node_get_mtu(net, dnode, 0) / ITEM_SIZE) *
-                       ITEM_SIZE;
+       uint max_named_msg_sz = tipc_node_get_mtu(net, dnode, 0) - INT_H_SIZE;
+       uint msg_dsz = roundup(max_named_msg_sz, ITEM_SIZE);
        uint msg_rem = msg_dsz;
 
        list_for_each_entry(publ, pls, local_list) {
-- 
2.1.4


------------------------------------------------------------------------------
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to