> -----Original Message-----
> From: Parthasarathy Bhuvaragan
> Sent: Friday, 29 April, 2016 07:45
> To: Jon Maloy; tipc-discussion@lists.sourceforge.net; Ying Xue
> Cc: Richard Alpe; ma...@donjonn.com
> Subject: Re: [PATCH net-next v3 1/1] tipc: add neighbor monitoring framework
> 
> Hi Jon,
> 
> Thanks for the review and feedback.
> 
> Can we bump the protocol minor version to 2.1 as we are extending the
> protocol introducing new link supervision algorithm and improved socket
> layer flow control in subsequent series.

I agree with that. But I do have at least one more new "capability" in the 
pipe,  maybe we should wait until then, i.e., a couple of months from now.

> 
> The protocol specification needs to be updated with the new supervision
> and flow control algorithms, as the current version explicitly describes
> full mesh and packet based flow control.

Agreed again. I will try to find time for this. (Unless you volunteer ;) )

///jon

> 
> This will give more control on when these features were introduced.
> 
> regards
> Partha
> 
> On 04/28/2016 11:29 PM, Jon Maloy wrote:
> > Hi Partha,
> > See below.
> >
> >
> >> -----Original Message-----
> >> From: Parthasarathy Bhuvaragan
> >> Sent: Thursday, 28 April, 2016 10:16
> >> To: Jon Maloy; tipc-discussion@lists.sourceforge.net; Ying Xue; Richard 
> >> Alpe
> >> Cc: ma...@donjonn.com
> >> Subject: Re: [PATCH net-next v3 1/1] tipc: add neighbor monitoring
> framework
> >>
> >> Hi Jon,
> >>
> >> I added my interpretation of members corresponding to your description
> >> along with some comments. Please confirm if they are correct.
> >>
> >>  Description of the printout:
> >>  ----------------------------
> >>
> >>  Indented line: This peer is not actively monitored from the local node.
> >> [partha] tipc_peer->is_head || tipc_peer->is_local
> > Yes.
> >
> >>  Node: the identity of a peer node
> >> [partha] tipc_peer->addr
> > Yes.
> >
> >>  Reported members: The number of members reported by the peer in its
> >>  domain record
> >> [partha] tipc_mon_domain->member_cnt
> > Yes.
> >
> >>  Applied members: The number of peers in the monitor list that matches
> >>  the contents of this peer's domain record. The peer node may see fewer
> >>  or more nodes than this node, so the lists may not match exactly. A
> >>  received domain record is matched member by member against the own list
> >>  until the end, until there is a mismatch, or until it encounters the
> >>  list's own node. In the latter two cases this number will differ from
> >>  "Reported Domain Members".
> >> [partha] tipc_peer->monitoring
> > Yes.
> >
> >>  Domain Generation: An integer which is stepped every time something
> >>  changes (peer add/remove/up/down) in a node's local domain. This helps
> >>  the receiver of a domain record to know if it can ignore the record, or
> >>  if it must process it.
> >> [partha] tipc_mon_domain->gen
> > Yes.
> >
> >>  Domain State Map: A bit map showing a peer's view of its domain
> >>  members' up/down state. Bit 0 corresponds to its view of the first
> >>  domain member, bit 1 to the next one etc.
> >> [partha] tipc_mon_domain->up_map
> > Yes.
> >
> >>  Head State Map: A bit map showing how the preceding peers in the list
> >>  sees the up/down state of this peer node. Position 0 corresponds to
> >>  the view of immediate preceding node in the list, position 1 the
> >>  the previous one etc.
> >> [partha] tipc_peer->head_map
> > Yes.
> >
> >> If the Head State Map differs from Domain State Map (i.e preceding peers
> >> view differs from peers view for a given peer), there is an error OR are
> >> there cases when they can differ?
> > They will of course differ, but since head_map is generated bit-by-bit from 
> > the
> preceding up-map, they must match for the applied bits. Also, if "applied
> members" is smaller than "reported members" according to the previous, the
> up_map of that peer will contain bits which are not reflected in the
> corresponding head_maps. You will be able to see this by using "full mode",
> where you will see that the last members of the domain record are not present 
> in
> the list, or that, in the case where the own node is a member of a domain, the
> applied domain is truncated at there.
> >
> >> From a tipc user's perspective, how does he ensure that after enabling the
> >> monitor everything is "OK"?
> >> It will be of great benefit to the user to present these attributes in a
> >> format which performs the above.
> > The closest I can imagine is if we present the bitmaps in some sort of 
> > binary
> view, e.g.: uuuddduuuddddu etc. for both maps, but that takes space (in a 
> 1000-
> node cluster we will have 32 bits set in each map), and some "hands-on"
> calculation will still be needed. An easy comparison of correctness would be
> provided if we could present them in a matrix, but that would take even more
> space.
> > But your suggestion is good and valid. I will try to spend some time 
> > pondering
> on this.
> >
> >
> >>  Domain Member List: The actual contents of a peer's domain record, i.e.,
> >>  its reported member list. The suffix ":u" means it considers the member
> >>  to be "up", the suffux ":d" that it considers it being down.
> >> [partha] For each of the peer listed in tipc_mon_domain->members, the
> >> peer states are derived from the tipc_mon_domain->up_map.
> >> But in full listing, we seem to miss the head state map. Or is this
> >> implicitly covered?
> > Since it is generated from the former, yes. But if we don't see it we can't 
> > verify
> if it is correct or not. That's why I added it in the "compact", view.
> > Also, remember that the head_map is an implementation detail, -we could do
> without it at the cost of some more computing.
> > But as already said, some combined view would be desirable.
> >
> >>  List Generation: An integer which is stepped every time something changes
> >>  in the local monitor list, including changes in the local domain. When a
> >>  link timer asks the monitor whether its should send a probe message or
> >>  not, it also hands it the value of this field as it was at the previous
> >>  request. By comparing this cached value with the current list generation,
> >>  the monitor can know if it can respond with a cached copy of the the
> >>  pervious response, or if it has to consult the monitor list to obtain
> >>  an updated response.
> >> [partha] tipc_monitor->list_gen
> > Yes.
> >
> > Regards
> > ///jon
> >
> >> regards
> >> Partha
> >>
> >> On 04/26/2016 11:02 PM, Jon Maloy wrote:
> >>> Hi Partha,
> >>> See attachment. I spent a little time on both suggesting a readable format
> and
> >> to explain what the different attributes mean. I hope this helps.
> >>> BR
> >>> ///jon
> >>>
> >>>> -----Original Message-----
> >>>> From: Parthasarathy Bhuvaragan
> >>>> Sent: Tuesday, 26 April, 2016 12:12
> >>>> To: Jon Maloy; tipc-discussion@lists.sourceforge.net; Ying Xue; Richard 
> >>>> Alpe
> >>>> Cc: ma...@donjonn.com
> >>>> Subject: Re: [PATCH net-next v3 1/1] tipc: add neighbor monitoring
> >> framework
> >>>> Hi Jon,
> >>>>
> >>>> I find it hard to express the monitoring attributes in tipc tool.
> >>>> Can you provide a short description for the required parameters?
> >>>>
> >>>> regards
> >>>> Partha
> >>>>
> >>>> On 2016-04-20 18:22, Jon Maloy wrote:
> >>>>> TIPC based clusters are by default set up with full-mesh link
> >>>>> connectivity between all nodes. Those links are expected to provide
> >>>>> a short failure detection time, by default set to 1500 ms. Because
> >>>>> of this, the background load for neighbor monitoring in an N-node
> >>>>> cluster increases with a factor N on each node, while the overall
> >>>>> monitoring traffic through the network infrastructure inceases at
> >>>>> a ~(N * (N - 1)) rate. Experience has shown that such clusters don't
> >>>>> scale well beyond ~100 nodes unless we significantly increase failure
> >>>>> discovery tolerance.
> >>>>>
> >>>>> This commit introduces a framework and an algorithm that drastically
> >>>>> reduces this background load, while basically maintaining the original
> >>>>> failure detection times across the whole cluster. Using this algortithm,
> >>>>> background load will now grow at a rate of ~(2 * sqrt(N)) per node, and
> >>>>> at ~(2 * N * sqrt(N)) in traffic overhead. As an example, each node will
> >>>>> now have to actively monitor 38 neighbors in a 400-node cluster, instead
> >>>>> of as before 399.
> >>>>>
> >>>>> This "Overlapping Ring Supervision Algorithm" is completely distributed
> >>>>> and employs no centralized or coordinated state. It goes as follows:
> >>>>>
> >>>>> - Each node makes up a linearly ascending, circular list of all its N
> >>>>>   known neighbors, based on their TIPC node identity. This algorithm
> >>>>>   must be the same on all nodes.
> >>>>>
> >>>>> - The node then selects the next M = sqrt(N) - 1 nodes downstream from
> >>>>>   itself in the list, and chooses to actively monitor those. This is
> >>>>>   called its "local monitoring domain".
> >>>>>
> >>>>> - It creates a domain record describing the monitoring domain, and
> >>>>>   piggy-backs this in the data area of all neighbor monitoring messages
> >>>>>   (LINK_PROTOCOL/STATE) leaving that node. This means that all nodes in
> >>>>>   the cluster eventually (default within 400 ms) will learn about
> >>>>>   its monitoring domain.
> >>>>>
> >>>>> - Whenever a node discovers a change in its local domain, e.g., a node
> >>>>>   has been added or has gone down, it creates and sends out a new
> >>>>>   version of its node record to inform all neighbors about the change.
> >>>>>
> >>>>> - A node receiving a domain record from anybody outside its local domain
> >>>>>   matches this against its own list (which may not look the same), and
> >>>>>   chooses to not actively monitor those members of the received domain
> >>>>>   record that are also present in its own list. Instead, it relies on
> >>>>>   indications from the direct monitoring nodes if an indirectly
> >>>>>   monitored node has gone up or down. If a node is indicated lost, the
> >>>>>   receiving node temporarily activates its own direct monitoring towards
> >>>>>   that node in order to confirm, or not, that it is actually gone.
> >>>>>
> >>>>> - Since each node is actively montoring sqrt(N) downstream neighbors,
> >>>>>   each node is also actively monitored by the same number of upstream
> >>>>>   neighbors. This means that all non-direct monitoring nodes normally
> >>>>>   will receive sqrt(N) indications that a node is gone.
> >>>>>
> >>>>> - A major drawback with ring monitoring is how it handles failures that
> >>>>>   cause massive network partitionings. If both a lost node and all its
> >>>>>   direct monitoring neighbors are inside the lost partition, the nodes 
> >>>>> in
> >>>>>   the remaining partition will never receive indications about the loss.
> >>>>>   To overcome this, each node also chooses to actively monitor some
> >>>>>   nodes outside its local domain. Those nodes are called remote domain
> >>>>>   "heads", and are selected in such a way that no node in the cluster
> >>>>>   will be more than two direct monitoring hops away. Because of this,
> >>>>>   each node, apart from monitoring the member of its local domain, will
> >>>>>   also typically monitor sqrt(N) remote head nodes.
> >>>>>
> >>>>> - As an optimization, local list status, domain status and domain
> >>>>>   records are marked with a generation number. This saves senders from
> >>>>>   unecessarily conveying  unaltered domain records, and receivers from
> >>>>>   performing unneeded re-adaptations of their node monitoring list, such
> >>>>>   as re-assigning domain heads.
> >>>>>
> >>>>> v2: - Updated according to comments from Richard Alpe. His proposal
> >>>>>       for the peer_xxx() functions didn't work out well, but for the
> >>>>>       rest it was ok.
> >>>>>     - Added a breakpoint (cluster size) where monitoring algo should
> >>>>>       switch from "full mesh" to "overlapping ring". Default value
> >>>>>       32, but I even tested to set it to 100 without any problems.
> >>>>>       This should be the variable Partha should make configurable,
> >>>>>       instead of just toggling as I suggested originally. By setting
> >>>>>       the breakpoint to 0 ring monitoring is always on, by setting it
> >>>>>       to some high number it is always off.
> >>>>>
> >>>>> v3: - Renamed "mon_breakpoint" to "mon_threshold"
> >>>>>     - Added code to revert to full-mesh monitoring when a node is
> >>>>>       removed and we reach the monitor threshold from above.
> >>>>>
> >>>>> Signed-off-by: Jon Maloy <jon.ma...@ericsson.com>
> >>>>> ---
> >>>>>  net/tipc/Makefile  |   2 +-
> >>>>>  net/tipc/addr.h    |   1 +
> >>>>>  net/tipc/bearer.c  |   8 +-
> >>>>>  net/tipc/bearer.h  |   2 +-
> >>>>>  net/tipc/core.c    |   1 +
> >>>>>  net/tipc/core.h    |  15 +-
> >>>>>  net/tipc/link.c    |  33 ++-
> >>>>>  net/tipc/monitor.c | 590
> >>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>  net/tipc/monitor.h |  72 +++++++
> >>>>>  net/tipc/node.c    |  25 ++-
> >>>>>  10 files changed, 724 insertions(+), 25 deletions(-)
> >>>>>  create mode 100644 net/tipc/monitor.c
> >>>>>  create mode 100644 net/tipc/monitor.h
> >>>>>
> >>>>> diff --git a/net/tipc/Makefile b/net/tipc/Makefile
> >>>>> index 57e460b..31b9f9c 100644
> >>>>> --- a/net/tipc/Makefile
> >>>>> +++ b/net/tipc/Makefile
> >>>>> @@ -6,7 +6,7 @@ obj-$(CONFIG_TIPC) := tipc.o
> >>>>>
> >>>>>  tipc-y += addr.o bcast.o bearer.o \
> >>>>>            core.o link.o discover.o msg.o  \
> >>>>> -          name_distr.o  subscr.o name_table.o net.o  \
> >>>>> +          name_distr.o  subscr.o monitor.o name_table.o net.o  \
> >>>>>            netlink.o netlink_compat.o node.o socket.o eth_media.o \
> >>>>>            server.o socket.o
> >>>>>
> >>>>> diff --git a/net/tipc/addr.h b/net/tipc/addr.h
> >>>>> index 93f7c98..64f4004 100644
> >>>>> --- a/net/tipc/addr.h
> >>>>> +++ b/net/tipc/addr.h
> >>>>> @@ -73,4 +73,5 @@ int tipc_addr_node_valid(u32 addr);
> >>>>>  int tipc_in_scope(u32 domain, u32 addr);
> >>>>>  int tipc_addr_scope(u32 domain);
> >>>>>  char *tipc_addr_string_fill(char *string, u32 addr);
> >>>>> +
> >>>>>  #endif
> >>>>> diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
> >>>>> index 6f11c62..9a70e1d 100644
> >>>>> --- a/net/tipc/bearer.c
> >>>>> +++ b/net/tipc/bearer.c
> >>>>> @@ -1,7 +1,7 @@
> >>>>>  /*
> >>>>>   * net/tipc/bearer.c: TIPC bearer code
> >>>>>   *
> >>>>> - * Copyright (c) 1996-2006, 2013-2014, Ericsson AB
> >>>>> + * Copyright (c) 1996-2006, 2013-2016, Ericsson AB
> >>>>>   * Copyright (c) 2004-2006, 2010-2013, Wind River Systems
> >>>>>   * All rights reserved.
> >>>>>   *
> >>>>> @@ -39,6 +39,7 @@
> >>>>>  #include "bearer.h"
> >>>>>  #include "link.h"
> >>>>>  #include "discover.h"
> >>>>> +#include "monitor.h"
> >>>>>  #include "bcast.h"
> >>>>>  #include "netlink.h"
> >>>>>
> >>>>> @@ -313,6 +314,10 @@ restart:
> >>>>>         rcu_assign_pointer(tn->bearer_list[bearer_id], b);
> >>>>>         if (skb)
> >>>>>                 tipc_bearer_xmit_skb(net, bearer_id, skb, 
> >>>>> &b->bcast_addr);
> >>>>> +
> >>>>> +       if (tipc_mon_create(net, bearer_id))
> >>>>> +               return -ENOMEM;
> >>>>> +
> >>>>>         pr_info("Enabled bearer <%s>, discovery domain %s, priority 
> >>>>> %u\n",
> >>>>>                 name,
> >>>>>                 tipc_addr_string_fill(addr_string, disc_domain), 
> >>>>> priority);
> >>>>> @@ -348,6 +353,7 @@ static void bearer_disable(struct net *net, struct
> >>>> tipc_bearer *b)
> >>>>>                 tipc_disc_delete(b->link_req);
> >>>>>         RCU_INIT_POINTER(tn->bearer_list[bearer_id], NULL);
> >>>>>         kfree_rcu(b, rcu);
> >>>>> +       tipc_mon_delete(net, bearer_id);
> >>>>>  }
> >>>>>
> >>>>>  int tipc_enable_l2_media(struct net *net, struct tipc_bearer *b,
> >>>>> diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h
> >>>>> index f686e41..0d337c7 100644
> >>>>> --- a/net/tipc/bearer.h
> >>>>> +++ b/net/tipc/bearer.h
> >>>>> @@ -1,7 +1,7 @@
> >>>>>  /*
> >>>>>   * net/tipc/bearer.h: Include file for TIPC bearer code
> >>>>>   *
> >>>>> - * Copyright (c) 1996-2006, 2013-2014, Ericsson AB
> >>>>> + * Copyright (c) 1996-2006, 2013-2016, Ericsson AB
> >>>>>   * Copyright (c) 2005, 2010-2011, Wind River Systems
> >>>>>   * All rights reserved.
> >>>>>   *
> >>>>> diff --git a/net/tipc/core.c b/net/tipc/core.c
> >>>>> index 03a8428..8f9d8d8 100644
> >>>>> --- a/net/tipc/core.c
> >>>>> +++ b/net/tipc/core.c
> >>>>> @@ -57,6 +57,7 @@ static int __net_init tipc_init_net(struct net *net)
> >>>>>
> >>>>>         tn->net_id = 4711;
> >>>>>         tn->own_addr = 0;
> >>>>> +       tn->mon_threshold = TIPC_DEF_MON_THRESHOLD;
> >>>>>         get_random_bytes(&tn->random, sizeof(int));
> >>>>>         INIT_LIST_HEAD(&tn->node_list);
> >>>>>         spin_lock_init(&tn->node_list_lock);
> >>>>> diff --git a/net/tipc/core.h b/net/tipc/core.h
> >>>>> index 5504d63..12cade0 100644
> >>>>> --- a/net/tipc/core.h
> >>>>> +++ b/net/tipc/core.h
> >>>>> @@ -66,11 +66,13 @@ struct tipc_bc_base;
> >>>>>  struct tipc_link;
> >>>>>  struct tipc_name_table;
> >>>>>  struct tipc_server;
> >>>>> +struct tipc_monitor;
> >>>>>
> >>>>>  #define TIPC_MOD_VER "2.0.0"
> >>>>>
> >>>>> -#define NODE_HTABLE_SIZE   512
> >>>>> -#define MAX_BEARERS       3
> >>>>> +#define NODE_HTABLE_SIZE       512
> >>>>> +#define MAX_BEARERS             3
> >>>>> +#define TIPC_DEF_MON_THRESHOLD  32
> >>>>>
> >>>>>  extern int tipc_net_id __read_mostly;
> >>>>>  extern int sysctl_tipc_rmem[3] __read_mostly;
> >>>>> @@ -88,6 +90,10 @@ struct tipc_net {
> >>>>>         u32 num_nodes;
> >>>>>         u32 num_links;
> >>>>>
> >>>>> +       /* Neighbor monitoring list */
> >>>>> +       struct tipc_monitor *monitors[MAX_BEARERS];
> >>>>> +       int mon_threshold;
> >>>>> +
> >>>>>         /* Bearer list */
> >>>>>         struct tipc_bearer __rcu *bearer_list[MAX_BEARERS + 1];
> >>>>>
> >>>>> @@ -123,6 +129,11 @@ static inline struct list_head *tipc_nodes(struct
> net
> >>>> *net)
> >>>>>         return &tipc_net(net)->node_list;
> >>>>>  }
> >>>>>
> >>>>> +static inline unsigned int tipc_hashfn(u32 addr)
> >>>>> +{
> >>>>> +       return addr & (NODE_HTABLE_SIZE - 1);
> >>>>> +}
> >>>>> +
> >>>>>  static inline u16 mod(u16 x)
> >>>>>  {
> >>>>>         return x & 0xffffu;
> >>>>> diff --git a/net/tipc/link.c b/net/tipc/link.c
> >>>>> index 2e28a7d..7870e475 100644
> >>>>> --- a/net/tipc/link.c
> >>>>> +++ b/net/tipc/link.c
> >>>>> @@ -42,6 +42,7 @@
> >>>>>  #include "name_distr.h"
> >>>>>  #include "discover.h"
> >>>>>  #include "netlink.h"
> >>>>> +#include "monitor.h"
> >>>>>
> >>>>>  #include <linux/pkt_sched.h>
> >>>>>
> >>>>> @@ -96,6 +97,7 @@ struct tipc_stats {
> >>>>>   * @pmsg: convenience pointer to "proto_msg" field
> >>>>>   * @priority: current link priority
> >>>>>   * @net_plane: current link network plane ('A' through 'H')
> >>>>> + * @mon_state: cookie with information needed by link monitor
> >>>>>   * @backlog_limit: backlog queue congestion thresholds (indexed by
> >>>> importance)
> >>>>>   * @exp_msg_count: # of tunnelled messages expected during link
> >>>> changeover
> >>>>>   * @reset_rcv_checkpt: seq # of last acknowledged message at time of
> link
> >>>> reset
> >>>>> @@ -140,6 +142,7 @@ struct tipc_link {
> >>>>>         char if_name[TIPC_MAX_IF_NAME];
> >>>>>         u32 priority;
> >>>>>         char net_plane;
> >>>>> +       struct tipc_mon_state mon_state;
> >>>>>         u16 rst_cnt;
> >>>>>
> >>>>>         /* Failover/synch */
> >>>>> @@ -710,18 +713,23 @@ int tipc_link_timeout(struct tipc_link *l, struct
> >>>> sk_buff_head *xmitq)
> >>>>>         bool setup = false;
> >>>>>         u16 bc_snt = l->bc_sndlink->snd_nxt - 1;
> >>>>>         u16 bc_acked = l->bc_rcvlink->acked;
> >>>>> -
> >>>>> -       link_profile_stats(l);
> >>>>> +       struct tipc_mon_state *mstate = &l->mon_state;
> >>>>>
> >>>>>         switch (l->state) {
> >>>>>         case LINK_ESTABLISHED:
> >>>>>         case LINK_SYNCHING:
> >>>>> +               link_profile_stats(l);
> >>>>>                 if (l->silent_intv_cnt > l->abort_limit)
> >>>>>                         return tipc_link_fsm_evt(l, LINK_FAILURE_EVT);
> >>>>>                 mtyp = STATE_MSG;
> >>>>>                 state = bc_acked != bc_snt;
> >>>>> -               probe = l->silent_intv_cnt;
> >>>>> -               if (probe)
> >>>>> +               state |= l->rcv_unacked;
> >>>>> +               state |= skb_queue_len(&l->transmq);
> >>>>> +               state |= skb_queue_len(&l->deferdq);
> >>>>> +               probe = tipc_mon_is_probed(l->net, l->addr,
> >>>>> +                                          mstate, l->bearer_id);
> >>>>> +               probe |= l->silent_intv_cnt;
> >>>>> +               if (probe || mstate->monitored)
> >>>>>                         l->silent_intv_cnt++;
> >>>>>                 break;
> >>>>>         case LINK_RESET:
> >>>>> @@ -833,6 +841,7 @@ void tipc_link_reset(struct tipc_link *l)
> >>>>>         l->stats.recv_info = 0;
> >>>>>         l->stale_count = 0;
> >>>>>         l->bc_peer_is_up = false;
> >>>>> +       memset(&l->mon_state, 0, sizeof(l->mon_state));
> >>>>>         tipc_link_reset_stats(l);
> >>>>>  }
> >>>>>
> >>>>> @@ -1241,6 +1250,9 @@ static void tipc_link_build_proto_msg(struct
> >> tipc_link
> >>>> *l, int mtyp, bool probe,
> >>>>>         struct tipc_msg *hdr;
> >>>>>         struct sk_buff_head *dfq = &l->deferdq;
> >>>>>         bool node_up = link_is_up(l->bc_rcvlink);
> >>>>> +       struct tipc_mon_state *mstate = &l->mon_state;
> >>>>> +       int dlen;
> >>>>> +       void *data;
> >>>>>
> >>>>>         /* Don't send protocol message during reset or link failover */
> >>>>>         if (tipc_link_is_blocked(l))
> >>>>> @@ -1253,12 +1265,13 @@ static void tipc_link_build_proto_msg(struct
> >>>> tipc_link *l, int mtyp, bool probe,
> >>>>>                 rcvgap = buf_seqno(skb_peek(dfq)) - l->rcv_nxt;
> >>>>>
> >>>>>         skb = tipc_msg_create(LINK_PROTOCOL, mtyp, INT_H_SIZE,
> >>>>> -                             TIPC_MAX_IF_NAME, l->addr,
> >>>>> +                             tipc_max_domain_size, l->addr,
> >>>>>                               tipc_own_addr(l->net), 0, 0, 0);
> >>>>>         if (!skb)
> >>>>>                 return;
> >>>>>
> >>>>>         hdr = buf_msg(skb);
> >>>>> +       data = msg_data(hdr);
> >>>>>         msg_set_session(hdr, l->session);
> >>>>>         msg_set_bearer_id(hdr, l->bearer_id);
> >>>>>         msg_set_net_plane(hdr, l->net_plane);
> >>>>> @@ -1276,12 +1289,14 @@ static void tipc_link_build_proto_msg(struct
> >>>> tipc_link *l, int mtyp, bool probe,
> >>>>>                 msg_set_seq_gap(hdr, rcvgap);
> >>>>>                 msg_set_size(hdr, INT_H_SIZE);
> >>>>>                 msg_set_probe(hdr, probe);
> >>>>> +               tipc_mon_prep(l->net, data, &dlen, mstate, l-
> >bearer_id);
> >>>>> +               msg_set_size(hdr, INT_H_SIZE + dlen);
> >>>>>                 l->stats.sent_states++;
> >>>>>                 l->rcv_unacked = 0;
> >>>>>         } else {
> >>>>>                 /* RESET_MSG or ACTIVATE_MSG */
> >>>>>                 msg_set_max_pkt(hdr, l->advertised_mtu);
> >>>>> -               strcpy(msg_data(hdr), l->if_name);
> >>>>> +               strcpy(data, l->if_name);
> >>>>>         }
> >>>>>         if (probe)
> >>>>>                 l->stats.sent_probes++;
> >>>>> @@ -1374,7 +1389,9 @@ static int tipc_link_proto_rcv(struct tipc_link 
> >>>>> *l,
> >> struct
> >>>> sk_buff *skb,
> >>>>>         u16 peers_tol = msg_link_tolerance(hdr);
> >>>>>         u16 peers_prio = msg_linkprio(hdr);
> >>>>>         u16 rcv_nxt = l->rcv_nxt;
> >>>>> +       u16 dlen = msg_data_sz(hdr);
> >>>>>         int mtyp = msg_type(hdr);
> >>>>> +       void *data = msg_data(hdr);
> >>>>>         char *if_name;
> >>>>>         int rc = 0;
> >>>>>
> >>>>> @@ -1403,7 +1420,7 @@ static int tipc_link_proto_rcv(struct tipc_link 
> >>>>> *l,
> >> struct
> >>>> sk_buff *skb,
> >>>>>                         break;
> >>>>>                 if (msg_data_sz(hdr) < TIPC_MAX_IF_NAME)
> >>>>>                         break;
> >>>>> -               strncpy(if_name, msg_data(hdr),
>       TIPC_MAX_IF_NAME);
> >>>>> +               strncpy(if_name, data, TIPC_MAX_IF_NAME);
> >>>>>
> >>>>>                 /* Update own tolerance if peer indicates a non-zero 
> >>>>> value */
> >>>>>                 if (in_range(peers_tol, TIPC_MIN_LINK_TOL,
> >>>> TIPC_MAX_LINK_TOL))
> >>>>> @@ -1451,6 +1468,8 @@ static int tipc_link_proto_rcv(struct tipc_link 
> >>>>> *l,
> >> struct
> >>>> sk_buff *skb,
> >>>>>                                 rc = TIPC_LINK_UP_EVT;
> >>>>>                         break;
> >>>>>                 }
> >>>>> +               tipc_mon_rcv(l->net, data, dlen, l->addr,
> >>>>> +                            &l->mon_state, l->bearer_id);
> >>>>>
> >>>>>                 /* Send NACK if peer has sent pkts we haven't received 
> >>>>> yet */
> >>>>>                 if (more(peers_snd_nxt, rcv_nxt) && 
> >>>>> !tipc_link_is_synching(l))
> >>>>> diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c
> >>>>> new file mode 100644
> >>>>> index 0000000..fdd6d67
> >>>>> --- /dev/null
> >>>>> +++ b/net/tipc/monitor.c
> >>>>> @@ -0,0 +1,590 @@
> >>>>> +/*
> >>>>> + * net/tipc/monitor.c
> >>>>> + *
> >>>>> + * Copyright (c) 2016, Ericsson AB
> >>>>> + * All rights reserved.
> >>>>> + *
> >>>>> + * Redistribution and use in source and binary forms, with or without
> >>>>> + * modification, are permitted provided that the following conditions 
> >>>>> are
> >> met:
> >>>>> + *
> >>>>> + * 1. Redistributions of source code must retain the above copyright
> >>>>> + *    notice, this list of conditions and the following disclaimer.
> >>>>> + * 2. Redistributions in binary form must reproduce the above copyright
> >>>>> + *    notice, this list of conditions and the following disclaimer in 
> >>>>> the
> >>>>> + *    documentation and/or other materials provided with the
> distribution.
> >>>>> + * 3. Neither the names of the copyright holders nor the names of its
> >>>>> + *    contributors may be used to endorse or promote products derived
> >> from
> >>>>> + *    this software without specific prior written permission.
> >>>>> + *
> >>>>> + * Alternatively, this software may be distributed under the terms of 
> >>>>> the
> >>>>> + * GNU General Public License ("GPL") version 2 as published by the 
> >>>>> Free
> >>>>> + * Software Foundation.
> >>>>> + *
> >>>>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> >>>> CONTRIBUTORS "AS IS"
> >>>>> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> >>>> LIMITED TO, THE
> >>>>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
> >>>> PARTICULAR PURPOSE
> >>>>> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
> >>>> CONTRIBUTORS BE
> >>>>> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
> >> OR
> >>>>> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
> >>>> PROCUREMENT OF
> >>>>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
> OR
> >>>> BUSINESS
> >>>>> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> LIABILITY,
> >>>> WHETHER IN
> >>>>> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
> >>>> OTHERWISE)
> >>>>> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
> >>>> ADVISED OF THE
> >>>>> + * POSSIBILITY OF SUCH DAMAGE.
> >>>>> + */
> >>>>> +
> >>>>> +#include "core.h"
> >>>>> +#include "addr.h"
> >>>>> +#include "monitor.h"
> >>>>> +
> >>>>> +#define TIPC_MAX_MON_DOMAIN     64
> >>>>> +
> >>>>> +/* struct tipc_mon_domain: domain record to be transferred between
> >> peers
> >>>>> + * @len: actual size of domain record
> >>>>> + * @gen: current generation of sender's domain
> >>>>> + * @ack_gen: most recent generation of self's domain acked by peer
> >>>>> + * @member_cnt: number of domain member nodes described in this
> >> record
> >>>>> + * @up_map: bit map indicating which of the members the sender
> >> considers
> >>>> up
> >>>>> + * @members: identity of the domain members
> >>>>> + */
> >>>>> +struct tipc_mon_domain {
> >>>>> +       u16 len;
> >>>>> +       u16 gen;
> >>>>> +       u16 ack_gen;
> >>>>> +       u16 member_cnt;
> >>>>> +       u64 up_map;
> >>>>> +       u32 members[TIPC_MAX_MON_DOMAIN];
> >>>>> +};
> >>>>> +
> >>>>> +/* struct tipc_peer: state of a peer node and its domain
> >>>>> + * @addr: tipc node identity of peer
> >>>>> + * @head_map: shows which other nodes currently consider peer 'up'
> >>>>> + * @domain: most recent domain record from peer
> >>>>> + * @hash: position in hashed lookup list
> >>>>> + * @list: position in linked list, in circular ascending order by 
> >>>>> 'addr'
> >>>>> + * @monitoring: number of nodes monitored by peer, as seen from this
> >> node
> >>>>> + * @is_up: peer is up as seen from this node
> >>>>> + * @is_head: peer is assigned domain head as seen from this node
> >>>>> + * @is_local: peer is in local domain and should be continuously
> monitored
> >>>>> + * @confirm: - set and start probing if some other peer has lost link
> >>>>> + */
> >>>>> +struct tipc_peer {
> >>>>> +       u32 addr;
> >>>>> +       u64 head_map;
> >>>>> +       struct tipc_mon_domain *domain;
> >>>>> +       struct hlist_node hash;
> >>>>> +       struct list_head list;
> >>>>> +       unsigned int monitoring : 6;
> >>>>> +       bool is_up              : 1;
> >>>>> +       bool is_head            : 1;
> >>>>> +       bool is_local           : 1;
> >>>>> +       bool confirm            : 1;
> >>>>> +};
> >>>>> +
> >>>>> +struct tipc_monitor {
> >>>>> +       struct hlist_head peers[NODE_HTABLE_SIZE];
> >>>>> +       int peer_cnt;
> >>>>> +       struct tipc_peer *self;
> >>>>> +       rwlock_t lock;
> >>>>> +       struct tipc_mon_domain cache;
> >>>>> +       u16 list_gen;
> >>>>> +       u16 dom_gen;
> >>>>> +       bool disabled;
> >>>>> +       struct net *net;
> >>>>> +};
> >>>>> +
> >>>>> +static struct tipc_monitor *tipc_monitor(struct net *net, int 
> >>>>> bearer_id)
> >>>>> +{
> >>>>> +       return tipc_net(net)->monitors[bearer_id];
> >>>>> +}
> >>>>> +
> >>>>> +const int tipc_max_domain_size = sizeof(struct tipc_mon_domain);
> >>>>> +
> >>>>> +/* dom_rec_len(): actual size of domain record for transport
> >>>>> + */
> >>>>> +static int dom_rec_len(struct tipc_mon_domain *dom, u16 mcnt)
> >>>>> +{
> >>>>> +       return ((void *)&dom->members - (void *)dom) + (mcnt *
> sizeof(u32));
> >>>>> +}
> >>>>> +
> >>>>> +/* dom_size() : calculate size of own domain based on number of peers
> >>>>> + */
> >>>>> +static int dom_size(int peers)
> >>>>> +{
> >>>>> +       int i = 0;
> >>>>> +
> >>>>> +       while ((i * i) < peers)
> >>>>> +               i++;
> >>>>> +       return i < TIPC_MAX_MON_DOMAIN ? i :
> TIPC_MAX_MON_DOMAIN;
> >>>>> +}
> >>>>> +
> >>>>> +static void map_set(u64 *up_map, int i, unsigned int v)
> >>>>> +{
> >>>>> +       *up_map &= ~(1 << i);
> >>>>> +       *up_map |= (v << i);
> >>>>> +}
> >>>>> +
> >>>>> +static int map_get(u64 up_map, int i)
> >>>>> +{
> >>>>> +       return (up_map & (1 << i)) >> i;
> >>>>> +}
> >>>>> +
> >>>>> +static struct tipc_peer *peer_prev(struct tipc_peer *peer)
> >>>>> +{
> >>>>> +       return list_last_entry(&peer->list, struct tipc_peer, list);
> >>>>> +}
> >>>>> +
> >>>>> +static struct tipc_peer *peer_nxt(struct tipc_peer *peer)
> >>>>> +{
> >>>>> +       return list_first_entry(&peer->list, struct tipc_peer, list);
> >>>>> +}
> >>>>> +
> >>>>> +static struct tipc_peer *peer_head(struct tipc_peer *peer)
> >>>>> +{
> >>>>> +       while (!peer->is_head)
> >>>>> +               peer = peer_prev(peer);
> >>>>> +       return peer;
> >>>>> +}
> >>>>> +
> >>>>> +static struct tipc_peer *get_peer(struct tipc_monitor *mon, u32 addr)
> >>>>> +{
> >>>>> +       struct tipc_peer *peer;
> >>>>> +       unsigned int thash = tipc_hashfn(addr);
> >>>>> +
> >>>>> +       hlist_for_each_entry(peer, &mon->peers[thash], hash) {
> >>>>> +               if (peer->addr == addr)
> >>>>> +                       return peer;
> >>>>> +       }
> >>>>> +       return NULL;
> >>>>> +}
> >>>>> +
> >>>>> +static struct tipc_peer *get_self(struct net *net, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +
> >>>>> +       return mon->self;
> >>>>> +}
> >>>>> +
> >>>>> +/* mon_match_domain() : match a peer's domain record against monitor
> >> list
> >>>>> + */
> >>>>> +static void mon_match_domain(struct tipc_monitor *mon,
> >>>>> +                            struct tipc_peer *peer)
> >>>>> +{
> >>>>> +       struct tipc_mon_domain *dom = peer->domain;
> >>>>> +       struct tipc_peer *member;
> >>>>> +       u64 prev_map;
> >>>>> +       u32 addr;
> >>>>> +       int up, i;
> >>>>> +
> >>>>> +       if (!dom || !peer->is_up)
> >>>>> +               return;
> >>>>> +
> >>>>> +       /* Scan across domain members and match against monitor list */
> >>>>> +       peer->monitoring = 0;
> >>>>> +       member = peer_nxt(peer);
> >>>>> +       for (i = 0; i < dom->member_cnt; i++) {
> >>>>> +               addr = dom->members[i];
> >>>>> +               if (addr != member->addr)
> >>>>> +                       return;
> >>>>> +               if (addr == tipc_own_addr(mon->net))
> >>>>> +                       return;
> >>>>> +               peer->monitoring++;
> >>>>> +               prev_map = member->head_map;
> >>>>> +
> >>>>> +               /* Set peer's up/down status for this member in its head
> map */
> >>>>> +               up = map_get(dom->up_map, i);
> >>>>> +               map_set(&member->head_map, i, up);
> >>>>> +
> >>>>> +               /* Start confirmation probing if status went up -> down
> */
> >>>>> +               if (member->is_up && !up && (member->head_map !=
> >>>> prev_map))
> >>>>> +                       member->confirm = true;
> >>>>> +               member = peer_nxt(member);
> >>>>> +       }
> >>>>> +}
> >>>>> +
> >>>>> +/* mon_update_local_domain() : update after peer
> >>>> addition/removal/up/down
> >>>>> + */
> >>>>> +static void mon_update_local_domain(struct tipc_monitor *mon)
> >>>>> +{
> >>>>> +       struct tipc_peer *self = mon->self;
> >>>>> +       struct tipc_mon_domain *cache = &mon->cache;
> >>>>> +       struct tipc_mon_domain *dom = self->domain;
> >>>>> +       struct tipc_peer *peer = self;
> >>>>> +       int member_cnt, i;
> >>>>> +
> >>>>> +       /* Update local domain size based on current size of cluster */
> >>>>> +       member_cnt = dom_size(mon->peer_cnt) - 1;
> >>>>> +       self->monitoring = member_cnt;
> >>>>> +
> >>>>> +       /* Update native and cached outgoing local domain records */
> >>>>> +       dom->len = dom_rec_len(dom, member_cnt);
> >>>>> +       dom->gen = ++mon->dom_gen;
> >>>>> +       dom->member_cnt = member_cnt;
> >>>>> +       for (i = 0; i < member_cnt; i++) {
> >>>>> +               peer = peer_nxt(peer);
> >>>>> +               dom->members[i] = peer->addr;
> >>>>> +               map_set(&dom->up_map, i, peer->is_up);
> >>>>> +               cache->members[i] = htonl(peer->addr);
> >>>>> +       }
> >>>>> +       cache->len = htons(dom->len);
> >>>>> +       cache->gen = htons(dom->gen);
> >>>>> +       cache->member_cnt = htons(member_cnt);
> >>>>> +       cache->up_map = cpu_to_be64(dom->up_map);
> >>>>> +       mon_match_domain(mon, self);
> >>>>> +}
> >>>>> +
> >>>>> +/* mon_update_neighbors() : update neighbors around an
> >> added/removed
> >>>> peer
> >>>>> + */
> >>>>> +static void mon_update_neighbors(struct tipc_monitor *mon,
> >>>>> +                                struct tipc_peer *peer)
> >>>>> +{
> >>>>> +       int dz, i;
> >>>>> +
> >>>>> +       dz = dom_size(mon->peer_cnt);
> >>>>> +       for (i = 0; i < dz; i++) {
> >>>>> +               peer->head_map = 0;
> >>>>> +               peer = peer_nxt(peer);
> >>>>> +       }
> >>>>> +       for (i = 0; i < (dz * 2); i++) {
> >>>>> +               mon_match_domain(mon, peer);
> >>>>> +               peer = peer_prev(peer);
> >>>>> +       }
> >>>>> +}
> >>>>> +
> >>>>> +/* mon_assign_roles() : reassign peer roles after a network change
> >>>>> + * The monitor list is consistent at this stage; i.e., each peer is 
> >>>>> monitoring
> >>>>> + * a set of domain members as matched beween domain record and the
> >>>> monitor list
> >>>>> + */
> >>>>> +static void mon_assign_roles(struct tipc_monitor *mon, struct tipc_peer
> >>>> *head)
> >>>>> +{
> >>>>> +       struct tipc_peer *peer = peer_nxt(head);
> >>>>> +       struct tipc_peer *self = mon->self;
> >>>>> +       int i = 0;
> >>>>> +
> >>>>> +       for (; peer != self; peer = peer_nxt(peer)) {
> >>>>> +               peer->is_local = false;
> >>>>> +
> >>>>> +               /* Update domain member */
> >>>>> +               if (i++ < head->monitoring) {
> >>>>> +                       peer->is_head = false;
> >>>>> +                       if (head == self)
> >>>>> +                               peer->is_local = true;
> >>>>> +                       continue;
> >>>>> +               }
> >>>>> +               /* Assign next domain head */
> >>>>> +               if (!peer->is_up)
> >>>>> +                       continue;
> >>>>> +               if (peer->is_head)
> >>>>> +                       break;
> >>>>> +               head = peer;
> >>>>> +               head->is_head = true;
> >>>>> +               i = 0;
> >>>>> +       }
> >>>>> +       mon->list_gen++;
> >>>>> +}
> >>>>> +
> >>>>> +void tipc_mon_remove_peer(struct net *net, u32 addr, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_net *tn = tipc_net(net);
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +       struct tipc_peer *self = get_self(net, bearer_id);
> >>>>> +       struct tipc_peer *peer, *prev, *head;
> >>>>> +
> >>>>> +       write_lock_bh(&mon->lock);
> >>>>> +       peer = get_peer(mon, addr);
> >>>>> +       if (!peer)
> >>>>> +               goto exit;
> >>>>> +       prev = peer_prev(peer);
> >>>>> +       list_del(&peer->list);
> >>>>> +       hlist_del(&peer->hash);
> >>>>> +       kfree(peer->domain);
> >>>>> +       kfree(peer);
> >>>>> +       mon->peer_cnt--;
> >>>>> +       head = peer_head(prev);
> >>>>> +       if (head == self)
> >>>>> +               mon_update_local_domain(mon);
> >>>>> +       mon_update_neighbors(mon, prev);
> >>>>> +
> >>>>> +       /* Revert to full-mesh monitoring if we reach threshold */
> >>>>> +       if (mon->peer_cnt == tn->mon_threshold) {
> >>>>> +               list_for_each_entry(peer, &self->list, list) {
> >>>>> +                       kfree(peer->domain);
> >>>>> +                       peer->domain = NULL;
> >>>>> +                       peer->head_map = 0;
> >>>>> +                       peer->is_head = false;
> >>>>> +               }
> >>>>> +       }
> >>>>> +       mon_assign_roles(mon, head);
> >>>>> +exit:
> >>>>> +       write_unlock_bh(&mon->lock);
> >>>>> +}
> >>>>> +
> >>>>> +static bool tipc_mon_add_peer(struct tipc_monitor *mon, u32 addr,
> >>>>> +                             struct tipc_peer **peer)
> >>>>> +{
> >>>>> +       struct tipc_peer *self = mon->self;
> >>>>> +       struct tipc_peer *cur, *prev, *p;
> >>>>> +
> >>>>> +       p = kzalloc(sizeof(*p), GFP_ATOMIC);
> >>>>> +       *peer = p;
> >>>>> +       if (!p)
> >>>>> +               return false;
> >>>>> +       p->addr = addr;
> >>>>> +
> >>>>> +       /* Add new peer to lookup list */
> >>>>> +       INIT_LIST_HEAD(&p->list);
> >>>>> +       hlist_add_head(&p->hash, &mon->peers[tipc_hashfn(addr)]);
> >>>>> +
> >>>>> +       /* Sort new peer into iterator list, in ascending circular 
> >>>>> order */
> >>>>> +       prev = self;
> >>>>> +       list_for_each_entry(cur, &self->list, list) {
> >>>>> +               if ((addr > prev->addr) && (addr < cur->addr))
> >>>>> +                       break;
> >>>>> +               if (((addr < cur->addr) || (addr > prev->addr)) &&
> >>>>> +                   (prev->addr > cur->addr))
> >>>>> +                       break;
> >>>>> +               prev = cur;
> >>>>> +       }
> >>>>> +       list_add_tail(&p->list, &cur->list);
> >>>>> +       mon->peer_cnt++;
> >>>>> +       mon_update_neighbors(mon, p);
> >>>>> +       return true;
> >>>>> +}
> >>>>> +
> >>>>> +void tipc_mon_peer_up(struct net *net, u32 addr, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +       struct tipc_peer *self = get_self(net, bearer_id);
> >>>>> +       struct tipc_peer *peer, *head;
> >>>>> +
> >>>>> +       write_lock_bh(&mon->lock);
> >>>>> +       peer = get_peer(mon, addr);
> >>>>> +       if (!peer && !tipc_mon_add_peer(mon, addr, &peer))
> >>>>> +               goto exit;
> >>>>> +       peer->is_up = true;
> >>>>> +       head = peer_head(peer);
> >>>>> +       if (head == self)
> >>>>> +               mon_update_local_domain(mon);
> >>>>> +       mon_assign_roles(mon, head);
> >>>>> +exit:
> >>>>> +       write_unlock_bh(&mon->lock);
> >>>>> +}
> >>>>> +
> >>>>> +void tipc_mon_peer_down(struct net *net, u32 addr, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +       struct tipc_peer *self = get_self(net, bearer_id);
> >>>>> +       struct tipc_peer *peer, *member, *head;
> >>>>> +       int i = 0;
> >>>>> +
> >>>>> +       write_lock_bh(&mon->lock);
> >>>>> +       peer = get_peer(mon, addr);
> >>>>> +       if (!peer) {
> >>>>> +               pr_warn("Mon: unknown link %x/%u DOWN\n", addr,
> >>>> bearer_id);
> >>>>> +               goto exit;
> >>>>> +       }
> >>>>> +       /* Update domain members' head_map field */
> >>>>> +       if (peer->domain) {
> >>>>> +               peer->domain->up_map = 0;
> >>>>> +               mon_match_domain(mon, peer);
> >>>>> +       }
> >>>>> +       /* Suppress member probing if peer was not domain head */
> >>>>> +       member = peer_nxt(peer);
> >>>>> +       while (!peer->is_head && (i++ < peer->monitoring)) {
> >>>>> +               member->confirm = false;
> >>>>> +               member = peer_nxt(member);
> >>>>> +       }
> >>>>> +       peer->is_up = false;
> >>>>> +       peer->is_head = false;
> >>>>> +       peer->is_local = false;
> >>>>> +       peer->confirm = false;
> >>>>> +       peer->monitoring = 0;
> >>>>> +       kfree(peer->domain);
> >>>>> +       peer->domain = NULL;
> >>>>> +       head = peer_head(peer);
> >>>>> +       if (head == self)
> >>>>> +               mon_update_local_domain(mon);
> >>>>> +       mon_assign_roles(mon, head);
> >>>>> +exit:
> >>>>> +       write_unlock_bh(&mon->lock);
> >>>>> +}
> >>>>> +
> >>>>> +/* tipc_mon_rcv - process monitor domain event message
> >>>>> + */
> >>>>> +void tipc_mon_rcv(struct net *net, void *data, u16 dlen, u32 addr,
> >>>>> +                 struct tipc_mon_state *state, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +       struct tipc_mon_domain *ndom = data;
> >>>>> +       u16 nmember_cnt = ntohs(ndom->member_cnt);
> >>>>> +       int ndlen = dom_rec_len(ndom, nmember_cnt);
> >>>>> +       u16 ndgen = ntohs(ndom->gen);
> >>>>> +       struct tipc_mon_domain *dom;
> >>>>> +       struct tipc_peer *peer;
> >>>>> +       int i;
> >>>>> +
> >>>>> +       if (!dlen)
> >>>>> +               return;
> >>>>> +
> >>>>> +       if ((dlen != ntohs(ndom->len)) || (dlen != ndlen)) {
> >>>>> +               pr_warn_ratelimited("Received illegal domain record");
> >>>>> +               return;
> >>>>> +       }
> >>>>> +       state->ack_gen = ntohs(ndom->ack_gen);
> >>>>> +
> >>>>> +       /* Ignore if this generation already received */
> >>>>> +       if (!more(ndgen, state->peer_gen) && !state->probed)
> >>>>> +               return;
> >>>>> +       state->probed = 0;
> >>>>> +
> >>>>> +       write_lock_bh(&mon->lock);
> >>>>> +       peer = get_peer(mon, addr);
> >>>>> +       if (!peer)
> >>>>> +               goto exit;
> >>>>> +       if (!more(ndgen, state->peer_gen))
> >>>>> +               goto exit;
> >>>>> +       state->peer_gen = ndgen;
> >>>>> +       if (!peer->is_up)
> >>>>> +               goto exit;
> >>>>> +
> >>>>> +       /* Transform and store received domain record */
> >>>>> +       dom = peer->domain;
> >>>>> +       if (!dom || (dom->len < ndlen)) {
> >>>>> +               kfree(dom);
> >>>>> +               dom = kmalloc(ndlen, GFP_ATOMIC);
> >>>>> +               peer->domain = dom;
> >>>>> +               if (!dom)
> >>>>> +                       goto exit;
> >>>>> +       }
> >>>>> +       dom->len = ndlen;
> >>>>> +       dom->gen = ndgen;
> >>>>> +       dom->member_cnt = nmember_cnt;
> >>>>> +       dom->up_map = be64_to_cpu(ndom->up_map);
> >>>>> +       for (i = 0; i < nmember_cnt; i++)
> >>>>> +               dom->members[i] = ntohl(ndom->members[i]);
> >>>>> +
> >>>>> +       /* Update peers affected by this domain record */
> >>>>> +       mon_match_domain(mon, peer);
> >>>>> +       peer->confirm = 0;
> >>>>> +       mon_assign_roles(mon, peer_head(peer));
> >>>>> +exit:
> >>>>> +       write_unlock_bh(&mon->lock);
> >>>>> +}
> >>>>> +
> >>>>> +void tipc_mon_prep(struct net *net, void *data, int *dlen,
> >>>>> +                  struct tipc_mon_state *state, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_net *tn = tipc_net(net);
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +       struct tipc_mon_domain *dom = data;
> >>>>> +       u16 gen = state->gen;
> >>>>> +
> >>>>> +       if (mon->peer_cnt <= tn->mon_threshold) {
> >>>>> +               *dlen = 0;
> >>>>> +               return;
> >>>>> +       }
> >>>>> +       if (!less(state->ack_gen, gen) || mon->disabled) {
> >>>>> +               *dlen = dom_rec_len(dom, 0);
> >>>>> +               dom->len = htons(dom_rec_len(dom, 0));
> >>>>> +               dom->gen = htons(gen);
> >>>>> +               dom->ack_gen = htons(state->peer_gen);
> >>>>> +               dom->member_cnt = 0;
> >>>>> +               return;
> >>>>> +       }
> >>>>> +       read_lock_bh(&mon->lock);
> >>>>> +       *dlen = ntohs(mon->cache.len);
> >>>>> +       memcpy(data, &mon->cache, *dlen);
> >>>>> +       read_unlock_bh(&mon->lock);
> >>>>> +       dom->ack_gen = htons(state->peer_gen);
> >>>>> +}
> >>>>> +
> >>>>> +bool tipc_mon_is_probed(struct net *net, u32 addr,
> >>>>> +                       struct tipc_mon_state *state,
> >>>>> +                       int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +       struct tipc_peer *peer;
> >>>>> +
> >>>>> +       if (mon->disabled)
> >>>>> +               return false;
> >>>>> +
> >>>>> +       if (!state->probed &&
> >>>>> +           !less(state->list_gen, mon->list_gen) &&
> >>>>> +           !less(state->ack_gen, state->gen))
> >>>>> +               return false;
> >>>>> +
> >>>>> +       read_lock_bh(&mon->lock);
> >>>>> +       peer = get_peer(mon, addr);
> >>>>> +       if (peer) {
> >>>>> +               state->probed = less(state->gen, mon->dom_gen);
> >>>>> +               state->probed |= less(state->ack_gen, state->gen);
> >>>>> +               state->probed |= peer->confirm;
> >>>>> +               peer->confirm = 0;
> >>>>> +               state->monitored = peer->is_local;
> >>>>> +               state->monitored |= peer->is_head;
> >>>>> +               state->monitored |= !peer->head_map;
> >>>>> +               state->list_gen = mon->list_gen;
> >>>>> +               state->gen = mon->dom_gen;
> >>>>> +       }
> >>>>> +       read_unlock_bh(&mon->lock);
> >>>>> +       return state->probed || state->monitored;
> >>>>> +}
> >>>>> +
> >>>>> +int tipc_mon_create(struct net *net, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_net *tn = tipc_net(net);
> >>>>> +       struct tipc_monitor *mon;
> >>>>> +       struct tipc_peer *self;
> >>>>> +       struct tipc_mon_domain *dom;
> >>>>> +
> >>>>> +       if (tn->monitors[bearer_id])
> >>>>> +               return 0;
> >>>>> +
> >>>>> +       mon = kzalloc(sizeof(*mon), GFP_ATOMIC);
> >>>>> +       self = kzalloc(sizeof(*self), GFP_ATOMIC);
> >>>>> +       dom = kzalloc(sizeof(*dom), GFP_ATOMIC);
> >>>>> +       if (!mon || !self || !dom) {
> >>>>> +               kfree(mon);
> >>>>> +               kfree(self);
> >>>>> +               kfree(dom);
> >>>>> +               return -ENOMEM;
> >>>>> +       }
> >>>>> +       tn->monitors[bearer_id] = mon;
> >>>>> +       rwlock_init(&mon->lock);
> >>>>> +       mon->net = net;
> >>>>> +       mon->peer_cnt = 1;
> >>>>> +       mon->self = self;
> >>>>> +       self->domain = dom;
> >>>>> +       self->addr = tipc_own_addr(net);
> >>>>> +       self->is_up = true;
> >>>>> +       self->is_head = true;
> >>>>> +       INIT_LIST_HEAD(&self->list);
> >>>>> +       return 0;
> >>>>> +}
> >>>>> +
> >>>>> +void tipc_mon_disable(struct net *net, int bearer_id)
> >>>>> +{
> >>>>> +       tipc_monitor(net, bearer_id)->disabled = true;
> >>>>> +}
> >>>>> +
> >>>>> +void tipc_mon_delete(struct net *net, int bearer_id)
> >>>>> +{
> >>>>> +       struct tipc_net *tn = tipc_net(net);
> >>>>> +       struct tipc_monitor *mon = tipc_monitor(net, bearer_id);
> >>>>> +       struct tipc_peer *self = get_self(net, bearer_id);
> >>>>> +       struct tipc_peer *peer, *tmp;
> >>>>> +
> >>>>> +       write_lock_bh(&mon->lock);
> >>>>> +       tn->monitors[bearer_id] = NULL;
> >>>>> +       list_for_each_entry_safe(peer, tmp, &self->list, list) {
> >>>>> +               list_del(&peer->list);
> >>>>> +               hlist_del(&peer->hash);
> >>>>> +               kfree(peer->domain);
> >>>>> +               kfree(peer);
> >>>>> +       }
> >>>>> +       kfree(self->domain);
> >>>>> +       kfree(self);
> >>>>> +       write_unlock_bh(&mon->lock);
> >>>>> +       tn->monitors[bearer_id] = NULL;
> >>>>> +       kfree(mon);
> >>>>> +}
> >>>>> diff --git a/net/tipc/monitor.h b/net/tipc/monitor.h
> >>>>> new file mode 100644
> >>>>> index 0000000..7a25541
> >>>>> --- /dev/null
> >>>>> +++ b/net/tipc/monitor.h
> >>>>> @@ -0,0 +1,72 @@
> >>>>> +/*
> >>>>> + * net/tipc/monitor.h
> >>>>> + *
> >>>>> + * Copyright (c) 2015, Ericsson AB
> >>>>> + * All rights reserved.
> >>>>> + *
> >>>>> + * Redistribution and use in source and binary forms, with or without
> >>>>> + * modification, are permitted provided that the following conditions 
> >>>>> are
> >> met:
> >>>>> + *
> >>>>> + * 1. Redistributions of source code must retain the above copyright
> >>>>> + *    notice, this list of conditions and the following disclaimer.
> >>>>> + * 2. Redistributions in binary form must reproduce the above copyright
> >>>>> + *    notice, this list of conditions and the following disclaimer in 
> >>>>> the
> >>>>> + *    documentation and/or other materials provided with the
> distribution.
> >>>>> + * 3. Neither the names of the copyright holders nor the names of its
> >>>>> + *    contributors may be used to endorse or promote products derived
> >> from
> >>>>> + *    this software without specific prior written permission.
> >>>>> + *
> >>>>> + * Alternatively, this software may be distributed under the terms of 
> >>>>> the
> >>>>> + * GNU General Public License ("GPL") version 2 as published by the 
> >>>>> Free
> >>>>> + * Software Foundation.
> >>>>> + *
> >>>>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> >>>> CONTRIBUTORS "AS IS"
> >>>>> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> >>>> LIMITED TO, THE
> >>>>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
> >>>> PARTICULAR PURPOSE
> >>>>> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
> >>>> CONTRIBUTORS BE
> >>>>> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
> >> OR
> >>>>> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
> >>>> PROCUREMENT OF
> >>>>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
> OR
> >>>> BUSINESS
> >>>>> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> LIABILITY,
> >>>> WHETHER IN
> >>>>> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
> >>>> OTHERWISE)
> >>>>> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
> >>>> ADVISED OF THE
> >>>>> + * POSSIBILITY OF SUCH DAMAGE.
> >>>>> + */
> >>>>> +
> >>>>> +#ifndef _TIPC_MONITOR_H
> >>>>> +#define _TIPC_MONITOR_H
> >>>>> +
> >>>>> +/* struct tipc_mon_state: link instance's cache of monitor list and 
> >>>>> domain
> >>>> state
> >>>>> + * @list_gen: current generation of this node's monitor list
> >>>>> + * @gen: current generation of this node's local domain
> >>>>> + * @peer_gen: most recent domain generation received from peer
> >>>>> + * @ack_gen: most recent generation of self's domain acked by peer
> >>>>> + * @monitored: peer endpoint should continuously monitored
> >>>>> + * @probed: peer endpoint should be temporarily probed for potential
> loss
> >>>>> + */
> >>>>> +struct tipc_mon_state {
> >>>>> +       u16 list_gen;
> >>>>> +       u16 gen;
> >>>>> +       u16 peer_gen;
> >>>>> +       u16 ack_gen;
> >>>>> +       bool monitored;
> >>>>> +       bool probed;
> >>>>> +};
> >>>>> +
> >>>>> +int tipc_mon_create(struct net *net, int bearer_id);
> >>>>> +void tipc_mon_disable(struct net *net, int bearer_id);
> >>>>> +void tipc_mon_delete(struct net *net, int bearer_id);
> >>>>> +
> >>>>> +void tipc_mon_peer_up(struct net *net, u32 addr, int bearer_id);
> >>>>> +void tipc_mon_peer_down(struct net *net, u32 addr, int bearer_id);
> >>>>> +void tipc_mon_prep(struct net *net, void *data, int *dlen,
> >>>>> +                  struct tipc_mon_state *state, int bearer_id);
> >>>>> +void tipc_mon_rcv(struct net *net, void *data, u16 dlen, u32 addr,
> >>>>> +                 struct tipc_mon_state *state, int bearer_id);
> >>>>> +bool tipc_mon_is_probed(struct net *net, u32 addr,
> >>>>> +                       struct tipc_mon_state *state,
> >>>>> +                       int bearer_id);
> >>>>> +void tipc_mon_remove_peer(struct net *net, u32 addr, int bearer_id);
> >>>>> +
> >>>>> +extern const int tipc_max_domain_size;
> >>>>> +#endif
> >>>>> diff --git a/net/tipc/node.c b/net/tipc/node.c
> >>>>> index 68d9f7b..43f2d78 100644
> >>>>> --- a/net/tipc/node.c
> >>>>> +++ b/net/tipc/node.c
> >>>>> @@ -40,6 +40,7 @@
> >>>>>  #include "name_distr.h"
> >>>>>  #include "socket.h"
> >>>>>  #include "bcast.h"
> >>>>> +#include "monitor.h"
> >>>>>  #include "discover.h"
> >>>>>  #include "netlink.h"
> >>>>>
> >>>>> @@ -191,16 +192,6 @@ int tipc_node_get_mtu(struct net *net, u32 addr,
> >> u32
> >>>> sel)
> >>>>>         tipc_node_put(n);
> >>>>>         return mtu;
> >>>>>  }
> >>>>> -/*
> >>>>> - * A trivial power-of-two bitmask technique is used for speed, since 
> >>>>> this
> >>>>> - * operation is done for every incoming TIPC packet. The number of hash
> >> table
> >>>>> - * entries has been chosen so that no hash chain exceeds 8 nodes and 
> >>>>> will
> >>>>> - * usually be much smaller (typically only a single node).
> >>>>> - */
> >>>>> -static unsigned int tipc_hashfn(u32 addr)
> >>>>> -{
> >>>>> -       return addr & (NODE_HTABLE_SIZE - 1);
> >>>>> -}
> >>>>>
> >>>>>  static void tipc_node_kref_release(struct kref *kref)
> >>>>>  {
> >>>>> @@ -265,6 +256,7 @@ static void tipc_node_write_unlock(struct
> tipc_node
> >> *n)
> >>>>>         u32 addr = 0;
> >>>>>         u32 flags = n->action_flags;
> >>>>>         u32 link_id = 0;
> >>>>> +       u32 bearer_id;
> >>>>>         struct list_head *publ_list;
> >>>>>
> >>>>>         if (likely(!flags)) {
> >>>>> @@ -274,6 +266,7 @@ static void tipc_node_write_unlock(struct
> tipc_node
> >> *n)
> >>>>>         addr = n->addr;
> >>>>>         link_id = n->link_id;
> >>>>> +       bearer_id = link_id & 0xffff;
> >>>>>         publ_list = &n->publ_list;
> >>>>>
> >>>>>         n->action_flags &= ~(TIPC_NOTIFY_NODE_DOWN |
> >>>> TIPC_NOTIFY_NODE_UP |
> >>>>> @@ -287,13 +280,16 @@ static void tipc_node_write_unlock(struct
> >> tipc_node
> >>>> *n)
> >>>>>         if (flags & TIPC_NOTIFY_NODE_UP)
> >>>>>                 tipc_named_node_up(net, addr);
> >>>>>
> >>>>> -       if (flags & TIPC_NOTIFY_LINK_UP)
> >>>>> +       if (flags & TIPC_NOTIFY_LINK_UP) {
> >>>>> +               tipc_mon_peer_up(net, addr, bearer_id);
> >>>>>                 tipc_nametbl_publish(net, TIPC_LINK_STATE, addr, addr,
> >>>>>                                      TIPC_NODE_SCOPE, link_id, addr);
> >>>>> -
> >>>>> -       if (flags & TIPC_NOTIFY_LINK_DOWN)
> >>>>> +       }
> >>>>> +       if (flags & TIPC_NOTIFY_LINK_DOWN) {
> >>>>> +               tipc_mon_peer_down(net, addr, bearer_id);
> >>>>>                 tipc_nametbl_withdraw(net, TIPC_LINK_STATE, addr,
> >>>>>                                       link_id, addr);
> >>>>> +       }
> >>>>>  }
> >>>>>
> >>>>>  struct tipc_node *tipc_node_create(struct net *net, u32 addr, u16
> >> capabilities)
> >>>>> @@ -674,6 +670,7 @@ static void tipc_node_link_down(struct tipc_node
> *n,
> >>>> int bearer_id, bool delete)
> >>>>>         struct tipc_link *l = le->link;
> >>>>>         struct tipc_media_addr *maddr;
> >>>>>         struct sk_buff_head xmitq;
> >>>>> +       int old_bearer_id = bearer_id;
> >>>>>
> >>>>>         if (!l)
> >>>>>                 return;
> >>>>> @@ -693,6 +690,8 @@ static void tipc_node_link_down(struct tipc_node
> *n,
> >>>> int bearer_id, bool delete)
> >>>>>                 tipc_link_fsm_evt(l, LINK_RESET_EVT);
> >>>>>         }
> >>>>>         tipc_node_write_unlock(n);
> >>>>> +       if (delete)
> >>>>> +               tipc_mon_remove_peer(n->net, n->addr,
> old_bearer_id);
> >>>>>         tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr);
> >>>>>         tipc_sk_rcv(n->net, &le->inputq);
> >>>>>  }


------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to