[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg

2019-04-30 Thread Gary Lee via Opensaf-tickets
commit 0878555710e2116e466b0d1b124eb2c21ae85d16
Author: Gary Lee 
Date:   Tue Mar 26 12:59:35 2019 +1100

mbc: prevent infinite peer_up message loop [#3021]

If the active and standby SCs are split into network partitions, it is
possible a RED_UP never arrives even though we have already
received MBC PEER_UP. The service using MBC will then get stuck
in an infinite loop and probably fail health checks.

To cater for 'normal' race conditions between MDS topology and data
messages, allow only up to 255 loops. If this is exceeded, the msg
will be discarded.



---

** [tickets:#3021] mbc: infinite loop when processing peer_up msg**

**Status:** review
**Milestone:** 5.19.06
**Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee
**Last Updated:** Tue Mar 26, 2019 02:16 AM UTC
**Owner:** Gary Lee


Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.

This bit of code in mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.

~~~
/* Again post the event, till RED_UP event arrives */
if (NCSCC_RC_SUCCESS !=
m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) {
TRACE_LEAVE2("ipc send failed");
m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock,
 NCS_LOCK_WRITE);
return NCSCC_RC_FAILURE;
}
~~~
 
~~~
<143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2

[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg

2019-04-30 Thread Gary Lee via Opensaf-tickets
- **status**: review --> fixed



---

** [tickets:#3021] mbc: infinite loop when processing peer_up msg**

**Status:** fixed
**Milestone:** 5.19.06
**Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee
**Last Updated:** Wed May 01, 2019 06:14 AM UTC
**Owner:** Gary Lee


Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.

This bit of code in mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.

~~~
/* Again post the event, till RED_UP event arrives */
if (NCSCC_RC_SUCCESS !=
m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) {
TRACE_LEAVE2("ipc send failed");
m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock,
 NCS_LOCK_WRITE);
return NCSCC_RC_FAILURE;
}
~~~
 
~~~
<143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1718"]

[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg

2019-03-25 Thread Gary Lee via Opensaf-tickets
- **status**: accepted --> review
- **Component**: unknown --> mbc



---

** [tickets:#3021] mbc: infinite loop when processing peer_up msg**

**Status:** review
**Milestone:** 5.19.06
**Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee
**Last Updated:** Tue Mar 26, 2019 01:56 AM UTC
**Owner:** Gary Lee


Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.

This bit of code in mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.

~~~
/* Again post the event, till RED_UP event arrives */
if (NCSCC_RC_SUCCESS !=
m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) {
TRACE_LEAVE2("ipc send failed");
m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock,
 NCS_LOCK_WRITE);
return NCSCC_RC_FAILURE;
}
~~~
 
~~~
<143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495 o

[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg

2019-03-24 Thread Gary Lee via Opensaf-tickets
- **status**: unassigned --> accepted
- **assigned_to**: Gary Lee



---

** [tickets:#3021] mbc: infinite loop when processing peer_up msg**

**Status:** accepted
**Milestone:** 5.19.03
**Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee
**Last Updated:** Tue Mar 19, 2019 08:02 AM UTC
**Owner:** Gary Lee


Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.

This bit of code in mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.

~~~
/* Again post the event, till RED_UP event arrives */
if (NCSCC_RC_SUCCESS !=
m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) {
TRACE_LEAVE2("ipc send failed");
m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock,
 NCS_LOCK_WRITE);
return NCSCC_RC_FAILURE;
}
~~~
 
~~~
<143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495 

[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg

2019-03-19 Thread Gary Lee via Opensaf-tickets
- Description has changed:

Diff:



--- old
+++ new
@@ -1,6 +1,6 @@
 Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.
 
-This bit of code nm mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.
+This bit of code in mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.
 
 ~~~
/* Again post the event, till RED_UP event arrives */






---

** [tickets:#3021] mbc: infinite loop when processing peer_up msg**

**Status:** unassigned
**Milestone:** 5.19.03
**Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee
**Last Updated:** Tue Mar 19, 2019 08:01 AM UTC
**Owner:** nobody


Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.

This bit of code in mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.

~~~
/* Again post the event, till RED_UP event arrives */
if (NCSCC_RC_SUCCESS !=
m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) {
TRACE_LEAVE2("ipc send failed");
m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock,
 NCS_LOCK_WRITE);
return NCSCC_RC_FAILURE;
}
~~~
 
~~~
<143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1714"] 18495:mbc/

[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg

2019-03-19 Thread Gary Lee via Opensaf-tickets
- Description has changed:

Diff:



--- old
+++ new
@@ -1,6 +1,6 @@
 Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.
 
-This bit of code im mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.
+This bit of code nm mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.
 
 ~~~
/* Again post the event, till RED_UP event arrives */






---

** [tickets:#3021] mbc: infinite loop when processing peer_up msg**

**Status:** unassigned
**Milestone:** 5.19.03
**Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee
**Last Updated:** Tue Mar 19, 2019 08:00 AM UTC
**Owner:** nobody


Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.

This bit of code nm mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.

~~~
/* Again post the event, till RED_UP event arrives */
if (NCSCC_RC_SUCCESS !=
m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) {
TRACE_LEAVE2("ipc send failed");
m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock,
 NCS_LOCK_WRITE);
return NCSCC_RC_FAILURE;
}
~~~
 
~~~
<143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1714"] 18495:mbc/

[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg

2019-03-19 Thread Gary Lee via Opensaf-tickets



---

** [tickets:#3021] mbc: infinite loop when processing peer_up msg**

**Status:** unassigned
**Milestone:** 5.19.03
**Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee
**Last Updated:** Tue Mar 19, 2019 08:00 AM UTC
**Owner:** nobody


Sometimes, a process (eg amfd) utilising MBC will become unresponsive because 
it is stuck in an infinite loop processing PEER_UP msg. This instance was 
noticed when SC-1 and SC-2 were deliberately split into network partitions. It 
seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP.

This bit of code im mbcsv_peer.c looks problematic, if it's called from 
mbcsv_hdl_dispatch_all.

~~~
/* Again post the event, till RED_UP event arrives */
if (NCSCC_RC_SUCCESS !=
m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) {
TRACE_LEAVE2("ipc send failed");
m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock,
 NCS_LOCK_WRITE);
return NCSCC_RC_FAILURE;
}
~~~
 
~~~
<143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, 
My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180
<143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events
<143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: 
mbcsv hdl: 4293918753
<143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event
<143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> 
mbcsv_process_peer_discovery_message
<143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9
<143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info
<143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not 
arrived of the peer
<143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495 osafamfd [meta 
sequenceId="1718"] 18495:mbc/mbcsv_peer.c:407