[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg
commit 0878555710e2116e466b0d1b124eb2c21ae85d16 Author: Gary Lee Date: Tue Mar 26 12:59:35 2019 +1100 mbc: prevent infinite peer_up message loop [#3021] If the active and standby SCs are split into network partitions, it is possible a RED_UP never arrives even though we have already received MBC PEER_UP. The service using MBC will then get stuck in an infinite loop and probably fail health checks. To cater for 'normal' race conditions between MDS topology and data messages, allow only up to 255 loops. If this is exceeded, the msg will be discarded. --- ** [tickets:#3021] mbc: infinite loop when processing peer_up msg** **Status:** review **Milestone:** 5.19.06 **Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee **Last Updated:** Tue Mar 26, 2019 02:16 AM UTC **Owner:** Gary Lee Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. This bit of code in mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ if (NCSCC_RC_SUCCESS != m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) { TRACE_LEAVE2("ipc send failed"); m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock, NCS_LOCK_WRITE); return NCSCC_RC_FAILURE; } ~~~ ~~~ <143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2
[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg
- **status**: review --> fixed --- ** [tickets:#3021] mbc: infinite loop when processing peer_up msg** **Status:** fixed **Milestone:** 5.19.06 **Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee **Last Updated:** Wed May 01, 2019 06:14 AM UTC **Owner:** Gary Lee Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. This bit of code in mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ if (NCSCC_RC_SUCCESS != m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) { TRACE_LEAVE2("ipc send failed"); m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock, NCS_LOCK_WRITE); return NCSCC_RC_FAILURE; } ~~~ ~~~ <143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1718"]
[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg
- **status**: accepted --> review - **Component**: unknown --> mbc --- ** [tickets:#3021] mbc: infinite loop when processing peer_up msg** **Status:** review **Milestone:** 5.19.06 **Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee **Last Updated:** Tue Mar 26, 2019 01:56 AM UTC **Owner:** Gary Lee Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. This bit of code in mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ if (NCSCC_RC_SUCCESS != m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) { TRACE_LEAVE2("ipc send failed"); m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock, NCS_LOCK_WRITE); return NCSCC_RC_FAILURE; } ~~~ ~~~ <143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495 o
[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg
- **status**: unassigned --> accepted - **assigned_to**: Gary Lee --- ** [tickets:#3021] mbc: infinite loop when processing peer_up msg** **Status:** accepted **Milestone:** 5.19.03 **Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee **Last Updated:** Tue Mar 19, 2019 08:02 AM UTC **Owner:** Gary Lee Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. This bit of code in mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ if (NCSCC_RC_SUCCESS != m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) { TRACE_LEAVE2("ipc send failed"); m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock, NCS_LOCK_WRITE); return NCSCC_RC_FAILURE; } ~~~ ~~~ <143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495
[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg
- Description has changed: Diff: --- old +++ new @@ -1,6 +1,6 @@ Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. -This bit of code nm mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. +This bit of code in mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ --- ** [tickets:#3021] mbc: infinite loop when processing peer_up msg** **Status:** unassigned **Milestone:** 5.19.03 **Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee **Last Updated:** Tue Mar 19, 2019 08:01 AM UTC **Owner:** nobody Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. This bit of code in mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ if (NCSCC_RC_SUCCESS != m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) { TRACE_LEAVE2("ipc send failed"); m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock, NCS_LOCK_WRITE); return NCSCC_RC_FAILURE; } ~~~ ~~~ <143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1714"] 18495:mbc/
[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg
- Description has changed: Diff: --- old +++ new @@ -1,6 +1,6 @@ Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. -This bit of code im mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. +This bit of code nm mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ --- ** [tickets:#3021] mbc: infinite loop when processing peer_up msg** **Status:** unassigned **Milestone:** 5.19.03 **Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee **Last Updated:** Tue Mar 19, 2019 08:00 AM UTC **Owner:** nobody Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. This bit of code nm mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ if (NCSCC_RC_SUCCESS != m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) { TRACE_LEAVE2("ipc send failed"); m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock, NCS_LOCK_WRITE); return NCSCC_RC_FAILURE; } ~~~ ~~~ <143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1714"] 18495:mbc/
[tickets] [opensaf:tickets] #3021 mbc: infinite loop when processing peer_up msg
--- ** [tickets:#3021] mbc: infinite loop when processing peer_up msg** **Status:** unassigned **Milestone:** 5.19.03 **Created:** Tue Mar 19, 2019 08:00 AM UTC by Gary Lee **Last Updated:** Tue Mar 19, 2019 08:00 AM UTC **Owner:** nobody Sometimes, a process (eg amfd) utilising MBC will become unresponsive because it is stuck in an infinite loop processing PEER_UP msg. This instance was noticed when SC-1 and SC-2 were deliberately split into network partitions. It seems SC-1 receives PEER_UP from SC-2, and hadn't received RED_UP. This bit of code im mbcsv_peer.c looks problematic, if it's called from mbcsv_hdl_dispatch_all. ~~~ /* Again post the event, till RED_UP event arrives */ if (NCSCC_RC_SUCCESS != m_MBCSV_SND_MSG(&mbx, evt, NCS_IPC_PRIORITY_HIGH)) { TRACE_LEAVE2("ipc send failed"); m_NCS_UNLOCK(&mbcsv_cb.peer_list_lock, NCS_LOCK_WRITE); return NCSCC_RC_FAILURE; } ~~~ ~~~ <143>1 2019-03-15T10:11:27.525305+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188866"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.52531+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188867"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525315+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188868"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525321+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188869"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525327+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188870"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525333+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188871"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525338+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188872"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525346+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188873"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525351+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188874"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525356+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188875"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525361+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188876"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525366+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188877"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525372+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188878"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.525378+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="17188879"] 18495:mbc/mbcsv_peer.c:407 T1 Peer UP msg, My role: 1, My svc_id: 10, pwe handle:65537, peer role:2, peer_anchor: 565215027999180 <143>1 2019-03-15T10:11:27.525384+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1710"] 18495:mbc/mbcsv_peer.c:414 << mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525389+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1711"] 18495:mbc/mbcsv_pr_evts.c:220 << mbcsv_process_events <143>1 2019-03-15T10:11:27.525398+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1712"] 18495:mbc/mbcsv_pr_evts.c:66 >> mbcsv_process_events: mbcsv hdl: 4293918753 <143>1 2019-03-15T10:11:27.525403+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1713"] 18495:mbc/mbcsv_pr_evts.c:177 TR peer discovery event <143>1 2019-03-15T10:11:27.525408+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1714"] 18495:mbc/mbcsv_peer.c:360 >> mbcsv_process_peer_discovery_message <143>1 2019-03-15T10:11:27.525413+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1715"] 18495:mbc/mbcsv_peer.c:369 TR peer version: 9 <143>1 2019-03-15T10:11:27.525418+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1716"] 18495:mbc/mbcsv_peer.c:765 >> mbcsv_process_peer_up_info <143>1 2019-03-15T10:11:27.525423+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1717"] 18495:mbc/mbcsv_peer.c:828 T4 Still RED_UP event not arrived of the peer <143>1 2019-03-15T10:11:27.52543+01:00 SC-2-1 osafamfd 18495 osafamfd [meta sequenceId="1718"] 18495:mbc/mbcsv_peer.c:407