---

** [tickets:#2407] amfnd: message ID mismatches during SC absence recovery**

**Status:** accepted
**Milestone:** 5.0.2
**Created:** Fri Mar 31, 2017 08:55 AM UTC by Gary Lee
**Last Updated:** Fri Mar 31, 2017 08:55 AM UTC
**Owner:** Gary Lee


In a test case where the active SC is repeatedly powered off abruptly, 
sometimes this can be seen:

2017-03-27 21:42:38 PL-5 osafamfnd[422]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: Message ID mismatch, rec 1, expected 2, OwnNodeId = 
132367, SupervisionTime = 60

2017-03-27 21:42:36 SC-1 osafamfd[510]: Started
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2030f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2030f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up_msg from all nodes
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2040f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2030f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2050f: msg_id 2

```
2017-03-27 21:39:50 PL-5 osafamfnd[422]: Started
2017-03-27 21:39:50 PL-5 osafamfnd[422]: WA saClmInitialize_4 returned 31
2017-03-27 21:39:50 PL-5 osafamfnd[422]: NO Sending node up due to NCSMDS_UP
2017-03-27 21:39:51 PL-5 osafamfnd[422]: NO 
'safSu=PL-5,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED => 
INSTANTIATING
2017-03-27 21:39:51 PL-5 osafamfnd[422]: NO 
'safSu=PL-5,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING => 
INSTANTIATED
2017-03-27 21:39:51 PL-5 osafamfnd[422]: NO Assigning 
'safSi=NoRed2,safApp=OpenSAF' ACTIVE to 'safSu=PL-5,safSg=NoRed,safApp=OpenSAF'
2017-03-27 21:39:51 PL-5 osafamfnd[422]: NO Assigned 
'safSi=NoRed2,safApp=OpenSAF' ACTIVE to 'safSu=PL-5,safSg=NoRed,safApp=OpenSAF'
2017-03-27 21:40:00 PL-5 osafamfnd[422]: NO AVD NEW_ACTIVE, adest:1
2017-03-27 21:40:15 PL-5 osafamfnd[422]: message repeated 2 times: [ NO AVD 
NEW_ACTIVE, adest:1]
2017-03-27 21:40:19 PL-5 osafamfnd[422]: WA AMF director unexpectedly crashed
2017-03-27 21:40:19 PL-5 osafamfnd[422]: NO Checking 
'safSu=PL-5,safSg=NoRed,safApp=OpenSAF' for pending messages
2017-03-27 21:40:35 PL-5 osafamfnd[422]: NO AVD NEW_ACTIVE, adest:1
2017-03-27 21:40:35 PL-5 osafamfnd[422]: NO saClmDispatch BAD_HANDLE
2017-03-27 21:40:35 PL-5 osafamfnd[422]: NO 1 SISU states sent
2017-03-27 21:40:35 PL-5 osafamfnd[422]: NO 1 SU states sent
2017-03-27 21:40:35 PL-5 osafamfnd[422]: NO 5 CSICOMP states sent
2017-03-27 21:40:35 PL-5 osafamfnd[422]: NO 5 COMP states sent
2017-03-27 21:40:35 PL-5 osafamfnd[422]: NO Sending node up due to 
NCSMDS_NEW_ACTIVE
2017-03-27 21:40:40 PL-5 osafamfnd[422]: NO AVD NEW_ACTIVE, adest:1
2017-03-27 21:41:11 PL-5 osafamfnd[422]: message repeated 3 times: [ NO AVD 
NEW_ACTIVE, adest:1]
2017-03-27 21:41:18 PL-5 osafamfnd[422]: WA AMF director unexpectedly crashed
2017-03-27 21:41:18 PL-5 osafamfnd[422]: NO Checking 
'safSu=PL-5,safSg=NoRed,safApp=OpenSAF' for pending messages
2017-03-27 21:41:35 PL-5 osafamfnd[422]: NO AVD NEW_ACTIVE, adest:1
2017-03-27 21:41:35 PL-5 osafamfnd[422]: NO saClmDispatch BAD_HANDLE
2017-03-27 21:41:35 PL-5 osafamfnd[422]: NO 1 SISU states sent
2017-03-27 21:41:35 PL-5 osafamfnd[422]: NO 1 SU states sent
2017-03-27 21:41:35 PL-5 osafamfnd[422]: NO 5 CSICOMP states sent
2017-03-27 21:41:35 PL-5 osafamfnd[422]: NO 5 COMP states sent
2017-03-27 21:41:35 PL-5 osafamfnd[422]: NO Sending node up due to 
NCSMDS_NEW_ACTIVE
2017-03-27 21:41:43 PL-5 osafamfnd[422]: WA AMF director unexpectedly crashed
2017-03-27 21:41:43 PL-5 osafamfnd[422]: NO Checking 
'safSu=PL-5,safSg=NoRed,safApp=OpenSAF' for pending messages
2017-03-27 21:42:02 PL-5 osafamfnd[422]: NO AVD NEW_ACTIVE, adest:1
2017-03-27 21:42:02 PL-5 osafamfnd[422]: NO saClmDispatch BAD_HANDLE
2017-03-27 21:42:02 PL-5 osafamfnd[422]: NO 1 SISU states sent
2017-03-27 21:42:02 PL-5 osafamfnd[422]: NO 1 SU states sent
2017-03-27 21:42:02 PL-5 osafamfnd[422]: NO 5 CSICOMP states sent
2017-03-27 21:42:02 PL-5 osafamfnd[422]: NO 5 COMP states sent
2017-03-27 21:42:02 PL-5 osafamfnd[422]: NO Sending node up due to 
NCSMDS_NEW_ACTIVE
2017-03-27 21:42:12 PL-5 osafamfnd[422]: NO AVD NEW_ACTIVE, adest:1
2017-03-27 21:42:12 PL-5 osafamfnd[422]: NO saClmDispatch BAD_HANDLE
2017-03-27 21:42:36 PL-5 osafamfnd[422]: NO AVD NEW_ACTIVE, adest:1
2017-03-27 21:42:36 PL-5 osafamfnd[422]: NO saClmDispatch BAD_HANDLE
2017-03-27 21:42:38 PL-5 osafamfnd[422]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: Message ID mismatch, rec 1, expected 2, OwnNodeId = 
132367, SupervisionTime = 60




2017-03-27 21:40:53 SC-2 osafamfd[477]: Started
2017-03-27 21:40:53 SC-2 osafamfnd[485]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
2017-03-27 21:40:56 SC-2 osafamfd[477]: NO Cold sync complete!
2017-03-27 21:41:00 SC-2 osafamfd[477]: NO FAILOVER StandBy --> Active
2017-03-27 21:41:00 SC-2 osafamfd[477]: NO Node 'SC-1' left the cluster
2017-03-27 21:41:00 SC-2 osafamfd[477]: NO FAILOVER StandBy --> Active DONE!
2017-03-27 21:41:04 SC-2 osafamfd[477]: NO Received node_up from 2010f: msg_id 1
2017-03-27 21:41:04 SC-2 osafamfd[477]: NO Node 'SC-1' joined the cluster
2017-03-27 21:41:10 SC-2 osafamfd[477]: exiting for shutdown
2017-03-27 21:41:14 SC-2 osafamfd[476]: Started
2017-03-27 21:41:14 SC-2 osafamfnd[484]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
2017-03-27 21:41:18 SC-2 osafamfd[476]: NO FAILOVER StandBy --> Active
2017-03-27 21:41:18 SC-2 osafamfd[476]: ER FAILOVER StandBy --> Active FAILED, 
Standby OUT OF SYNC
2017-03-27 21:41:18 SC-2 osafamfd[476]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 131599, SupervisionTime = 60
2017-03-27 21:41:39 SC-2 osafamfd[475]: mkfifo already exists: 
/var/lib/opensaf/osafamfd.fifo File exists
2017-03-27 21:41:39 SC-2 osafamfd[475]: Started
2017-03-27 21:41:40 SC-2 osafamfnd[484]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
2017-03-27 21:41:42 SC-2 osafamfd[475]: NO FAILOVER StandBy --> Active
2017-03-27 21:41:42 SC-2 osafamfd[475]: ER FAILOVER StandBy --> Active FAILED, 
Standby OUT OF SYNC
2017-03-27 21:41:42 SC-2 osafamfd[475]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 131599, SupervisionTime = 60
2017-03-27 21:41:53 SC-2 osafamfd[478]: mkfifo already exists: 
/var/lib/opensaf/osafamfd.fifo File exists
2017-03-27 21:41:53 SC-2 osafamfd[478]: Started
2017-03-27 21:42:02 SC-2 osafamfnd[486]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
2017-03-27 21:42:05 SC-2 osafamfd[478]: NO Re-initializing with IMM
2017-03-27 21:42:05 SC-2 osafamfd[478]: exiting for shutdown
2017-03-27 21:42:09 SC-2 osafamfd[477]: Started
2017-03-27 21:42:09 SC-2 osafamfnd[485]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
2017-03-27 21:42:25 SC-2 osafamfd[476]: mkfifo already exists: 
/var/lib/opensaf/osafamfd.fifo File exists
2017-03-27 21:42:25 SC-2 osafamfd[476]: Started
2017-03-27 21:42:36 SC-2 osafamfnd[484]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
2017-03-27 21:42:41 SC-2 osafamfd[476]: NO Cold sync complete!




2017-03-27 21:42:02 SC-1 osafamfd[486]: Started
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Receive message with event type:12, 
msg_type:31, from node:2050f, msg_id:0
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Receive message with event type:12, 
msg_type:31, from node:2040f, msg_id:0
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Receive message with event type:13, 
msg_type:32, from node:2050f, msg_id:0
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Receive message with event type:12, 
msg_type:31, from node:2030f, msg_id:0
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Receive message with event type:13, 
msg_type:32, from node:2040f, msg_id:0
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Receive message with event type:13, 
msg_type:32, from node:2030f, msg_id:0
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up_msg from all nodes
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up from 2030f: msg_id 1
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up from 2020f: msg_id 1
2017-03-27 21:42:03 SC-1 osafamfnd[500]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up from 2010f: msg_id 1
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Enter restore headless cached RTAs 
from IMM
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Leave reading headless cached RTAs 
from IMM: SUCCESS
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Node 'SC-1' joined the cluster
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up from 2050f: msg_id 1
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Node 'PL-5' joined the cluster
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up from 2040f: msg_id 1
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Node 'PL-4' joined the cluster
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up from 2030f: msg_id 1
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Node 'PL-3' joined the cluster
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Received node_up from 2020f: msg_id 1
2017-03-27 21:42:03 SC-1 osafamfd[486]: NO Node 'SC-2' joined the cluster
2017-03-27 21:42:04 SC-1 osafamfd[486]: NO Cluster startup is done
2017-03-27 21:42:05 SC-1 osafamfd[486]: NO Node 'SC-2' left the cluster
2017-03-27 21:42:06 SC-1 osafamfd[486]: WA mbcsv cold sync rsp term
2017-03-27 21:42:09 SC-1 osafamfd[486]: NO Received node_up from 2020f: msg_id 1
2017-03-27 21:42:09 SC-1 osafamfd[486]: NO Node 'SC-2' joined the cluster
2017-03-27 21:42:36 SC-1 osafamfd[510]: mkfifo already exists: 
/var/lib/opensaf/osafamfd.fifo File exists
2017-03-27 21:42:36 SC-1 osafamfd[510]: Started
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2030f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2030f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up_msg from all nodes
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2040f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2030f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfd[510]: NO Received node_up from 2050f: msg_id 2
2017-03-27 21:42:36 SC-1 osafamfnd[524]: NO Start monitoring AMFD using 
/var/lib/opensaf/osafamfd.fifo
```


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to