Hello,
I am currently working on getting OpenSaf (without PLM/openhpi) working in a VM
environment. I have so far been able to bring up one system controller and get
all my required components enabled. However, when I go to bring up the second
system controller, it runs through the startup, completes the cold sync and
then it appears nothing further happens. I have determined that opensafd
startup on the second SC is hanging on amfnd startup which I surmise is due to
the nid_notify call not being hit. It does not seem to be connectivity related
since both sides are communicating.
I have turned on trace debug on for amfd on both SCs and for amfnd on the SC I
am trying to bring up, but so far it is not pointing to anything obvious (but
then again I am still working to understand the code flow).
Looking for any pointers to help isolate the issue.
Version: OpenSAF 4.5.0
Compiled config: ./configure --disable-ais-plm --enable-tipc --enable-imm-pbe
In the mds.log I do see an ERR log but I do not know if is related:
Sep 30 20:39:01.736442 osafamfnd[5679] INFO |MDTM: User Recd msg len=16
Sep 30 20:39:01.736460 osafamfnd[5679] INFO |MDTM: Sending data to upper layer
for a single recd message
Sep 30 20:39:01.736478 osafamfnd[5679] DBG |MCM_DB : Entering : mds_svc_tbl_get
Sep 30 20:39:01.736494 osafamfnd[5679] DBG |MCM_DB : Leaving : S :
mds_svc_tbl_get
Sep 30 20:39:01.736511 osafamfnd[5679] DBG |MCM_DB : Entering :
mds_svc_tbl_get_role
Sep 30 20:39:01.736528 osafamfnd[5679] DBG |MCM_DB : Entering :
mds_svc_tbl_query
Sep 30 20:39:01.736544 osafamfnd[5679] DBG |MCM_DB : Leaving : S :
mds_svc_tbl_query
Sep 30 20:39:01.736561 osafamfnd[5679] DBG |MCM_DB : Entering :
mds_vdest_tbl_get_role
Sep 30 20:39:01.736578 osafamfnd[5679] DBG |MCM_DB : Leaving : S :
mds_vdest_tbl_get_role
Sep 30 20:39:01.736595 osafamfnd[5679] DBG |MCM_DB : Leaving : S :
mds_svc_tbl_get_role
Sep 30 20:39:01.736612 osafamfnd[5679] DBG |MDS_SND_RCV : Entering
mds_mcm_process_recv_snd_msg_common
Sep 30 20:39:01.736629 osafamfnd[5679] DBG |MDS_SND_RCV : Entering
mds_mcm_do_decode_full_or_flat
Sep 30 20:39:01.736647 osafamfnd[5679] DBG |MDS_SND_RCV : calling callback ptr
Sep 30 20:39:01.736682 osafamfnd[5679] DBG |MDS_SND_RCV : Leaving
mds_mcm_do_decode_full_or_flat
Sep 30 20:39:01.736751 osafamfnd[5679] ERR |MDS_SND_RCV: Receive callback
failed svc_id = CLMA(35)
^^^^^
Sep 30 20:39:01.736803 osafamfnd[5679] DBG |MCM_DB : Entering : mds_svc_tbl_get
Sep 30 20:39:01.736828 osafamfnd[5679] DBG |MCM_DB : Leaving : S :
mds_svc_tbl_get
Sep 30 20:39:01.736851 osafamfnd[5679] DBG |MCM_DB : Entering :
mds_subtn_tbl_get
Sep 30 20:39:01.736869 osafamfnd[5679] DBG |MCM_DB : Leaving : S :
mds_subtn_tbl_get
Sep 30 20:39:01.736887 osafamfnd[5679] INFO |MDS_SND_RCV: Successfully recd
the response also
Sep 30 20:39:01.736905 osafamfnd[5679] INFO |MDS_SND_RCV: Deleting the sync
send entry with xch_id=4
Here is the output syslog:
Sep 30 20:38:59 host-s2 opensafd: Starting OpenSAF Services
Sep 30 20:38:59 host-s2 kernel: [95954.028402] TIPC: Activated (version 2.0.0)
Sep 30 20:38:59 host-s2 kernel: [95954.029613] NET: Registered protocol family
30
Sep 30 20:38:59 host-s2 kernel: [95954.030680] TIPC: Started in single node mode
Sep 30 20:38:59 host-s2 kernel: [95954.033582] TIPC: Started in network mode
Sep 30 20:38:59 host-s2 kernel: [95954.034575] TIPC: Own node address <1.1.2>,
network identity 1234
Sep 30 20:38:59 host-s2 kernel: [95954.038442] TIPC: Enabled bearer <eth:eth4>,
discovery domain <1.1.0>, priority 10
Sep 30 20:38:59 host-s2 kernel: [95954.040652] TIPC: Established link
<1.1.2:eth4-1.1.1:eth4> on network plane A
Sep 30 20:38:59 host-s2 osafrded[5588]: Started
Sep 30 20:38:59 host-s2 osafrded[5588]: NO Peer rde@2010f has active state =>
Assigning Standby role to this node
Sep 30 20:39:00 host-s2 osaffmd[5596]: Started
Sep 30 20:39:00 host-s2 osafimmd[5605]: Started
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO 2PBE configured with
IMMSV_PEER_SC_MAX_WAIT: 30 seconds
Sep 30 20:39:00 host-s2 osafimmnd[5618]: Started
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO Persistent Back-End capability
configured, Pbe file:imm.db (suffix may get added)
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE: IMM_SERVER_ANONYMOUS
--> IMM_SERVER_CLUSTER_WAITING
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO 2PBE configured,
IMMSV_PBE_FILE_SUFFIX:.2020f (sync)
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO NODE STATE-> IMM_NODE_ISOLATED
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO SBY: Ruling epoch noted as:155
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO IMMND coord at 2010f
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO SBY: SaImmRepositoryInitModeT
changed and noted as 'SA_IMM_KEEP_REPOSITORY'
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 2462
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO RepositoryInitModeT is
SA_IMM_KEEP_REPOSITORY
Sep 30 20:39:01 host-s2 osafimmnd[5618]: WA IMM Access Control mode is DISABLED!
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO Epoch set to 155 in ImmModel
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO SBY: New Epoch for IMMND process at
node 2010f old epoch: 154 new epoch:155
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO IMMND coord at 2010f
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO SBY: New Epoch for IMMND process at
node 2020f old epoch: 0 new epoch:155
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_SYNC_CLIENT --> IMM SERVER READY
Sep 30 20:39:01 host-s2 osaflogd[5627]: Started
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO No or invalid log service
configuration object
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO Log config system: high 0 low 0,
application: high 0 low 0
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO log root directory is:
/var/log/opensaf/saflog
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO LGS_MBCSV_VERSION = 3
Sep 30 20:39:01 host-s2 osafntfd[5639]: Started
Sep 30 20:38:59 host-s2 opensafd: Starting OpenSAF Services
Sep 30 20:38:59 host-s2 kernel: [95954.028402] TIPC: Activated (version 2.0.0)
Sep 30 20:38:59 host-s2 kernel: [95954.029613] NET: Registered protocol family
30
Sep 30 20:38:59 host-s2 kernel: [95954.030680] TIPC: Started in single node mode
Sep 30 20:38:59 host-s2 kernel: [95954.033582] TIPC: Started in network mode
Sep 30 20:38:59 host-s2 kernel: [95954.034575] TIPC: Own node address <1.1.2>,
network identity 1234
Sep 30 20:38:59 host-s2 kernel: [95954.038442] TIPC: Enabled bearer <eth:eth4>,
discovery domain <1.1.0>, priority 10
Sep 30 20:38:59 host-s2 kernel: [95954.040652] TIPC: Established link
<1.1.2:eth4-1.1.1:eth4> on network plane A
Sep 30 20:38:59 host-s2 osafrded[5588]: Started
Sep 30 20:38:59 host-s2 osafrded[5588]: NO Peer rde@2010f has active state =>
Assigning Standby role to this node
Sep 30 20:39:00 host-s2 osaffmd[5596]: Started
Sep 30 20:39:00 host-s2 osafimmd[5605]: Started
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO 2PBE configured with
IMMSV_PEER_SC_MAX_WAIT: 30 seconds
Sep 30 20:39:00 host-s2 osafimmnd[5618]: Started
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO Persistent Back-End capability
configured, Pbe file:imm.db (suffix may get added)
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE: IMM_SERVER_ANONYMOUS
--> IMM_SERVER_CLUSTER_WAITING
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO 2PBE configured,
IMMSV_PBE_FILE_SUFFIX:.2020f (sync)
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO NODE STATE-> IMM_NODE_ISOLATED
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO SBY: Ruling epoch noted as:155
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO IMMND coord at 2010f
Sep 30 20:39:00 host-s2 osafimmd[5605]: NO SBY: SaImmRepositoryInitModeT
changed and noted as 'SA_IMM_KEEP_REPOSITORY'
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO NODE STATE-> IMM_NODE_W_AVAILABLE
Sep 30 20:39:00 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 2462
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO RepositoryInitModeT is
SA_IMM_KEEP_REPOSITORY
Sep 30 20:39:01 host-s2 osafimmnd[5618]: WA IMM Access Control mode is DISABLED!
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO Epoch set to 155 in ImmModel
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO SBY: New Epoch for IMMND process at
node 2010f old epoch: 154 new epoch:155
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO IMMND coord at 2010f
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO SBY: New Epoch for IMMND process at
node 2020f old epoch: 0 new epoch:155
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO SERVER STATE:
IMM_SERVER_SYNC_CLIENT --> IMM SERVER READY
Sep 30 20:39:01 host-s2 osaflogd[5627]: Started
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO No or invalid log service
configuration object
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO Log config system: high 0 low 0,
application: high 0 low 0
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO log root directory is:
/var/log/opensaf/saflog
Sep 30 20:39:01 host-s2 osaflogd[5627]: NO LGS_MBCSV_VERSION = 3
Sep 30 20:39:01 host-s2 osafntfd[5639]: Started
Sep 30 20:39:01 host-s2 osafntfd[5639]: Started
Sep 30 20:39:01 host-s2 osafclmd[5648]: logtrace: trace enabled to file
/var/log/opensaf/osafclmd, mask=0xffffffff
Sep 30 20:39:01 host-s2 osafclmd[5648]: Started
Sep 30 20:39:01 host-s2 osafclmna[5657]: logtrace: trace enabled to file
/var/log/opensaf/osafclmna, mask=0xffffffff
Sep 30 20:39:01 host-s2 osafclmna[5657]: Started
Sep 30 20:39:01 host-s2 osafclmna[5657]: NO
safNode=SC-2,safCluster=Q50clmCluster Joined cluster, nodeid=2020f
Sep 30 20:39:01 host-s2 osafamfd[5665]: logtrace: trace enabled to file
/var/log/opensaf/osafamfd, mask=0xffffffff
Sep 30 20:39:01 host-s2 osafamfd[5665]: Started
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO PBE-OI established on other SC.
Dumping incrementally to file imm.db
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO STARTING SLAVE PBE process.
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO
pbe-db-file-path:/etc/opensaf/imm.db.2020f VETERAN:1 B:0
Sep 30 20:39:01 host-s2 osafimmpbed: IN arg[0] ==
'/usr/lib64/opensaf/osafimmpbed'
Sep 30 20:39:01 host-s2 osafimmpbed: IN arg[1] == '--pbe2B'
Sep 30 20:39:01 host-s2 osafimmpbed: IN arg[2] == '/etc/opensaf/imm.db.2020f'
Sep 30 20:39:01 host-s2 osafimmpbed: IN Generating DB file from current IMM
state. DB file: /etc/opensaf/imm.db.2020f
Sep 30 20:39:01 host-s2 osafimmpbed: NO Successfully opened empty local sqlite
pbe file /tmp/imm.db.RIranA
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO SBY: Ruling epoch noted as:156
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO SBY: New Epoch for IMMND process at
node 2020f old epoch: 155 new epoch:156
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO SBY: New Epoch for IMMND process at
node 2010f old epoch: 155 new epoch:156
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO IMMND coord at 2010f
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO Epoch set to 156 in ImmModel
Sep 30 20:39:01 host-s2 osafamfd[5665]: NO got si
:safSi=Management-2N,safApp=ManagementApp
Sep 30 20:39:01 host-s2 osafamfd: Last message 'NO got si :safSi=Man' repeated
13 times, suppressed by syslog-ng on host-s2
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO Implementer (applier) connected: 12
(@safAmfService2020f) <8, 2020f>
Sep 30 20:39:01 host-s2 osafimmpbed: NO Successfully dumped to file
/tmp/imm.db.RIranA
Sep 30 20:39:01 host-s2 osafimmpbed: NO Moved /tmp/imm.db.RIranA to
/etc/opensaf/imm.db.2020f.tmp
Sep 30 20:39:01 host-s2 osafimmpbed: NO Moved /etc/opensaf/imm.db.2020f.tmp to
/etc/opensaf/imm.db.2020f
Sep 30 20:39:01 host-s2 osafimmpbed: NO Removed obsolete journal file:
/etc/opensaf/imm.db.2020f-journal
Sep 30 20:39:01 host-s2 osafimmpbed: NO Removed obsolete db file:
/etc/opensaf/imm.db.2020f.prev
Sep 30 20:39:01 host-s2 osafimmpbed: IN Re-attaching to the new version of
/etc/opensaf/imm.db.2020f
Sep 30 20:39:01 host-s2 osafimmpbed: NO Successfully opened pre-existing sqlite
pbe file /etc/opensaf/imm.db.2020f
Sep 30 20:39:01 host-s2 osafimmpbed: IN saImmRepositoryInit:
SA_IMM_KEEP_REPOSITORY - attaching to repository
Sep 30 20:39:01 host-s2 osafimmpbed: NO pbeDaemon starting with obj-count:808
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO Implementer (applier) connected: 13
(@OpenSafImmPBE) <403, 2020f>
Sep 30 20:39:01 host-s2 osafimmnd[5618]: NO Implementer connected: 14
(OsafImmPbeRt_B) <404, 2020f>
Sep 30 20:39:01 host-s2 osafamfnd[5679]: logtrace: trace enabled to file
/var/log/opensaf/osafamfnd, mask=0xffffffff
Sep 30 20:39:01 host-s2 osafamfnd[5679]: Started
Sep 30 20:39:01 host-s2 osafimmd[5605]: NO IMMND coord at 2010f
Sep 30 20:39:01 host-s2 osafimmpbed: NO Update epoch 156 committing with
ccbId:10000000a/4294967306
Sep 30 20:39:03 host-s2 osafimmnd[5618]: NO PBE slave established on this SC.
Dumping incrementally to file imm.db
Sep 30 20:39:04 host-s2 osafamfd[5665]: NO Cold sync complete!
Thanks,
Chad
------------------------------------------------------------------------------
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users