When the PBE hung, amfd can process the events with
below order when a node was started then stop then started
- clm_track_cb for node down event
- clm_track_cb for second node up event
- avd_mds_avnd_down_evh was called to process amfnd down event
And it cause the node can not join the cluster.
---
src/amf/amfd/ndfsm.cc | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/src/amf/amfd/ndfsm.cc b/src/amf/amfd/ndfsm.cc
index 8c8f3c5..7099196 100644
--- a/src/amf/amfd/ndfsm.cc
+++ b/src/amf/amfd/ndfsm.cc
@@ -800,6 +800,13 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
daemon_exit();
}
+ if (node->node_state == AVD_AVND_STATE_ABSENT) {
+ TRACE("Ignore '%s' amfnd down event since node state absent",
+ node->node_name.c_str());
+ TRACE_LEAVE();
+ return;
+ }
+
if (cb->failover_list.find(evt->info.node_id) != cb->failover_list.end()) {
std::shared_ptr<NodeStateMachine> failed_node =
cb->failover_list.at(evt->info.node_id);
--
2.7.4
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel