Hello everyone I recently tested the multiple management services, based on agent lb HOST HA (KVM). It was found that in extreme cases, HA would fail; the details are as follows:
Two management nodes, M1 (172.17.1.141) and M2 (172.17.1.142), share an external database cluster Three KVM nodes, H1, H2, H3 An external NFS primary storage CLOUDSTACK parameter configuration Indirect.agent.lb.algorithm=static Indirect.agent.lb.check.interval=0 host=172.17.1.141,172.17.1.142 Through the agent.log analysis, all kvm agents are connected to the first selection management node M1 (172.17.1.141): INFO [cloud.agent.Agent] (agentRequest-Handler-1:null) (logid:b30323e4) Processed new management server list: 172.17.1.141,172.17.1.142@static In extreme cases: KVM HOST and the preferred management server fail at the same time, KVM HOST will not trigger HA detection E.g: M1+H1, power off at the same time; the state of H1 remains Disconnected, and all VMs on H1 will not restart on other KVM nodes; M1+H2, power off at the same time; the state of H1 remains Disconnected, and all VMs on H2 will not restart on other KVM nodes; M1+H3, power off at the same time; the state of H1 remains Disconnected, and all VMs on H3 will not restart on other KVM nodes;