Hi Jerry, I'll request some additional information. Can you provide me with the value stored on agent.properties for 'host' property on each KVM host? I suspect that the global setting has not been propagated to the agents, as it is trying to reconnect instead of connecting to the next management server once it is down.
Regards, Nicolas Vazquez ________________________________ From: li jerry <div...@hotmail.com> Sent: Monday, July 15, 2019 10:20 PM To: us...@cloudstack.apache.org <us...@cloudstack.apache.org>; dev@cloudstack.apache.org <dev@cloudstack.apache.org> Subject: Agent LB for CloudStack failed Hello everyone My kvm Agent LB on 4.11.2/4.11.3 failed. When the preferred managment node is forced to power off, the agent will not immediately connect to the second management node.After 15 minutes, the agent issues a "No route to host" error and connects to the second management node. management node: acs-mn01,172.17.1.141 acs-mn02,172.17.1.142 mysql db node: acs-db01 kvmm agent node: test-ceph-node01 test-ceph-node02 test-ceph-node03 test-ceph-node04 global seting host=172.17.1.142,172.17.1.141 indirect.agent.lb.algorithm=roundrobin indirect.agent.lb.check.interval=60 Partial agnet logs: 2019-07-15 23:22:39,340 DEBUG [cloud.agent.Agent] (UgentTask-5:null) (logid:) Sending ping: Seq 1-19: { Cmd , MgmtId: -1, via: 1, Ver : v1, Flags: 11, [{"com.cloud.agent.api.PingRoutingWithNwGroupsCommand":{"newGroupStates":{},"_hostVmStateReport":{},"_gatewayAccessible":true,"_vnetAccessible":true,"hostType ":"Routing","hostId":1,"wait":0}}] } 2019-07-15 23:23:09,960 DEBUG [utils.nio.NioConnection] (Agent-NioConnectionHandler-1:null) (logid:) Location 1: Socket Socket[addr=/172.17.1.142,port=8250,localport= 34854] closed on read. Probably -1 returned: No route to host 2019-07-15 23:23:09,960 DEBUG [utils.nio.NioConnection] (Agent-NioConnectionHandler-1:null) (logid:) Closing socket Socket[addr=/172.17.1.142,port=8250,localport=34854] 2019-07-15 23:23:09,961 DEBUG [cloud.agent.Agent] (Agent-Handler-4:null) (logid:a4e4de49) Clearing watch list: 2 2019-07-15 23:23:09,962 INFO [cloud.agent.Agent] (Agent-Handler-4:null) (logid:a4e4de49) Lost connection to host: 172.17.1.142. Attempting reconnection while we still have 0 commands in Progress. 2019-07-15 23:23:09,963 INFO [utils.nio.NioClient] (Agent-Handler-4:null) (logid:a4e4de49) NioClient connection closed 2019-07-15 23:23:09,964 INFO [cloud.agent.Agent] (Agent-Handler-4:null) (logid:a4e4de49) Reconnecting to host:172.17.1.142 2019-07-15 23:23:09,964 INFO [utils.nio.NioClient] (Agent-Handler-4:null) (logid:a4e4de49) Connecting to 172.17.1.142:8250 2019-07-15 23:23:12,972 ERROR [utils.nio.NioConnection] (Agent-Handler-4:null) (logid:a4e4de49) Unable to initialize the threads. java.net.NoRouteToHostException: No route to host At sun.nio.ch.Net.connect0(Native Method) At sun.nio.ch.Net.connect(Net.java:454) At sun.nio.ch.Net.connect(Net.java:446) At sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) At com.cloud.utils.nio.NioClient.init(NioClient.java:56) At com.cloud.utils.nio.NioConnection.start(NioConnection.java:95) At com.cloud.agent.Agent.reconnect(Agent.java:517) At com.cloud.agent.Agent$ServerHandler.doTask(Agent.java:1091) At com.clo nicolas.vazq...@shapeblue.com www.shapeblue.com Amadeus House, Floral Street, London WC2E 9DPUK @shapeblue