Hi Before the steps below, I suggest (since your whole rack is affected...) to verify via telnet "telnet <mgmt-server-IP-address> 8250" -that your mgmt server is reachable - confirm that there is no connectivity issue on IP level (firewall and such).
Then, try the following: - stop cloudstack agent - restart libvirt (make sure it's restarted, I've seen cases when libvirt would not restart due to being stuck etc - confirm that the process has the new PID) - start cloudstack agent This will make sure your libvirt has NO pools in it (existing VMs are still happily running), and CloudStack agents will connect to the management server and, hopefully, everything should be fine. Regards, Andrija On Tue, 5 Nov 2019 at 17:01, Munjo Jung <[email protected]> wrote: > Hello, everyone, > Please somebody help me or give me some advice to solve the problem of my > private cloud below. > cloudstack 4.8 on CentOS and KVM hosts are used in the cloud. > cnode02 ~ cnode10 in Rack01 entered alert status and are not connected to > the cloudstack-management server. > [image: image.png] > > *config in /etc/cloudstack/agent/agent.properties* > #Storage > #Tue Nov 05 22:28:26 KST 2019 > guest.network.device=cloudbr1 > workers=5 > private.network.device=cloudbr0 > port=8250 > resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResource > pod=1 > zone=1 > hypervisor.type=kvm > guid=eaa8860f-1a51-3130-89ca-75550e0217d1 > public.network.device=cloudbr1 > cluster=1 > local.storage.uuid=3801e58c-84dd-4506-b3e3-b2257ed82bb4 > domr.scripts.dir=scripts/network/domr/kvm > LibvirtComputingResource.id=0 > host=10.2.1.251 > > *partial log messages in /var/log/cloudstack/agent/agent.log* > 2019-11-05 22:10:49,737 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:a14d6e1b) Unable to send response: null > 2019-11-05 22:10:55,244 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:0a288205) Unable to send response: null > 2019-11-05 22:11:00,660 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:14b60513) Unable to send response: null > 2019-11-05 22:11:06,068 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:95b2b83d) Unable to send response: null > 2019-11-05 22:11:11,460 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:637f01f8) Unable to send response: null > 2019-11-05 22:11:16,886 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:4a621d4e) Unable to send response: null > 2019-11-05 22:11:22,284 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:ac4b5fad) Unable to send response: null > 2019-11-05 22:11:27,707 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:76049733) Unable to send response: null > 2019-11-05 22:11:33,113 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:db1f9207) Unable to send response: null > 2019-11-05 22:11:38,536 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:6c2cce66) Unable to send response: null > 2019-11-05 22:11:43,968 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:f628c8b9) Unable to send response: null > 2019-11-05 22:11:49,378 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:6ca4376c) Unable to send response: null > 2019-11-05 22:11:54,848 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:18f85f8e) Unable to send response: null > 2019-11-05 22:12:00,248 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:f38a50d3) Unable to send response: null > 2019-11-05 22:12:05,709 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:6e364389) Unable to send response: null > 2019-11-05 22:12:11,162 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:a648b060) Unable to send response: null > 2019-11-05 22:12:16,597 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:1f2b530c) Unable to send response: null > 2019-11-05 22:12:22,031 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:ddba3a27) Unable to send response: null > 2019-11-05 22:12:27,445 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:d1bb3bba) Unable to send response: null > 2019-11-05 22:12:32,875 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:fcba9031) Unable to send response: null > 2019-11-05 22:12:38,280 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:367f591f) Unable to send response: null > 2019-11-05 22:12:43,713 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:975ee4bb) Unable to send response: null > 2019-11-05 22:12:49,118 WARN [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:dcf0f6a3) Unable to send response: null > 2019-11-05 22:12:54,559 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:7874a8bf) Unable to send response: null > 2019-11-05 22:12:59,979 WARN [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:9422ed8b) Unable to send response: null > 2019-11-05 22:11:16,817 WARN [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-5:null) (logid:) Unable to send response because > connection is closed: Seq 0-16514: { Ans: , MgmtId: 90520730730497, via: > 10, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.PingAnswer":{"_command":{"newGroupStates":{},"_hostVmStateReport":{"i-5-31-VM":{"state":"PowerOn","host":"cnode04-R01"},"s-4-VM":{"state":"PowerOn","host":"cnode04-R01"},"i-5-115-VM":{"state":"PowerOn","host":"cnode04-R01"},"i-5-39-VM":{"state":"PowerOn","host":"cnode04-R01"}},"_gatewayAccessible":true,"_vnetAccessible":true,"hostType":"Routing","hostId":0,"wait":0},"result":true,"wait":0}}] > } > 2019-11-05 22:11:21,016 WARN [c.c.s.StorageManagerImpl] > (AgentConnectTaskPool-25425:ctx-7f61f027) (logid:aab3ae9f) Unable to setup > the local storage pool for Host[-21-Routing] > > *com.cloud.utils.exception.CloudRuntimeException: Another active pool with > the same uuid already exists* > 2019-11-05 22:11:21,017 INFO [c.c.u.e.CSExceptionErrorCode] > (AgentConnectTaskPool-25425:ctx-7f61f027) (logid:aab3ae9f) Could not find > exception: com.cloud.exception.ConnectionException in error code list for > exceptions > 2019-11-05 22:11:21,017 WARN [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-25425:ctx-7f61f027) (logid:aab3ae9f) Monitor > LocalStoragePoolListener says there is an error in the connect process for > 21 due to Unable to setup the local storage pool for Host[-21-Routing] > 2019-11-05 22:11:21,024 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-25425:ctx-7f61f027) (logid:aab3ae9f) Failed to handle > host connection: com.cloud.exception.ConnectionException: Unable to setup > the local storage pool for Host[-21-Routing] > 2019-11-05 22:11:22,236 WARN [c.c.s.StorageManagerImpl] > (AgentConnectTaskPool-25426:ctx-0c8c7593) (logid:ac4b5fad) Unable to setup > the local storage pool for Host[-10-Routing] > com.cloud.utils.exception.CloudRuntimeException: Another active pool with > the same uuid already exists > 2019-11-05 22:11:22,237 INFO [c.c.u.e.CSExceptionErrorCode] > (AgentConnectTaskPool-25426:ctx-0c8c7593) (logid:ac4b5fad) Could not find > exception: com.cloud.exception.ConnectionException in error code list for > exceptions > 2019-11-05 22:11:22,237 WARN [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-25426:ctx-0c8c7593) (logid:ac4b5fad) Monitor > LocalStoragePoolListener says there is an error in the connect process for > 10 due to Unable to setup the local storage pool for Host[-10-Routing] > 2019-11-05 22:11:22,244 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-25426:ctx-0c8c7593) (logid:ac4b5fad) Failed to handle > host connection: com.cloud.exception.ConnectionException: Unable to setup > the local storage pool for Host[-10-Routing] > 2019-11-05 22:11:27,653 WARN [c.c.s.StorageManagerImpl] > (AgentConnectTaskPool-25427:ctx-aeafd009) (logid:76049733) Unable to setup > the local storage pool for Host[-10-Routing] > Thanks, > > MJ > -- Andrija Panić
