Re: Virtual Routers not starting up after host restart
From what I can see, the ACS is unable to contact your hypervisor host server: 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying What is the status of your host server? Is it shown as Up/Alert/Disconnected/Connecting? Andrei - Original Message - From: Mohammad Rastgoo moham...@synapti.ca To: users@cloudstack.apache.org Sent: Monday, 2 February, 2015 7:27:30 PM Subject: Re: Virtual Routers not starting up after host restart Andrei, Below is the partial MS log. I have marked couple parts in bold. Might be dumb but my first thought was maybe iptables is causing it, yet I have no good explanations for it. 2015-02-02 13:17:14,152 WARN [o.a.c.alerts] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) alertType:: 9 // dataCenterId:: 1 // podId:: 1 // clusterId:: null // message:: Command: com.cloud.agent.api.check.CheckSshCommand failed while starting virtual router 2015-02-02 13:17:14,233 WARN [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Command: com.cloud.agent.api.check.CheckSshCommand failed while starting virtual router 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-3041bbe4) Task (job-244) has been pending for 1134 seconds 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-3041bbe4) Task (job-245) has been pending for 1133 seconds 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-526a6af6) Task (job-244) has been pending for 1194 seconds 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-526a6af6) Task (job-245) has been pending for 1193 seconds 2015-02-02 13:19:16,969 ERROR [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Failed to start instance VM[DomainRouter|r-29-VM] com.cloud.utils.exception.ExecutionException: Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying 2015-02-02 13:19:17,518 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: null new host id: null host id before state transition: 1 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying 2015-02-02 13:19:17,585 INFO [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Rethrow exception com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying 2015-02-02 13:19:17,586 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-24:ctx-114e980e job-244/job-245) Unable to complete AsyncJobVO {id:245, userId: 2, accountId: 2, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIAHXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 161333667508, completeMsid: null, lastUpdated: null, lastPolled: null, created: Mon Feb 02 12:58:55 EST 2015}, job origin:244 *com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retryingCaused by: com.cloud.utils.exception.ExecutionException: Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying* 2015
Re: Virtual Routers not starting up after host restart
Mohammad, any errors on the host side? Can you check if VRs are being created on the host? Also, check if you can get the console (from the hypervisor and not from the ACS GUI). Perhaps there is a clue on what's happening. By the way, are you other system vms working okay? Like ssvm and cpvm? Andrei - Original Message - From: Mohammad Rastgoo moham...@synapti.ca To: users@cloudstack.apache.org Sent: Monday, 2 February, 2015 7:42:13 PM Subject: Re: Virtual Routers not starting up after host restart UP and green. On Mon, Feb 2, 2015 at 2:34 PM, Andrei Mikhailovsky and...@arhont.com wrote: From what I can see, the ACS is unable to contact your hypervisor host server: 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying What is the status of your host server? Is it shown as Up/Alert/Disconnected/Connecting? Andrei - Original Message - From: Mohammad Rastgoo moham...@synapti.ca To: users@cloudstack.apache.org Sent: Monday, 2 February, 2015 7:27:30 PM Subject: Re: Virtual Routers not starting up after host restart Andrei, Below is the partial MS log. I have marked couple parts in bold. Might be dumb but my first thought was maybe iptables is causing it, yet I have no good explanations for it. 2015-02-02 13:17:14,152 WARN [o.a.c.alerts] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) alertType:: 9 // dataCenterId:: 1 // podId:: 1 // clusterId:: null // message:: Command: com.cloud.agent.api.check.CheckSshCommand failed while starting virtual router 2015-02-02 13:17:14,233 WARN [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Command: com.cloud.agent.api.check.CheckSshCommand failed while starting virtual router 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-3041bbe4) Task (job-244) has been pending for 1134 seconds 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-3041bbe4) Task (job-245) has been pending for 1133 seconds 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-526a6af6) Task (job-244) has been pending for 1194 seconds 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-526a6af6) Task (job-245) has been pending for 1193 seconds 2015-02-02 13:19:16,969 ERROR [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Failed to start instance VM[DomainRouter|r-29-VM] com.cloud.utils.exception.ExecutionException: Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying 2015-02-02 13:19:17,518 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: null new host id: null host id before state transition: 1 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying 2015-02-02 13:19:17,585 INFO [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Rethrow exception com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying 2015-02-02 13:19:17,586 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-24:ctx-114e980e job-244/job-245) Unable to complete AsyncJobVO {id:245, userId: 2, accountId: 2, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB
Re: Virtual Routers not starting up after host restart
Mohammad, what does the management server log say when you try to start VRs? It should have the clue why it is not starting Andrei - Original Message - From: Mohammad Rastgoo moham...@synapti.ca To: users@cloudstack.apache.org Sent: Monday, 2 February, 2015 6:06:41 PM Subject: Virtual Routers not starting up after host restart Hi, Thanks for reading this. I have this setup: server 1: MS + DB server 2: secondary storage NFS server 3: kvm - local primary (all centos 6.6) net1: isolated network 10.0.0.0/x net2: shared network (public ip) Here are the steps I took: 1- stopped all VMs 2- stopped system VMs (not VRs) 3- yum updated glibc + reboot on all servers Now here is the situation, net2 has remained in setup state and net1 on allocated. sys VMs are back on. VRs are at starting and then stopped. so far, I have deleted VRs and restarted networks + clean up. no luck. has anyone encountered the same problem? am I missing anything here? Any help is highly appreciated. Tnx -- Mohammad Rastgoo