>From what I can see, the ACS is unable to contact your hypervisor host server:
2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying What is the status of your host server? Is it shown as Up/Alert/Disconnected/Connecting? Andrei ----- Original Message ----- > From: "Mohammad Rastgoo" <moham...@synapti.ca> > To: users@cloudstack.apache.org > Sent: Monday, 2 February, 2015 7:27:30 PM > Subject: Re: Virtual Routers not starting up after host restart > > Andrei, > > Below is the partial MS log. I have marked couple parts in bold. Might be > dumb but my first thought was maybe iptables is causing it, yet I have no > good explanations for it. > > 2015-02-02 13:17:14,152 WARN [o.a.c.alerts] > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) > alertType:: 9 // dataCenterId:: 1 // podId:: 1 // clusterId:: null // > message:: Command: com.cloud.agent.api.check.CheckSshCommand failed while > starting virtual router > 2015-02-02 13:17:14,233 WARN [c.c.n.r.VirtualNetworkApplianceManagerImpl] > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Command: > com.cloud.agent.api.check.CheckSshCommand failed while starting virtual > router > 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > (Timer-1:ctx-3041bbe4) Task (job-244) has been pending for 1134 seconds > 2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > (Timer-1:ctx-3041bbe4) Task (job-245) has been pending for 1133 seconds > 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > (Timer-1:ctx-526a6af6) Task (job-244) has been pending for 1194 seconds > 2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor] > (Timer-1:ctx-526a6af6) Task (job-245) has been pending for 1193 seconds > 2015-02-02 13:19:16,969 ERROR [c.c.v.VirtualMachineManagerImpl] > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Failed to > start instance VM[DomainRouter|r-29-VM] > com.cloud.utils.exception.ExecutionException: Unable to start > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying > 2015-02-02 13:19:17,518 DEBUG [c.c.c.CapacityManagerImpl] > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) VM state > transitted from :Starting to Stopped with event: OperationFailedvm's > original host id: null new host id: null host id before state transition: 1 > 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy] > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation > exception, caused by: com.cloud.exception.AgentUnavailableException: > Resource [Host:1] is unreachable: Host 1: Unable to start instance due to > Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not > retrying > 2015-02-02 13:19:17,585 INFO [c.c.v.VmWorkJobHandlerProxy] > (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Rethrow > exception com.cloud.exception.AgentUnavailableException: Resource [Host:1] > is unreachable: Host 1: Unable to start instance due to Unable to start > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying > 2015-02-02 13:19:17,586 ERROR [c.c.v.VmWorkJobDispatcher] > (Work-Job-Executor-24:ctx-114e980e job-244/job-245) Unable to complete > AsyncJobVO {id:245, userId: 2, accountId: 2, instanceType: null, > instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: > rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAHXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAAAAAADHcIAAAAEAAAAAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, > cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, > result: null, initMsid: 161333667508, completeMsid: null, lastUpdated: > null, lastPolled: null, created: Mon Feb 02 12:58:55 EST 2015}, job > origin:244 > > > *com.cloud.exception.AgentUnavailableException: Resource [Host:1] is > unreachable: Host 1: Unable to start instance due to Unable to start > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retryingCaused > by: com.cloud.utils.exception.ExecutionException: Unable to start > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying* > > 2015-02-02 13:19:17,930 WARN [o.a.c.e.o.NetworkOrchestrator] > (API-Job-Executor-16:ctx-0dfa85ec job-244 ctx-2d8a3616) Failed to implement > network Ntwk[f3a318a2-d6f0-4fcb-be94-4e4586cc20a3|Guest|7] elements and > resources as a part of network restart due to > java.lang.RuntimeException: *Job failed due to exception Resource [Host:1] > is unreachable: Host 1: Unable to start instance due to Unable to start > VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying* > 2015-02-02 13:19:17,930 WARN [c.c.n.NetworkServiceImpl] > (API-Job-Executor-16:ctx-0dfa85ec job-244 ctx-2d8a3616) Network id=207 > failed to restart. > 2015-02-02 13:19:18,135 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] > (API-Job-Executor-16:ctx-0dfa85ec job-244) Complete async job-244, > jobStatus: FAILED, resultCode: 530, result: > org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed > to restart network"} > 2015-02-02 13:23:18,345 WARN [c.c.a.d.ParamGenericValidationWorker] > (catalina-exec-19:ctx-b153f3e0 ctx-d5075366) Received unknown parameters > for command listNetworks. Unknown parameters : details > > On Mon, Feb 2, 2015 at 2:19 PM, Andrei Mikhailovsky <and...@arhont.com> > wrote: > > > Mohammad, what does the management server log say when you try to start > > VRs? It should have the clue why it is not starting > > > > Andrei > > > > ----- Original Message ----- > > > > > From: "Mohammad Rastgoo" <moham...@synapti.ca> > > > To: users@cloudstack.apache.org > > > Sent: Monday, 2 February, 2015 6:06:41 PM > > > Subject: Virtual Routers not starting up after host restart > > > > > Hi, > > > > > Thanks for reading this. > > > > > I have this setup: > > > server 1: MS + DB > > > server 2: secondary storage NFS > > > server 3: kvm - local primary > > > (all centos 6.6) > > > net1: isolated network 10.0.0.0/x > > > net2: shared network (public ip) > > > > > Here are the steps I took: > > > > > 1- stopped all VMs > > > 2- stopped system VMs (not VRs) > > > 3- yum updated glibc + reboot on all servers > > > > > Now here is the situation, net2 has remained in setup state and net1 > > > on > > > allocated. > > > > > sys VMs are back on. VRs are at starting and then stopped. > > > > > so far, I have deleted VRs and restarted networks + clean up. no > > > luck. > > > > > has anyone encountered the same problem? am I missing anything here? > > > > > Any help is highly appreciated. Tnx > > > > > -- > > > Mohammad Rastgoo > > > > > > -- > Mohammad Rastgoo > Founder & CEO >