Re: Virtual Routers not starting up after host restart

2015-02-02 Thread Andrei Mikhailovsky
From what I can see, the ACS is unable to contact your hypervisor host server:


2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy]
(Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation
exception, caused by: com.cloud.exception.AgentUnavailableException:
Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
retrying


What is the status of your host server? Is it shown as 
Up/Alert/Disconnected/Connecting?

Andrei

- Original Message -
 From: Mohammad Rastgoo moham...@synapti.ca
 To: users@cloudstack.apache.org
 Sent: Monday, 2 February, 2015 7:27:30 PM
 Subject: Re: Virtual Routers not starting up after host restart
 
 Andrei,
 
 Below is the partial MS log. I have marked couple parts in bold. Might be
 dumb but my first thought was maybe iptables is causing it, yet I have no
 good explanations for it.
 
 2015-02-02 13:17:14,152 WARN  [o.a.c.alerts]
 (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
 alertType:: 9 // dataCenterId:: 1 // podId:: 1 // clusterId:: null //
 message:: Command: com.cloud.agent.api.check.CheckSshCommand failed while
 starting virtual router
 2015-02-02 13:17:14,233 WARN  [c.c.n.r.VirtualNetworkApplianceManagerImpl]
 (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Command:
 com.cloud.agent.api.check.CheckSshCommand failed while starting virtual
 router
 2015-02-02 13:17:49,620 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
 (Timer-1:ctx-3041bbe4) Task (job-244) has been pending for 1134 seconds
 2015-02-02 13:17:49,620 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
 (Timer-1:ctx-3041bbe4) Task (job-245) has been pending for 1133 seconds
 2015-02-02 13:18:49,620 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
 (Timer-1:ctx-526a6af6) Task (job-244) has been pending for 1194 seconds
 2015-02-02 13:18:49,620 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
 (Timer-1:ctx-526a6af6) Task (job-245) has been pending for 1193 seconds
 2015-02-02 13:19:16,969 ERROR [c.c.v.VirtualMachineManagerImpl]
 (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Failed to
 start instance VM[DomainRouter|r-29-VM]
 com.cloud.utils.exception.ExecutionException: Unable to start
 VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying
 2015-02-02 13:19:17,518 DEBUG [c.c.c.CapacityManagerImpl]
 (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) VM state
 transitted from :Starting to Stopped with event: OperationFailedvm's
 original host id: null new host id: null host id before state transition: 1
 2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy]
 (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Invocation
 exception, caused by: com.cloud.exception.AgentUnavailableException:
 Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
 Unable to start VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
 retrying
 2015-02-02 13:19:17,585 INFO  [c.c.v.VmWorkJobHandlerProxy]
 (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78) Rethrow
 exception com.cloud.exception.AgentUnavailableException: Resource [Host:1]
 is unreachable: Host 1: Unable to start instance due to Unable to start
 VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying
 2015-02-02 13:19:17,586 ERROR [c.c.v.VmWorkJobDispatcher]
 (Work-Job-Executor-24:ctx-114e980e job-244/job-245) Unable to complete
 AsyncJobVO {id:245, userId: 2, accountId: 2, instanceType: null,
 instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo:
 rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIAHXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw,
 cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
 result: null, initMsid: 161333667508, completeMsid: null, lastUpdated:
 null, lastPolled: null, created: Mon Feb 02 12:58:55 EST 2015}, job
 origin:244
 
 
 *com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
 unreachable: Host 1: Unable to start instance due to Unable to start
 VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retryingCaused
 by: com.cloud.utils.exception.ExecutionException: Unable to start
 VM[DomainRouter|r-29-VM] due to error in finalizeStart, not retrying*
 
 2015

Re: Virtual Routers not starting up after host restart

2015-02-02 Thread Andrei Mikhailovsky
Mohammad, any errors on the host side? Can you check if VRs are being created 
on the host? Also, check if you can get the console (from the hypervisor and 
not from the ACS GUI). Perhaps there is a clue on what's happening. 

By the way, are you other system vms working okay? Like ssvm and cpvm? 

Andrei 

- Original Message -

 From: Mohammad Rastgoo moham...@synapti.ca
 To: users@cloudstack.apache.org
 Sent: Monday, 2 February, 2015 7:42:13 PM
 Subject: Re: Virtual Routers not starting up after host restart

 UP and green.

 On Mon, Feb 2, 2015 at 2:34 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  From what I can see, the ACS is unable to contact your hypervisor
  host
  server:
 
 
  2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy]
  (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
  Invocation
  exception, caused by:
  com.cloud.exception.AgentUnavailableException:
  Resource [Host:1] is unreachable: Host 1: Unable to start instance
  due to
  Unable to start VM[DomainRouter|r-29-VM] due to error in
  finalizeStart, not
  retrying
 
 
  What is the status of your host server? Is it shown as
  Up/Alert/Disconnected/Connecting?
 
  Andrei
 
  - Original Message -
   From: Mohammad Rastgoo moham...@synapti.ca
   To: users@cloudstack.apache.org
   Sent: Monday, 2 February, 2015 7:27:30 PM
   Subject: Re: Virtual Routers not starting up after host restart
  
   Andrei,
  
   Below is the partial MS log. I have marked couple parts in bold.
   Might be
   dumb but my first thought was maybe iptables is causing it, yet I
   have no
   good explanations for it.
  
   2015-02-02 13:17:14,152 WARN [o.a.c.alerts]
   (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
   alertType:: 9 // dataCenterId:: 1 // podId:: 1 // clusterId::
   null //
   message:: Command: com.cloud.agent.api.check.CheckSshCommand
   failed while
   starting virtual router
   2015-02-02 13:17:14,233 WARN
  [c.c.n.r.VirtualNetworkApplianceManagerImpl]
   (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
   Command:
   com.cloud.agent.api.check.CheckSshCommand failed while starting
   virtual
   router
   2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
   (Timer-1:ctx-3041bbe4) Task (job-244) has been pending for 1134
   seconds
   2015-02-02 13:17:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
   (Timer-1:ctx-3041bbe4) Task (job-245) has been pending for 1133
   seconds
   2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
   (Timer-1:ctx-526a6af6) Task (job-244) has been pending for 1194
   seconds
   2015-02-02 13:18:49,620 WARN [o.a.c.f.j.i.AsyncJobMonitor]
   (Timer-1:ctx-526a6af6) Task (job-245) has been pending for 1193
   seconds
   2015-02-02 13:19:16,969 ERROR [c.c.v.VirtualMachineManagerImpl]
   (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
   Failed
  to
   start instance VM[DomainRouter|r-29-VM]
   com.cloud.utils.exception.ExecutionException: Unable to start
   VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
   retrying
   2015-02-02 13:19:17,518 DEBUG [c.c.c.CapacityManagerImpl]
   (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
   VM state
   transitted from :Starting to Stopped with event:
   OperationFailedvm's
   original host id: null new host id: null host id before state
  transition: 1
   2015-02-02 13:19:17,585 ERROR [c.c.v.VmWorkJobHandlerProxy]
   (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
  Invocation
   exception, caused by:
   com.cloud.exception.AgentUnavailableException:
   Resource [Host:1] is unreachable: Host 1: Unable to start
   instance due to
   Unable to start VM[DomainRouter|r-29-VM] due to error in
   finalizeStart,
  not
   retrying
   2015-02-02 13:19:17,585 INFO [c.c.v.VmWorkJobHandlerProxy]
   (Work-Job-Executor-24:ctx-114e980e job-244/job-245 ctx-405a7c78)
   Rethrow
   exception com.cloud.exception.AgentUnavailableException: Resource
  [Host:1]
   is unreachable: Host 1: Unable to start instance due to Unable to
   start
   VM[DomainRouter|r-29-VM] due to error in finalizeStart, not
   retrying
   2015-02-02 13:19:17,586 ERROR [c.c.v.VmWorkJobDispatcher]
   (Work-Job-Executor-24:ctx-114e980e job-244/job-245) Unable to
   complete
   AsyncJobVO {id:245, userId: 2, accountId: 2, instanceType: null,
   instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo:
  
  rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB

Re: Virtual Routers not starting up after host restart

2015-02-02 Thread Andrei Mikhailovsky
Mohammad, what does the management server log say when you try to start VRs? It 
should have the clue why it is not starting 

Andrei 

- Original Message -

 From: Mohammad Rastgoo moham...@synapti.ca
 To: users@cloudstack.apache.org
 Sent: Monday, 2 February, 2015 6:06:41 PM
 Subject: Virtual Routers not starting up after host restart

 Hi,

 Thanks for reading this.

 I have this setup:
 server 1: MS + DB
 server 2: secondary storage NFS
 server 3: kvm - local primary
 (all centos 6.6)
 net1: isolated network 10.0.0.0/x
 net2: shared network (public ip)

 Here are the steps I took:

 1- stopped all VMs
 2- stopped system VMs (not VRs)
 3- yum updated glibc + reboot on all servers

 Now here is the situation, net2 has remained in setup state and net1
 on
 allocated.

 sys VMs are back on. VRs are at starting and then stopped.

 so far, I have deleted VRs and restarted networks + clean up. no
 luck.

 has anyone encountered the same problem? am I missing anything here?

 Any help is highly appreciated. Tnx

 --
 Mohammad Rastgoo