Hi, Something bizarre is happening to my CS installation. I restarted the CS management server and now the VR is stopped in the infrastructure. When I start it, it dies. In the log I find the following:
2016-01-19 11:47:13,987 DEBUG [c.c.a.ApiServlet] (catalina-exec-12:ctx-ed3f1fe7) ===START=== 10.29.128.147 -- GET command=restartNetwork&id=73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6&cleanup=false&response=json&_=1453196733255 2016-01-19 11:47:14,100 INFO [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-3:ctx-1ab47dd8 job-2767) Add job-2767 into job monitoring 2016-01-19 11:47:14,116 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (catalina-exec-12:ctx-ed3f1fe7 ctx-d676ceae) submit async job-2767, details: AsyncJobVO {id:2767, userId: 2, accountId: 2, instanceType: None, instanceId: null, cmd: org.apache.cloudstack.api.command.user.network.RestartNetworkCmd, cmdInfo: {"id":"73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6","response":"json","cleanup":"false","ctxDetails":"{\"com.cloud.network.Network\":\"73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6\"}","cmdEventType":"NETWORK.RESTART","ctxUserId":"2","httpmethod":"GET","_":"1453196733255","uuid":"73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6","ctxAccountId":"2","ctxStartEventId":"7033"}, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 7063056740127, completeMsid: null, lastUpdated: null, lastPolled: null, created: null} 2016-01-19 11:47:14,118 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767) Executing AsyncJobVO {id:2767, userId: 2, accountId: 2, instanceType: None, instanceId: null, cmd: org.apache.cloudstack.api.command.user.network.RestartNetworkCmd, cmdInfo: {"id":"73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6","response":"json","cleanup":"false","ctxDetails":"{\"com.cloud.network.Network\":\"73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6\"}","cmdEventType":"NETWORK.RESTART","ctxUserId":"2","httpmethod":"GET","_":"1453196733255","uuid":"73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6","ctxAccountId":"2","ctxStartEventId":"7033"}, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 7063056740127, completeMsid: null, lastUpdated: null, lastPolled: null, created: null} 2016-01-19 11:47:14,122 DEBUG [c.c.a.ApiServlet] (catalina-exec-12:ctx-ed3f1fe7 ctx-d676ceae) ===END=== 10.29.128.147 -- GET command=restartNetwork&id=73ee6a4f-6fa1-42a1-a6b5-b273b8fea8f6&cleanup=false&response=json&_=1453196733255 2016-01-19 11:47:14,238 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Restarting network 204... 2016-01-19 11:47:14,238 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Skip the shutting down of network id=204 2016-01-19 11:47:14,239 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Implementing the network Ntwk[204|Guest|6] elements and resources as a part of network restart 2016-01-19 11:47:14,262 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Asking SecurityGroupProvider to implemenet Ntwk[204|Guest|6] 2016-01-19 11:47:14,271 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Asking VirtualRouter to implemenet Ntwk[204|Guest|6] 2016-01-19 11:47:14,286 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Lock is acquired for network id 204 as a part of router startup in Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))] : Dest[Zone(1)-Pod(null)-Cluster(null)-Host(null)-Storage()] 2016-01-19 11:47:14,348 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Adding nic for Virtual Router in Guest network Ntwk[204|Guest|6] 2016-01-19 11:47:14,373 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Adding nic for Virtual Router in Control network 2016-01-19 11:47:14,393 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Found existing network configuration for offering [Network Offering [3-Control-System-Control-Network]: Ntwk[202|Control|3] 2016-01-19 11:47:14,394 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Releasing lock for Acct[e84838fc-b4fa-11e3-a020-066c7efcef1f-system] 2016-01-19 11:47:14,448 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Allocating the VR i=1615 in datacenter com.cloud.dc.DataCenterVO$$EnhancerByCGLIB$$af5ce0bd@1with the hypervisor type XenServer 2016-01-19 11:47:14,461 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) XenServer won't support system vm, skip it 2016-01-19 11:47:14,465 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Lock is released for network id 204 as a part of router startup in Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))] : Dest[Zone(1)-Pod(null)-Cluster(null)-Host(null)-Storage()] 2016-01-19 11:47:14,466 WARN [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Failed to implement network Ntwk[204|Guest|6] elements and resources as a part of network restart due to com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1] is unreachable: Can't find all necessary running routers! at com.cloud.network.element.VirtualRouterElement.implement(VirtualRouterElement.java:202) at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.implementNetworkElementsAndResources(NetworkOrchestrator.java:1103) at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.restartNetwork(NetworkOrchestrator.java:2546) at com.cloud.network.NetworkServiceImpl.restartNetwork(NetworkServiceImpl.java:1891) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:106) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy157.restartNetwork(Unknown Source) at org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNetworkCmd.java:95) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141) at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2016-01-19 11:47:14,472 WARN [c.c.n.NetworkServiceImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767 ctx-50a90eaf) Network id=204 failed to restart. 2016-01-19 11:47:14,495 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-3:ctx-1ab47dd8 job-2767) Complete async job-2767, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed to restart network"} Also I get the following notifications: Insufficient capacity to restart VM, name: r-96-VM, id: 96 which was running on host name: cloudsrv1.afdb.org(id:10), availability zone: DEVZONE, pod: DEV_POD and An out of band migration of router r-96-VM(1f0fe028-f20b-41ed-a125-d5d7e63c9595) was detected. No automated action was performed. I've tried to recreate the VR by destroying it and restarting the network with clean=false but this does not resolve the issue. I've tried reinstalling the system template but again no show. I moved the systemvm.iso and tried reinstalling the system template but this does not recreate the systemvm.iso file. I'm now stuck with the system sending me notifications regularly about insufficient capacity and out of band migration. Please help if you have seen this before or have a suggestion. TIA Osay