[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989542#comment-14989542
 ] 

dsclose commented on CLOUDSTACK-9024:
-------------------------------------

This appears to have been introduced as part of CLOUDSTACK-6433.
Relevant commit: 
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=59a9db3
Commit title: Don't return success if only one RvR builds successfully.

The point being that we need the network restart to succeed if only one virtual 
router is built. This conforms to Cloudstack documentation on how to deal with 
faulty routers:

"If you are sure that a virtual router is down forever, or no longer functions 
as expected, destroy it. ... Recreate the missing router by using the 
restartNetwork API with cleanup=false parameter."
http://cloudstack-administration.readthedocs.org/en/latest/troubleshooting.html#recovering-a-lost-virtual-router

> Restart network fails if redundant router is missing
> ----------------------------------------------------
>
>                 Key: CLOUDSTACK-9024
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9024
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: API, Network Controller, Virtual Router
>    Affects Versions: 4.5.2
>         Environment: Cloudstack installed on CentOS 6.5
>            Reporter: dsclose
>
> Restart network action fails if a network is missing a redundant virtual 
> router. This occurs if triggered via the UI (Networks -> Select Network -> 
> Restart -> Clean-ip: False -> OK) or via the API.
> Steps to reproduce:
> ------------------------
> 1. Create a redundant router network offering.
> 2. Create a network using the redundant router network offering.
> 3. Destroy a redundant router from the network. Leave one functioning.
> 4. Initiate the restart network action or restartNetwork API call with 
> clean-up set to False.
> Expected Behaviour:
> --------------------------
> Cloudstack should boot a new redundant virtual router to replace the missing 
> router. The Network Restart action should return successfully.
> Actual Behaviour:
> -----------------------
> Cloudstack boots a replacement redundant router but the API call returns 
> unsucessful. The Web UI reports that the router fails.
> Timeline:
> -----------
> 2015-11-03 17:12:08,256 Destroying router "r-985-VM".
> 2015-11-03 17:12:24,511 Performing network restart.
> 2015-11-03 17:14:02,851 Failed to restart network
> Management Log Sample
> ---------------------------------
> 2015-11-03 17:12:14,943 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
> (Work-Job-Executor-12:ctx-a671c200 job-163/job-164) Remove job-164 from job 
> monitoring
> 2015-11-03 17:12:15,739 INFO  [o.a.c.s.v.VolumeServiceImpl] 
> (API-Job-Executor-12:ctx-33b24483 job-163 ctx-4d95a357) Volume 985 is not 
> referred anywhere, remove it from volumes table
> 2015-11-03 17:12:15,850 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
> (API-Job-Executor-12:ctx-33b24483 job-163) Remove job-163 from job monitoring
> 2015-11-03 17:12:18,698 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
> (API-Job-Executor-13:ctx-c29ad7f0 job-165) Add job-165 into job monitoring
> 2015-11-03 17:12:18,985 INFO  [c.c.n.r.VirtualNetworkApplianceManagerImpl] 
> (API-Job-Executor-13:ctx-c29ad7f0 job-165 ctx-7945f6f9) Use same MAC as 
> previous RvR, the MAC is 06:9c:86:00:00:0e
> 2015-11-03 17:12:19,829 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
> (Work-Job-Executor-13:ctx-06672650 job-165/job-166) Add job-166 into job 
> monitoring
> 2015-11-03 17:12:20,078 INFO  [c.c.s.StorageManagerImpl] 
> (Work-Job-Executor-13:ctx-06672650 job-165/job-166 ctx-81c163bb) Storage pool 
> null (1) does not supply IOPS capacity, assuming enough capacity
> 2015-11-03 17:12:40,248 INFO  [c.c.v.VirtualMachineManagerImpl] 
> (DirectAgentCronJob-492:ctx-1fb6ecea) There is pending job or HA tasks 
> working on the VM. vm id: 992, postpone power-change report by resetting 
> power-change counters
> 2015-11-03 17:13:40,384 INFO  [c.c.v.VirtualMachineManagerImpl] 
> (DirectAgentCronJob-280:ctx-846ef4f0) There is pending job or HA tasks 
> working on the VM. vm id: 992, postpone power-change report by resetting 
> power-change counters
> 2015-11-03 17:13:49,799 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (AsyncJobMgr-Heartbeat-1:ctx-10ff7f5f) Begin cleanup expired async-jobs
> 2015-11-03 17:13:49,825 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (AsyncJobMgr-Heartbeat-1:ctx-10ff7f5f) End cleanup expired async-jobs
> 2015-11-03 17:13:55,688 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
> (Work-Job-Executor-13:ctx-06672650 job-165/job-166) Remove job-166 from job 
> monitoring
> 2015-11-03 17:13:55,730 WARN  [o.a.c.e.o.NetworkOrchestrator] 
> (API-Job-Executor-13:ctx-c29ad7f0 job-165 ctx-7945f6f9) Failed to implement 
> network Ntwk[208|Guest|15] elements and resources as a part of network 
> restart due to
> com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1] is 
> unreachable: Can't find all necessary running routers!
>         at 
> com.cloud.network.element.VirtualRouterElement.implement(VirtualRouterElement.java:202)
>         at 
> org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.implementNetworkElementsAndResources(NetworkOrchestrator.java:1103)
>         at 
> org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.restartNetwork(NetworkOrchestrator.java:2546)
>         at 
> com.cloud.network.NetworkServiceImpl.restartNetwork(NetworkServiceImpl.java:1891)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at 
> org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:106)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>         at 
> com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>         at 
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at com.sun.proxy.$Proxy157.restartNetwork(Unknown Source)
>         at 
> org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNetworkCmd.java:95)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141)
>         at 
> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
>         at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>         at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-11-03 17:13:55,732 WARN  [c.c.n.NetworkServiceImpl] 
> (API-Job-Executor-13:ctx-c29ad7f0 job-165 ctx-7945f6f9) Network id=208 failed 
> to restart.
> 2015-11-03 17:13:55,806 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
> (API-Job-Executor-13:ctx-c29ad7f0 job-165) Remove job-165 from job monitoring
> 2015-11-03 17:14:00,988 INFO  [c.c.n.r.VirtualNetworkApplianceManagerImpl] 
> (RedundantRouterStatusMonitor-7:ctx-5a3246e2) Redundant virtual router (name: 
> r-992-VM, id: 992)  just switch from UNKNOWN to BACKUP



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to