[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-8949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Farrington updated CLOUDSTACK-8949:
----------------------------------------
    Description: 
when creating a "MigrateVolumeCommand" a timeout should be set, this way the 
ACS would not use the default value of parameter "wait".  That timeout value 
should be the same as the one used on CitrixResourceBase and its children to 
control the migration of volumes.

Environment:    XenServer 6.2, multiple pre-setup primary storage locations 
that are iscsi SRs in XenServer.

Summary of actions:      Attempt to move a volume that is attached to a running 
instance to another primary storage location.


What *I* assume should happen:    Cloudstack communicates with the host of the 
VM in question and issues a xe vdi-pool-migrate.   The command is synchronous 
and doesn't return control back until it has performed the action.   Cloudstack 
should know this is a synchronous call and wait the configured amount of time 
for the migration to be performed.


What *I* think is happening(with some education from Rafael Weingärtner):    
There are multiple timers being set in the process (please reference  




{noformat}
2015-10-12 16:41:20,456 DEBUG [c.c.a.t.Request] (Job-Executor-63:ctx-f7b6817d 
ctx-c6b92515) Seq 38-996939857: Sending  { Cmd , MgmtId: 42756806312036, via: 
38(xen-nc-bc2b7), Ver: v1, Flags: 100111, 
[{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
 }
2015-10-12 16:41:20,456 DEBUG [c.c.a.t.Request] (Job-Executor-63:ctx-f7b6817d 
ctx-c6b92515) Seq 38-996939857: Executing:  { Cmd , MgmtId: 42756806312036, 
via: 38(xen-nc-bc2b7), Ver: v1, Flags: 100111, 
[{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
 }
2015-10-12 16:41:20,457 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-494:ctx-5aac3f29) Seq 38-996939857: Executing request
2015-10-12 17:41:20,457 DEBUG [c.c.a.m.AgentAttache] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Waiting some more 
time because this is the current command
2015-10-12 18:41:20,457 DEBUG [c.c.a.m.AgentAttache] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Waiting some more 
time because this is the current command
2015-10-12 18:41:20,457 INFO  [c.c.u.e.CSExceptionErrorCode] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Could not find exception: 
com.cloud.exception.OperationTimedoutException in error code list for exceptions
2015-10-12 18:41:20,465 WARN  [c.c.a.m.AgentAttache] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Timed out on Seq 
38-996939857:  { Cmd , MgmtId: 42756806312036, via: 38(xen-nc-bc2b7), Ver: v1, 
Flags: 100111, 
[{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
 }
2015-10-12 18:41:20,465 DEBUG [c.c.a.m.AgentAttache] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Cancelling.
2015-10-12 18:41:20,465 DEBUG [c.c.a.m.AgentAttache] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: No more commands 
found
2015-10-12 18:41:20,465 DEBUG [o.a.c.s.RemoteHostEndPoint] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Failed to send command, due to 
Agent:38, com.cloud.exception.OperationTimedoutException: Commands 996939857 to 
Host 38 timed out after 7200
2015-10-12 18:41:20,471 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) copy failed
com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to 
Agent:38, com.cloud.exception.OperationTimedoutException: Commands 996939857 to 
Host 38 timed out after 7200
        at 
org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessage(RemoteHostEndPoint.java:116)
        at 
org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.migrateVolumeToPool(AncientDataMotionStrategy.java:382)
        at 
org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:421)
        at 
org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70)
        at 
org.apache.cloudstack.storage.volume.VolumeServiceImpl.migrateVolume(VolumeServiceImpl.java:931)
        at 
com.cloud.storage.VolumeApiServiceImpl.liveMigrateVolume(VolumeApiServiceImpl.java:1680)
        at 
com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:1666)
        at 
com.cloud.storage.VolumeApiServiceImpl.migrateVolume(VolumeApiServiceImpl.java:1622)
        at sun.reflect.GeneratedMethodAccessor335.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy196.migrateVolume(Unknown Source)
        at 
org.apache.cloudstack.api.command.user.volume.MigrateVolumeCmd.execute(MigrateVolumeCmd.java:103)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
        at 
com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
        at 
com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at 
com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
        at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:701)
2015-10-12 18:41:20,479 WARN  [o.a.c.s.d.ObjectInDataStoreManagerImpl] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Unsupported data object (VOLUME, 
org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@4fa7a45f), no need 
to delete from object in store ref table
2015-10-12 18:41:20,479 DEBUG [c.c.s.VolumeApiServiceImpl] 
(Job-Executor-63:ctx-f7b6817d ctx-c6b92515) migrate volume 
failed:com.cloud.utils.exception.CloudRuntimeException: Failed to send command, 
due to Agent:38, com.cloud.exception.OperationTimedoutException: Commands 
996939857 to Host 38 timed out after 7200
2015-10-12 18:41:20,480 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Job-Executor-63:ctx-f7b6817d) Complete async job-5257, jobStatus: FAILED, 
resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
 to migrate volume"}
2015-10-12 18:41:20,486 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Job-Executor-63:ctx-f7b6817d) Done executing 
org.apache.cloudstack.api.command.user.volume.MigrateVolumeCmd for job-5257
2015-10-12 18:41:20,489 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(Job-Executor-63:ctx-f7b6817d) Remove job-5257 from job monitoring
{noformat}


  was:
when creating a "MigrateVolumeCommand" a timeout should be set, this way the 
ACS would not use the default value of parameter "wait".  That timeout value 
should be the same as the one used on CitrixResourceBase and its children to 
control the migration of volumes.



> Job timeout caused by improper/inconsistent application of wait parameters
> --------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-8949
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-8949
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Hypervisor Controller
>    Affects Versions: 4.3.0
>         Environment: XenServer 6.2
>            Reporter: Ryan Farrington
>            Priority: Minor
>
> when creating a "MigrateVolumeCommand" a timeout should be set, this way the 
> ACS would not use the default value of parameter "wait".  That timeout value 
> should be the same as the one used on CitrixResourceBase and its children to 
> control the migration of volumes.
> Environment:    XenServer 6.2, multiple pre-setup primary storage locations 
> that are iscsi SRs in XenServer.
> Summary of actions:      Attempt to move a volume that is attached to a 
> running instance to another primary storage location.
> What *I* assume should happen:    Cloudstack communicates with the host of 
> the VM in question and issues a xe vdi-pool-migrate.   The command is 
> synchronous and doesn't return control back until it has performed the 
> action.   Cloudstack should know this is a synchronous call and wait the 
> configured amount of time for the migration to be performed.
> What *I* think is happening(with some education from Rafael Weingärtner):    
> There are multiple timers being set in the process (please reference  
> {noformat}
> 2015-10-12 16:41:20,456 DEBUG [c.c.a.t.Request] (Job-Executor-63:ctx-f7b6817d 
> ctx-c6b92515) Seq 38-996939857: Sending  { Cmd , MgmtId: 42756806312036, via: 
> 38(xen-nc-bc2b7), Ver: v1, Flags: 100111, 
> [{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
>  }
> 2015-10-12 16:41:20,456 DEBUG [c.c.a.t.Request] (Job-Executor-63:ctx-f7b6817d 
> ctx-c6b92515) Seq 38-996939857: Executing:  { Cmd , MgmtId: 42756806312036, 
> via: 38(xen-nc-bc2b7), Ver: v1, Flags: 100111, 
> [{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
>  }
> 2015-10-12 16:41:20,457 DEBUG [c.c.a.m.DirectAgentAttache] 
> (DirectAgent-494:ctx-5aac3f29) Seq 38-996939857: Executing request
> 2015-10-12 17:41:20,457 DEBUG [c.c.a.m.AgentAttache] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Waiting some 
> more time because this is the current command
> 2015-10-12 18:41:20,457 DEBUG [c.c.a.m.AgentAttache] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Waiting some 
> more time because this is the current command
> 2015-10-12 18:41:20,457 INFO  [c.c.u.e.CSExceptionErrorCode] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Could not find exception: 
> com.cloud.exception.OperationTimedoutException in error code list for 
> exceptions
> 2015-10-12 18:41:20,465 WARN  [c.c.a.m.AgentAttache] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Timed out on 
> Seq 38-996939857:  { Cmd , MgmtId: 42756806312036, via: 38(xen-nc-bc2b7), 
> Ver: v1, Flags: 100111, 
> [{"com.cloud.agent.api.storage.MigrateVolumeCommand":{"volumeId":808,"volumePath":"0cd3ec8c-9fa9-4caf-8380-1a85cdfd0958","pool":{"id":246,"uuid":"VNX_PR5_LUN2003","host":"localhost","path":"/VNX_PR5_LUN2003","port":0,"type":"PreSetup"},"attachedVmName":"i-34-311-VM","wait":0}}]
>  }
> 2015-10-12 18:41:20,465 DEBUG [c.c.a.m.AgentAttache] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: Cancelling.
> 2015-10-12 18:41:20,465 DEBUG [c.c.a.m.AgentAttache] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Seq 38-996939857: No more 
> commands found
> 2015-10-12 18:41:20,465 DEBUG [o.a.c.s.RemoteHostEndPoint] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Failed to send command, due to 
> Agent:38, com.cloud.exception.OperationTimedoutException: Commands 996939857 
> to Host 38 timed out after 7200
> 2015-10-12 18:41:20,471 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) copy failed
> com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due 
> to Agent:38, com.cloud.exception.OperationTimedoutException: Commands 
> 996939857 to Host 38 timed out after 7200
>         at 
> org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessage(RemoteHostEndPoint.java:116)
>         at 
> org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.migrateVolumeToPool(AncientDataMotionStrategy.java:382)
>         at 
> org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:421)
>         at 
> org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70)
>         at 
> org.apache.cloudstack.storage.volume.VolumeServiceImpl.migrateVolume(VolumeServiceImpl.java:931)
>         at 
> com.cloud.storage.VolumeApiServiceImpl.liveMigrateVolume(VolumeApiServiceImpl.java:1680)
>         at 
> com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:1666)
>         at 
> com.cloud.storage.VolumeApiServiceImpl.migrateVolume(VolumeApiServiceImpl.java:1622)
>         at sun.reflect.GeneratedMethodAccessor335.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:622)
>         at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at 
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at com.sun.proxy.$Proxy196.migrateVolume(Unknown Source)
>         at 
> org.apache.cloudstack.api.command.user.volume.MigrateVolumeCmd.execute(MigrateVolumeCmd.java:103)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
>         at 
> com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
>         at 
> com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at 
> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
>         at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:701)
> 2015-10-12 18:41:20,479 WARN  [o.a.c.s.d.ObjectInDataStoreManagerImpl] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) Unsupported data object (VOLUME, 
> org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@4fa7a45f), no 
> need to delete from object in store ref table
> 2015-10-12 18:41:20,479 DEBUG [c.c.s.VolumeApiServiceImpl] 
> (Job-Executor-63:ctx-f7b6817d ctx-c6b92515) migrate volume 
> failed:com.cloud.utils.exception.CloudRuntimeException: Failed to send 
> command, due to Agent:38, com.cloud.exception.OperationTimedoutException: 
> Commands 996939857 to Host 38 timed out after 7200
> 2015-10-12 18:41:20,480 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (Job-Executor-63:ctx-f7b6817d) Complete async job-5257, jobStatus: FAILED, 
> resultCode: 530, result: 
> org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
>  to migrate volume"}
> 2015-10-12 18:41:20,486 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (Job-Executor-63:ctx-f7b6817d) Done executing 
> org.apache.cloudstack.api.command.user.volume.MigrateVolumeCmd for job-5257
> 2015-10-12 18:41:20,489 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
> (Job-Executor-63:ctx-f7b6817d) Remove job-5257 from job monitoring
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to