[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajani Karuturi updated CLOUDSTACK-5504:
----------------------------------------
    Fix Version/s:     (was: 4.6.0)

> Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported 
> failure because of timing out after 20 minutes.But the snapshot process 
> continues to succeed in Vmcenter after NFS was brought up.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5504
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5504
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>         Attachments: primarydown.rar
>
>
> Setup:
> Advanced zone set up with 2 5.1 ESXI host.
> 1. Deploy few Vms in each of the hosts  , so we start with 11 Vms.
> 2. Create snapshot for ROOT volumes.
> 3. When snapshot is still in progress , Make the primary storage unavailable 
> for 10 mts.
> 4. Bring up the primary store after more than 10 mts.
> When the primary store was brought up , I see the snapshots that were in 
> progress actually continue to download to secondary and succeed . 
> one of the snapshots that succeeded and fully available in secondary store:
> root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]# ls -ltr
> total 446808
> -rw-r--r--. 1 root root      6454 Dec 13 21:09 
> 1c545037-1d1c-4927-918a-2f3975e1076b.ovf
> -rw-r--r--. 1 root root 457069056 Dec 13 21:09 
> 1c545037-1d1c-4927-918a-2f3975e1076b-disk0.vmdk
> [root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]#
> But all the 11 snapshot tasks from Cloud Stack side report failure after 
> about 20 minutes and then snapshots are put in "CreatedOnPrimary" state.
> Next scheduled hourly snapshot is attempted and succeeds.
> |        22 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        21 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        20 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        19 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        18 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        17 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        16 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        14 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        25 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        24 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        23 | CreatedOnPrimary | 2013-12-13 21:52:18 | NULL                |
> |        22 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        21 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        20 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        19 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        18 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        17 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        16 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        14 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        25 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> |        24 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> |        23 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> +-----------+------------------+---------------------+---------------------+
> 86 rows in set (0.00 sec)
> 2013-12-13 16:52:17,720 DEBUG [c.c.a.t.Request] (Job-Executor-5:ctx-1d3bd6cc 
> ctx-837a59a5) Seq 5-422576170: Sending  { Cmd , MgmtId: 95307354844397, via: 
> 5(s-10-VM), Ver: v1, Flags: 100011, 
> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
>  
> i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
>  
> i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}]
>  }
> 2013-12-13 17:13:44,250 DEBUG [c.c.a.m.AgentAttache] 
> (AgentConnectTaskPool-3:ctx-1a3a5155) Seq 5-422576
> 170: Sending disconnect to class com.cloud.agent.manager.SynchronousListener
> 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more 
> time because this is the current command
> 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more 
> time because this is the current command
> 2013-12-13 17:13:44,252 INFO  [c.c.u.e.CSExceptionErrorCode] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Could not find exception: 
> com.cloud.exception.OperationTimedoutException in error code list for 
> exceptions
> 2013-12-13 17:13:44,256 WARN  [c.c.a.m.AgentAttache] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Timed out on Seq 
> 5-422576170:  { Cmd , MgmtId: 95307354844397, via: 5(s-10-VM), Ver: v1, 
> Flags: 100011, 
> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
>  
> i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed]
>  
> i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}]
>  }
> 2013-12-13 17:13:44,294 DEBUG [c.c.a.m.AgentAttache] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Cancelling.
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.RemoteHostEndPoint] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Failed to send command, due to 
> Agent:2, com.cloud.exception.OperationTimedoutException: Commands 422576170 
> to Host 5 timed out after 43200
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy snasphot failed: 
> com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due 
> to Agent:2, com.cloud.exception.OperationTimedoutException: Commands 
> 422576170 to Host 5 timed out after 43200
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy failed
> 2013-12-13 17:13:44,440 DEBUG [c.c.s.s.SnapshotManagerImpl] 
> (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Failed to create snapshot
> com.cloud.utils.exception.CloudRuntimeException: 
> com.cloud.utils.exception.CloudRuntimeException: 
> com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due 
> to Agent:2, com.cloud.exception.OperationTimedoutException: Commands 
> 422576170 to Host 5 timed out after 43200
>         at 
> org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:275)
>         at 
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:135)
>         at 
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:294)
>         at 
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at 
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at $Proxy161.takeSnapshot(Unknown Source)
>         at 
> org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1341)
>         at 
> com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1486)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at 
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to