[ https://issues.apache.org/jira/browse/CLOUDSTACK-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajani Karuturi updated CLOUDSTACK-5504: ---------------------------------------- Fix Version/s: (was: 4.6.0) > Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported > failure because of timing out after 20 minutes.But the snapshot process > continues to succeed in Vmcenter after NFS was brought up. > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: CLOUDSTACK-5504 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5504 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Management Server > Affects Versions: 4.3.0 > Environment: Build from 4.3 > Reporter: Sangeetha Hariharan > Attachments: primarydown.rar > > > Setup: > Advanced zone set up with 2 5.1 ESXI host. > 1. Deploy few Vms in each of the hosts , so we start with 11 Vms. > 2. Create snapshot for ROOT volumes. > 3. When snapshot is still in progress , Make the primary storage unavailable > for 10 mts. > 4. Bring up the primary store after more than 10 mts. > When the primary store was brought up , I see the snapshots that were in > progress actually continue to download to secondary and succeed . > one of the snapshots that succeeded and fully available in secondary store: > root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]# ls -ltr > total 446808 > -rw-r--r--. 1 root root 6454 Dec 13 21:09 > 1c545037-1d1c-4927-918a-2f3975e1076b.ovf > -rw-r--r--. 1 root root 457069056 Dec 13 21:09 > 1c545037-1d1c-4927-918a-2f3975e1076b-disk0.vmdk > [root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]# > But all the 11 snapshot tasks from Cloud Stack side report failure after > about 20 minutes and then snapshots are put in "CreatedOnPrimary" state. > Next scheduled hourly snapshot is attempted and succeeds. > | 22 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL | > | 21 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL | > | 20 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL | > | 19 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL | > | 18 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL | > | 17 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL | > | 16 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL | > | 14 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL | > | 25 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL | > | 24 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL | > | 23 | CreatedOnPrimary | 2013-12-13 21:52:18 | NULL | > | 22 | BackedUp | 2013-12-13 22:42:15 | NULL | > | 21 | BackedUp | 2013-12-13 22:42:15 | NULL | > | 20 | BackedUp | 2013-12-13 22:42:15 | NULL | > | 19 | BackedUp | 2013-12-13 22:42:15 | NULL | > | 18 | BackedUp | 2013-12-13 22:42:15 | NULL | > | 17 | BackedUp | 2013-12-13 22:42:16 | NULL | > | 16 | BackedUp | 2013-12-13 22:42:16 | NULL | > | 14 | BackedUp | 2013-12-13 22:42:16 | NULL | > | 25 | BackedUp | 2013-12-13 22:42:17 | NULL | > | 24 | BackedUp | 2013-12-13 22:42:17 | NULL | > | 23 | BackedUp | 2013-12-13 22:42:17 | NULL | > +-----------+------------------+---------------------+---------------------+ > 86 rows in set (0.00 sec) > 2013-12-13 16:52:17,720 DEBUG [c.c.a.t.Request] (Job-Executor-5:ctx-1d3bd6cc > ctx-837a59a5) Seq 5-422576170: Sending { Cmd , MgmtId: 95307354844397, via: > 5(s-10-VM), Ver: v1, Flags: 100011, > [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] > > i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] > > i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}] > } > 2013-12-13 17:13:44,250 DEBUG [c.c.a.m.AgentAttache] > (AgentConnectTaskPool-3:ctx-1a3a5155) Seq 5-422576 > 170: Sending disconnect to class com.cloud.agent.manager.SynchronousListener > 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more > time because this is the current command > 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more > time because this is the current command > 2013-12-13 17:13:44,252 INFO [c.c.u.e.CSExceptionErrorCode] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Could not find exception: > com.cloud.exception.OperationTimedoutException in error code list for > exceptions > 2013-12-13 17:13:44,256 WARN [c.c.a.m.AgentAttache] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Timed out on Seq > 5-422576170: { Cmd , MgmtId: 95307354844397, via: 5(s-10-VM), Ver: v1, > Flags: 100011, > [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] > > i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] > > i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}] > } > 2013-12-13 17:13:44,294 DEBUG [c.c.a.m.AgentAttache] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Cancelling. > 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.RemoteHostEndPoint] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Failed to send command, due to > Agent:2, com.cloud.exception.OperationTimedoutException: Commands 422576170 > to Host 5 timed out after 43200 > 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy snasphot failed: > com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due > to Agent:2, com.cloud.exception.OperationTimedoutException: Commands > 422576170 to Host 5 timed out after 43200 > 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy failed > 2013-12-13 17:13:44,440 DEBUG [c.c.s.s.SnapshotManagerImpl] > (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Failed to create snapshot > com.cloud.utils.exception.CloudRuntimeException: > com.cloud.utils.exception.CloudRuntimeException: > com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due > to Agent:2, com.cloud.exception.OperationTimedoutException: Commands > 422576170 to Host 5 timed out after 43200 > at > org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:275) > at > org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:135) > at > org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:294) > at > com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at $Proxy161.takeSnapshot(Unknown Source) > at > org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1341) > at > com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1486) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) -- This message was sent by Atlassian JIRA (v6.3.4#6332)