[ https://issues.apache.org/jira/browse/CLOUDSTACK-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769870#comment-13769870 ]
Sangeetha Hariharan commented on CLOUDSTACK-4651: ------------------------------------------------- Tested with the latest build: Deploy a VM. Initiate snapshot for root volume of this VM. As soon as the VM snapshot command is issued , kill the management server process. Start Management server. Stop this VM. Start this VM. starting Vm fails with error message - "Unable to create deployment, no usable volumes found for the VM". With CLOUDSTACK-4650 fixes , volume remains in "Snapshoting" state for a very short time and getting into this state would be a timing issue. Following exception seen in management server logs: 2013-09-17 12:50:31,176 DEBUG [cloud.api.ApiServlet] (catalina-exec-8:null) ===START=== 10.215.3.9 -- GET command=startVirtualMachine&id=4a398643-5f9d-4c75-8174-a9bb62580538&response=json&sessionkey=TkwA%2Bm9SCHW5H3aBJVqsWm4YB68%3D&_=1379448058217 2013-09-17 12:50:31,223 DEBUG [cloud.async.AsyncJobManagerImpl] (catalina-exec-8:null) submit async job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ], details: AsyncJobVO {id:53, userId: 3, accountId: 3, sessionKey: null, instanceType: VirtualMachine, instanceId: 10, cmd: org.apache.cloudstack.api.command.user.vm.StartVMCmd, cmdOriginator: null, cmdInfo: {"id":"4a398643-5f9d-4c75-8174-a9bb62580538","response":"json","sessionkey":"TkwA+m9SCHW5H3aBJVqsWm4YB68\u003d","cmdEventType":"VM.START","ctxUserId":"3","httpmethod":"GET","_":"1379448058217","ctxAccountId":"3","ctxStartEventId":"202"}, cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, processStatus: 0, resultCode: 0, result: null, initMsid: 161197867246747, completeMsid: null, lastUpdated: null, lastPolled: null, created: null} 2013-09-17 12:50:31,226 DEBUG [cloud.api.ApiServlet] (catalina-exec-8:null) ===END=== 10.215.3.9 -- GET command=startVirtualMachine&id=4a398643-5f9d-4c75-8174-a9bb62580538&response=json&sessionkey=TkwA%2Bm9SCHW5H3aBJVqsWm4YB68%3D&_=1379448058217 2013-09-17 12:50:31,229 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Executing org.apache.cloudstack.api.command.user.vm.StartVMCmd for job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ] 2013-09-17 12:50:31,251 DEBUG [cloud.user.AccountManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Access to VM[User|sangee5] granted to Acct[2db522b2-192b-42bf-a9fd-f4a18fcbb51b-sangee] by DomainChecker_EnhancerByCloudStack_57441200 2013-09-17 12:50:31,266 DEBUG [cloud.network.NetworkModelImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Service SecurityGroup is not supported in the network id=208 2013-09-17 12:50:31,271 DEBUG [cloud.network.NetworkModelImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Service SecurityGroup is not supported in the network id=208 2013-09-17 12:50:31,315 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Deploy avoids pods: [], clusters: [], hosts: [] 2013-09-17 12:50:31,318 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) DeploymentPlanner allocation algorithm: com.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_4e6f8d51@4807e0d7 2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Trying to allocate a host and storage pools from dc:2, pod:2,cluster:null, requested cpu: 100, requested ram: 130023424 2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Is ROOT volume READY (pool already allocated)?: No 2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) This VM has last host_id specified, trying to choose the same host: 7 2013-09-17 12:50:31,402 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Checking if host: 7 has enough capacity for requested CPU: 100 and requested RAM: 130023424 , cpuOverprovisioningFactor: 1.0 2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Hosts's actual total CPU: 9044 and CPU after applying overprovisioning: 9044 2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) We need to allocate to the last host again, so checking if there is enough reserved capacity 2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Reserved CPU: 100 , Requested CPU: 100 2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Reserved RAM: 130023424 , Requested RAM: 130023424 2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Host has enough CPU and RAM available 2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) STATS: Can alloc CPU from host: 7, used: 2300, reserved: 100, actual total: 9044, total with overprovisioning: 9044; requested cpu:100,alloc_from_last_host?:true ,considerReservedCapacity?: true 2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) STATS: Can alloc MEM from host: 7, used: 2403336192, reserved: 130023424, total: 16190149632; requested mem: 130023424,alloc_from_last_host?:true ,considerReservedCapacity?: true 2013-09-17 12:50:31,406 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) The last host of this VM is UP and has enough capacity 2013-09-17 12:50:31,406 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Now checking for suitable pools under zone: 2, pod: 2, cluster: 2 2013-09-17 12:50:31,418 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Unexpected exception while executing org.apache.cloudstack.api.command.user.vm.StartVMCmd com.cloud.utils.exception.CloudRuntimeException: Unable to create deployment, no usable volumes found for the VM at com.cloud.deploy.DeploymentPlanningManagerImpl.findSuitablePoolsForVolumes(DeploymentPlanningManagerImpl.java:1059) at com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:358) at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:187) at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:198) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3405) at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:1948) at com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125) at org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:120) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) 2013-09-17 12:50:31,450 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Complete async job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ], jobStatus: 2, resultCode: 530, result: Error Code: 530 Error text: Unable to create deployment, no usable volumes found for the VM > Restarting management server when volume Snapshot is still in progress for > root volume of a VM , then there is no way to restart VM since the startVM > job is stuck forever since the volume is in "Snapshoting" state. > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-4651 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4651 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Management Server > Affects Versions: 4.2.1 > Environment: Build from 4.2.-forward > Reporter: Sangeetha Hariharan > Assignee: Prachi Damle > Priority: Critical > Fix For: 4.2.1 > > > Restarting management server when volume Snapshot is still in progress for > root volume of a VM , then there is no way to restart VM since the startVM > job is stuck forever since the volume is in "Snapshoting" state. > Steps to reproduce the problem: > Deploy a VM. > Initiate snapshot for root volume of this VM. > When VM snapshot is in progress , stop the management server. > Start Management server. > Stop this VM. > Start this VM. > VM will never transition to "Starting" state and continues to be in "Stopped" > state. > The start VM async job never completes and hits an infinite loop in this case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira