[jira] [Commented] (CLOUDSTACK-4651) Restarting management server when volume Snapshot is still in progress for root volume of a VM , then there is no way to restart VM since the startVM job is stuck forever since the volume is in "Snapshoting" state.

Sangeetha Hariharan (JIRA) Tue, 17 Sep 2013 13:17:05 -0700

    [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769870#comment-13769870
 ]


Sangeetha Hariharan commented on CLOUDSTACK-4651:
-------------------------------------------------

Tested with the latest build:
Deploy a VM.
Initiate snapshot for root volume of this VM.

As soon as the VM snapshot command is issued , kill the management server 
process.

Start Management server.

Stop this VM.
Start this VM. 

starting Vm fails with error message - "Unable to create deployment, no usable 
volumes found for the VM".

With CLOUDSTACK-4650 fixes , volume remains in "Snapshoting" state for a very 
short time and getting into this state would be a timing issue.   


Following exception seen in management server logs:

2013-09-17 12:50:31,176 DEBUG [cloud.api.ApiServlet] (catalina-exec-8:null) 
===START===  10.215.3.9 -- GET  
command=startVirtualMachine&id=4a398643-5f9d-4c75-8174-a9bb62580538&response=json&sessionkey=TkwA%2Bm9SCHW5H3aBJVqsWm4YB68%3D&_=1379448058217
2013-09-17 12:50:31,223 DEBUG [cloud.async.AsyncJobManagerImpl] 
(catalina-exec-8:null) submit async job-53 = [ 
605b349e-9745-41c1-87e2-2ae08c7072cf ], details: AsyncJobVO {id:53, userId: 3, 
accountId: 3, sessionKey: null, instanceType: VirtualMachine, instanceId: 10, 
cmd: org.apache.cloudstack.api.command.user.vm.StartVMCmd, cmdOriginator: null, 
cmdInfo: 
{"id":"4a398643-5f9d-4c75-8174-a9bb62580538","response":"json","sessionkey":"TkwA+m9SCHW5H3aBJVqsWm4YB68\u003d","cmdEventType":"VM.START","ctxUserId":"3","httpmethod":"GET","_":"1379448058217","ctxAccountId":"3","ctxStartEventId":"202"},
 cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, 
processStatus: 0, resultCode: 0, result: null, initMsid: 161197867246747, 
completeMsid: null, lastUpdated: null, lastPolled: null, created: null}
2013-09-17 12:50:31,226 DEBUG [cloud.api.ApiServlet] (catalina-exec-8:null) 
===END===  10.215.3.9 -- GET  
command=startVirtualMachine&id=4a398643-5f9d-4c75-8174-a9bb62580538&response=json&sessionkey=TkwA%2Bm9SCHW5H3aBJVqsWm4YB68%3D&_=1379448058217
2013-09-17 12:50:31,229 DEBUG [cloud.async.AsyncJobManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Executing 
org.apache.cloudstack.api.command.user.vm.StartVMCmd for job-53 = [ 
605b349e-9745-41c1-87e2-2ae08c7072cf ]
2013-09-17 12:50:31,251 DEBUG [cloud.user.AccountManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Access to 
VM[User|sangee5] granted to Acct[2db522b2-192b-42bf-a9fd-f4a18fcbb51b-sangee] 
by DomainChecker_EnhancerByCloudStack_57441200
2013-09-17 12:50:31,266 DEBUG [cloud.network.NetworkModelImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Service 
SecurityGroup is not supported in the network id=208
2013-09-17 12:50:31,271 DEBUG [cloud.network.NetworkModelImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Service 
SecurityGroup is not supported in the network id=208
2013-09-17 12:50:31,315 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Deploy 
avoids pods: [], clusters: [], hosts: []
2013-09-17 12:50:31,318 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) 
DeploymentPlanner allocation algorithm: 
com.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_4e6f8d51@4807e0d7
2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Trying to 
allocate a host and storage pools from dc:2, pod:2,cluster:null, requested cpu: 
100, requested ram: 130023424
2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Is ROOT 
volume READY (pool already allocated)?: No
2013-09-17 12:50:31,319 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) This VM has 
last host_id specified, trying to choose the same host: 7
2013-09-17 12:50:31,402 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Checking if 
host: 7 has enough capacity for requested CPU: 100 and requested RAM: 130023424 
, cpuOverprovisioningFactor: 1.0
2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Hosts's 
actual total CPU: 9044 and CPU after applying overprovisioning: 9044
2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) We need to 
allocate to the last host again, so checking if there is enough reserved 
capacity
2013-09-17 12:50:31,405 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Reserved 
CPU: 100 , Requested CPU: 100
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Reserved 
RAM: 130023424 , Requested RAM: 130023424
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Host has 
enough CPU and RAM available
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) STATS: Can 
alloc CPU from host: 7, used: 2300, reserved: 100, actual total: 9044, total 
with overprovisioning: 9044; requested cpu:100,alloc_from_last_host?:true 
,considerReservedCapacity?: true
2013-09-17 12:50:31,406 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) STATS: Can 
alloc MEM from host: 7, used: 2403336192, reserved: 130023424, total: 
16190149632; requested mem: 130023424,alloc_from_last_host?:true 
,considerReservedCapacity?: true
2013-09-17 12:50:31,406 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) The last 
host of this VM is UP and has enough capacity
2013-09-17 12:50:31,406 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Now checking 
for suitable pools under zone: 2, pod: 2, cluster: 2
2013-09-17 12:50:31,418 ERROR [cloud.async.AsyncJobManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Unexpected 
exception while executing org.apache.cloudstack.api.command.user.vm.StartVMCmd
com.cloud.utils.exception.CloudRuntimeException: Unable to create deployment, 
no usable volumes found for the VM
        at 
com.cloud.deploy.DeploymentPlanningManagerImpl.findSuitablePoolsForVolumes(DeploymentPlanningManagerImpl.java:1059)
        at 
com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:358)
        at 
org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.reserveVirtualMachine(VMEntityManagerImpl.java:187)
        at 
org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.reserve(VirtualMachineEntityImpl.java:198)
        at 
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3405)
        at 
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:1948)
        at 
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
        at 
org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:120)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
        at 
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
2013-09-17 12:50:31,450 DEBUG [cloud.async.AsyncJobManagerImpl] 
(Job-Executor-2:job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ]) Complete 
async job-53 = [ 605b349e-9745-41c1-87e2-2ae08c7072cf ], jobStatus: 2, 
resultCode: 530, result: Error Code: 530 Error text: Unable to create 
deployment, no usable volumes found for the VM

                
> Restarting management server when volume Snapshot is still in progress for 
> root volume of a VM , then there is no way to restart VM since the startVM 
> job is stuck forever  since the volume is in "Snapshoting" state.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-4651
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4651
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.2.1
>         Environment: Build from 4.2.-forward
>            Reporter: Sangeetha Hariharan
>            Assignee: Prachi Damle
>            Priority: Critical
>             Fix For: 4.2.1
>
>
> Restarting management server when volume Snapshot is still in progress for 
> root volume of a VM , then there is no way to restart VM since the startVM 
> job is stuck forever since the volume is in "Snapshoting" state.
> Steps to reproduce the problem:
> Deploy a VM.
> Initiate snapshot for root volume of this VM.
> When VM snapshot is in progress , stop the management server.
> Start Management server.
> Stop this VM.
> Start this VM.
> VM will never transition to "Starting" state and continues to be in "Stopped" 
> state.
> The start VM async job never completes and hits an infinite loop in this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CLOUDSTACK-4651) Restarting management server when volume Snapshot is still in progress for root volume of a VM , then there is no way to restart VM since the startVM job is stuck forever since the volume is in "Snapshoting" state.

Reply via email to