Prieur Leary created CLOUDSTACK-7457: ----------------------------------------
Summary: Unable to launch VM after unexpected Hypervisor Reboot (out of band) Key: CLOUDSTACK-7457 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7457 Project: CloudStack Issue Type: Bug Security Level: Public (Anyone can view this level - this is the default.) Components: KVM, Management Server Affects Versions: 4.4.0 Environment: CentOS Reporter: Prieur Leary Priority: Critical After an unexpected hypervisor server reboot (server crash), certain VMs fail to start and return, "Was unable to find lock for the key vm_instance1355" (full log below). I suspect this is related to the VM state being out of sync. As it stands, am searching for a way to work around this issue, should anyone care to provide some insight. ------------------------------------ 2014-08-31 11:14:33,381 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Checking if host: 28 has enough capacity for requested CPU: 30 and requested RA$ 2014-08-31 11:14:33,383 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Hosts's actual total CPU: 19992 and CPU after applying overprovisioning: 119952 2014-08-31 11:14:33,383 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) We need to allocate to the last host again, so checking if there is enough rese$ 2014-08-31 11:14:33,383 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Reserved CPU: 30 , Requested CPU: 30 2014-08-31 11:14:33,383 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Reserved RAM: 33554432 , Requested RAM: 33554432 2014-08-31 11:14:33,383 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Host has enough CPU and RAM available 2014-08-31 11:14:33,383 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) STATS: Can alloc CPU from host: 28, used: 6000, reserved: 30, actual total: 199$ 2014-08-31 11:14:33,383 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) STATS: Can alloc MEM from host: 28, used: 5368709120, reserved: 33554432, total$ 2014-08-31 11:14:33,384 DEBUG [c.c.c.CapacityManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Host: 28 has cpu capability (cpu:8, speed:2499) to support requested CPU: 1 and$ 2014-08-31 11:14:33,384 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) The last host of this VM is UP and has enough capacity 2014-08-31 11:14:33,384 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Now checking for suitable pools under zone: 1, pod: 1, cluster: 2 2014-08-31 11:14:33,385 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Checking suitable pools for volume (Id, Type): (1549,ROOT) 2014-08-31 11:14:33,385 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Volume has pool already allocated, checking if pool can be reused, po$ 2014-08-31 11:14:33,387 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Planner need not allocate a pool for this volume since its READY 2014-08-31 11:14:33,387 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Trying to find a potenial host and associated storage pools from the $ 2014-08-31 11:14:33,388 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Checking if host: 28 can access any suitable storage pool for volume:$ 2014-08-31 11:14:33,388 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Host: 28 can access pool: 1 2014-08-31 11:14:33,389 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Found a potential host id: 28 name: MC2HOST15.fortatrust.com and asso$ 2014-08-31 11:14:33,389 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Returning Deployment Destination: Dest[Zone(Id)-Pod(Id)-Cluster(Id)-H$ 2014-08-31 11:14:33,478 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Sync job-201795 execution on object VmWorkJobQueue.1355 2014-08-31 11:14:33,479 WARN [c.c.u.d.Merovingian2] (API-Job-Executor-14:ctx-17edc0e6 job-201794 ctx-25906915) Was unable to find lock for the key vm_instance1355 and thread id 1146336087 -- This message was sent by Atlassian JIRA (v6.3.4#6332)