[jira] [Updated] (CLOUDSTACK-2568) ACS41 regression in storage subsystem (seen with local storage and 2 or more hosts)

Ove Ewerlid (JIRA) Tue, 21 May 2013 15:04:22 -0700

     [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ove Ewerlid updated CLOUDSTACK-2568:
------------------------------------

    Attachment:     (was: mysqldump-cloud.sql)
    
> ACS41 regression in storage subsystem (seen with local storage and 2 or more 
> hosts)
> -----------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-2568
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-2568
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.1.0
>         Environment: RHES64 as in OEL64. Install from RPM built from latest 
> GIT on OEL64.
> 2 or more KVM hypervisors with local storage in one cluster that has one 
> primary NFS storage pool.
>            Reporter: Ove Ewerlid
>            Assignee: Prachi Damle
>            Priority: Blocker
>             Fix For: 4.1.0
>
>         Attachments: var-log-cloudstack-management.tar.gz
>
>
> ACS402 works with no issues when tested in exactly the setup where ACS41 
> fails.
> Identical configuration (the same setup program is used for testing both 
> versions).
> In ACS410 startVM fails if and only if the advanceStart: log line picks a 
> poolID that is not valid.
> E.g., the poolID reported in this logline appears random across a large 
> number of tests.
> If a poolID that can not be reached by the host selected for deployment, the 
> startVM fails.
> This is blocking upgrade from 4.0 to 4.1 since there is no  reliable way to 
> start VMs that have been deployed. If a deployed VM fails to start, giving 
> the startVM command multiple times, will eventually make the VM start.
> The more hosts there are, the less likely it is a startVM will succeed. It is 
> less likely that the poolID is correct.
> The below log portion conveys how the VM has a "correct" Deployment 
> Destination reported and the advanceStart reports a poolID that is different 
> and since the selected hypervisor can not reach the poolID the startVM fails.
> The bug never triggers if there is only one KVM with local storage since the 
> poolId can not be wrong, there is just one (and the NFS pool is always valid).
> -------------------
> 2013-05-20 06:49:29,477 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Found a potential host id: 1 name: 
> vm3-net0-s0-14.test.devops and associated storage pools for this VM
> 2013-05-20 06:49:29,478 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Returning Deployment Destination: 
> Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
>  : Dest[Zone(1)-Pod(1)-Cluster(1)-Host\
> (1)-Storage(Volume(10|ROOT-->Pool(200), Volume(11|DATADISK-->Pool(200))]
> 2013-05-20 06:49:29,495 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-34:job-34) VM state transitted from :Stopped to Starting with 
> event: StartRequestedvm's original host id: null new host id: null host id 
> before state trans\
> ition: null
> 2013-05-20 06:49:29,495 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-34:job-34) Successfully transitioned to start state for 
> VM[User|testvm-a] reservation id = e644d55e-3627-4395-9f89-639e6fc2f261
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-34:job-34) Trying to deploy VM, vm has dcId: 1 and podId: null
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-34:job-34) advanceStart: DeploymentPlan is provided, using 
> dcId:1, podId: 1, clusterId: 1, hostId: 1, poolId: 201
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-34:job-34) Deploy avoids pods: null, clusters: null, hosts: null
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) DeploymentPlanner allocation algorithm: random
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Trying to allocate a host and storage pools from 
> dc:1, pod:1,cluster:1, requested cpu: 4000, requested ram: 2147483648
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Is ROOT volume READY (pool already allocated)?: Yes
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) DeploymentPlan has host_id specified, making no 
> checks on this host, looks like admin test: 1
> 2013-05-20 06:49:29,505 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Looking for suitable pools for this host under zone: 
> 1, pod: 1, cluster: 1
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Checking suitable pools for volume (Id, Type): 
> (10,ROOT)
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Volume has pool(201) already allocated, checking if 
> pool can be reused, poolId: null
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) finding pool by id '201'
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Planner need not allocate a pool for this volume 
> since its READY
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Checking suitable pools for volume (Id, Type): 
> (11,DATADISK)
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Volume has pool(201) already allocated, checking if 
> pool can be reused, poolId: null
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) finding pool by id '201'
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Planner need not allocate a pool for this volume 
> since its READY
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Trying to find a potenial host and associated 
> storage pools from the suitable host/pool lists for this VM
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Checking if host: 1 can access any suitable storage 
> pool for volume: DATADISK
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Host: 1 cannot access pool: 201
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Could not find a potential host that has associated 
> storage pools from the suitable host/pool lists for this VM
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-34:job-34) Cannnot deploy to specified host, returning.
> 2013-05-20 06:49:29,524 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-34:job-34) VM state transitted from :Starting to Stopped with 
> event: OperationFailedvm's original host id: null new host id: null host id 
> before state tran\
> sition: null
> 2013-05-20 06:49:29,533 ERROR [cloud.async.AsyncJobManagerImpl] 
> (Job-Executor-34:job-34) Unexpected exception while executing 
> org.apache.cloudstack.api.command.user.vm.StartVMCmd
> com.cloud.exception.InsufficientServerCapacityException: Unable to create a 
> deployment for VM[User|testvm-a]Scope=interface com.cloud.dc.DataCenter; id=1
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:728)
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:471)
>         at 
> org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:212)
>         at 
> org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
>         at 
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3865)
>         at 
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2573)
>         at 
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>         at 
> org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:120)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:162)
>         at 
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:437)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CLOUDSTACK-2568) ACS41 regression in storage subsystem (seen with local storage and 2 or more hosts)

Reply via email to