[jira] [Commented] (CLOUDSTACK-9660) NPE while destroying volumes during 1000 VMs deploy and destroy tests

ASF subversion and git services (JIRA) Wed, 24 May 2017 03:13:34 -0700

    [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022654#comment-16022654
 ]


ASF subversion and git services commented on CLOUDSTACK-9660:
-------------------------------------------------------------

Commit 0506fe60862c1d211082281de5c28414d042abf5 in cloudstack's branch 
refs/heads/master from [~mike-tutkowski]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=0506fe6 ]

Fix for CLOUDSTACK-9660

A root volume can be replaced by a different root volume without the VM it 
belongs to being expunged.

>From dev@:

For example: Let’s say we have a system VM running on NFS primary storage. We 
then put this primary storage into maintenance mode, which creates the system 
VM (with the same name) on a different primary storage (we do not create a new 
row in the cloud.vm_instance table for this VM). While this VM works, the 
original root disk of the system VM remains on the original primary storage and 
is not destroyed by the code in StorageManagerImpl.cleanupStorage(boolean) in 
4.10 because 4.10 (as shown above) only asks for non-root volumes to consider 
for deletion. In the 4.9 version of the code, the original root disk is cleaned 
up in StorageManagerImpl.cleanupStorage(boolean). The problem with 4.10 relying 
on a root disk always being deleted when the VM it belongs to is deleted is 
that in a situation like this that the system VM doesn’t get deleted at this 
point – it gets a new root disk that’s hosted by a different primary storage 
(so now it’s original root disk is stranded).

> NPE while destroying volumes during 1000 VMs deploy and destroy tests
> ---------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9660
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9660
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.10.0.0
>            Reporter: Koushik Das
>            Assignee: Koushik Das
>             Fix For: 4.10.0.0
>
>
> Steps:
> 1. Install and configure a zone (advanced or basic).
> 2. Set config storage.cleanup.enabled = true and storage.cleanup.interval = 
> 10 seconds
> 3. Deploy 1000 VMs and then destroy over multiple iterations.
> NPE seen in MS logs while deleting volume:
> 2015-06-18 16:27:47,797 DEBUG [c.c.v.VirtualMachineManagerImpl] 
> (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Cleaning up hypervisor 
> data structures (ex. SRs in XenServer) for managed storage
> 2015-06-18 16:27:47,799 DEBUG [o.a.c.e.o.VolumeOrchestrator] 
> (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Cleaning storage for vm: 
> 2894
> 2015-06-18 16:27:47,823 INFO [o.a.c.s.v.VolumeServiceImpl] 
> (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Expunge volume with no 
> data store specified
> 2015-06-18 16:27:47,828 DEBUG [c.c.s.StorageManagerImpl] 
> (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Storage pool 
> garbage collector found 0 templates to clean up in storage pool: 
> XenRT-Zone-0-Pod-0-Cluster-0-Primary-Store-0
> 2015-06-18 16:27:47,828 INFO [o.a.c.s.v.VolumeServiceImpl] 
> (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Volume 2894 is not 
> referred anywhere, remove it from volumes table
> 2015-06-18 16:27:47,829 DEBUG [c.c.s.StorageManagerImpl] 
> (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Storage pool 
> garbage collector found 0 templates to clean up in storage pool: 
> XenRT-Zone-0-Pod-0-Cluster-1-Primary-Store-0
> 2015-06-18 16:27:47,832 DEBUG [c.c.s.StorageManagerImpl] 
> (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Secondary storage 
> garbage collector found 0 templates to cleanup on template_store_ref for 
> store: nfs://10.81.56.7/xenrtnfs/1092931-dycPsK
> 2015-06-18 16:27:47,833 DEBUG [c.c.s.StorageManagerImpl] 
> (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Secondary storage 
> garbage collector found 0 snapshots to cleanup on snapshot_store_ref for 
> store: nfs://10.81.56.7/xenrtnfs/1092931-dycPsK
> 2015-06-18 16:27:47,834 DEBUG [c.c.s.StorageManagerImpl] 
> (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Secondary storage 
> garbage collector found 0 volumes to cleanup on volume_store_ref for store: 
> nfs://10.81.56.7/xenrtnfs/1092931-dycPsK
> 2015-06-18 16:27:47,842 DEBUG [c.c.v.VirtualMachineManagerImpl] 
> (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Expunged 
> VM[User|i-10-2894-VM]
> 2015-06-18 16:27:47,844 WARN [c.c.s.StorageManagerImpl] 
> (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Unable to destroy 
> volume 0b22f54b-3242-49ef-b16d-1c7801d5c2bd
> java.lang.NullPointerException
> at 
> org.apache.cloudstack.storage.volume.VolumeServiceImpl.expungeVolumeAsync(VolumeServiceImpl.java:276)
> at 
> com.cloud.storage.StorageManagerImpl.cleanupStorage(StorageManagerImpl.java:1121)
> at 
> com.cloud.storage.StorageManagerImpl$StorageGarbageCollector.runInContext(StorageManagerImpl.java:1481)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> 2015-06-18 16:27:47,850 DEBUG [c.c.u.AccountManagerImpl] 
> (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Access granted to 
> Acct[ed48b7f2-15a0-11e5-96dd-d275a7df156a-system] to Domain:1/ by 
> AffinityGroupAccessChecker
> 2015-06-18 16:27:47,871 DEBUG [c.c.v.UserVmManagerImpl] 
> (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Starting cleaning up vm 
> VM[User|i-10-2894-VM] resources...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CLOUDSTACK-9660) NPE while destroying volumes during 1000 VMs deploy and destroy tests

Reply via email to