Hi Jeremy, I don’ t think cloudstack migrates all the volumes on the primary storage when you put it into maintenance.
-Jithin From: Jeremy Hansen <jer...@skidrow.la.INVALID> Date: Saturday, 20 January 2024 at 4:21 PM To: users@cloudstack.apache.org <users@cloudstack.apache.org> Subject: Re: Issues migrating primary storage I’m trying to put my NFS primary storage in to maintenance mode, which I believe is supposed to migrate all of its storage, correct? The problem is I don’t know how to get a status on this job? I can’t really tell if it’s working. Management server doesn’t really have anything in the logs…. I don’t see any new images or images growing on the Ceph side. So I just don’t know if it’s working or how far along the migration is. -jeremy On Friday, Jan 19, 2024 at 12:34 AM, Jeremy Hansen <jer...@skidrow.la<mailto:jer...@skidrow.la>> wrote: I’m still having issues. Is it unreasonable to migrate 1TB images over a 10G network? Any other ideas of things to try would be much appreciated. -jeremy On Wednesday, Jan 17, 2024 at 12:49 PM, Jeremy Hansen <jer...@skidrow.la<mailto:jer...@skidrow.la>> wrote: Extending these timeouts in the “wait” configs seems to have helped. One of my 1TB volumes is finally migrating. WHat’s I’ve noticed is if I allocate a new 1TB volume, I can migrate this between NFS and Ceph and it takes only about a 1 minute. I assume this is because it’s “thin provisioned” and there’s no actual data on the volume. But these other volumes I’m trying to move are also “thin provisioned” but they’re a part of a LVM group. Does making a thin provisioned device part of a LVM group defeat the thin provisioning? I know these volumes weren’t full, but I thought perhaps there’s a chance that since it’s a pv in a LVM config, that maybe that defeats the thin provisioning and it counts it as a full 1TB volume? I’m just spitballing but I’m trying to understand how this works so we can do the right thing when provisioning additional volumes. Also, the behavior I’m seeing is it takes a very long time before I see the block image show up on the Ceph side. Perhaps it preallocated a image before copying the data? But it seemed strange that I wouldn’t immidiately see the image appear on the Ceph side after initiating a migration. It’s hard to see what’s actually going on from the logs and the interface. Thanks -jeremy On Tuesday, Jan 16, 2024 at 11:29 PM, Jeremy Hansen <jer...@skidrow.la<mailto:jer...@skidrow.la>> wrote: I changed copy.volume.wait to 72000 But I just noticed: kvm.storage.online.migration.wait and kvm.storage.offline.migration.wait. Worth changing this? Thanks -jeremy On Tuesday, Jan 16, 2024 at 11:01 PM, Jithin Raju <jithin.r...@shapeblue.com<mailto:jithin.r...@shapeblue.com>> wrote: Hi Jeremy, Have you checked the ‘wait’ parameter? Used as wait * 2 timeout. -Jithin From: Jeremy Hansen <jer...@skidrow.la.INVALID> Date: Wednesday, 17 January 2024 at 12:14 PM To: users@cloudstack.apache.org <users@cloudstack.apache.org> Subject: Re: Issues migrating primary storage Unfortunately the upgrade didn’t help: Resource [StoragePool:3] is unreachable: Volume [{"name”:”bigdisk","uuid":"8f24b8a6-229a-4311-9ddc-d6c6acb89aca"}] migration failed due to [com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/8f24b8a6-229a-4311-9ddc-d6c6acb89aca to 5837f4e6-9307-43a9-a50c-8c9c885f25e8.qcow2]. Anything else I can try? I’m trying to move away from NFS completely. -jeremy On Tuesday, Jan 16, 2024 at 7:06 AM, Suresh Kumar Anaparti <sureshkumar.anapa...@gmail.com<mailto:sureshkumar.anapa...@gmail.com>> wrote: Hi Jeremy, Can you extend with the config 'migratewait' and check. Regards, Suresh On Tue, Jan 16, 2024 at 1:45 PM Jeremy Hansen <jer...@skidrow.la.invalid> wrote: I have some large volumes I’m trying to migrate from NFS to Ceph/RBD. 1TB volumes. These inevitably times out. I extended these configs: copy.volume.wait=72000 job.cancel.threshold.minutes=480 job.expire.minutes=1440 This helped with smaller volumes but large once still eventually fail. 2024-01-16 07:50:25,929 DEBUG [c.c.a.t.Request] (AgentManager-Handler-8:null) (logid:) Seq 1-5583619113009291196: Processing: { Ans: , MgmtId: 20558852646968, via: 1, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"result":"false","details":"com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e to b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2","wait":"0","bypassHostMaintenance":"false"}}] } 2024-01-16 07:50:26,698 DEBUG [c.c.s.VolumeApiServiceImpl] (Work-Job-Executor-41:ctx-e5baf6dc job-1175/job-1176 ctx-bc7b188b) (logid:d7d98b81) Failed to migrate volume com.cloud.exception.StorageUnavailableException: Resource [StoragePool:3] is unreachable: Volume [{"name":"sequencingdata","uuid":"861a6692-e746-4401-9cda-bd791b7d3b5e"}] migration failed due to [com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e to b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2]. at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.migrateVolume(VolumeOrchestrator.java:1348) at jdk.internal.reflect.GeneratedMethodAccessor672.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) at com.sun.proxy.$Proxy227.migrateVolume(Unknown Source) at com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:3356) at com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:4721) at jdk.internal.reflect.GeneratedMethodAccessor671.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) at com.cloud.storage.VolumeApiServiceImpl.handleVmWorkJob(VolumeApiServiceImpl.java:4735) at jdk.internal.reflect.GeneratedMethodAccessor670.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) at com.sun.proxy.$Proxy232.handleVmWorkJob(Unknown Source) at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:620) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:568) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) 2024-01-16 07:50:26,727 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-41:ctx-e5baf6dc job-1175/job-1176 ctx-bc7b188b) (logid:d7d98b81) Invocation exception, caused by: com.cloud.utils.exception.CloudRuntimeException: Resource [StoragePool:3] is unreachable: Volume [{"name":"sequencingdata","uuid":"861a6692-e746-4401-9cda-bd791b7d3b5e"}] migration failed due to [com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e to b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2]. com.cloud.utils.exception.CloudRuntimeException: Resource [StoragePool:3] is unreachable: Volume [{"name":"sequencingdata","uuid":"861a6692-e746-4401-9cda-bd791b7d3b5e"}] migration failed due to [com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e to b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2]. at com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:3363) at com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:4721) at jdk.internal.reflect.GeneratedMethodAccessor671.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) at com.cloud.storage.VolumeApiServiceImpl.handleVmWorkJob(VolumeApiServiceImpl.java:4735) at jdk.internal.reflect.GeneratedMethodAccessor670.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215) at com.sun.proxy.$Proxy232.handleVmWorkJob(Unknown Source) at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:620) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:568) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) 2024-01-16 07:50:26,744 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-41:ctx-e5baf6dc job-1175/job-1176) (logid:d7d98b81) Complete async job-1176, jobStatus: FAILED This is 4.18.0.0. Is there other timeout that might be at play here? StoragePool:3] is unreachable. Is StoragePool:3 referring to the NFS server or RBD? Or how do I interpret StoragePool:3 and why it thinks it’s unavailable? Thanks -jeremy