Re: Issues migrating primary storage

Jithin Raju Sun, 21 Jan 2024 20:37:03 -0800

Hi Jeremy,

I don’ t think cloudstack migrates all the volumes on the primary storage when 
you put it into maintenance.

-Jithin

From: Jeremy Hansen <jer...@skidrow.la.INVALID>
Date: Saturday, 20 January 2024 at 4:21 PM
To: users@cloudstack.apache.org <users@cloudstack.apache.org>
Subject: Re: Issues migrating primary storage
I’m trying to put my NFS primary storage in to maintenance mode, which I 
believe is supposed to migrate all of its storage, correct?  The problem is I 
don’t know how to get a status on this job?  I can’t really tell if it’s 
working.  Management server doesn’t really have anything in the logs…. I don’t 
see any new images or images growing on the Ceph side.  So I just don’t know if 
it’s working or how far along the migration is.

-jeremy

On Friday, Jan 19, 2024 at 12:34 AM, Jeremy Hansen 
<jer...@skidrow.la<mailto:jer...@skidrow.la>> wrote:
I’m still having issues.  Is it unreasonable to migrate 1TB images over a 10G 
network?  Any other ideas of things to try would be much appreciated.

-jeremy

On Wednesday, Jan 17, 2024 at 12:49 PM, Jeremy Hansen 
<jer...@skidrow.la<mailto:jer...@skidrow.la>> wrote:
Extending these timeouts in the “wait” configs seems to have helped.  One of my 
1TB volumes is finally migrating.

WHat’s I’ve noticed is if I allocate a new 1TB volume, I can migrate this 
between NFS and Ceph and it takes only about a 1 minute.  I assume this is 
because it’s “thin provisioned” and there’s no actual data on the volume.

But these other volumes I’m trying to move are also “thin provisioned” but 
they’re a part of a LVM group.  Does making a thin provisioned device part of a 
LVM group defeat the thin provisioning?  I know these volumes weren’t full, but 
I thought perhaps there’s a chance that since it’s a pv in a LVM config, that 
maybe that defeats the thin provisioning and it counts it as a full 1TB volume? 
 I’m just spitballing but I’m trying to understand how this works so we can do 
the right thing when provisioning additional volumes.

Also, the behavior I’m seeing is it takes a very long time before I see the 
block image show up on the Ceph side.  Perhaps it preallocated a image before 
copying the data?  But it seemed strange that I wouldn’t immidiately see the 
image appear on the Ceph side after initiating a migration.  It’s hard to see 
what’s actually going on from the logs and the interface.

Thanks
-jeremy

On Tuesday, Jan 16, 2024 at 11:29 PM, Jeremy Hansen 
<jer...@skidrow.la<mailto:jer...@skidrow.la>> wrote:
I changed copy.volume.wait to 72000

But I just noticed:

kvm.storage.online.migration.wait and kvm.storage.offline.migration.wait.  
Worth changing this?

Thanks
-jeremy

On Tuesday, Jan 16, 2024 at 11:01 PM, Jithin Raju 
<jithin.r...@shapeblue.com<mailto:jithin.r...@shapeblue.com>> wrote:
Hi Jeremy,

Have you checked the ‘wait’ parameter? Used as wait * 2 timeout.

-Jithin

From: Jeremy Hansen <jer...@skidrow.la.INVALID>
Date: Wednesday, 17 January 2024 at 12:14 PM
To: users@cloudstack.apache.org <users@cloudstack.apache.org>
Subject: Re: Issues migrating primary storage
Unfortunately the upgrade didn’t help:

Resource [StoragePool:3] is unreachable: Volume 
[{"name”:”bigdisk","uuid":"8f24b8a6-229a-4311-9ddc-d6c6acb89aca"}] migration 
failed due to [com.cloud.utils.exception.CloudRuntimeException: Failed to copy 
/mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/8f24b8a6-229a-4311-9ddc-d6c6acb89aca 
to 5837f4e6-9307-43a9-a50c-8c9c885f25e8.qcow2].

Anything else I can try? I’m trying to move away from NFS completely.

-jeremy

On Tuesday, Jan 16, 2024 at 7:06 AM, Suresh Kumar Anaparti 
<sureshkumar.anapa...@gmail.com<mailto:sureshkumar.anapa...@gmail.com>> wrote:
Hi Jeremy,

Can you extend with the config 'migratewait' and check.

Regards,
Suresh

On Tue, Jan 16, 2024 at 1:45 PM Jeremy Hansen <jer...@skidrow.la.invalid>
wrote:

I have some large volumes I’m trying to migrate from NFS to Ceph/RBD. 1TB
volumes. These inevitably times out. I extended these configs:

copy.volume.wait=72000
job.cancel.threshold.minutes=480
job.expire.minutes=1440

This helped with smaller volumes but large once still eventually fail.

2024-01-16 07:50:25,929 DEBUG [c.c.a.t.Request]
(AgentManager-Handler-8:null) (logid:) Seq 1-5583619113009291196:
Processing: { Ans: , MgmtId: 20558852646968, via: 1, Ver: v1, Flags: 10,
[{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"result":"false","details":"com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e
to
b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2","wait":"0","bypassHostMaintenance":"false"}}]
}

2024-01-16 07:50:26,698 DEBUG [c.c.s.VolumeApiServiceImpl]
(Work-Job-Executor-41:ctx-e5baf6dc job-1175/job-1176 ctx-bc7b188b)
(logid:d7d98b81) Failed to migrate volume
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:3]
is unreachable: Volume
[{"name":"sequencingdata","uuid":"861a6692-e746-4401-9cda-bd791b7d3b5e"}]
migration failed due to [com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e
to b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2].
at
org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.migrateVolume(VolumeOrchestrator.java:1348)
at jdk.internal.reflect.GeneratedMethodAccessor672.invoke(Unknown
Source)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
at com.sun.proxy.$Proxy227.migrateVolume(Unknown Source)
at
com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:3356)
at
com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:4721)
at jdk.internal.reflect.GeneratedMethodAccessor671.invoke(Unknown
Source)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
at
com.cloud.storage.VolumeApiServiceImpl.handleVmWorkJob(VolumeApiServiceImpl.java:4735)
at jdk.internal.reflect.GeneratedMethodAccessor670.invoke(Unknown
Source)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
at com.sun.proxy.$Proxy232.handleVmWorkJob(Unknown Source)
at
com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
at
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:620)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
at
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:568)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
2024-01-16 07:50:26,727 ERROR [c.c.v.VmWorkJobHandlerProxy]
(Work-Job-Executor-41:ctx-e5baf6dc job-1175/job-1176 ctx-bc7b188b)
(logid:d7d98b81) Invocation exception, caused by:
com.cloud.utils.exception.CloudRuntimeException: Resource [StoragePool:3]
is unreachable: Volume
[{"name":"sequencingdata","uuid":"861a6692-e746-4401-9cda-bd791b7d3b5e"}]
migration failed due to [com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e
to b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2].

com.cloud.utils.exception.CloudRuntimeException: Resource [StoragePool:3]
is unreachable: Volume
[{"name":"sequencingdata","uuid":"861a6692-e746-4401-9cda-bd791b7d3b5e"}]
migration failed due to [com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/11cd19d0-f207-3d01-880f-8d01d4b15020/861a6692-e746-4401-9cda-bd791b7d3b5e
to b7acadc8-34a1-4d7a-8040-26368dafc21d.qcow2].
at
com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:3363)
at
com.cloud.storage.VolumeApiServiceImpl.orchestrateMigrateVolume(VolumeApiServiceImpl.java:4721)
at jdk.internal.reflect.GeneratedMethodAccessor671.invoke(Unknown
Source)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
at
com.cloud.storage.VolumeApiServiceImpl.handleVmWorkJob(VolumeApiServiceImpl.java:4735)
at jdk.internal.reflect.GeneratedMethodAccessor670.invoke(Unknown
Source)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
at
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
at com.sun.proxy.$Proxy232.handleVmWorkJob(Unknown Source)
at
com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
at
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:620)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
at
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:568)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
2024-01-16 07:50:26,744 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(Work-Job-Executor-41:ctx-e5baf6dc job-1175/job-1176) (logid:d7d98b81)
Complete async job-1176, jobStatus: FAILED

This is 4.18.0.0.

Is there other timeout that might be at play here? StoragePool:3] is
unreachable. Is StoragePool:3 referring to the NFS server or RBD? Or how
do I interpret StoragePool:3 and why it thinks it’s unavailable?

Thanks
-jeremy

Re: Issues migrating primary storage

Reply via email to