[BUG?] ACS 4.5.1 can't create VM from template on CEPH, bigger than 20GB

2015-06-10 Thread Andrija Panic
Hi guys,

we just experienced a very strange problem:

ACS 4.5.1 vanila
Ubuntu 14.04.2 - latest Qemu* binaries (2.0.0+dfsg-2ubuntu1.11)
CEPH 0.94.1

We are unable to create VM from template that is bigger than 20GB - on CEPH.
Works with 10,14,18GB, but not with templates 20GB or bigger...

Problem with 20GB or more: mgmt server / agent tries 2 times to move/create
CEPH/RAW volume from NFS/QCOW2, although the first execution is successfull.
(some timeout problem?)

So, first time it detects it needs to copy from Qcow2/NFS to RAW/CEPH, it
issues qemu-img convert and gets success on execution.
Than again it tries to do EXACT same thing, so of course CEPH gives error
because destinatin image already exists and VM provisioning fails.

We dont see this behaviour with 10GB,14GB,18GB, but only with 20GB and
bigger.


Please find logs from a host, I marked critical points with "!!!":

[{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/2/204/475f79e4-82ef-36f8-b1f2-acfcf541ca95.qcow2","origUrl":"
http://xxx.yyy.180.244/userdata/d3494552-eb13-478e-84d6-da16f06a01a6.qcow2
","uuid":"9a61c577-e74f-4c22-b7f8-5e642ea69cc2","id":204,"format":"QCOW2","accountId":2,"checksum":"b62eb99b8dd4ecfe40b61c91c8c037c0","hvm":true,"displayText":"andrija-debian7","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://
10.23.2.1/data/tank/secondary
","_role":"Image"}},"name":"204-2-f7d41a8b-de2e-3a71-a8cf-301bc76cb2ab","hypervisorType":"KVM"}},"destTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"origUrl":"
http://xxx.yyy.180.244/userdata/d3494552-eb13-478e-84d6-da16f06a01a6.qcow2
","uuid":"9a61c577-e74f-4c22-b7f8-5e642ea69cc2","id":204,"format":"QCOW2","accountId":2,"checksum":"b62eb99b8dd4ecfe40b61c91c8c037c0","hvm":true,"displayText":"andrija-debian7","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"8457c284-cf5d-3979-b82e-32ea5efeb97b","id":1,"poolType":"RBD","host":"mon.swiss2.local","path":"cold-storage","port":6789,"url":"RBD://mon.swiss2.local/cold-storage/?ROLE=Primary&STOREUUID=8457c284-cf5d-3979-b82e-32ea5efeb97b"}},"name":"204-2-f7d41a8b-de2e-3a71-a8cf-301bc76cb2ab","hypervisorType":"KVM"}},"executeInSequence":true,"options":{},"wait":21600}}]
2015-06-10 01:42:36,661 DEBUG [cloud.agent.Agent]
(agentRequest-Handler-4:null) Processing command:
org.apache.cloudstack.storage.command.CopyCommand
2015-06-10 01:42:36,661 INFO  [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) Attempting to create storage pool
b6d6c679-8475-31f6-b28f-a2e56746c7b9 (NetworkFilesystem) in libvirt
2015-06-10 01:42:36,669 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) 
b6d6c679-8475-31f6-b28f-a2e56746c7b9
b6d6c679-8475-31f6-b28f-a2e56746c7b9





/mnt/b6d6c679-8475-31f6-b28f-a2e56746c7b9



2015-06-10 01:42:37,764 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) Trying to fetch storage pool
b6d6c679-8475-31f6-b28f-a2e56746c7b9 from libvirt
2015-06-10 01:42:37,775 DEBUG [kvm.storage.KVMStorageProcessor]
(agentRequest-Handler-4:null) Copying template to primary storage, template
format is qcow2
2015-06-10 01:42:37,775 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) Trying to fetch storage pool
8457c284-cf5d-3979-b82e-32ea5efeb97b from libvirt
2015-06-10 01:42:37,781 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) copyPhysicalDisk: disk size:1203378688,
virtualsize:25769803776 format:qcow2
2015-06-10 01:42:37,781 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) The source image is not RBD, but the
destination is. We will convert into RBD format 2

2015-06-10 01:42:37,782 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) Starting copy from source image
/mnt/b6d6c679-8475-31f6-b28f-a2e56746c7b9/475f79e4-82ef-36f8-b1f2-acfcf541ca95.qcow2
to RBD image cold-storage/9a61c577-e74f-4c22-b7f8-5e642ea69cc2

(!!)
2015-06-10 01:42:37,782 DEBUG [utils.script.Script]
(agentRequest-Handler-4:null) Executing:
qemu-img convert -O raw
/mnt/b6d6c679-8475-31f6-b28f-a2e56746c7b9/475f79e4-82ef-36f8-b1f2-acfcf541ca95.qcow2
rbd:cold-storage/9a61c577-e74f-4c22-b7f8-5e642ea69cc2:mon_host=mon.swiss2.local:auth_supported=cephx:id=admin:key=AQA9E2pVzqotGRAALYrGr1zI+eO0i5F5ghU5/g==:rbd_default_format=2:client_mount_timeout=30

2015-06-10 01:47:12,430 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) Succesfully converted source image
/mnt/b6d6c679-8475-31f6-b28f-a2e56746c7b9/475f79e4-82ef-36f8-b1f2-acfcf541ca95.qcow2
to RBD image cold-storage/9a61c577-e74f-4c22-b7f8-5e642ea69cc2
2015-06-10 01:47:12,549 DEBUG [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) Succesfully connected to Ceph cluster at
mon.swiss2.local:6789
2015-06-10 01:47:12,576 INFO  [kvm.storage.LibvirtStorageAdaptor]
(agentRequest-Handler-4:null) Attempting to remove storage pool

Re: [BUG?] ACS 4.5.1 can't create VM from template on CEPH, bigger than 20GB

2015-06-10 Thread Andrija Panic
Actually,

my collegue made a patch for
the 
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java,
where we first check if the image is already present on CEPH, and if not,
only then it tries to copy it to the CEPH.

But this is not permanent solution, since in the first place, ACS should
not try to copy image 2 times, but only 1...

Anybody possible to test creating VM from bigger template on CEPH ?

Thanks,



On 10 June 2015 at 10:55, Andrija Panic  wrote:

> Hi guys,
>
> we just experienced a very strange problem:
>
> ACS 4.5.1 vanila
> Ubuntu 14.04.2 - latest Qemu* binaries (2.0.0+dfsg-2ubuntu1.11)
> CEPH 0.94.1
>
> We are unable to create VM from template that is bigger than 20GB - on
> CEPH.
> Works with 10,14,18GB, but not with templates 20GB or bigger...
>
> Problem with 20GB or more: mgmt server / agent tries 2 times to
> move/create CEPH/RAW volume from NFS/QCOW2, although the first execution is
> successfull.
> (some timeout problem?)
>
> So, first time it detects it needs to copy from Qcow2/NFS to RAW/CEPH, it
> issues qemu-img convert and gets success on execution.
> Than again it tries to do EXACT same thing, so of course CEPH gives error
> because destinatin image already exists and VM provisioning fails.
>
> We dont see this behaviour with 10GB,14GB,18GB, but only with 20GB and
> bigger.
>
>
> Please find logs from a host, I marked critical points with "!!!":
>
>
> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/2/204/475f79e4-82ef-36f8-b1f2-acfcf541ca95.qcow2","origUrl":"
> http://xxx.yyy.180.244/userdata/d3494552-eb13-478e-84d6-da16f06a01a6.qcow2
> ","uuid":"9a61c577-e74f-4c22-b7f8-5e642ea69cc2","id":204,"format":"QCOW2","accountId":2,"checksum":"b62eb99b8dd4ecfe40b61c91c8c037c0","hvm":true,"displayText":"andrija-debian7","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://
> 10.23.2.1/data/tank/secondary
> ","_role":"Image"}},"name":"204-2-f7d41a8b-de2e-3a71-a8cf-301bc76cb2ab","hypervisorType":"KVM"}},"destTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"origUrl":"
> http://xxx.yyy.180.244/userdata/d3494552-eb13-478e-84d6-da16f06a01a6.qcow2
> ","uuid":"9a61c577-e74f-4c22-b7f8-5e642ea69cc2","id":204,"format":"QCOW2","accountId":2,"checksum":"b62eb99b8dd4ecfe40b61c91c8c037c0","hvm":true,"displayText":"andrija-debian7","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"8457c284-cf5d-3979-b82e-32ea5efeb97b","id":1,"poolType":"RBD","host":"mon.swiss2.local","path":"cold-storage","port":6789,"url":"RBD://mon.swiss2.local/cold-storage/?ROLE=Primary&STOREUUID=8457c284-cf5d-3979-b82e-32ea5efeb97b"}},"name":"204-2-f7d41a8b-de2e-3a71-a8cf-301bc76cb2ab","hypervisorType":"KVM"}},"executeInSequence":true,"options":{},"wait":21600}}]
> 2015-06-10 01:42:36,661 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-4:null) Processing command:
> org.apache.cloudstack.storage.command.CopyCommand
> 2015-06-10 01:42:36,661 INFO  [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-4:null) Attempting to create storage pool
> b6d6c679-8475-31f6-b28f-a2e56746c7b9 (NetworkFilesystem) in libvirt
> 2015-06-10 01:42:36,669 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-4:null) 
> b6d6c679-8475-31f6-b28f-a2e56746c7b9
> b6d6c679-8475-31f6-b28f-a2e56746c7b9
> 
> 
> 
> 
> 
> /mnt/b6d6c679-8475-31f6-b28f-a2e56746c7b9
> 
> 
>
> 2015-06-10 01:42:37,764 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-4:null) Trying to fetch storage pool
> b6d6c679-8475-31f6-b28f-a2e56746c7b9 from libvirt
> 2015-06-10 01:42:37,775 DEBUG [kvm.storage.KVMStorageProcessor]
> (agentRequest-Handler-4:null) Copying template to primary storage, template
> format is qcow2
> 2015-06-10 01:42:37,775 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-4:null) Trying to fetch storage pool
> 8457c284-cf5d-3979-b82e-32ea5efeb97b from libvirt
> 2015-06-10 01:42:37,781 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-4:null) copyPhysicalDisk: disk size:1203378688,
> virtualsize:25769803776 format:qcow2
> 2015-06-10 01:42:37,781 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-4:null) The source image is not RBD, but the
> destination is. We will convert into RBD format 2
>
> 2015-06-10 01:42:37,782 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-4:null) Starting copy from source image
> /mnt/b6d6c679-8475-31f6-b28f-a2e56746c7b9/475f79e4-82ef-36f8-b1f2-acfcf541ca95.qcow2
> to RBD image cold-storage/9a61c577-e74f-4c22-b7f8-5e642ea69cc2
>
> (!!)
> 2015-06-10 01:42:37,782 DEBUG [utils.script.Script]
> (agentRequest-Handler-4:null) Executing:
> qemu-img convert -O raw
> /mnt/b6d6c679-8475-31f6-b28f-a2e56746c7b9/475f79e4-82ef-36f8-b1f2-acfcf541ca95.qcow2
> rbd:cold-storage/9a61c577-e74f-4c22-b7f8-5e642ea69cc2:mon_host=mon.swiss2.local:auth_supported=cephx:id=admin:key=AQA