I am testing the migration from CentOS7/oVirt 4.3 to CentOS8/oVirt 4.4.

Exporting all VMs to OVAs, and re-importing them on a new cluster built from 
scratch seems the safest and best method, because in the step-by-step 
migration, there is simply far too many things that can go wrong and no easy 
way to fail-back after each step.

But of course it requires that one of the most essential operations in a 
hypervisor actualy works.

For me a hypervisor turns a machine into a file and a file into a machine: 
That's the most appreciated and fundamental quality of it.

oVirt fails right there, repeatedly, and for different reasons and without even 
reporting an error.

So I have manually put the single-line fix in, which settles udev to ensure 
that disks are not exported as zeros. That's the bug which renders the final 
release oVirt 4.3 forever unfit, 4 years before the end of maintenance of 
CentOS7, because it won't be fixed there.

Exporting VMs and re-importing them on another farm generally seemed to work 
after that.

But just as I was exporting not one of the trivial machines, that I have been 
using for testing, but one of the bigger ones, that actually contain a 
significant amout of data, I find myself hitting this timeout bug.

The disks for both the trival and less-trivial are defined at 500GB, thinly 
allocated. The trivial is the naked OS at something like 7GB actually 
allocated, the 'real' has 113GB allocated. In both cases the OVA export file to 
a local SSD xfs partition is 500GB, with lots of zeros and sparse allocation in 
the case of the first one.

The second came to 72GB of 500GB actually allocated, which didn't seem like a 
good sign already, but perhaps there was some compression involved?

Still the export finished without error or incident and the import on the other 
side went just as well. The machine even boots and runs, it was only once I 
started using it, that I suddenly had all types of file system errors... it 
turns out 113-73GB were actually really cut off and missing from the OVA 
export, and there is nobody and nothing checking for that.

I now know that qemu-img is used in the process, which actually runs in a pipe. 
There is no checksumming or any other logic involved to ensure that the format 
conversion of the disk image has retained the integrity of the image. There is 
no obvious simple solution that I can think of, but the combination of 
processing the image through a pipe and an impatient ansible timeout results in 
a hypervisor which fails on the most important elementary task: Turn a machine 
into a file and back into a machine.

IMHO it makes oVirt a toy, not a tool. And worst of all, I am pretty sure that 
RHV has the same quality, even if the sticker price is probably quite different.

I have the export domain backup running right now, but I'm not sure it's not 
using the same mechanism under the cover with potentially similar results.

Yes, I know there is a Python script that will do the backup, and probably with 
full fidelity. And perhaps with this new infrastructure as code approach, that 
is how it should be done anyway.

But if you have a GUI, that should just work: No excuses.

P.S. The allocation size of the big VM in the export domain is again 72GB, with 
the file size at 500GB. I'll run the import on the other end, but by now I am 
pretty sure, the result will be no different.

Unless you resort to Python or some external tool, there is no reliable way to 
back up and restore a VM of less than 80 seconds worth of data transfer and no 
warning, when corruption occurs.

I am not sure you can compete with Nutanix and VMware at this level of 
reliability.

P.P.S. So just where (and on which machine) do I need to change the timeout?
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/H6E2LNT76IPMDKBG6UHJQHVU5X3PUTPJ/

Reply via email to