That very much describes my own situation two years ago..., just a slight time and geographic offset as my home is near Frankfurt and my work is in Lyon. I had been doing 70:1 consolidation via virtualization based on OpenVZ (containers, but with a IaaS abstraction), since 2006 because it was zero hardware and software budget whereas VMware would have both requried VT-x capable hardware and pricey licenses.
OpenVZ turned out extremely easy to use, super reliable and the sysadmin only had to learn two or three new commands, not a new abstraction layer. We were running payment front office systems, where five minutes of downtime mean a very angry boss, one hour of down time costs a yearly bonus and beyond that you have to gather your things and go. That also meant planned 100% availability of eventually consistent data, so nothing you could do on a file level: Oracle with streams replication and programmed healing or nothing, that meant at the time, can't beat the CAP theorem. (No this isn't a recommendation to go with OpenVZ: That project is about as lively as Docker these days). For a new context with far lighter avaiIability demands but the need to support GPUs for machine learning (CUDA breaks OpenVZ containers) I thought that oVirt would add managed heterogeneous VMs and hyperconverged storage to the mix, again at zero entrance fee with a supported option if that became necessary. Two years later, I'd say that "I married the wrong woman". It works, but the claim "oVirt is an open-source distributed virtualization solution, designed to manage your entire enterprise infrastructure" is terribly misleading. I had planned to get it reasonably stable and ready within weeks of my typical stress and failure testing on 1/3 time budget, but it took almost 2 years and more like 50% time allocation to learn all the very many things that can go wrong, how to diagnose them and how to fix them. Just yesterday I had one cluster (my 1st functional test 3 node HCI running on Atoms, that mostly just sits idle and gets updates, which require reboots), had evidently decided to lose one gluster network connection and accumulated 5000+ entries in the heal queue during a week of vacation. It was four hours of careful digging, hundreds of restarted daemons, various reboots and a transient situation where the three storage nodes could not see or access the storage they were providing, while the management engine ran on a compute node and continued to write to a disk that evidently did not exist... As a newbie I would have either jumped off a cliff (active users) or tossed the project. However, the magnificent basic design of oVirt had me recover everything without a loss... except hair, nerves, general health etc. And I had to learn the hard way, that just exporting a VM as an OVA and expecting it to be importable on any other platform advertising OVA support, or even back into oVirt, is not functionality ever included in any regular QA testing... ...which might explain why it doesn't work. Or perhaps just no longer, after an update. In short: do not expect anything to work, that you have not fully tested after every minor update--several times: everything is extremely raw and ready to break at any moment and I can't remember the last time I did a plain vanilla install on freshly scrubbed hardware, where I didn't have to help it along manually and with digging through dozens of big log files. What really troubles me: since the basic ingredients for the commercial product are the same, I don't see how that might save your bacon. Perhaps 100% validated hardware might do it (for a while), but where oVirt is *designed* for a maximum of flexibility, it won't reward your taking advantage of that. I am sticking with it until CentOS7 is end of life, too (just like CentOS8 is already), because otherwise I'd have nothing to show for two years of work. But if you want to join in, you need to have serious resources to commit. It's most likely still smaller than OpenStack, though. And if you have NetApp filers or SAN, you should not risk HCI. That is super elegant as a concept, just like Gluster is a beautiful concept, but very soon you'll realize that they were never designed for each other and remain full of contradictions. oVirt may be designed to fit that enterprise role, but in the HCI variant, it stil has nowhere near the cohesion and maturity you'd need for that role. CentOS, LVM, VDO, KVM, the management engine, Gluster, Ansible are all distinct products from what used to be different companies. And it shows. Of course, that's just my personal experience and opinion. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/NRZPNUZTJGVJ3XRVNCFW6QS2DUBUY4QM/