oVirt may have started as a vSphere 'look-alike', but it graduated to a Nutanix 'clone', at least in terms of marketing.
IMHO that means the 3-node hyperconverged default oVirt setup (2 replicas and 1 arbiter) deserves special love in terms of documenting failure scenarios. 3-node HCI is supposed to defend you against long-term effects of any single point of failure. There is no protection against the loss of dynamic state/session data, but state-free services should recover or resume: that's what it's all about. Sadly, what I find missing in the oVirt and Gluster documentation is an SOP (standard operating procedure) that one should follow in case of a late-night/early-morning on-call wakeup when one of those three HCI nodes should have failed... dramatically or via a 'brown out' e.g. where only the storage part was actually lost. My impression is that the oVirt and Gluster teams are barely talking, but in HCI that's fatal. And I sure can't find those recovery procedures, not even in the commercial RH documents. So please, either add them or show me where I missed them. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/QZFFH2U2RM2R3POGHXUZ3MLI4FB4BVLL/