Thanks for the replies, all!

Yep, Chris is right. TrueNAS HA is active/passive and there isn't a way around 
that when failing between heads.

Sven: In my experience with iX support, they have directed me to reboot the 
active node to initiate failover. There's "hactl takeover" and "hactl giveback" 
commends, but reboot seems to be their preferred method.

VMs going into a paused state and resuming when storage is back online sounds 
great. As long as oVirt's pause/resume isn't significantly slower than the 
30-or-so seconds the TrueNAS takes to complete its failover, that's a pretty 
tolerable interruption for my needs. So my next questions are:

1) Assuming the SAN failover DOES work correctly, can anyone comment on their 
experience with oVirt pausing/thawing VMs in an NFS-based active/passive SAN 
failover scenario? Does it work reliably without intervention? Is it reasonably 
fast?

2) Is there anything else in the oVirt stack that might cause it to "freak out" 
rather than gracefully pause/unpause VMs?

2a) Particularly: I'm running hosted engine on the same TrueNAS storage. Does 
that change anything WRT to timeouts and oVirt's HA and fencing and sanlock and 
such?

2b) Is there a limit to how long oVirt will wait for storage before doing 
something more drastic than just pausing VMs?

--
Matthew Trent
Network Engineer
Lewis County IT Services
360.740.1247 - Helpdesk
360.740.3343 - Direct line

________________________________________
From: users-boun...@ovirt.org <users-boun...@ovirt.org> on behalf of Chris 
Adams <c...@cmadams.net>
Sent: Tuesday, June 6, 2017 7:21 AM
To: users@ovirt.org
Subject: Re: [ovirt-users] Seamless SAN HA failovers with oVirt?

Once upon a time, Juan Pablo <pablo.localh...@gmail.com> said:
> Chris, if you have active-active with multipath: you upgrade one system,
> reboot it, check it came active again, then upgrade the other.

Yes, but that's still not how a TrueNAS (and most other low- to
mid-range SANs) works, so is not relevant.  The TrueNAS only has a
single active node talking to the hard drives at a time, because having
two nodes talking to the same storage at the same time is a hard problem
to solve (typically requires custom hardware with active cache coherency
and such).

You can (and should) use multipath between servers and a TrueNAS, and
that protects against NIC, cable, and switch failures, but does not help
with a controller failure/reboot/upgrade.  Multipath is also used to
provide better bandwidth sharing between links than ethernet LAGs.

--
Chris Adams <c...@cmadams.net>
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to