On 03.03.2014 12:24, Andrei Mikhailovsky wrote:
I am using HA for about 30% of the guest vms, but my testing showed
that HA is not working reliably with KVM. It works pretty well if you
initiate a vm shutdown inside a guest without using the ACS GUI.
However, when the host goes down for whatever reason (power failure,
init 6/0, network failure, etc.) the HA fails to kick in and restart
the vms.

This shuld be submitted as a bug. Which version are you on?



Regarding the nfs storage, I did not put the nfs server in the
maintenance mode. Would this solve the problem with reboots? I will
try it next time when I am doing maintenance on the nfs, but I do
recall that i've previously restarted the nfs server in the past and
I've not seen the hosts rebooting themselves. Is there a timeout which
causes the hosts to reboot?

Not sure what the timeout is, I'd be interested in finding out as well.

To the best of my knowledge, when you put primary storage in m-mode ACS will shut down the VMs on it. Otherwise the shared storage is used by ACS to maintain HA (so your HA is as good as your shared storage ...), if link to the shared storage is down the host assumes something is wrong and shuts down (fences itself), this is the correct and expected behaviour. Maybe your network has segmented etc.




In any case, I think it is not safe to do an automated host server
reboot and if it was up to me I would disable this feature from the
agent. IMHO this should be down to system administrator and acs agent
should send an alert email if something goes wrong instead of
rebooting the host servers.

Not sure what to tell you, HA is a sensitive and complex subject. For now I'm ok with this behaviour and I see it implemented similarly in Xenserver, too.



I am using ceph for my primary storage for guest vms data and root
disks. The NFS is used as a backup disk offering for the guest.


Andrei



--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

Reply via email to