Hi Simon, same here 4.8 heavily patched. We were on NFS and/or CEPH, back in the days of this issues (KVM also)
@ Sean, this is interesting finding really - at least to avoid 2 VM running on top of same image, but otherwise doesn't solve the HA mechanism (4.11 is supposed...) Thx for the info guys On 15 February 2018 at 23:39, Sean Lair <sl...@ippathways.com> wrote: > Thanks for the replies everyone. > > After further investigating, I am seeing how broken VM HA is right now (at > least in 4.9.3). > > We've started patching the code so it works again, but once we fixed it - > we hit the dreaded VMs running on 2 different hosts... not good! > > We are KVM w/ NFS. It looks like the standard CloudStack documentation > doesn't specify to use the built-in locking mechanism in libvirtd. Looks > like an easy solution, as if we are locking the VM's disk files, it > shouldn't be able to come up on another host... > > I've seen some of the talk about IPMI being used for Host HA in 4.11... > but we don't have IPMI setup yet. The locking mechanisms in libvirtd seem > like the best idea to us so far - but we are just starting to look into it > and implement it. > > https://libvirt.org/locking-lockd.html > > It reminds us of how VMware vSphere does locking, which works great. > > > > -----Original Message----- > From: Andrija Panic [mailto:andrija.pa...@gmail.com] > Sent: Wednesday, February 14, 2018 3:22 AM > To: dev <dev@cloudstack.apache.org> > Subject: Re: System VMs not migrating when host down > > Humble opinion (until HOST HA is ready in 4.11 if not mistaken?), avoid > using HA option for VMs - avoid setting the "Offer HA" option on any > compute/service offerings, since we did end up (was it ACS 4.5 or 4.8, > can't remember now) having 2 copies of SAME VM running on 2 different > hosts...imagine storage/volume corruption...this happened a few times for > us. > > HOST HA looks like really a nice thing, I have not tested that yet...but > sould completely solve the problem. > > On 14 February 2018 at 10:14, Paul Angus <paul.an...@shapeblue.com> wrote: > > > Hi Sean, > > > > The 'problem' with VM HA in KVM is that it relies on the parent host > > agent to be connected to report that the VM is down. We cannot assume > > that just because a host agent is disconnected, that the VMs on that > > host are not running. > > > > This is where HOST HA comes in, this feature detects loss of > > connection to the agent and then tries to determine if the VMs on that > > host are active and then attempts some corrective action. > > > > > > Kind regards, > > > > Paul Angus > > > > paul.an...@shapeblue.com > > www.shapeblue.com > > 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue > > > > > > > > > > -----Original Message----- > > From: Sean Lair [mailto:sl...@ippathways.com] > > Sent: 13 February 2018 23:06 > > To: dev@cloudstack.apache.org > > Subject: System VMs not migrating when host down > > > > Hi all, > > > > We are testing VM HA and are having a problem with our system VMs > > (secondary storage and console) not being started up on another host > when a > > host fails. > > > > Shouldn't the system VMs be VM HA-enabled? Currently they are just in an > > "Alert" agent state, but never migrate. We are currently running 4.9.3. > > > > > > Thanks > > Sean > > > > > > -- > > Andrija Panić > -- Andrija Panić