So today, we aren't using any a watchdog timer on KVM, but Wido has submitted a PR to include one. What that means is that today the VM container has to fail before an HA action will be completed. If the VM just kernel panics and doesn't have it's own kernel watchdog timer, the Mgmt Server will still believe the VM is alive and well.
There is also a lot of work underway to greatly improve the KVM HA overall. A good chunk of this has already been merged around IPMI management and control of the underling host servers. There is a new proposal currently in development that will better handle host failure and force a fencing action to make sure we have no chance of data corruption. Today if a host fails and the agent goes down hard with it, VMs that were on the host will not be HA'd to another host until an operator takes an action to remove that host from ACS. - Si ________________________________ From: Audrey Roberto B Baldin <audrey.bal...@unitelco.com.br> Sent: Wednesday, July 12, 2017 8:28 AM To: users Subject: Re: KVM hypervm and HA Hello Victor, As far as I tested the HA in KVM, it only get in action if: i. The computing offer you are using to create the guest VM has the Offer HA flag on; ii. The KVM should crash unexpectedly; To force a crash you can simulate a kernel panic using the command: echo c > /proc/sysrq-trigger. I had some issues that one KVM agent, for some unknown reason, was disconnecting from the orchestrator, although it was working and the guest VM were running. In this situation I couldn't do anything with the guests VMs (move, reload, shutdown) and the HA didn't work. Hope you find this information useful. Regards, Audrey ----- Mensagem original ----- De: "Ivan Kudryavtsev" <kudryavtsev...@bw-sw.com> Para: "victor" <vic...@ihnetworks.com> Cc: "users" <users@cloudstack.apache.org> Enviadas: Quarta-feira, 12 de julho de 2017 6:55:00 Assunto: Re: KVM hypervm and HA You should mark VM service offering as HA-enabled and ACS will start vms automatically. VR, SSVM, CP VMs are also started automatically as needed. 2017-07-12 16:53 GMT+07:00 victor <vic...@ihnetworks.com>: > Hello, > > Thanks for the update. When a single vm in "hypervm1" is stopped or down, > it gets automatically started in the second vm. But when the entire > hypervm1 server is down completely, the vm' inside it is not getting > started in the secondary "hypervm2". Is the feature is version specific. > > Regards > Victor > > > On 07/12/2017 08:26 AM, Ivan Kudryavtsev wrote: > >> Hi, Victor. They both will be moved automatically to a new host by ACS. >> >> 12 июл. 2017 г. 5:23 пользователь "victor" <vic...@ihnetworks.com> >> написал: >> >> Hello, >>> >>> I have a cloudstack system with one management server, one nfs server >>> and >>> two kvm hypervm host servers. Initially I configured cloudstack with >>> hypervm1, so the system vm and console proxy gets created in it. Lately >>> I >>> have added hypervm2 and created few instances in it. So my doubt is the >>> following. >>> >>> -------- >>> >>> 1, If hypervm 1 is down which contain system-vm,console-proxy and >>> router-vm, then what will happen. >>> >>> 2, If hypervm 1 is down completely and we couldn't make it up, will the >>> system vm,console proxy and router vm will be switched to hypervm2 >>> automatically. If it is not automatically, then is there any option for >>> that. >>> >>> ==== >>> >>> Regards >>> >>> Victor >>> >>> >>> > -- With best regards, Ivan Kudryavtsev Bitworks Software, Ltd. Cell: +7-923-414-1515 WWW: http://bitworks.software/ <http://bw-sw.com/> DESIGN AND DEVELOPMENT OF COMPREHENSIVE SOFTWARE SYSTEMS ...<http://bitworks.software/> bitworks.software design and development of comprehensive software systems for fast-growing businesses