[GitHub] cloudstack issue #1640: CLOUDSTACK-9458: Fix HA bug when VMs are stopped on ...

koushik-das Sun, 18 Sep 2016 22:57:53 -0700

Github user koushik-das commented on the issue:

    https://github.com/apache/cloudstack/pull/1640
  
    @abhinandanprateek In latest master the sequence of event described above 
only happens when the host has been determined as 'Down'. Refer to the below 
code. So the bug described won't happen. Earlier even when host state was 
'Alert' the same sequence used to get triggered which possibly killed healthy 
VMs.
    
    > if (host != null && host.getStatus() == Status.Down) {
    >     _haMgr.scheduleRestartForVmsOnHost(host, true);
    > }
    
    In case there is still a possibility of healthy VMs getting killed then the 
scenario needs to be clearly identified. If we need to fix anything, the first 
thing would be look at improving the VM investigators rather than changing the 
existing fencing logic.
    
    If we go ahead with the above fix then I can think of the following 
scenario that is broken. In case of a genuine host down scenario non-HA VMs 
continue to remain in 'Running' state and no operations can be done on it. 
Currently non-HA VMs are marked as 'Stopped' after fencing is successful and 
they can be manually started on another host.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack issue #1640: CLOUDSTACK-9458: Fix HA bug when VMs are stopped on ...

Reply via email to