[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685711#comment-15685711
 ] 

ASF GitHub Bot commented on CLOUDSTACK-9458:
--------------------------------------------

Github user abhinandanprateek commented on the issue:

    https://github.com/apache/cloudstack/pull/1640
  
    @marcaurele for a host that is found to be down we go ahead and schedule a 
restart for HA enabled VM, this is good.
    
    For the VMs that are not HA enabled they will continue to show as running.  
This works in the scenario where the host finally comes around. What if host is 
gone for long or forever, then the VMs will continue to show as running. The 
user will have to guess that he has to stop and then start the VM.  Can you 
check if VMs will be eventually marked down by VM sync ? If that is the case, I 
think this fix should be good then 
    
    Another suggestion: In the specific case where host drops and then come 
back in certain interval. Can we make the parameter that times out to mark a 
host down as configurable. In your case you can increase it to several hours 
and it will not start HA during that time and host can still connect back ?
    



> Some VMs are being stopped when agent is reconnecting
> -----------------------------------------------------
>
>                 Key: CLOUDSTACK-9458
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>            Reporter: Marc-Aurèle Brothier
>            Assignee: Marc-Aurèle Brothier
>
> If you loose the communication between the management server and one of the 
> agent for a few minutes, even though HA mode is not active the 
> HighAvailibilityManager kicks in and start to schedule vm restart. Those 
> tasks are being inserted as async job in the DB and if the agent comes back 
> online during the time the jobs are still in the async table, they are pushed 
> to the agent and shuts down the VMs. Then since HA is not active, the VM are 
> not restarted.
> The expected behavior in my opinion is that the VM should not be restarted at 
> all if HA mode is not active on them, and let the agent update the VM state 
> with the power report.
> The bug lies in 
> {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, 
> boolean investigate)}}, PR will follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to