[ https://issues.apache.org/jira/browse/CLOUDSTACK-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448760#comment-15448760 ]
ASF GitHub Bot commented on CLOUDSTACK-9458: -------------------------------------------- Github user abhinandanprateek commented on the issue: https://github.com/apache/cloudstack/pull/1640 @marcaurele @koushik-das When the MS thinks that the VM is down, it issues a stop command. This is done to clear up the resources on management server db tied up for that VM. Now it is seen several times that this actually kills a healthy VM. I have seen this issue in MS cluster with agent.lb turned on. The issue is that we do need a state cleanup when a running VM is found to be stopped on the host. But this probably should not induce a shutdown on the host ? really, but again this is a tricky boundary condition. > Some VMs are being stopped when agent is reconnecting > ----------------------------------------------------- > > Key: CLOUDSTACK-9458 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9458 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Reporter: Marc-Aurèle Brothier > Assignee: Marc-Aurèle Brothier > > If you loose the communication between the management server and one of the > agent for a few minutes, even though HA mode is not active the > HighAvailibilityManager kicks in and start to schedule vm restart. Those > tasks are being inserted as async job in the DB and if the agent comes back > online during the time the jobs are still in the async table, they are pushed > to the agent and shuts down the VMs. Then since HA is not active, the VM are > not restarted. > The expected behavior in my opinion is that the VM should not be restarted at > all if HA mode is not active on them, and let the agent update the VM state > with the power report. > The bug lies in > {{HighAvailibilityManagerImpl.scheduleRestartForVmsOnHost(final HostVO host, > boolean investigate)}}, PR will follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)