[ https://issues.apache.org/jira/browse/CLOUDSTACK-8713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Remi Bergsma updated CLOUDSTACK-8713: ------------------------------------- Description: While testing a PR, I found that HA on KVM does not work properly. Steps to reproduce: - Spin up some VMs on KVM using a HA offering - go to KVM hypervisor and kill one of them to simulate a crash virsh destroy 6 (change number) - look how cloudstack handles this missing VM Result: - VM stays down and is not started Expected result: - VM should be started somewhere Cause: It doesn’t parse the power report property it gets from the hypervisor, so it never marks it Stopped. HA will not start, VM will stay down. Database reports PowerStateMissing. Starting manually works fine. select name,power_state,instance_name,state from vm_instance where name='test003'; | name | power_state | instance_name | state | | test003 | PowerReportMissing | i-2-6-VM | Running | 1 row in set (0.00 sec) I also tried to crash a KVM hypervisor and then the same thing happens. Haven’t tested it on other hypervisors. Could anyone verify this? Logs: 2015-08-06 15:40:46,809 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (AgentManager-Handler-16:null) VM state report is updated. host: 1, vm id: 6, power state: PowerReportMissing 2015-08-06 15:40:46,815 INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-16:null) VM i-2-6-VM is at Running and we received a power-off report while there is no pending jobs on it 2015-08-06 15:40:46,815 INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-16:null) Detected out-of-band stop of a HA enabled VM i-2-6-VM, will schedule restart 2015-08-06 15:40:46,824 INFO [c.c.h.HighAvailabilityManagerImpl] (AgentManager-Handler-16:null) Schedule vm for HA: VM[User|i-2-6-VM] 2015-08-06 15:40:46,824 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (AgentManager-Handler-16:null) Done with process of VM state report. host: 1 2015-08-06 15:40:46,851 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-4e073b92 work-37) Processing HAWork[37-HA-6-Running-Investigating] 2015-08-06 15:40:46,871 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-4e073b92 work-37) HA on VM[User|i-2-6-VM] 2015-08-06 15:40:46,880 DEBUG [c.c.a.t.Request] (HA-Worker-3:ctx-4e073b92 work-37) Seq 1-6463228415230083145: Sending { Cmd , MgmtId: 3232241215, via: 1(kvm2), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckVirtualMachineCommand":{"vmName":"i-2-6-VM","wait":20}}] } 2015-08-06 15:40:46,908 ERROR [c.c.a.t.Request] (AgentManager-Handler-17:null) Unable to convert to json: [{"com.cloud.agent.api.CheckVirtualMachineAnswer":{"state":"Stopped","result":true,"contextMap":{},"wait":0}}] 2015-08-06 15:40:46,909 WARN [c.c.u.n.Task] (AgentManager-Handler-17:null) Caught the following exception but pushing on com.google.gson.JsonParseException: The JsonDeserializer EnumTypeAdapter failed to deserialize json object "Stopped" given the type class com.cloud.vm.VirtualMachine$PowerState at com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:64) at com.google.gson.JsonDeserializationVisitor.invokeCustomDeserializer(JsonDeserializationVisitor.java:92) at com.google.gson.JsonObjectDeserializationVisitor.visitFieldUsingCustomHandler(JsonObjectDeserializationVisitor.java:117) at com.google.gson.ReflectingFieldNavigator.visitFieldsReflectively(ReflectingFieldNavigator.java:63) at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:120) at com.google.gson.JsonDeserializationContextDefault.fromJsonObject(JsonDeserializationContextDefault.java:76) at com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:54) at com.google.gson.Gson.fromJson(Gson.java:551) at com.google.gson.Gson.fromJson(Gson.java:521) at com.cloud.agent.transport.ArrayTypeAdaptor.deserialize(ArrayTypeAdaptor.java:80) at com.cloud.agent.transport.ArrayTypeAdaptor.deserialize(ArrayTypeAdaptor.java:40) at com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:51) at com.google.gson.JsonDeserializationVisitor.invokeCustomDeserializer(JsonDeserializationVisitor.java:92) at com.google.gson.JsonDeserializationVisitor.visitUsingCustomHandler(JsonDeserializationVisitor.java:80) at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:101) at com.google.gson.JsonDeserializationContextDefault.fromJsonArray(JsonDeserializationContextDefault.java:67) at com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:52) at com.google.gson.Gson.fromJson(Gson.java:551) at com.google.gson.Gson.fromJson(Gson.java:498) at com.google.gson.Gson.fromJson(Gson.java:467) at com.google.gson.Gson.fromJson(Gson.java:417) at com.google.gson.Gson.fromJson(Gson.java:389) at com.cloud.agent.transport.Request.log(Request.java:399) at com.cloud.agent.transport.Request.logD(Request.java:368) at com.cloud.agent.manager.AgentAttache.processAnswers(AgentAttache.java:271) at com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:709) at com.cloud.utils.nio.Task.run(Task.java:84) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: No enum constant com.cloud.vm.VirtualMachine.PowerState.Stopped at java.lang.Enum.valueOf(Enum.java:236) at com.google.gson.DefaultTypeAdapters$EnumTypeAdapter.deserialize(DefaultTypeAdapters.java:524) at com.google.gson.DefaultTypeAdapters$EnumTypeAdapter.deserialize(DefaultTypeAdapters.java:514) at com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:51) ... 29 more was: While testing a PR, I found that HA on KVM does not work properly. Steps to reproduce: - Spin up some VMs on KVM using a HA offering - go to KVM hypervisor and kill one of them to simulate a crash virsh destroy 6 (change number) - look how cloudstack handles this missing VM Result: - VM stays down and is not started Expected result: - VM should be started somewhere Cause: It doesn’t parse the power report property it gets from the hypervisor, so it never marks it Stopped. HA will not start, VM will stay down. Database reports PowerStateMissing. Starting manually works fine. select name,power_state,instance_name,state from vm_instance where name='test003'; +---------+--------------------+---------------+---------+ | name | power_state | instance_name | state | +---------+--------------------+---------------+---------+ | test003 | PowerReportMissing | i-2-6-VM | Running | +---------+--------------------+---------------+---------+ 1 row in set (0.00 sec) I also tried to crash a KVM hypervisor and then the same thing happens. Haven’t tested it on other hypervisors. Could anyone verify this? Logs: 2015-08-06 15:40:46,809 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (AgentManager-Handler-16:null) VM state report is updated. host: 1, vm id: 6, power state: PowerReportMissing 2015-08-06 15:40:46,815 INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-16:null) VM i-2-6-VM is at Running and we received a power-off report while there is no pending jobs on it 2015-08-06 15:40:46,815 INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-16:null) Detected out-of-band stop of a HA enabled VM i-2-6-VM, will schedule restart 2015-08-06 15:40:46,824 INFO [c.c.h.HighAvailabilityManagerImpl] (AgentManager-Handler-16:null) Schedule vm for HA: VM[User|i-2-6-VM] 2015-08-06 15:40:46,824 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (AgentManager-Handler-16:null) Done with process of VM state report. host: 1 2015-08-06 15:40:46,851 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-4e073b92 work-37) Processing HAWork[37-HA-6-Running-Investigating] 2015-08-06 15:40:46,871 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-4e073b92 work-37) HA on VM[User|i-2-6-VM] 2015-08-06 15:40:46,880 DEBUG [c.c.a.t.Request] (HA-Worker-3:ctx-4e073b92 work-37) Seq 1-6463228415230083145: Sending { Cmd , MgmtId: 3232241215, via: 1(kvm2), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.CheckVirtualMachineCommand":{"vmName":"i-2-6-VM","wait":20}}] } 2015-08-06 15:40:46,908 ERROR [c.c.a.t.Request] (AgentManager-Handler-17:null) Unable to convert to json: [{"com.cloud.agent.api.CheckVirtualMachineAnswer":{"state":"Stopped","result":true,"contextMap":{},"wait":0}}] 2015-08-06 15:40:46,909 WARN [c.c.u.n.Task] (AgentManager-Handler-17:null) Caught the following exception but pushing on com.google.gson.JsonParseException: The JsonDeserializer EnumTypeAdapter failed to deserialize json object "Stopped" given the type class com.cloud.vm.VirtualMachine$PowerState at com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:64) at com.google.gson.JsonDeserializationVisitor.invokeCustomDeserializer(JsonDeserializationVisitor.java:92) at com.google.gson.JsonObjectDeserializationVisitor.visitFieldUsingCustomHandler(JsonObjectDeserializationVisitor.java:117) at com.google.gson.ReflectingFieldNavigator.visitFieldsReflectively(ReflectingFieldNavigator.java:63) at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:120) at com.google.gson.JsonDeserializationContextDefault.fromJsonObject(JsonDeserializationContextDefault.java:76) at com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:54) at com.google.gson.Gson.fromJson(Gson.java:551) at com.google.gson.Gson.fromJson(Gson.java:521) at com.cloud.agent.transport.ArrayTypeAdaptor.deserialize(ArrayTypeAdaptor.java:80) at com.cloud.agent.transport.ArrayTypeAdaptor.deserialize(ArrayTypeAdaptor.java:40) at com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:51) at com.google.gson.JsonDeserializationVisitor.invokeCustomDeserializer(JsonDeserializationVisitor.java:92) at com.google.gson.JsonDeserializationVisitor.visitUsingCustomHandler(JsonDeserializationVisitor.java:80) at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:101) at com.google.gson.JsonDeserializationContextDefault.fromJsonArray(JsonDeserializationContextDefault.java:67) at com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:52) at com.google.gson.Gson.fromJson(Gson.java:551) at com.google.gson.Gson.fromJson(Gson.java:498) at com.google.gson.Gson.fromJson(Gson.java:467) at com.google.gson.Gson.fromJson(Gson.java:417) at com.google.gson.Gson.fromJson(Gson.java:389) at com.cloud.agent.transport.Request.log(Request.java:399) at com.cloud.agent.transport.Request.logD(Request.java:368) at com.cloud.agent.manager.AgentAttache.processAnswers(AgentAttache.java:271) at com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:709) at com.cloud.utils.nio.Task.run(Task.java:84) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: No enum constant com.cloud.vm.VirtualMachine.PowerState.Stopped at java.lang.Enum.valueOf(Enum.java:236) at com.google.gson.DefaultTypeAdapters$EnumTypeAdapter.deserialize(DefaultTypeAdapters.java:524) at com.google.gson.DefaultTypeAdapters$EnumTypeAdapter.deserialize(DefaultTypeAdapters.java:514) at com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:51) ... 29 more > KVM Power state report not properly parsed (Exception) resulting in HA is not > working on CentOS 7 > ------------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-8713 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-8713 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Affects Versions: 4.6.0 > Environment: KVM on CentOS 7, management server running latest master > aka 4.6.0 > Reporter: Remi Bergsma > Priority: Critical > > While testing a PR, I found that HA on KVM does not work properly. > Steps to reproduce: > - Spin up some VMs on KVM using a HA offering > - go to KVM hypervisor and kill one of them to simulate a crash > virsh destroy 6 (change number) > - look how cloudstack handles this missing VM > Result: > - VM stays down and is not started > Expected result: > - VM should be started somewhere > Cause: > It doesn’t parse the power report property it gets from the hypervisor, so it > never marks it Stopped. HA will not start, VM will stay down. > Database reports PowerStateMissing. Starting manually works fine. > select name,power_state,instance_name,state from vm_instance where > name='test003'; > | name | power_state | instance_name | state | > | test003 | PowerReportMissing | i-2-6-VM | Running | > 1 row in set (0.00 sec) > I also tried to crash a KVM hypervisor and then the same thing happens. > Haven’t tested it on other hypervisors. Could anyone verify this? > Logs: > 2015-08-06 15:40:46,809 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] > (AgentManager-Handler-16:null) VM state report is updated. host: 1, vm id: 6, > power state: PowerReportMissing > 2015-08-06 15:40:46,815 INFO [c.c.v.VirtualMachineManagerImpl] > (AgentManager-Handler-16:null) VM i-2-6-VM is at Running and we received a > power-off report while there is no pending jobs on it > 2015-08-06 15:40:46,815 INFO [c.c.v.VirtualMachineManagerImpl] > (AgentManager-Handler-16:null) Detected out-of-band stop of a HA enabled VM > i-2-6-VM, will schedule restart > 2015-08-06 15:40:46,824 INFO [c.c.h.HighAvailabilityManagerImpl] > (AgentManager-Handler-16:null) Schedule vm for HA: VM[User|i-2-6-VM] > 2015-08-06 15:40:46,824 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] > (AgentManager-Handler-16:null) Done with process of VM state report. host: 1 > 2015-08-06 15:40:46,851 INFO [c.c.h.HighAvailabilityManagerImpl] > (HA-Worker-3:ctx-4e073b92 work-37) Processing > HAWork[37-HA-6-Running-Investigating] > 2015-08-06 15:40:46,871 INFO [c.c.h.HighAvailabilityManagerImpl] > (HA-Worker-3:ctx-4e073b92 work-37) HA on VM[User|i-2-6-VM] > 2015-08-06 15:40:46,880 DEBUG [c.c.a.t.Request] (HA-Worker-3:ctx-4e073b92 > work-37) Seq 1-6463228415230083145: Sending { Cmd , MgmtId: 3232241215, via: > 1(kvm2), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.CheckVirtualMachineCommand":{"vmName":"i-2-6-VM","wait":20}}] > } > 2015-08-06 15:40:46,908 ERROR [c.c.a.t.Request] > (AgentManager-Handler-17:null) Unable to convert to json: > [{"com.cloud.agent.api.CheckVirtualMachineAnswer":{"state":"Stopped","result":true,"contextMap":{},"wait":0}}] > 2015-08-06 15:40:46,909 WARN [c.c.u.n.Task] (AgentManager-Handler-17:null) > Caught the following exception but pushing on > com.google.gson.JsonParseException: The JsonDeserializer EnumTypeAdapter > failed to deserialize json object "Stopped" given the type class > com.cloud.vm.VirtualMachine$PowerState > at > com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:64) > at > com.google.gson.JsonDeserializationVisitor.invokeCustomDeserializer(JsonDeserializationVisitor.java:92) > at > com.google.gson.JsonObjectDeserializationVisitor.visitFieldUsingCustomHandler(JsonObjectDeserializationVisitor.java:117) > at > com.google.gson.ReflectingFieldNavigator.visitFieldsReflectively(ReflectingFieldNavigator.java:63) > at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:120) > at > com.google.gson.JsonDeserializationContextDefault.fromJsonObject(JsonDeserializationContextDefault.java:76) > at > com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:54) > at com.google.gson.Gson.fromJson(Gson.java:551) > at com.google.gson.Gson.fromJson(Gson.java:521) > at > com.cloud.agent.transport.ArrayTypeAdaptor.deserialize(ArrayTypeAdaptor.java:80) > at > com.cloud.agent.transport.ArrayTypeAdaptor.deserialize(ArrayTypeAdaptor.java:40) > at > com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:51) > at > com.google.gson.JsonDeserializationVisitor.invokeCustomDeserializer(JsonDeserializationVisitor.java:92) > at > com.google.gson.JsonDeserializationVisitor.visitUsingCustomHandler(JsonDeserializationVisitor.java:80) > at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:101) > at > com.google.gson.JsonDeserializationContextDefault.fromJsonArray(JsonDeserializationContextDefault.java:67) > at > com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:52) > at com.google.gson.Gson.fromJson(Gson.java:551) > at com.google.gson.Gson.fromJson(Gson.java:498) > at com.google.gson.Gson.fromJson(Gson.java:467) > at com.google.gson.Gson.fromJson(Gson.java:417) > at com.google.gson.Gson.fromJson(Gson.java:389) > at com.cloud.agent.transport.Request.log(Request.java:399) > at com.cloud.agent.transport.Request.logD(Request.java:368) > at > com.cloud.agent.manager.AgentAttache.processAnswers(AgentAttache.java:271) > at > com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:709) > at com.cloud.utils.nio.Task.run(Task.java:84) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalArgumentException: No enum constant > com.cloud.vm.VirtualMachine.PowerState.Stopped > at java.lang.Enum.valueOf(Enum.java:236) > at > com.google.gson.DefaultTypeAdapters$EnumTypeAdapter.deserialize(DefaultTypeAdapters.java:524) > at > com.google.gson.DefaultTypeAdapters$EnumTypeAdapter.deserialize(DefaultTypeAdapters.java:514) > at > com.google.gson.JsonDeserializerExceptionWrapper.deserialize(JsonDeserializerExceptionWrapper.java:51) > ... 29 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)