Hi Swen, The KVMHAMonitor is initialised here <https://github.com/apache/cloudstack/blob/8f6721ed4c4e1b31081a951c62ffbe5331cf16d4/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java#L1202> even if you didn't enable the HA. If I may advise you to disable in the `agent.properties` file the property `reboot.host.and.alert.management.on.heartbeat.timeout=false`. CloudStack will still execute the heartbeat check but won't restart the agent host on failure.
Best regards, Slavka On Tue, Feb 20, 2024 at 12:41 AM Swen <m...@swen.io> wrote: > Hi all, > > we encountered a strange issue today in our lab installation. We are > running > CS 4.19.0 upgraded from CS 4.18.1 with linstor as primary storage. We are > using a linstor jar file provided by Linbit which is not the default one > part of default > 4.19.0! > > This updated plugin already includes a new feature to use linstor storage > for host > ha. I provide this only for information reasons. > > The issue we encountered was that even we do not have HA enabled on > cluster and host level, cloudstack agent on our KVM hosts triggered HA > actions and rebooted our hosts. We found this on our agent.log: > > Feb 19 11:53:05 pc-kvm-2 java[6617]: WARN [kvm.resource.KVMHAMonitor] > (Thread-1:) (logid:) Write heartbeat for pool > [71c272d3-b180-4b18-a0fc-cfc1dc5b86c9] failed: Down; try: 2 of 5. > > Feb 19 11:58:58 pc-kvm-2 java[9465]: WARN [kvm.resource.KVMHAMonitor] > (Thread-1:) (logid:) Write heartbeat for pool > [71c272d3-b180-4b18-a0fc-cfc1dc5b86c9] failed: Down; try: 3 of 5. > > Feb 19 12:00:08 pc-kvm-2 java[9465]: WARN [kvm.resource.KVMHAMonitor] > (Thread-1:) (logid:) Write heartbeat for pool > [71c272d3-b180-4b18-a0fc-cfc1dc5b86c9] failed: Down; try: 4 of 5. > > Feb 19 12:01:08 pc-kvm-2 java[9465]: WARN [kvm.resource.KVMHAMonitor] > (Thread-1:) (logid:) Write heartbeat for pool > [71c272d3-b180-4b18-a0fc-cfc1dc5b86c9] failed: Down; try: 5 of 5. > > Feb 19 12:01:08 pc-kvm-2 java[9465]: WARN [kvm.resource.KVMHAMonitor] > (Thread-1:) (logid:) Write heartbeat for pool > [71c272d3-b180-4b18-a0fc-cfc1dc5b86c9] failed: Down; stopping > cloudstack-agent. > > Feb 19 12:02:08 pc-kvm-2 heartbeat: kvmspheartbeat.sh will reboot system > because it was unable to write the heartbeat to the storage. > > We and Linbit did some debugging. We tried to understand why the > cloudstack agent is running those checks in the first place. We were unable > to find any code which > checks if host HA is enabled or not and will not perform HA tasks if HA is > disabled. Can somebody please double-check this? > > Thank you very much! > > Regards, > Swen > > >