GitHub user akoskuczi-bw created a discussion: KVM cluster with NFS primary
storage – VM HA not working when host is powered down
### problem
In a KVM cluster with NFS primary storage, VM HA does not work when a host is
powered down.
- The host status transitions to Down, HA state shows Fenced.
- VMs from the powered-down host are not restarted on other available hosts in
the cluster.
- Both Host HA and VM HA are enabled.
- OOB driver: IPMI.
### Expected behavior
VMs from the failed host should be restarted on other available hosts in the
cluster.
### Actual behavior
- Host goes to `Down` and HA state `Fenced`.
- VMs are not started elsewhere.
- Management server logs show a `NoTransitionException`.
### Relevant log snippet
WARN [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-4:[ctx-c2bf501d])
(logid:96e12771) Unable to find next HA state for current HA state=[Fenced] for
event=[Ineligible] for host Host
{"id":4,"name":"csh-1-2.clab.run","type":"Routing","uuid":"f8f86177-f0e3-4994-8609-dd55e0e35a3e"}
with id 4. com.cloud.utils.fsm.NoTransitionException: Unable to transition to
a new state from Fenced via Ineligible
at
com.cloud.utils.fsm.StateMachine2.getTransition(StateMachine2.java:108)
at com.cloud.utils.fsm.StateMachine2.getNextState(StateMachine2.java:94)
at
org.apache.cloudstack.ha.HAManagerImpl.transitionHAState(HAManagerImpl.java:153)
at
org.apache.cloudstack.ha.HAManagerImpl.validateAndFindHAProvider(HAManagerImpl.java:233)
at
org.apache.cloudstack.ha.HAManagerImpl$HAManagerBgPollTask.runInContext(HAManagerImpl.java:665)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at
java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
### versions
### Environment
- CloudStack version: 4.20.1.0
- Hypervisor: KVM
- Primary storage: NFS
- HA settings: Host HA enabled, VM HA enabled, OOB driver = IPMI
### The steps to reproduce the bug
1.1. Enable Host HA and VM HA in a KVM cluster (NFS primary storage).
2. Power off a host that runs VMs.
3. Observe host and VM states in the management server.
### What to do about it?
_No response_
GitHub link: https://github.com/apache/cloudstack/discussions/11674
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]