On 12/3/20 6:20 PM, Alejandro Bonilla via pve-user wrote:
Hello - I’ve just implemented an HA Group and then added my VMs as resources to be managed across my 3-node group. After struggling with ha-manager to disable/enable and unlocking VMs due to stuck migrations at first, I feel I can clear the usual issues as VMs get stuck. My question comes from the fact that I use Proxmox for my Lab, therefore I script a few things and start my servers in the morning but the HA VMs always come up in an error state - likely due to Ceph or the cluster not being fully ready. I have implemented a delay start of 60 seconds which used to be enough. Is this delay also respected when the HA resources/VMs are managed by HA?
You mean the delay when you configure the guests to boot when the host is starting? I haven't tested it explicitly but HA should not take that delay into account.
Which log can I see to identify why these VMs never started and errored?
In the task log you should see the start jobs for each VM and if there is a problem starting it, those would be the first place to look. Otherwise the syslog.
A separate question - is there an easier way to test/simulate a dead/node failure besides actually killing my hosts?
Take down/disconnect the interface over which corosync communicates. The isolated node will fence itself after it lost connection to the quorum part of the cluster.
Thanks _______________________________________________ pve-user mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
_______________________________________________ pve-user mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
