On 12/3/20 6:20 PM, Alejandro Bonilla via pve-user wrote:

Hello -

I’ve just implemented an HA Group and then added my VMs as resources to be 
managed across my 3-node group. After struggling with ha-manager to 
disable/enable and unlocking VMs due to stuck migrations at first, I feel I can 
clear the usual issues as VMs get stuck.

My question comes from the fact that I use Proxmox for my Lab, therefore I 
script a few things and start my servers in the morning but the HA VMs always 
come up in an error state - likely due to Ceph or the cluster not being fully 
ready. I have implemented a delay start of 60 seconds which used to be enough. 
Is this delay also respected when the HA resources/VMs are managed by HA?


You mean the delay when you configure the guests to boot when the host is 
starting? I haven't tested it explicitly but HA should not take that delay into 
account.


Which log can I see to identify why these VMs never started and errored?

In the task log you should see the start jobs for each VM and if there is a 
problem starting it, those would be the first place to look. Otherwise the 
syslog.


A separate question - is there an easier way to test/simulate a dead/node 
failure besides actually killing my hosts?


Take down/disconnect the interface over which corosync communicates. The 
isolated node will fence itself after it lost connection to the quorum part of 
the cluster.


Thanks
_______________________________________________
pve-user mailing list
[email protected]
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user



_______________________________________________
pve-user mailing list
[email protected]
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to