--- Begin Message ---
On Wed, 14 Apr 2021 11:04:10 +0200
Eneko Lacunza via pve-user <[email protected]> wrote:
> Hi all,
>
> Yesterday we had a strange fence happen in a PVE 6.2 cluster.
>
> Cluster has 3 nodes (proxmox1, proxmox2, proxmox3) and has been
> operating normally for a year. Last update was on January 21st 2021.
> Storage is Ceph and nodes are connected to the same network switch
> with active-pasive bonds.
>
> proxmox1 was fenced and automatically rebooted, then everything
> recovered. HA restarted VMs in other nodes too.
>
> proxmox1 syslog: (no network link issues reported at device level)
I have seen this occasionally and every time the cause was high network
load/network congestion which caused token timeout. The default token
timeout in corosync IMHO is very optimistically configured to 1000 ms
so I have changed this setting to 5000 ms and after I have done this I
have never seen fencing happening caused by network load/network
congestion again. You could try this and see if that helps you.
PS. my cluster communication is on a dedicated gb bonded vlan.
--
Hilsen/Regards
Michael Rasmussen
Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
https://pgp.key-server.io/pks/lookup?search=0xD3C9A00E
mir <at> datanom <dot> net
https://pgp.key-server.io/pks/lookup?search=0xE501F51C
mir <at> miras <dot> org
https://pgp.key-server.io/pks/lookup?search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
When I woke up this morning, my girlfriend asked if I had slept well.
I said, "No, I made a few mistakes."
-- Steven Wright
pgpBS2p1tzy_R.pgp
Description: OpenPGP digital signature
--- End Message ---
_______________________________________________
pve-user mailing list
[email protected]
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user