[ClusterLabs] Linux 8.2 - high totem token requires manual setting of ping_interval and ping_timeout

Hayden,Robert Thu, 25 Jun 2020 17:27:55 -0700

All,
Hello.  Hope all is well.   I have been researching Oracle Linux 8.2 and ran 
across a situation that is not well documented.   I decided to provide some 
details to the community in case I am missing something.


Basically, if you increase the totem token above approximately 33000 with the 
knet transport, then a two node cluster will not properly form.   The exact 
threshold value will slightly fluctuate, depending on hardware type and 
debugging, but will consistently fail above 40000.

The failure to form a cluster would occur when running the "pcs cluster start 
--all" command or if I would start one cluster, let it stabilize, then start 
the second.  When it fails to form a cluster, each side would say they are 
ONLINE, but the other side is UNCLEAN(offline) (cluster state: partition 
WITHOUT quorum).   If I define proper stonith resources, then they will not 
fence since the cluster never makes it to an initial quorum state.  So, the 
cluster will stay in this split state indefinitely.

Changing the transport back to udpu or udp, the higher totem tokens worked as 
expected.

>From the debug logging, I suspect that the Election Trigger (20 seconds) fires 
>before all nodes are properly identified by the knet transport.  I noticed 
>that with a totem token passing 32 seconds, the knet_ping* defaults were 
>pushing up against that 20 second mark.  The output of "corosync-cfgtool -s" 
>will show each node's link as enabled, but each side will state the other 
>side's link is not connected.   Since each side thinks the other node is not 
>active, they fail to properly send a join message to the other node during the 
>election.   They will essentially form a singleton cluster(??).  It is more 
>puzzling when you start one node at a time, waiting for the node to stabilize 
>before starting the other.   It is like the first node will never see the 
>remote knet interfaces become active, regardless of how long you wait.

The solution is to manually set the knet ping_timeout and ping_interval to 
lower values than the default values derived from the totem token.  This seems 
to allow for the knet transport to determine link status of all nodes before 
the election timer pops.

I tested this on both physical hardware and with VMs.  Both react similarly.

Bare bones test case to reproduce:
yum install pcs pacemaker fence-agents-all
firewall-cmd --permanent --add-service=high-availability
firewall-cmd --add-service=high-availability
systemctl start pcsd.service
systemctl enable pcsd.service
systemctl disable corosync
systemctl disable pacemaker
passwd hacluster
pcs host auth node1 node2
pcs cluster setup rhcs_test node1 node2 totem token=41000
pcs cluster start --all

Example command to create cluster that will properly form and get quorum:
pcs cluster setup rhcs_test node1 node2 totem token=61000 transport knet link 
ping_interval=1250 ping_timeout=2500

Hope this helps someone in the future.

Thanks
Robert


Robert Hayden | Lead Technology Architect | Cerner Corporation


CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing, copying, distribution, or use of such 
information is strictly prohibited and may be unlawful. If you are not the 
addressee, please promptly delete this message and notify the sender of the 
delivery error by e-mail or you may call Cerner's corporate offices in Kansas 
City, Missouri, U.S.A at (+1) (816)221-1024.

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Linux 8.2 - high totem token requires manual setting of ping_interval and ping_timeout

Reply via email to