Eric,

-----Original Message-----
From: Users <users-boun...@clusterlabs.org> On Behalf Of Jan Friesse
Sent: Monday, March 1, 2021 3:27 AM
To: Cluster Labs - All topics related to open-source clustering welcomed

...

ha1 lost connection to qnetd so it gives up all hope immediately. ha2
retains connection to qnetd so it waits for final decision before
continuing.


Thanks for digging into logs. I believe Eric is hitting
https://github.com/corosync/corosync-qdevice/issues/10 (already fixed, but
may take some time to get into distributions) - it also contains workaround.

Honza


Reading through that linked thread, it seems that quorum timeouts are tricky to get right. I made some changes over the weekend and increased my token timeout to 5000. Are there other timeouts I

Token timeout really doesn't help because it doesn't affect quorum timeout.

Please follow changes as described in
https://github.com/ClusterLabs/sbd/pull/76#issuecomment-486952369 comment.


should adjust to make sure I don't run into a complicated race condition that causes weird/random failures due to mismatched or misaligned timeouts?

Nope, not really.

Honza



In your case apparently one node was completely disconnected for 15
seconds, then connectivity resumed. The second node was still waiting
for qdevice/qnetd decision. So it appears to work as expected.

Note that fencing would not have been initiated before timeout as well.
Fencing /may/ have been initiated after nodes established connection
again and saw that one resource failed to stop. This would
automatically resolve your issue. I need to think how to reproduce stop
failure.

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to