On Wed, 2018-09-05 at 17:21 +0200, Cesar Hernandez wrote: > > > > P.S. If the issue is just a matter of timing when you're starting > > both > > nodes, you can start corosync on both nodes first, then start > > pacemaker > > on both nodes. That way pacemaker on each node will immediately see > > the > > other node's presence. > > -- > > Well rebooting a server lasts 2 minutes approximately. > I think I'm going to keep the same workaround I have on other > servers: > > -set crm stonith-timeout=300s > -have a "sleep 180" in the fencing script, so the fencing will always > last 3 minutes > > So when crm fences a node on startup, the fencing script will return > after 3 minutes. And at that time, the other node should be up and it > won't be retried fencing > > What you think about this workaround? > > > The other solution would be updating pacemaker, but this 1.1.14 I > have tested on many servers, and I don't want to take the risk to > update to 1.1.15 and (maybe) have some other new issues... > > Thanks a lot! > Cesar
If you build from source, you can apply the patch that fixes the issue to the 1.1.14 code base: https://github.com/ClusterLabs/pacemaker/commit/98457d1635db1222f93599b6021e662e766ce62d -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org