On 11/03/2016 05:28 PM, Adam Spiers wrote: > Ken Gaillot <kgail...@redhat.com> wrote: >> ClusterLabs is happy to announce the first release candidate for >> Pacemaker version 1.1.16. Source code is available at: >> >> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 >> >> The most significant enhancements in this release are: > [snipped] > >> * Watchdog-based fencing using sbd now works on remote nodes. > What were the problems with this before, exactly? Thanks! If you enabled just cluster-watcher on remote-nodes that was not much of an observation.
But if you in addition enabled pacemaker-watcher then when the remote-node-resource switched from one cluster-node to another the client receiving the cib inside pacemaker-watcher didn't get that switch and still insisted on getting something via the old connection so that the node was reset via watchdog. Introducing a tcp-timeout derived from the sbd-watchdog-timeout makes the connection timeout and the client switches to the new control-node. So a remote-node would just be watchdog-fenced if the remote-node-resource doesn't reconnect within time - regardless which node it is running now. Actually that commit in pacemaker should be beneficial for tooling run on remote-nodes - via proxy - in general. > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org