On Thu, 2018-06-28 at 19:58 +0300, Andrei Borzenkov wrote: > 28.06.2018 18:35, Dileep V Nair пишет: > > > > > > Hi, > > > > I have a cluster with DB2 running in HADR mode. I have used the > > db2 > > resource agent. My problem is whenever DB2 fails on primary it is > > migrating > > to the secondary node. Ideally it should restart thrice (Migration > > Threshold set to 3) but not happening. This is causing extra > > downtime for > > customer. Is there any other settings / parameters which needs to > > be set. > > Did anyone face similar issue ? I am on pacemaker version 1.1.15- > > 21.1. > > > > It is impossible to answer without good knowledge of application and > resource agent. From quick look at resource agent, it removes master > score from current node if database failure is detected which means > current node will not be eligible for fail-over. > > Note that pacemaker does not really have concept of "restarting > resource > on the same node". Every time it performs full node selection using > current scores. It usually happens to be "same node" simply due to > non-zero resource stickiness by default. You could attempt to adjust > stickiness so that final score will be larger than master score on > standby. But that also needs agent cooperation - are you sure agent > will > even attempt to restart failed master locally?
Also, some types of errors cannot be recovered by a restart on the same node. For example, by default, start failures will not be retried on the same node (see the cluster property start-failure-is-fatal), to avoid a repeatedly failing start preventing the cluster from doing anything else. Certain OCF resource agent exit codes are considered "hard" errors that prevent retrying on the same node: missing dependencies, file permission errors, etc. -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org