Hi, On Thu, Oct 29, 2015 at 10:40:18AM +0530, Pritam Kharat wrote: > Thank you very much Ken for reply. I will try your suggested steps.
If you cannot figure out from the logs why the stop operation times out, you can also try to trace the resource agent: # crm resource help trace # crm resource trace vip stop Then take a look at the trace or post it somewhere. Thanks, Dejan > > On Wed, Oct 28, 2015 at 11:23 PM, Ken Gaillot <kgail...@redhat.com> wrote: > > > On 10/28/2015 03:51 AM, Pritam Kharat wrote: > > > Hi All, > > > > > > I am facing one issue in my two node HA. When I stop pacemaker on ACTIVE > > > node, it takes more time to stop and by this time VIP migration with > > other > > > resources migration fails to STANDBY node. (I have seen same issue in > > > ACTIVE node reboot case also) > > > > I assume STANDBY in this case is just a description of the node's > > purpose, and does not mean that you placed the node in pacemaker's > > standby mode. If the node really is in standby mode, it can't run any > > resources. > > > > > Last change: Wed Oct 28 02:52:57 2015 via cibadmin on node-1 > > > Stack: corosync > > > Current DC: node-1 (1) - partition with quorum > > > Version: 1.1.10-42f2063 > > > 2 Nodes configured > > > 2 Resources configured > > > > > > > > > Online: [ node-1 node-2 ] > > > > > > Full list of resources: > > > > > > resource (upstart:resource): Stopped > > > vip (ocf::heartbeat:IPaddr2): Started node-2 (unmanaged) FAILED > > > > > > Migration summary: > > > * Node node-1: > > > * Node node-2: > > > > > > Failed actions: > > > vip_stop_0 (node=node-2, call=-1, rc=1, status=Timed Out, > > > last-rc-change=Wed Oct 28 03:05:24 2015 > > > , queued=0ms, exec=0ms > > > ): unknown error > > > > > > VIP monitor is failing over here with error Timed Out. What is the > > general > > > reason for TimeOut. ? I have kept default-action-timeout=180secs which > > > should be enough for monitoring > > > > 180s should be far more than enough, so something must be going wrong. > > Notice that it is the stop operation on the active node that is failing. > > Normally in such a case, pacemaker would fence that node to be sure that > > it is safe to bring it up elsewhere, but you have disabled stonith. > > > > Fencing is important in failure recovery such as this, so it would be a > > good idea to try to get it implemented. > > > > > I have added order property -> when vip is started then only start other > > > resources. > > > Any clue to solve this problem ? Most of the time this VIP monitoring is > > > failing with Timed Out error. > > > > The "stop" in "vip_stop_0" means that the stop operation is what failed. > > Have you seen timeouts on any other operations? > > > > Look through the logs around the time of the failure, and try to see if > > there are any indications as to why the stop failed. > > > > If you can set aside some time for testing or have a test cluster that > > exhibits the same issue, you can try unmanaging the resource in > > pacemaker, then: > > > > 1. Try adding/removing the IP via normal system commands, and make sure > > that works. > > > > 2. Try running the resource agent manually (with any verbose option) to > > start/stop/monitor the IP to see if you can reproduce the problem and > > get more messages. > > > > _______________________________________________ > > Users mailing list: Users@clusterlabs.org > > http://clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > > -- > Thanks and Regards, > Pritam Kharat. > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org