Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
After setting the crmd-transition-delay to 2 * my ping monitor interval the issues I was seeing before in testing have not re-occurred. Thanks again for the help. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
On 24/05/2013, at 2:43 AM, Andrew Widdersheim awiddersh...@hotmail.com wrote: After setting the crmd-transition-delay to 2 * my ping monitor interval the issues I was seeing before in testing have not re-occurred. Even a couple of seconds should be plenty. The dampen value gets them almost arriving at the same time, crmd-transition-delay is just for the last bit. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Have I just run into a shortcoming with pacemaker? Should I file a bug or RFE somewhere? Seems like there should be another parameter when setting up a pingd resource to tell the DC/policy engine to wait x amount of seconds so that all nodes have shared their connection state before it makes a decision about moving resources. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
On 21/05/2013, at 1:39 AM, Andrew Widdersheim awiddersh...@hotmail.com wrote: Have I just run into a shortcoming with pacemaker? Short answer: yes but there is a work-around Basically attrd should be but is not truly atomic. Despite its best efforts, updates can still arrive at sufficiently different times to produce the behavior you saw. Should I file a bug or RFE somewhere? Seems like there should be another parameter when setting up a pingd resource to tell the DC/policy engine to wait x amount of seconds so that all nodes have shared their connection state before it makes a decision about moving resources. That would be: crmd-transition-delay = time [0s] *** Advanced Use Only *** Enabling this option will slow down cluster recovery under all conditions Delay cluster recovery for the configured interval to allow for additional/related events to occur. Useful if your configuration is sensitive to the order in which ping updates arrive. from man crmd :) For some reason its not in pacemaker-explained, I'll fix that now. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Hello How do you configure your cluster network? are you using a private network for the cluster and one public for the services? 2013/5/15 Andrew Widdersheim awiddersh...@hotmail.com Sorry to bring up old issues but I am having the exact same problem as the original poster. A simultaneous disconnect on my two node cluster causes the resources to start to transition to the other node but mid flight the transition is aborted and resources are started again on the original node when the cluster realizes connectivity is same between the two nodes. I have tried various dampen settings without having any luck. Seems like the nodes report the outages at slightly different times which results in a partial transition of resources instead of waiting to know the connectivity of all of the nodes in the cluster before taking action which is what I would have thought dampen would help solve. Ideally the cluster wouldn't start the transition if another cluster node is having a connectivity issue as well and connectivity status is shared between all cluster nodes. Find my configuration below. Let me know there is something I can change to fix or if this behavior is expected. primitive p_drbd ocf:linbit:drbd \ params drbd_resource=r1 \ op monitor interval=30s role=Slave \ op monitor interval=10s role=Master primitive p_fs ocf:heartbeat:Filesystem \ params device=/dev/drbd/by-res/r1 directory=/drbd/r1 fstype=ext4 options=noatime \ op start interval=0 timeout=60s \ op stop interval=0 timeout=180s \ op monitor interval=30s timeout=40s primitive p_mysql ocf:heartbeat:mysql \ params binary=/usr/libexec/mysqld config=/drbd/r1/mysql/my.cnf datadir=/drbd/r1/mysql \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s \ meta target-role=Started primitive p_ping ocf:pacemaker:ping \ params host_list=192.168.5.1 dampen=30s multiplier=1000 debug=true \ op start interval=0 timeout=60s \ op stop interval=0 timeout=60s \ op monitor interval=5s timeout=10s group g_mysql_group p_fs p_mysql \ meta target-role=Started ms ms_drbd p_drbd \ meta notify=true master-max=1 clone-max=2 target-role=Started clone cl_ping p_ping location l_connected g_mysql \ rule $id=l_connected-rule pingd: defined pingd colocation c_mysql_on_drbd inf: g_mysql ms_drbd:Master order o_drbd_before_mysql inf: ms_drbd:promote g_mysql:start property $id=cib-bootstrap-options \ dc-version=1.1.6-1.el6-8b6c6b9b6dc2627713f870850d20163fad4cc2a2 \ cluster-infrastructure=Heartbeat \ no-quorum-policy=ignore \ stonith-enabled=false \ cluster-recheck-interval=5m \ last-lrm-refresh=1368632470 rsc_defaults $id=rsc-options \ migration-threshold=5 \ resource-stickiness=200 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- esta es mi vida e me la vivo hasta que dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
The cluster has 3 connections total. The first connection is the outside interface where services can communicate and is also used for cluster communication using mcast. The second interface is a cross-over that is solely for cluster communication. The third connection is another cross-over solely for DRBD replication. This issue happens when the first connection that is used for both the services and cluster communication is pulled on both nodes at the same time. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Andrew, I'd recommend adding more than one host to your p_ping resource and see if that improves the situation. When I had this problem, I observed better behavior after adding more than one IP to the list of hosts and changing the p_ping location constraint to be as follows: location loc_run_on_most_connected g_mygroup \ rule $id=loc_run_on_most_connected-rule -inf: not_defined p_ping or p_ping lte 0 More information: http://www.gossamer-threads.com/lists/linuxha/pacemaker/81502#81502 Hope this helps, Andrew - Original Message - From: Andrew Widdersheim awiddersh...@hotmail.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Thursday, May 16, 2013 9:35:56 AM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? The cluster has 3 connections total. The first connection is the outside interface where services can communicate and is also used for cluster communication using mcast. The second interface is a cross-over that is solely for cluster communication. The third connection is another cross-over solely for DRBD replication. This issue happens when the first connection that is used for both the services and cluster communication is pulled on both nodes at the same time. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Thanks for the help. Adding another node to the ping host_list may help in some situations but the root issues doesn't really get solved. Also, the location constraint you posted is very different than mine. Your constraint requires connectivity where as the one I am trying to use looks for best connectivity. I have used the location constraint you posted with success in the past but I don't want my resource to be shut off in the event of a network outage that is happening across all nodes at the same time. Don't get me wrong in some cluster configurationss I do use the configuration you posted but this setup is not one of them for specific reasons. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Sorry to bring up old issues but I am having the exact same problem as the original poster. A simultaneous disconnect on my two node cluster causes the resources to start to transition to the other node but mid flight the transition is aborted and resources are started again on the original node when the cluster realizes connectivity is same between the two nodes. I have tried various dampen settings without having any luck. Seems like the nodes report the outages at slightly different times which results in a partial transition of resources instead of waiting to know the connectivity of all of the nodes in the cluster before taking action which is what I would have thought dampen would help solve. Ideally the cluster wouldn't start the transition if another cluster node is having a connectivity issue as well and connectivity status is shared between all cluster nodes. Find my configuration below. Let me know there is something I can change to fix or if this behavior is expected. primitive p_drbd ocf:linbit:drbd \ params drbd_resource=r1 \ op monitor interval=30s role=Slave \ op monitor interval=10s role=Master primitive p_fs ocf:heartbeat:Filesystem \ params device=/dev/drbd/by-res/r1 directory=/drbd/r1 fstype=ext4 options=noatime \ op start interval=0 timeout=60s \ op stop interval=0 timeout=180s \ op monitor interval=30s timeout=40s primitive p_mysql ocf:heartbeat:mysql \ params binary=/usr/libexec/mysqld config=/drbd/r1/mysql/my.cnf datadir=/drbd/r1/mysql \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s \ meta target-role=Started primitive p_ping ocf:pacemaker:ping \ params host_list=192.168.5.1 dampen=30s multiplier=1000 debug=true \ op start interval=0 timeout=60s \ op stop interval=0 timeout=60s \ op monitor interval=5s timeout=10s group g_mysql_group p_fs p_mysql \ meta target-role=Started ms ms_drbd p_drbd \ meta notify=true master-max=1 clone-max=2 target-role=Started clone cl_ping p_ping location l_connected g_mysql \ rule $id=l_connected-rule pingd: defined pingd colocation c_mysql_on_drbd inf: g_mysql ms_drbd:Master order o_drbd_before_mysql inf: ms_drbd:promote g_mysql:start property $id=cib-bootstrap-options \ dc-version=1.1.6-1.el6-8b6c6b9b6dc2627713f870850d20163fad4cc2a2 \ cluster-infrastructure=Heartbeat \ no-quorum-policy=ignore \ stonith-enabled=false \ cluster-recheck-interval=5m \ last-lrm-refresh=1368632470 rsc_defaults $id=rsc-options \ migration-threshold=5 \ resource-stickiness=200 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
On 2013-05-15 20:44, Andrew Widdersheim wrote: Sorry to bring up old issues but I am having the exact same problem as the original poster. A simultaneous disconnect on my two node cluster causes the resources to start to transition to the other node but mid flight the transition is aborted and resources are started again on the original node when the cluster realizes connectivity is same between the two nodes. I have tried various dampen settings without having any luck. Seems like the nodes report the outages at slightly different times which results in a partial transition of resources instead of waiting to know the connectivity of all of the nodes in the cluster before taking action which is what I would have thought dampen would help solve. You have some logs for us? Ideally the cluster wouldn't start the transition if another cluster node is having a connectivity issue as well and connectivity status is shared between all cluster nodes. Find my configuration below. Let me know there is something I can change to fix or if this behavior is expected. primitive p_drbd ocf:linbit:drbd \ params drbd_resource=r1 \ op monitor interval=30s role=Slave \ op monitor interval=10s role=Master primitive p_fs ocf:heartbeat:Filesystem \ params device=/dev/drbd/by-res/r1 directory=/drbd/r1 fstype=ext4 options=noatime \ op start interval=0 timeout=60s \ op stop interval=0 timeout=180s \ op monitor interval=30s timeout=40s primitive p_mysql ocf:heartbeat:mysql \ params binary=/usr/libexec/mysqld config=/drbd/r1/mysql/my.cnf datadir=/drbd/r1/mysql \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s \ meta target-role=Started primitive p_ping ocf:pacemaker:ping \ params host_list=192.168.5.1 dampen=30s multiplier=1000 debug=true \ op start interval=0 timeout=60s \ op stop interval=0 timeout=60s \ op monitor interval=5s timeout=10s group g_mysql_group p_fs p_mysql \ meta target-role=Started ms ms_drbd p_drbd \ meta notify=true master-max=1 clone-max=2 target-role=Started clone cl_ping p_ping location l_connected g_mysql \ rule $id=l_connected-rule pingd: defined pingd colocation c_mysql_on_drbd inf: g_mysql ms_drbd:Master order o_drbd_before_mysql inf: ms_drbd:promote g_mysql:start property $id=cib-bootstrap-options \ dc-version=1.1.6-1.el6-8b6c6b9b6dc2627713f870850d20163fad4cc2a2 \ cluster-infrastructure=Heartbeat \ Hmm ... you compiled your own Pacemaker version that supports Heartbeat on RHEL6? Best regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now no-quorum-policy=ignore \ stonith-enabled=false \ cluster-recheck-interval=5m \ last-lrm-refresh=1368632470 rsc_defaults $id=rsc-options \ migration-threshold=5 \ resource-stickiness=200 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
I attached logs from both nodes. Yes, we compiled 1.1.6 with heartbeat support for RHEL6.4. I tried 1.1.10 but had issues. I have another thread open on the mailing list for that issue as well. I'm not opposed to doing CMAN or corosync if those fix the problem. We have been using this setup or very similar for about 2-3 years. Florian Haas actually came to our company to do a training for us when he was still at Linbit and this is how we set it up then and have continued to do so since. We have never had an issue up until this point because all of our clusters in the past were setup so that connectivity was required and it was expected that resources would shut down during an event like this. May 15 13:42:00 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) logd is not running May 15 13:42:00 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) 2013/05/15_13:42:00 WARNING: 192.168.5.1 is inactive May 15 13:42:09 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) logd is not running May 15 13:42:09 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) 2013/05/15_13:42:09 WARNING: 192.168.5.1 is inactive May 15 13:42:18 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) logd is not running May 15 13:42:18 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) 2013/05/15_13:42:18 WARNING: 192.168.5.1 is inactive May 15 13:42:27 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) logd is not running May 15 13:42:27 node1 lrmd: [27346]: info: RA output: (p_ping:1:monitor:stderr) 2013/05/15_13:42:27 WARNING: 192.168.5.1 is inactive May 15 13:42:30 node1 attrd: [27348]: notice: attrd_trigger_update: Sending flush op to all hosts for: pingd (0) May 15 13:42:30 node1 attrd: [27348]: notice: attrd_perform_update: Sent update 238: pingd=0 May 15 13:42:30 node1 crmd: [27349]: info: abort_transition_graph: te_update_diff:164 - Triggered transition abort (complete=1, tag=nvpair, id=status-f5a576b5-003b-447d-8029-19202823bbfa-pingd, name=pingd, value=0, magic=NA, cib=0.75.78) : Transient attribute: update May 15 13:42:30 node1 crmd: [27349]: info: do_state_transition: State transition S_IDLE - S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] May 15 13:42:30 node1 crmd: [27349]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. May 15 13:42:30 node1 crmd: [27349]: info: do_pe_invoke: Query 362: Requesting the current CIB: S_POLICY_ENGINE May 15 13:42:30 node1 crmd: [27349]: info: do_pe_invoke_callback: Invoking the PE: query=362, ref=pe_calc-dc-1368639750-564, seq=8, quorate=1 May 15 13:42:30 node1 pengine: [6643]: notice: unpack_config: On loss of CCM Quorum: Ignore May 15 13:42:30 node1 pengine: [6643]: notice: unpack_rsc_op: Operation p_drbd:1_last_failure_0 found resource p_drbd:1 active on node2 May 15 13:42:30 node1 pengine: [6643]: notice: RecurringOp: Start recurring monitor (30s) for p_fs on node2 May 15 13:42:30 node1 pengine: [6643]: notice: RecurringOp: Start recurring monitor (30s) for p_mysql on node2 May 15 13:42:30 node1 pengine: [6643]: notice: RecurringOp: Start recurring monitor (30s) for p_drbd:0 on node1 May 15 13:42:30 node1 pengine: [6643]: notice: RecurringOp: Start recurring monitor (10s) for p_drbd:1 on node2 May 15 13:42:30 node1 pengine: [6643]: notice: RecurringOp: Start recurring monitor (30s) for p_drbd:0 on node1 May 15 13:42:30 node1 pengine: [6643]: notice: RecurringOp: Start recurring monitor (10s) for p_drbd:1 on node2 May 15 13:42:30 node1 pengine: [6643]: notice: LogActions: Move p_fs#011(Started node1 - node2) May 15 13:42:30 node1 pengine: [6643]: notice: LogActions: Move p_mysql#011(Started node1 - node2) May 15 13:42:30 node1 pengine: [6643]: notice: LogActions: Demote p_drbd:0#011(Master - Slave node1) May 15 13:42:30 node1 pengine: [6643]: notice: LogActions: Promote p_drbd:1#011(Slave - Master node2) May 15 13:42:30 node1 pengine: [6643]: notice: LogActions: Leave p_ping:0#011(Started node2) May 15 13:42:30 node1 pengine: [6643]: notice: LogActions: Leave p_ping:1#011(Started node1) May 15 13:42:30 node1 crmd: [27349]: info: do_state_transition: State transition S_POLICY_ENGINE - S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] May 15 13:42:30 node1 crmd: [27349]: info: unpack_graph: Unpacked transition 56: 40 actions in 40 synapses May 15 13:42:30 node1 crmd: [27349]: info: do_te_invoke: Processing graph 56 (ref=pe_calc-dc-1368639750-564) derived from /var/lib/pengine/pe-input-64.bz2 May 15 13:42:30 node1 crmd: [27349]: info: te_pseudo_action: Pseudo action 23 fired and confirmed May 15 13:42:30 node1 crmd: [27349]: info: te_rsc_command: Initiating action 7: cancel p_drbd:0_monitor_1 on node1 (local) May 15 13:42:30 node1 lrmd: [18185]: WARN: For LSB init script, no additional
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
On Tue, Aug 28, 2012 at 3:01 AM, Andrew Martin amar...@xes-inc.com wrote: Jake, Attached is the log from the same period for node2. If I am reading this correctly, it looks like there was a 7 second difference between when node1 set its score to 1000 and when node2 set its score to 1000? Assuming the time is in sync on both nodes, yes. This is somewhat expected since your monitor interval is 10s. This is why we recommend dampen = 2 * monitor. From the next log, it looks like you're using 5s (the -d option) instead of 20s. Aug 22 10:40:38 node1 attrd_updater: [1860]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s Aug 22 10:40:43 node1 attrd: [4402]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:44 node1 attrd: [4402]: notice: attrd_perform_update: Sent update 265: p_ping=1000 Aug 22 10:40:45 node2 attrd_updater: [27245]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_perform_update: Sent update 122: p_ping=1000 I had changed the attempts value to 8 (from the default 2) to address this same issue - to avoid resource migration based on brief connectivity problems with these IPs - however if we can get dampen configured correctly I'll set it back to the default. Thanks, Andrew - Original Message - From: Jake Smith jsm...@argotec.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, August 27, 2012 9:39:30 AM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? - Original Message - From: Andrew Martin amar...@xes-inc.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Thursday, August 23, 2012 7:36:26 PM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? Hi Florian, Thanks for the suggestion. I gave it a try, but even with a dampen value greater than 2* the monitoring interval the same behavior occurred (pacemaker restarted the resources on the same node). Here are my current ocf:pacemaker:ping settings: primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 dampen=25s multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 Any other ideas on what is causing this behavior? My understanding is the above config tells the cluster to attempt 8 pings to each of the IPs, and will assume that an IP is down if none of the 8 come back. Thus, an IP would have to be down for more than 8 seconds to be considered down. The dampen parameter tells the cluster to wait before making any decision, so that if the IP comes back online within the dampen period then no action is taken. Is this correct? I'm no expert on this either but I believe the dampen isn't long enough - I think what you say above is correct but not only does the IP need to come back online but the cluster must attempt to ping it successfully also. I would suggest trying dampen with greater than 3*monitor value. I don't think it's a problem but why change the attempts from the default 2 to 8? Thanks, Andrew - Original Message - From: Florian Crouzat gen...@floriancrouzat.net To: pacemaker@oss.clusterlabs.org Sent: Thursday, August 23, 2012 3:57:02 AM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? Le 22/08/2012 18:23, Andrew Martin a écrit : Hello, I have a 3 node Pacemaker + Heartbeat cluster (two real nodes and 1 quorum node that cannot run resources) running on Ubuntu 12.04 Server amd64. This cluster has a DRBD resource that it mounts and then runs a KVM virtual machine from. I have configured the cluster to use ocf:pacemaker:ping with two other devices on the network (192.168.0.128, 192.168.0.129), and set constraints to move the resources to the most well-connected node (whichever node can see more of these two devices): primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 ... clone cl_ping p_ping \ meta interleave=true ... location loc_run_on_most_connected g_vm \ rule $id=loc_run_on_most_connected-rule p_ping: defined p_ping Today, 192.168.0.128's network cable was unplugged for a few seconds and then plugged back in. During this time, pacemaker recognized that it could not ping 192.168.0.128 and restarted all of the resources, but left them on the same node. My understanding was that since neither node could ping 192.168.0.128 during this period, pacemaker would do nothing with the resources (leave them running
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
On Tue, Aug 28, 2012 at 6:44 AM, Andrew Martin amar...@xes-inc.com wrote: Hi Jake, Thank you for the detailed analysis of this problem. The original reason I was utilizing ocf:pacemaker:ping was to ensure that the node with the best network connectivity (network connectivity being judged by the ability to communicate with 192.168.0.128 and 192.168.0.129) would be the one running the resources. However, it is possible that that either of these IPs could be down for maintenance or a hardware failure, and the cluster should not be affected by this. It seems that a synchronous ping check from all of the nodes would ensure this behavior without this unfortunate side-effect. Is there another way to achieve the same network connectivity check instead of using ocf:pacemaker:ping? I know the other *ping* resource agents are deprecated. With the correct value of dampen, things should behave as expected regardless of which ping variant is used. Thanks, Andrew From: Jake Smith jsm...@argotec.com To: Andrew Martin amar...@xes-inc.com Cc: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, August 27, 2012 1:47:25 PM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resourcesto restart? - Original Message - From: Andrew Martin amar...@xes-inc.com To: Jake Smith jsm...@argotec.com, The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, August 27, 2012 1:01:54 PM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resourcesto restart? Jake, Attached is the log from the same period for node2. If I am reading this correctly, it looks like there was a 7 second difference between when node1 set its score to 1000 and when node2 set its score to 1000? I agree and (I think) more importantly this is what caused the issue to the best of my knowledge - not necessarily fact ;-) At 10:40:43 node1 updates it's pingd to 1000 causing the policy engine to recalculate node preference At 10:40:44 transition 760 is initiated to move everything to the more preferred node2 because it's pingd value is 2000 At 10:40:50 node2's pingd value drops to 1000. Policy engine doesn't stop/change the in-process transition - node1 and 2 are equal now but the transition is in process and node1 isn't more preferred so it continues. At 10:41:02 ping is back on node1 and ready to update pingd to 2000 At 10:41:07 after dampen node1 updates pingd to 2000 which is greater than node2's value At 10:41:08 cluster recognizes change in pingd value that requires a recalculation of node preference and aborts the in-process transition (760). I believe the cluster then waits for all in-process actions to complete so the cluster is in a known state to recalculate At 10:42:10 I'm guessing the shutdown timeout is reached without completing so then VirtualDomain is forcibly shutdown Once all of that is done the transition 760 is done stopping/aborting with some transactions completed and some not: Aug 22 10:42:13 node1 crmd: [4403]: notice: run_graph: Transition 760 (Complete=20, Pending=0, Fired=0, Skipped=39, Incomplete=30, Source=/var/lib/pengine/pe-input-2952.bz2): Stopped Then the cluster recalculates the node preference and restarts those services that are stopped on node1 because pingd scores between node1 and node2 are equal so there is preference to stay on node1 where some services are still active (drbd or such I'm guessing are still running on node1) Aug 22 10:40:38 node1 attrd_updater: [1860]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s Before this is the ping fail: Aug 22 10:40:31 node1 ping[1668]: [1823]: WARNING: 192.168.0.128 is inactive: PING 192.168.0.128 (192.168.0.128) 56(84) bytes of data.#012#012--- 192.168.0.128 ping statistics ---#0128 packets transmitted, 0 received, 100% packet loss, time 7055ms Then you get the 7 second delay to do the 8 attempts I believe and then the 5 second dampen (-d 5s) brings us to: Aug 22 10:40:43 node1 attrd: [4402]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:44 node1 attrd: [4402]: notice: attrd_perform_update: Sent update 265: p_ping=1000 Same thing on node2 - fails at 10:40:38 and then 7 seconds later: Aug 22 10:40:45 node2 attrd_updater: [27245]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s 5s Dampen Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_perform_update: Sent update 122: p_ping=1000 I had changed the attempts value to 8 (from the default 2) to address this same issue - to avoid resource migration based on brief connectivity problems with these IPs - however if we can get dampen configured correctly I'll set it back to the default. Well after looking through both more
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
- Original Message - From: Andrew Martin amar...@xes-inc.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Thursday, August 23, 2012 7:36:26 PM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? Hi Florian, Thanks for the suggestion. I gave it a try, but even with a dampen value greater than 2* the monitoring interval the same behavior occurred (pacemaker restarted the resources on the same node). Here are my current ocf:pacemaker:ping settings: primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 dampen=25s multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 Any other ideas on what is causing this behavior? My understanding is the above config tells the cluster to attempt 8 pings to each of the IPs, and will assume that an IP is down if none of the 8 come back. Thus, an IP would have to be down for more than 8 seconds to be considered down. The dampen parameter tells the cluster to wait before making any decision, so that if the IP comes back online within the dampen period then no action is taken. Is this correct? I'm no expert on this either but I believe the dampen isn't long enough - I think what you say above is correct but not only does the IP need to come back online but the cluster must attempt to ping it successfully also. I would suggest trying dampen with greater than 3*monitor value. I don't think it's a problem but why change the attempts from the default 2 to 8? Thanks, Andrew - Original Message - From: Florian Crouzat gen...@floriancrouzat.net To: pacemaker@oss.clusterlabs.org Sent: Thursday, August 23, 2012 3:57:02 AM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? Le 22/08/2012 18:23, Andrew Martin a écrit : Hello, I have a 3 node Pacemaker + Heartbeat cluster (two real nodes and 1 quorum node that cannot run resources) running on Ubuntu 12.04 Server amd64. This cluster has a DRBD resource that it mounts and then runs a KVM virtual machine from. I have configured the cluster to use ocf:pacemaker:ping with two other devices on the network (192.168.0.128, 192.168.0.129), and set constraints to move the resources to the most well-connected node (whichever node can see more of these two devices): primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 ... clone cl_ping p_ping \ meta interleave=true ... location loc_run_on_most_connected g_vm \ rule $id=loc_run_on_most_connected-rule p_ping: defined p_ping Today, 192.168.0.128's network cable was unplugged for a few seconds and then plugged back in. During this time, pacemaker recognized that it could not ping 192.168.0.128 and restarted all of the resources, but left them on the same node. My understanding was that since neither node could ping 192.168.0.128 during this period, pacemaker would do nothing with the resources (leave them running). It would only migrate or restart the resources if for example node2 could ping 192.168.0.128 but node1 could not (move the resources to where things are better-connected). Is this understanding incorrect? If so, is there a way I can change my configuration so that it will only restart/migrate resources if one node is found to be better connected? Can you tell me why these resources were restarted? I have attached the syslog as well as my full CIB configuration. As was said already the log shows node1 changed it's value for pingd to 1000, waited the 5 seconds of dampening and then started actions to move the resources. In the midst of stopping everything ping ran again successfully and the value increase back to 2000. This caused the policy engine to recalculate scores for all resources (before they had the chance to start on node2). I'm no scoring expert but I know there is additional value given to keep resources that are collocated together with their partners that are already running and resource stickiness to not move. So in this situation the score to stay/run on node1 once pingd was back at 2000 was greater that the score to move so things that were stopped or stopping restarted on node1. So increasing the dampen value should help/fix. Unfortunately you didn't include the log from node2 so we can't correlate what node2's pingd values are at the same times as node1. I believe if you look at the pingd values and times that movement is started between the nodes you will be able to make a better guess at how high a dampen value would make sure the nodes had the same pingd value *before* the dampen time ran out and that should prevent movement. HTH
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
- Original Message - From: Andrew Martin amar...@xes-inc.com To: Jake Smith jsm...@argotec.com, The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, August 27, 2012 1:01:54 PM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? Jake, Attached is the log from the same period for node2. If I am reading this correctly, it looks like there was a 7 second difference between when node1 set its score to 1000 and when node2 set its score to 1000? I agree and (I think) more importantly this is what caused the issue to the best of my knowledge - not necessarily fact ;-) At 10:40:43 node1 updates it's pingd to 1000 causing the policy engine to recalculate node preference At 10:40:44 transition 760 is initiated to move everything to the more preferred node2 because it's pingd value is 2000 At 10:40:50 node2's pingd value drops to 1000. Policy engine doesn't stop/change the in-process transition - node1 and 2 are equal now but the transition is in process and node1 isn't more preferred so it continues. At 10:41:02 ping is back on node1 and ready to update pingd to 2000 At 10:41:07 after dampen node1 updates pingd to 2000 which is greater than node2's value At 10:41:08 cluster recognizes change in pingd value that requires a recalculation of node preference and aborts the in-process transition (760). I believe the cluster then waits for all in-process actions to complete so the cluster is in a known state to recalculate At 10:42:10 I'm guessing the shutdown timeout is reached without completing so then VirtualDomain is forcibly shutdown Once all of that is done the transition 760 is done stopping/aborting with some transactions completed and some not: Aug 22 10:42:13 node1 crmd: [4403]: notice: run_graph: Transition 760 (Complete=20, Pending=0, Fired=0, Skipped=39, Incomplete=30, Source=/var/lib/pengine/pe-input-2952.bz2): Stopped Then the cluster recalculates the node preference and restarts those services that are stopped on node1 because pingd scores between node1 and node2 are equal so there is preference to stay on node1 where some services are still active (drbd or such I'm guessing are still running on node1) Aug 22 10:40:38 node1 attrd_updater: [1860]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s Before this is the ping fail: Aug 22 10:40:31 node1 ping[1668]: [1823]: WARNING: 192.168.0.128 is inactive: PING 192.168.0.128 (192.168.0.128) 56(84) bytes of data.#012#012--- 192.168.0.128 ping statistics ---#0128 packets transmitted, 0 received, 100% packet loss, time 7055ms Then you get the 7 second delay to do the 8 attempts I believe and then the 5 second dampen (-d 5s) brings us to: Aug 22 10:40:43 node1 attrd: [4402]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:44 node1 attrd: [4402]: notice: attrd_perform_update: Sent update 265: p_ping=1000 Same thing on node2 - fails at 10:40:38 and then 7 seconds later: Aug 22 10:40:45 node2 attrd_updater: [27245]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s 5s Dampen Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_perform_update: Sent update 122: p_ping=1000 I had changed the attempts value to 8 (from the default 2) to address this same issue - to avoid resource migration based on brief connectivity problems with these IPs - however if we can get dampen configured correctly I'll set it back to the default. Well after looking through both more closely I'm not sure dampen is what you'll need to fix the deeper problem. The time between fail and return was 10:40:31 to 10:41:02 or 32 seconds (31 on node2). I believe if you had a dampen value that was greater than monitor value plus time failed then nothing would have happened (dampen 10 + 32). However I'm not sure I would call 32 seconds a blip in connection - that's up to you. And since the dampen applies to all of the ping clones equally assuming a ping failure longer than your dampen value you would still have the same problem. For example assuming a dampen of 45 seconds: Node1 fails at 1:01, node2 fails at 1:08 Node1 will still update its pingd value at 1:52 - 7 seconds before node2 will and the transition will still happen even though both nodes have the same connectivity in reality. I guess what I'm saying in the end is that dampen is there to prevent movement for a momentary outage/blip in the pings with the idea being that the pings will return before the dampen expires. It isn't going to wait out the dampen on the other node(s) before making a decision. You would need to be able to add something like a sleep 10s in there AFTER the pingd value is updated BEFORE evaluating the node preference scoring! So in the end I don't have a fix for you except maybe
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Hi Jake, Thank you for the detailed analysis of this problem. The original reason I was utilizing ocf:pacemaker:ping was to ensure that the node with the best network connectivity (network connectivity being judged by the ability to communicate with 192.168.0.128 and 192.168.0.129) would be the one running the resources. However, it is possible that that either of these IPs could be down for maintenance or a hardware failure, and the cluster should not be affected by this. It seems that a synchronous ping check from all of the nodes would ensure this behavior without this unfortunate side-effect. Is there another way to achieve the same network connectivity check instead of using ocf:pacemaker:ping? I know the other *ping* resource agents are deprecated. Thanks, Andrew - Original Message - From: Jake Smith jsm...@argotec.com To: Andrew Martin amar...@xes-inc.com Cc: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, August 27, 2012 1:47:25 PM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? - Original Message - From: Andrew Martin amar...@xes-inc.com To: Jake Smith jsm...@argotec.com, The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, August 27, 2012 1:01:54 PM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? Jake, Attached is the log from the same period for node2. If I am reading this correctly, it looks like there was a 7 second difference between when node1 set its score to 1000 and when node2 set its score to 1000? I agree and (I think) more importantly this is what caused the issue to the best of my knowledge - not necessarily fact ;-) At 10:40:43 node1 updates it's pingd to 1000 causing the policy engine to recalculate node preference At 10:40:44 transition 760 is initiated to move everything to the more preferred node2 because it's pingd value is 2000 At 10:40:50 node2's pingd value drops to 1000. Policy engine doesn't stop/change the in-process transition - node1 and 2 are equal now but the transition is in process and node1 isn't more preferred so it continues. At 10:41:02 ping is back on node1 and ready to update pingd to 2000 At 10:41:07 after dampen node1 updates pingd to 2000 which is greater than node2's value At 10:41:08 cluster recognizes change in pingd value that requires a recalculation of node preference and aborts the in-process transition (760). I believe the cluster then waits for all in-process actions to complete so the cluster is in a known state to recalculate At 10:42:10 I'm guessing the shutdown timeout is reached without completing so then VirtualDomain is forcibly shutdown Once all of that is done the transition 760 is done stopping/aborting with some transactions completed and some not: Aug 22 10:42:13 node1 crmd: [4403]: notice: run_graph: Transition 760 (Complete=20, Pending=0, Fired=0, Skipped=39, Incomplete=30, Source=/var/lib/pengine/pe-input-2952.bz2): Stopped Then the cluster recalculates the node preference and restarts those services that are stopped on node1 because pingd scores between node1 and node2 are equal so there is preference to stay on node1 where some services are still active (drbd or such I'm guessing are still running on node1) Aug 22 10:40:38 node1 attrd_updater: [1860]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s Before this is the ping fail: Aug 22 10:40:31 node1 ping[1668]: [1823]: WARNING: 192.168.0.128 is inactive: PING 192.168.0.128 (192.168.0.128) 56(84) bytes of data.#012#012--- 192.168.0.128 ping statistics ---#0128 packets transmitted, 0 received, 100% packet loss, time 7055ms Then you get the 7 second delay to do the 8 attempts I believe and then the 5 second dampen (-d 5s) brings us to: Aug 22 10:40:43 node1 attrd: [4402]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:44 node1 attrd: [4402]: notice: attrd_perform_update: Sent update 265: p_ping=1000 Same thing on node2 - fails at 10:40:38 and then 7 seconds later: Aug 22 10:40:45 node2 attrd_updater: [27245]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s 5s Dampen Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:50 node2 attrd: [4069]: notice: attrd_perform_update: Sent update 122: p_ping=1000 I had changed the attempts value to 8 (from the default 2) to address this same issue - to avoid resource migration based on brief connectivity problems with these IPs - however if we can get dampen configured correctly I'll set it back to the default. Well after looking through both more closely I'm not sure dampen is what you'll need to fix the deeper problem. The time between fail and return was 10:40:31 to 10:41:02 or 32 seconds (31 on node2). I believe if you had a dampen value that was greater than
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Le 24/08/2012 01:36, Andrew Martin a écrit : The dampen parameter tells the cluster to wait before making any decision, so that if the IP comes back online within the dampen period then no action is taken. Is this correct? This is also my understanding of this parameter. -- Cheers, Florian Crouzat ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Le 22/08/2012 18:23, Andrew Martin a écrit : Hello, I have a 3 node Pacemaker + Heartbeat cluster (two real nodes and 1 quorum node that cannot run resources) running on Ubuntu 12.04 Server amd64. This cluster has a DRBD resource that it mounts and then runs a KVM virtual machine from. I have configured the cluster to use ocf:pacemaker:ping with two other devices on the network (192.168.0.128, 192.168.0.129), and set constraints to move the resources to the most well-connected node (whichever node can see more of these two devices): primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 ... clone cl_ping p_ping \ meta interleave=true ... location loc_run_on_most_connected g_vm \ rule $id=loc_run_on_most_connected-rule p_ping: defined p_ping Today, 192.168.0.128's network cable was unplugged for a few seconds and then plugged back in. During this time, pacemaker recognized that it could not ping 192.168.0.128 and restarted all of the resources, but left them on the same node. My understanding was that since neither node could ping 192.168.0.128 during this period, pacemaker would do nothing with the resources (leave them running). It would only migrate or restart the resources if for example node2 could ping 192.168.0.128 but node1 could not (move the resources to where things are better-connected). Is this understanding incorrect? If so, is there a way I can change my configuration so that it will only restart/migrate resources if one node is found to be better connected? Can you tell me why these resources were restarted? I have attached the syslog as well as my full CIB configuration. Thanks, Andrew Martin This is an interesting question and I'm also interested in answers. I had the same observations, and there is also the case where the monitor() aren't synced across all nodes so, Node 1 issue a monitor() on the ping resource and finds ping-node dead, node2 hasn't pinged yet, so node1 moves things to node2 but node2 now issue a monitor() and also finds ping-node dead. The only solution I found was to adjust the dampen parameter to at least 2*monitor().interval so that I can be *sure* that all nodes have issued a monitor() and they all decreased they scores so that when a decision occurs, nothings move. It's been a long time I haven't tested, my cluster is very very stable, I guess I should retry to validate it's still a working trick. dampen (integer, [5s]): Dampening interval The time to wait (dampening) further changes occur Eg: primitive ping-nq-sw-swsec ocf:pacemaker:ping \ params host_list=192.168.10.1 192.168.2.11 192.168.2.12 dampen=35s attempts=2 timeout=2 multiplier=100 \ op monitor interval=15s -- Cheers, Florian Crouzat ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Hi Florian, Thanks for the suggestion. I gave it a try, but even with a dampen value greater than 2* the monitoring interval the same behavior occurred (pacemaker restarted the resources on the same node). Here are my current ocf:pacemaker:ping settings: primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 dampen=25s multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 Any other ideas on what is causing this behavior? My understanding is the above config tells the cluster to attempt 8 pings to each of the IPs, and will assume that an IP is down if none of the 8 come back. Thus, an IP would have to be down for more than 8 seconds to be considered down. The dampen parameter tells the cluster to wait before making any decision, so that if the IP comes back online within the dampen period then no action is taken. Is this correct? Thanks, Andrew - Original Message - From: Florian Crouzat gen...@floriancrouzat.net To: pacemaker@oss.clusterlabs.org Sent: Thursday, August 23, 2012 3:57:02 AM Subject: Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart? Le 22/08/2012 18:23, Andrew Martin a écrit : Hello, I have a 3 node Pacemaker + Heartbeat cluster (two real nodes and 1 quorum node that cannot run resources) running on Ubuntu 12.04 Server amd64. This cluster has a DRBD resource that it mounts and then runs a KVM virtual machine from. I have configured the cluster to use ocf:pacemaker:ping with two other devices on the network (192.168.0.128, 192.168.0.129), and set constraints to move the resources to the most well-connected node (whichever node can see more of these two devices): primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 ... clone cl_ping p_ping \ meta interleave=true ... location loc_run_on_most_connected g_vm \ rule $id=loc_run_on_most_connected-rule p_ping: defined p_ping Today, 192.168.0.128's network cable was unplugged for a few seconds and then plugged back in. During this time, pacemaker recognized that it could not ping 192.168.0.128 and restarted all of the resources, but left them on the same node. My understanding was that since neither node could ping 192.168.0.128 during this period, pacemaker would do nothing with the resources (leave them running). It would only migrate or restart the resources if for example node2 could ping 192.168.0.128 but node1 could not (move the resources to where things are better-connected). Is this understanding incorrect? If so, is there a way I can change my configuration so that it will only restart/migrate resources if one node is found to be better connected? Can you tell me why these resources were restarted? I have attached the syslog as well as my full CIB configuration. Thanks, Andrew Martin This is an interesting question and I'm also interested in answers. I had the same observations, and there is also the case where the monitor() aren't synced across all nodes so, Node 1 issue a monitor() on the ping resource and finds ping-node dead, node2 hasn't pinged yet, so node1 moves things to node2 but node2 now issue a monitor() and also finds ping-node dead. The only solution I found was to adjust the dampen parameter to at least 2*monitor().interval so that I can be *sure* that all nodes have issued a monitor() and they all decreased they scores so that when a decision occurs, nothings move. It's been a long time I haven't tested, my cluster is very very stable, I guess I should retry to validate it's still a working trick. dampen (integer, [5s]): Dampening interval The time to wait (dampening) further changes occur Eg: primitive ping-nq-sw-swsec ocf:pacemaker:ping \ params host_list=192.168.10.1 192.168.2.11 192.168.2.12 dampen=35s attempts=2 timeout=2 multiplier=100 \ op monitor interval=15s -- Cheers, Florian Crouzat ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?
Hello, I have a 3 node Pacemaker + Heartbeat cluster (two real nodes and 1 quorum node that cannot run resources) running on Ubuntu 12.04 Server amd64. This cluster has a DRBD resource that it mounts and then runs a KVM virtual machine from. I have configured the cluster to use ocf:pacemaker:ping with two other devices on the network (192.168.0.128, 192.168.0.129), and set constraints to move the resources to the most well-connected node (whichever node can see more of these two devices): primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.128 192.168.0.129 multiplier=1000 attempts=8 debug=true \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 ... clone cl_ping p_ping \ meta interleave=true ... location loc_run_on_most_connected g_vm \ rule $id=loc_run_on_most_connected-rule p_ping: defined p_ping Today, 192.168.0.128's network cable was unplugged for a few seconds and then plugged back in. During this time, pacemaker recognized that it could not ping 192.168.0.128 and restarted all of the resources, but left them on the same node. My understanding was that since neither node could ping 192.168.0.128 during this period, pacemaker would do nothing with the resources (leave them running). It would only migrate or restart the resources if for example node2 could ping 192.168.0.128 but node1 could not (move the resources to where things are better-connected). Is this understanding incorrect? If so, is there a way I can change my configuration so that it will only restart/migrate resources if one node is found to be better connected? Can you tell me why these resources were restarted? I have attached the syslog as well as my full CIB configuration. Thanks, Andrew Martin Aug 22 10:40:31 node1 ping[1668]: [1823]: WARNING: 192.168.0.128 is inactive: PING 192.168.0.128 (192.168.0.128) 56(84) bytes of data.#012#012--- 192.168.0.128 ping statistics ---#0128 packets transmitted, 0 received, 100% packet loss, time 7055ms Aug 22 10:40:38 node1 attrd_updater: [1860]: info: Invoked: attrd_updater -n p_ping -v 1000 -d 5s Aug 22 10:40:43 node1 attrd: [4402]: notice: attrd_trigger_update: Sending flush op to all hosts for: p_ping (1000) Aug 22 10:40:44 node1 attrd: [4402]: notice: attrd_perform_update: Sent update 265: p_ping=1000 Aug 22 10:40:44 node1 crmd: [4403]: info: abort_transition_graph: te_update_diff:164 - Triggered transition abort (complete=1, tag=nvpair, id=status-1ab0690c-5aa0-4d9c-ae4e-b662e0ca54e5-p_ping, name=p_ping, value=1000, magic=NA, cib=0.121.49) : Transient attribute: update Aug 22 10:40:44 node1 crmd: [4403]: info: do_state_transition: State transition S_IDLE - S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Aug 22 10:40:44 node1 crmd: [4403]: info: do_state_transition: All 3 cluster nodes are eligible to run resources. Aug 22 10:40:44 node1 crmd: [4403]: info: do_pe_invoke: Query 1023: Requesting the current CIB: S_POLICY_ENGINE Aug 22 10:40:44 node1 crmd: [4403]: info: do_pe_invoke_callback: Invoking the PE: query=1023, ref=pe_calc-dc-1345650044-1095, seq=130, quorate=1 Aug 22 10:40:44 node1 pengine: [13079]: notice: unpack_rsc_op: Hard error - p_drbd_mount1:0_last_failure_0 failed with rc=5: Preventing ms_drbd_tools from re-starting on quorum Aug 22 10:40:44 node1 pengine: [13079]: notice: unpack_rsc_op: Hard error - p_drbd_vmstore:0_last_failure_0 failed with rc=5: Preventing ms_drbd_vmstore from re-starting on quorum Aug 22 10:40:44 node1 pengine: [13079]: notice: unpack_rsc_op: Hard error - p_vm_myvm_last_failure_0 failed with rc=5: Preventing p_vm_myvm from re-starting on quorum Aug 22 10:40:44 node1 pengine: [13079]: notice: unpack_rsc_op: Hard error - p_drbd_mount2:0_last_failure_0 failed with rc=5: Preventing ms_drbd_crm from re-starting on quorum Aug 22 10:40:44 node1 pengine: [13079]: notice: unpack_rsc_op: Operation p_drbd_vmstore:0_last_failure_0 found resource p_drbd_vmstore:0 active on node1 Aug 22 10:40:44 node1 pengine: [13079]: notice: unpack_rsc_op: Operation p_drbd_mount2:0_last_failure_0 found resource p_drbd_mount2:0 active on node1 Aug 22 10:40:44 node1 pengine: [13079]: notice: unpack_rsc_op: Operation p_drbd_mount1:0_last_failure_0 found resource p_drbd_mount1:0 active on node1 Aug 22 10:40:44 node1 pengine: [13079]: notice: RecurringOp: Start recurring monitor (20s) for p_drbd_mount2:0 on node1 Aug 22 10:40:44 node1 pengine: [13079]: notice: RecurringOp: Start recurring monitor (10s) for p_drbd_mount2:1 on node2 Aug 22 10:40:44 node1 pengine: [13079]: notice: RecurringOp: Start recurring monitor (20s) for p_drbd_mount2:0 on node1 Aug 22 10:40:44 node1 pengine: [13079]: notice: RecurringOp: Start recurring monitor (10s) for p_drbd_mount2:1 on node2 Aug 22 10:40:44 node1 pengine: [13079]: notice: RecurringOp: Start recurring monitor (20s) for p_drbd_mount1:0 on node1 Aug 22 10:40:44 node1 pengine: [13079]: notice: RecurringOp: Start recurring