Thanks for your hints. I had the same issue and your tips nearly resolved it for me. But i got a question. I setted the default timeout and afterwards the pingd resource started to work as expected. I had a IPTABLES Rule dropping icmp on one node and recieved:
============ Last updated: Thu May 20 14:30:20 2010 Stack: openais Current DC: test01-node1 - partition with quorum Version: 1.0.8-2c98138c2f070fcb6ddeab1084154cffbf44ba75 2 Nodes configured, 3 expected votes 3 Resources configured. ============ Online: [ test01-node2 test01-node1 ] Full list of resources: Master/Slave Set: ms_drbd_mysql0 Masters: [ test01-node1 ] Slaves: [ test01-node2 ] Resource Group: grp_MySQL res_Filesystem (ocf::heartbeat:Filesystem): Started test01-node1 res_ClusterIP (ocf::heartbeat:IPaddr2): Started test01-node1 res_MySQL (lsb:mysql): Started test01-node1 res_Apache (lsb:apache2): Started test01-node1 Clone Set: cl-pinggw Started: [ test01-node2 test01-node1 ] Migration summary: * Node test01-node1: pingd=100 * Node test01-node2: pingd=0 When i remove the dropping iptables rule with IPTABLES -F the ping works again BUT the Migration summary doesn't change. I understood it like this: does the pingd fail the node gets a value like the multiplier (100 in my case). I thought: okay the pingd resource works on both nodes again so therefore both should have a value of = 100. Am i right or did i understand it wrong? Here is my working config: node test01-node1 \ attributes standby="off" node test01-node2 \ attributes standby="off" primitive drbd_test0 ocf:linbit:drbd \ params drbd_resource="test0" \ operations $id="drbd_test0-operations" primitive pinggw ocf:pacemaker:pingd \ params host_list="10.1.1.162" multiplier="100" \ op monitor interval="10" primitive res_Apache lsb:apache2 \ operations $id="res_Apache-operations" \ op stop interval="0" timeout="15" \ op monitor interval="15" timeout="15" start-delay="15" \ op status interval="0" timeout="15" \ op start interval="0" timeout="15" \ meta target-role="Started" is-managed="true" primitive res_ClusterIP ocf:heartbeat:IPaddr2 \ params iflabel="ClusterIP" ip="10.1.1.175" \ operations $id="res_ClusterIP_1-operations" \ op stop interval="0" timeout="100" \ op monitor interval="10" timeout="20" start-delay="0" \ op status interval="10" timeout="20" \ op start interval="0" timeout="90" primitive res_Filesystem ocf:heartbeat:Filesystem \ params fstype="xfs" directory="/mnt/cluster" device="/dev/drbd0" \ operations $id="res_Filesystem-operations" \ op stop interval="0" timeout="60" \ op monitor interval="20" timeout="40" start-delay="0" \ op start interval="0" timeout="60" primitive res_MySQL lsb:mysql \ operations $id="res_MySQL-operations" \ op stop interval="0" timeout="15" \ op monitor interval="15" timeout="15" start-delay="15" \ op status interval="0" timeout="15" \ op start interval="0" timeout="15" group grp_MySQL res_Filesystem res_ClusterIP res_MySQL res_Apache \ meta target-role="Started" ms ms_drbd_mysql0 drbd_test0 \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" clone cl-pinggw pinggw \ meta globally-unique="false" location cli-prefer-grp_MySQL grp_MySQL \ rule $id="cli-prefer-rule-grp_MySQL" inf: #uname eq test01-node2 location grp_MySQL-with-pinggw grp_MySQL \ rule $id="grp_MySQL-with-pinggw-rule-1" -inf: not_defined pingd or pingd lte 0 colocation col_drbd_on_mysql inf: grp_MySQL ms_drbd_mysql0:Master order mysql_after_drbd inf: ms_drbd_mysql0:promote grp_MySQL:start property $id="cib-bootstrap-options" \ expected-quorum-votes="3" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ dc-version="1.0.8-2c98138c2f070fcb6ddeab1084154cffbf44ba75" \ cluster-infrastructure="openais" \ last-lrm-refresh="1274352045" \ symmetric-cluster="true" \ default-action-timeout="120s" rsc_defaults $id="rsc-options" \ resource-stickiness="100" Thanks in advance. Regards Sebastian Von: Vadym Chepkov [mailto:vchep...@gmail.com] Gesendet: Dienstag, 11. Mai 2010 19:56 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] clone ip definition and location stops myresources... The is no "default" unless it's set, that's why crm complains On Tue, May 11, 2010 at 12:41 PM, Gianluca Cecchi <gianluca.cec...@gmail.com> wrote: On Tue, May 11, 2010 at 5:47 PM, Vadym Chepkov <vchep...@gmail.com> wrote: pingd is a daemon with is running all the time and does it job you still need to define monitor operation though, what if the daemon dies? op monitor just have a different meaning for ping and pingd. with pingd - monitor daemon with ping - monitor connectivity as for warnings: crm configure property default-action-timeout="120s" Thanks again! Now it is more clear. Only doubt: why pacemaker doesn't set directly as a default 120s for timeout? Any drawbacks in setting it to 120? Also, with crm configure show I can see property $id="cib-bootstrap-options" \ dc-version="1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh="1273484758" rsc_defaults $id="rsc-options" \ resource-stickiness="1000" Any way to see what is the default value for "default-action-timeout" parameter that I'm going to change (I presume it is 20s from the warnings I received) and for other ones for example that are not shown with the show command? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf