Re: [Pacemaker] Resource Agent ethmonitor
I resolved the problem. I found this is a bug in ethmonitor agent. in ethmonitor : 255 # get the link status on $NIC 256 # asks ip about running (up) interfaces, returns the number of matching interface names that are up 257 get_link_status () { 258$IP2UTIL -o link show up dev $NIC | grep -c $NIC 259 } The command ip -o link show up dev eth0 , just only detect the interface down. but can't detect the link down. So , i guest the developer ,maybe just use command ifdown eth0/bond0 as test. not consider the scene that unplug the cable. Finaly, I decide add the function in IPaddr2. no longer use the agent ethmonitor. I changed monitor fuction of the agent ocf:heartbeat:IPaddr2. 760 ip_monitor() { 761 # TODO: Implement more elaborate monitoring like checking for 762 # interface health maybe via a daemon like FailSafe etc... 763 764 t=$(ip link show $NIC | grep -c state UP) 765 #test $t -ne 1 return $OCF_ERR_PERM 766 test $t -ne 1 return $OCF_ERR_PERM 767 so if the nic link down or interface down, the resource will be switch to other node. but u need add the meta to the ocf:heatbeat:IPaddr2. Some like this node sles11264-node1 node sles11264-node2 primitive p_apache lsb:apache2 \ op monitor interval=15 timeout=30 primitive p_vip ocf:heartbeat:IPaddr2 \ params ip=192.168.203.250 nic=eth0 iflabel=0 \ op monitor interval=10 timeout=20 \ meta failure-timeout=5 group g_apache p_vip p_apache \ meta target-role=Started property $id=cib-bootstrap-options \ dc-version=1.1.6-b988976485d15cb702c9307df55512d323831a5e \ cluster-infrastructure=openais \ expected-quorum-votes=2 \ stonith-enabled=no \ no-quorum-policy=ignore \ last-lrm-refresh=1340872994 about meta failure-timeout=5 , you must be careful to set this value. If you set to small, will cause the other side node doesn't have enough time take over. so calculate, set larger. my english is so bad ,i hope so you can understand. If you understand Chinese,you can see my blog. http://linux.52zhe.info/read.php/275.htm On Fri, Jun 29, 2012 at 1:01 PM, kook kook...@gmail.com wrote: For test. I don't know how to reply this subject. On Mon, Jun 25, 2012 at 4:00 PM, kook kook...@gmail.com wrote: Dear Fiorenza: I have the same problem with you. I checked the newest ethmonitor ra (ClusterLabs-resource-agents-v3.9.2-0-ge261943.tar). It's same with my sles 11 sp2. Failed actions: p_ethmonitor:1_monitor_15000 (node=sles11264-node1, call=1591, rc=-2, status=Timed Out): unknown exec error so, can you tell me. how did you solved this problem. Thanks. liujia Il 21/03/2012 09:06, Florian Haas ha scritto: * On Tue, Mar 20, 2012 at 4:18 PM, Fiorenza Meinifmeini at esseweb.eu http://oss.clusterlabs.org/mailman/listinfo/pacemaker wrote:** Hi there,** has anybody configured successfully the RA specified in the object of the** message? I got this error: if_eth0_monitor_0 (node=fw1, call=2297, rc=-2,** status=Timed Out): unknown exec error Your ethmonitor RA missed its 50-second timeout on the probe (that is,** the initial monitor operation). You should be seeing Monitoring of** if_eth0 failed, X retries left warnings in your logs. Grepping your** syslog for ethmonitor will probably turn up some useful results. Cheers,** Florian*** Thank you, I solved the problem. Regards -- Fiorenza Meini Spazio Web S.r.l. V. Dante Alighieri, 10 - 13900 Biella Tel.: 015.2431982 - 015.9526066 Fax: 015.2522600 Reg. Imprese, CF e P.I.: 02414430021 Iscr. REA: BI - 188936 Iscr. CCIAA: Biella - 188936 Cap. Soc.: 30.000,00 Euro i.v. Side A or B -- 我有一个梦想.呵呵 -- 我有一个梦想.呵呵 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Resource Agent ethmonitor
Dear Fiorenza: I have the same problem with you. I checked the newest ethmonitor ra (ClusterLabs-resource-agents-v3.9.2-0-ge261943.tar). It's same with my sles 11 sp2. Failed actions: p_ethmonitor:1_monitor_15000 (node=sles11264-node1, call=1591, rc=-2, status=Timed Out): unknown exec error so, can you tell me. how did you solved this problem. Thanks. liujia Il 21/03/2012 09:06, Florian Haas ha scritto: * On Tue, Mar 20, 2012 at 4:18 PM, Fiorenza Meinifmeini at esseweb.eu http://oss.clusterlabs.org/mailman/listinfo/pacemaker wrote:** Hi there,** has anybody configured successfully the RA specified in the object of the** message? I got this error: if_eth0_monitor_0 (node=fw1, call=2297, rc=-2,** status=Timed Out): unknown exec error Your ethmonitor RA missed its 50-second timeout on the probe (that is,** the initial monitor operation). You should be seeing Monitoring of** if_eth0 failed, X retries left warnings in your logs. Grepping your** syslog for ethmonitor will probably turn up some useful results. Cheers,** Florian*** Thank you, I solved the problem. Regards -- Fiorenza Meini Spazio Web S.r.l. V. Dante Alighieri, 10 - 13900 Biella Tel.: 015.2431982 - 015.9526066 Fax: 015.2522600 Reg. Imprese, CF e P.I.: 02414430021 Iscr. REA: BI - 188936 Iscr. CCIAA: Biella - 188936 Cap. Soc.: 30.000,00 Euro i.v. Side A or B ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource Agent ethmonitor
Il 21/03/2012 09:06, Florian Haas ha scritto: On Tue, Mar 20, 2012 at 4:18 PM, Fiorenza Meinifme...@esseweb.eu wrote: Hi there, has anybody configured successfully the RA specified in the object of the message? I got this error: if_eth0_monitor_0 (node=fw1, call=2297, rc=-2, status=Timed Out): unknown exec error Your ethmonitor RA missed its 50-second timeout on the probe (that is, the initial monitor operation). You should be seeing Monitoring of if_eth0 failed, X retries left warnings in your logs. Grepping your syslog for ethmonitor will probably turn up some useful results. Cheers, Florian Thank you, I solved the problem. Regards -- Fiorenza Meini Spazio Web S.r.l. V. Dante Alighieri, 10 - 13900 Biella Tel.: 015.2431982 - 015.9526066 Fax: 015.2522600 Reg. Imprese, CF e P.I.: 02414430021 Iscr. REA: BI - 188936 Iscr. CCIAA: Biella - 188936 Cap. Soc.: 30.000,00 Euro i.v. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource Agent ethmonitor
On Tue, Mar 20, 2012 at 4:18 PM, Fiorenza Meini fme...@esseweb.eu wrote: Hi there, has anybody configured successfully the RA specified in the object of the message? I got this error: if_eth0_monitor_0 (node=fw1, call=2297, rc=-2, status=Timed Out): unknown exec error Your ethmonitor RA missed its 50-second timeout on the probe (that is, the initial monitor operation). You should be seeing Monitoring of if_eth0 failed, X retries left warnings in your logs. Grepping your syslog for ethmonitor will probably turn up some useful results. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Resource Agent ethmonitor
Hi there, has anybody configured successfully the RA specified in the object of the message? I got this error: if_eth0_monitor_0 (node=fw1, call=2297, rc=-2, status=Timed Out): unknown exec error The RA definition in CIB is: primitive if_eth0 ocf:heartbeat:ethmonitor \ params interface=eth0 name=wan_eth0 \ op monitor interval=20s timeout=50s \ op start interval=0 timeout=60 \ op stop interval=0 timeout=20 Thanks and regards -- Fiorenza Meini Spazio Web S.r.l. V. Dante Alighieri, 10 - 13900 Biella Tel.: 015.2431982 - 015.9526066 Fax: 015.2522600 Reg. Imprese, CF e P.I.: 02414430021 Iscr. REA: BI - 188936 Iscr. CCIAA: Biella - 188936 Cap. Soc.: 30.000,00 Euro i.v. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org