On Tue, 15 Dec 2009 07:13:29 +0000, Chris Picton wrote:

>>> > The monitor op shouldn't make any changes.  If the rule has gone
>>> > away, the monitor op should return failure to indicate the resource
>>> > is broken, which will result in Pacemaker telling the the failed
>>> > resource to stop, and start again.  Actually, from the logs it looks
>>> > like a restart was attempted, and the stop op reported success, but
>>> > the subsequent start failed for some reason.
>>> >
>>> > Regards,
>>> >
>>> > Tim
>>> 
>>> Exactly. So the RA seems to have a problem handeling this error
>>> scenario correctly.
>> 
>> OK. Anybody knows how should it work and where's the problem. It seems
>> like it can't find some proc file.
> 
> I will have a go at fixing the RA today to do the following: 1. Detect
> the error in monitor and return the correct value 2. Stop the resource
> cleanly
> 3. Start it up again.
> 
> 
> Will let you know how it goes.

The below patch seems to detect this specific failure, and stop the 
resource cleanly.

The start operation is able to start it up again without errors

Chris

-------------------
--- IPaddr2.orig        2009-12-15 10:07:58.000000000 +0200
+++ IPaddr2.new 2009-12-15 10:22:03.000000000 +0200
@@ -548,6 +548,7 @@
 # returns:
 # ok = served (for CIP: + hash bucket)
 # partial = served and no hash bucket (CIP only)
+# partial2 = served and no CIP iptables rule
 # no = nothing
 #
 ip_served() {
@@ -577,6 +578,10 @@
        fi
 
        # Special handling for the CIP:
+       if [ ! -e $IP_CIP_FILE ]; then
+               echo "partial2"
+               return 0
+       fi
        if egrep -q "(^|,)${IP_INC_NO}(,|$)" $IP_CIP_FILE ; then
                echo "ok"
                return 0
@@ -620,7 +625,7 @@
                exit $OCF_SUCCESS
        fi
        
-       if [ -n "$IP_CIP" ] && [ $ip_status = "no" ]; then
+       if [ -n "$IP_CIP" ] && [ $ip_status = "no" ] || [ $ip_status = 
"partial2" ]; then
                $MODPROBE ip_conntrack
                $IPTABLES -I INPUT -d $BASEIP -i $NIC -j CLUSTERIP \
                                --new \
@@ -691,13 +696,14 @@
                fi
        fi
        local ip_status=`ip_served`
+       ocf_log info "IP status = $ip_status, IP_CIP=$IP_CIP"
 
        if [ $ip_status = "no" ]; then
                : Requested interface not in use
                exit $OCF_SUCCESS
        fi
 
-       if [ -n "$IP_CIP" ]; then
+       if [ -n "$IP_CIP" ] && [ $ip_status != "partial2" ]; then
                if [ $ip_status = "partial" ]; then
                        exit $OCF_SUCCESS
                fi
@@ -743,7 +749,7 @@
        ok)
                return $OCF_SUCCESS
                ;;
-       partial|no)
+       partial|no|partial2)
                exit $OCF_NOT_RUNNING
                ;;
        *)





_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to