Hi
Roland G. McIntosh wrote:
No matter how many times I kill IPaddr2 I can't seem to cause a failover
in my simple 2 node cluster.
OT, but why do people keep calling 2 node clusters "simple" clusters?
Clusters are not simple. Maybe it's a rather "small" cluster.
I'm trying to get it working for the 3 services in my group (HB 2.1.3 on
RHEL4 using CentOS packages). I don't understand why showscores.sh
shows "INFINITY" for my OCF resources, but an integer value for the
IPaddr2 resource.
This is expected. In a colocated group, only the first resource receives
the configured integer stickiness value (times the number of resources
in that group). Read below.
Here is the output of my showscores.sh:
[EMAIL PROTECTED] rss]$ ./showscores.sh
Resource Score Node Stickiness #Fail
Fail-Stickiness
slink_db -INFINITY slinkfail 100 0 -30
slink_db INFINITY slinkmaster 100 0 -30
slink_ipaddr2 0 slinkfail 100 0 -30
slink_ipaddr2 400 slinkmaster 100 0 -30
As you see here. You have a node preference of 100 plus 3 * 100 stickiness.
slink_jboss -INFINITY slinkfail 100 0 -30
slink_jboss INFINITY slinkmaster 100 0 -30
The INFINITY is implicitly given by the colocated group - that way your
resources run on the same node. -INFINITY is to make sure they dont run
on any other node than the one the first resource was started on.
With a failure stickiness of -30, you allow your groups resources to
fail (400/30)=14 times. Is that what you want?
I'm using the Mar 2008 version of showscores.sh (thanks Dominik!), so
perhaps this is related to the known issue of meta attributes on the
group instead of on the primitive.
From your config - no, it's not about that, as you don't have a
stickiness meta attribute for the group, just default values.
I've been trying to force a failover like this:
export OCF_RESKEY_ip=192.168.1.222
for nn in `seq 1 15`; do
/usr/lib/ocf/resource.d/heartbeat/IPaddr2 stop
sleep 1m
done
After one the score becomes "200".
Then it seems to jump back up to 300 and stays there. It never proceeds
down below zero as I expect. I have a colocation constraint, as you can
see in my cib.xml.
You don't have any monitor operations for the ipaddr and jboss
resources. Failures on them are not detected. Configure monitor
operations and try again.
Also make sure you use a recent version. Otherwise you may also hit the
bug of not increasing failcount in 2.1.3's crm. This is fixed in
pacemaker (0.6.x)
This line from your config:
<rsc_colocation id="colocation_MyGroup" from="MyGroup"
to="MyGroup" score="INFINITY"/>
is not needed. I don't even know what you want to express with this.
Regards
Dominik
ps. I'll add the group score things to
http://www.linux-ha.org/ScoreCalculation soon.
pps. where did you get the jboss RA? I'd be interested in it.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems