Hi guys,

I've just had something odd happen to me and I'm not totally sure why.

This is the result of showscores.sh on cleanly bringing up both nodes.
drbd0 and drbd1 are masterslave resources, pingd is a clone resource
and the rest are split into two groups with a Filesystem_2 and
Filesystem_12 being the first primitive in each.
drbd0 and drbd1 have resource stickiness set to 1000, as does pingd.
Basically I wanted to allow the servers to be failed over many, many
time before needing to reset scores.

Score           Resource        Node            Stickiness
Failcount       Failure-Stickiness
drbd0:0         0               kodiak.domain.org100             0
          -10
drbd0:0         1075            polar.domain.org100             0
         -10
drbd0:1         1075            kodiak.domain.org100             0
          -10
drbd0:1         -INFINITY       polar.domain.org100             0
         -10
drbd1:0         0               kodiak.domain.org100             0
          -10
drbd1:0         1075            polar.domain.org100             0
         -10
drbd1:1         1075            kodiak.domain.org100             0
          -10
drbd1:1         -INFINITY       polar.domain.org100             0
         -10
Filesystem_12   1475            polar.domain.org100             0
         -10
Filesystem_2    1475            kodiak.domain.org100             0
          -10
IPaddr_10_0_7_183INFINITY        kodiak.domain.org100             0
           -10
IPaddr_10_0_7_183-INFINITY       polar.domain.org100             0
          -10
IPaddr_10_0_7_184-INFINITY       kodiak.domain.org100             0
           -10
IPaddr_10_0_7_184INFINITY        polar.domain.org100             0
          -10
nfs-common_14   -INFINITY       kodiak.domain.org100             0
          -10
nfs-common_14   INFINITY        polar.domain.org100             0
         -10
nfs-common_4    INFINITY        kodiak.domain.org100             0
          -10
nfs-common_4    -INFINITY       polar.domain.org100             0
         -10
nfs-kernel-server_15-INFINITY       kodiak.domain.org100             0
              -10
nfs-kernel-server_15INFINITY        polar.domain.org100             0
             -10
nfs-kernel-server_5INFINITY        kodiak.domain.org100             0
             -10
nfs-kernel-server_5-INFINITY       polar.domain.org100             0
            -10
pingd-child:0   0               kodiak.domain.org100             0
          -10
pingd-child:0   1000            polar.domain.org100             0
         -10
pingd-child:1   1000            kodiak.domain.org100             0
          -10
pingd-child:1   -INFINITY       polar.domain.org100             0
         -10

I was firstly wondering why the two Filesystem's don't show scores for
both nodes? I also had a problem when I rebooted one of the servers.
These servers boot up pretty fast, so I'm wondering whether a large
monitoring gap and a fast reboot can lead to problems?
The drbd resource on the rebooted node transferred as expected, as did
the Filesystem, but NOT the IP. The IP previously from the rebooted
server was set to -INFINITY for both nodes and I'm not sure why.

Below are my constraints as they are during a fresh start of the
cluster. After they're up I delete the first two location constraints
as I don't want drbd resources automatically failing back once the
cluster is up.
     <constraints>
       <rsc_location id="loc:r0_likes_kodiak" rsc="ms-drbd0">
         <rule id="rule:r0_likes_kodiak" role="master" score="1">
           <expression attribute="#uname" operation="eq"
value="kodiak.domain.org" id="3ac0cc2b-e503-4b36-a960-a07ed8447b71"/>
         </rule>
       </rsc_location>
       <rsc_location id="loc:r1_likes_polar" rsc="ms-drbd1">
         <rule id="rule:r1_likes_polar" role="master" score="1">
           <expression attribute="#uname" operation="eq"
value="polar.domain.org" id="5857d79c-89df-4e8b-836c-cd43c0e7ec6d"/>
         </rule>
       </rsc_location>
       <rsc_order id="r0_before_group_1" from="group_1" action="start"
to="ms-drbd0" to_action="promote"/>
       <rsc_colocation id="group_1_on_r0" to="ms-drbd0"
to_role="master" from="group_1" score="infinity"/>
       <rsc_order id="r1_before_group_11" from="group_11"
action="start" to="ms-drbd1" to_action="promote"/>
       <rsc_colocation id="group_11_on_r1" to="ms-drbd1"
to_role="master" from="group_11" score="infinity"/>
       <rsc_location id="ms-drbd0:connected" rsc="ms-drbd0">
         <rule role="master" id="ms-drbd0:connected:rule"
score="-INFINITY" boolean_op="or">
           <expression id="ms-drbd0:connected:expr:undefined"
attribute="pingd" operation="not_defined"/>
           <expression id="ms-drbd0:connected:expr:zero"
attribute="pingd" operation="lte" value="0"/>
         </rule>
       </rsc_location>
       <rsc_location id="ms-drbd1:connected" rsc="ms-drbd1">
         <rule role="master" id="ms-drbd1:connected:rule"
score="-INFINITY" boolean_op="or">
           <expression id="ms-drbd1:connected:expr:undefined"
attribute="pingd" operation="not_defined"/>
           <expression id="ms-drbd1:connected:expr:zero"
attribute="pingd" operation="lte" value="0"/>
         </rule>
       </rsc_location>
     </constraints>

Any ideas on where I can start looking would be appreciated.

Thanks
Guy



-- 
Don't just do something...sit there!
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to