Am Monday, 6. August 2007 16:36 schrieb Andrew Beekhof:
...
> > critical, we tried to add another monitor operation with role="Slave" but
> > then none of the nodes was promoted as master initially...
>
> with the same interval?
> try adding:
>  <op id="drbd0_mon_11" name="monitor" interval="11s" timeout="5s"/>
>

tried that and it works! Thank you!

>
> no.  stickiness only controls where resources are run, not what state
> they're in the drbd agent should be setting the correct master preference
> using crm_master...
>

Is there some documentation on crm_master and how it is used and
configured? We have the feeling to not quite understand how that works
and still do not know how to achieve our goal:

- drbd on both nodes, one (initially preferably odin) in Master state
- in case of failures on the Master node, promote the other node to become 
  master
- in case of failures on Slave node, let heartbeat know that something is 
  wrong (that works with the two monitors now), but do nothing else
- if a failed node comes back (and the other node is running ok and has state
  Master), the returning node should not become Master

So, we wan't the location preference to be applied only at heartbeat startup.

We did some tests:
- if we drop the rule for the Master preference on odin, the non-autofailback
  behaviour works fine. This preference isn't that important, but we wan't to 
  add more resources and dependencies later and feel that if we can't do this
  relatively simple thing, we'll get much more problems later.

- We than tried small values for the master preference rule (50, then 10) and 
  had auto-failback again.

We monitored the score values of the resources using 
/usr/local/sbin/ptest -L -VVVVVV 2>&1 | grep assign_node

and made this observations:
- the values which occur here (76, 11, 6,...) seem not to come from our
  cib.xml !
- the default-resource-stickiness value of INFINTIY that we have in our
  cib.xml never makes it to the score values for the drbd resources!

The latter strengthens this conjecture (?):
> > The observed behaviour suggests that either the
> > default_resource_stickiness does not apply to a multistate resource or
> > that it does only distinguish between Stopped and Started, not between
> > Master/Slave.

Thanks for your help!
Klemens

-- 
Klemens Kittan
Systemadministrator

Uni-Potsdam, Inst. f. Informatik
August-Bebel-Str. 89
14482 Potsdam

Tel.    :   +49-331-977/3125
Fax.    :   +49-331-977/3122
eMail   : [EMAIL PROTECTED]

gpg --recv-keys --keyserver wwwkeys.de.pgp.net 6EA09333
 <cib admin_epoch="0" epoch="4" num_updates="72" generated="true" 
have_quorum="true" ignore_dtd="false" num_peers="2" cib-last-written="Wed Aug  
8 12:03:43 2007" cib_feature_revision="1.3" ccm_transition="4" 
dc_uuid="a7d30f67-3efd-4d20-8774-fd2b82566b2e">
   <configuration>
     <crm_config>
       <cluster_property_set id="cib-bootstrap-options">
         <attributes>
           <nvpair id="cib-bootstrap-options-symmetric-cluster" 
name="symmetric-cluster" value="true"/>
           <nvpair id="cib-bootstrap-options-no_quorum-policy" 
name="no_quorum-policy" value="stop"/>
           <nvpair id="cib-bootstrap-options-default-resource-stickiness" 
name="default-resource-stickiness" value="INFINITY"/>
           <nvpair 
id="cib-bootstrap-options-default-resource-failure-stickiness" 
name="default-resource-failure-stickiness" value="-400000"/>
           <nvpair id="cib-bootstrap-options-stonith-enabled" 
name="stonith-enabled" value="false"/>
           <nvpair id="cib-bootstrap-options-stonith-action" 
name="stonith-action" value="reboot"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-resources" 
name="stop-orphan-resources" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-actions" 
name="stop-orphan-actions" value="true"/>
           <nvpair id="cib-bootstrap-options-remove-after-stop" 
name="remove-after-stop" value="false"/>
           <nvpair id="cib-bootstrap-options-short-resource-names" 
name="short-resource-names" value="true"/>
           <nvpair id="cib-bootstrap-options-transition-idle-timeout" 
name="transition-idle-timeout" value="5min"/>
           <nvpair id="cib-bootstrap-options-default-action-timeout" 
name="default-action-timeout" value="5s"/>
           <nvpair id="cib-bootstrap-options-is-managed-default" 
name="is-managed-default" value="true"/>
           <nvpair name="last-lrm-refresh" 
id="cib-bootstrap-options-last-lrm-refresh" value="1186567901"/>
         </attributes>
       </cluster_property_set>
     </crm_config>
     <nodes>
       <node id="356346cd-d590-454c-bf29-4b8427fb0a2c" uname="frigg" 
type="normal"/>
       <node id="a7d30f67-3efd-4d20-8774-fd2b82566b2e" uname="odin" 
type="normal"/>
     </nodes>
     <resources>
       <master_slave id="ms_drbd">
         <meta_attributes id="ma_ms_drbd">
           <attributes>
             <nvpair id="ma_ms_drbd_0" name="clone_max" value="2"/>
             <nvpair id="ma_ms_drbd_1" name="clone_node_max" value="1"/>
             <nvpair id="ma_ms_drbd_2" name="master_max" value="1"/>
             <nvpair id="ma_ms_drbd_3" name="master_node_max" value="1"/>
             <nvpair id="ma_ms_drbd_4" name="notify" value="yes"/>
             <nvpair id="ma_ms_drbd_5" name="globally_unique" value="false"/>
             <nvpair id="ma_ms_drbd_6" name="target_role" value="Started"/>
           </attributes>
         </meta_attributes>
         <primitive id="drbd0" class="ocf" provider="heartbeat" type="drbd">
           <instance_attributes id="drbd0_ia">
             <attributes>
               <nvpair id="drbd0_ia_0" name="drbd_resource" value="r0"/>
             </attributes>
           </instance_attributes>
           <operations>
             <op id="drbd0_mon_0" name="monitor" interval="12s" timeout="5s"/>
             <op id="drbd0_mon_1" name="monitor" interval="10s" timeout="5s" 
role="Master"/>
           </operations>
         </primitive>
       </master_slave>
     </resources>
     <constraints>
       <rsc_location id="location" rsc="ms_drbd">
         <rule id="location_rule" score="-900000">
           <expression id="location_rule_0" attribute="#uname" operation="ne" 
value="odin"/>
           <expression id="location_rule_1" attribute="#uname" operation="ne" 
value="frigg"/>
         </rule>
       </rsc_location>
       <rsc_location id="connected" rsc="ms_drbd">
         <rule id="connected_rule" score="-INFINITY" boolean_op="or">
           <expression id="connected_rule_undefined" attribute="pingd" 
operation="not_defined"/>
           <expression id="connected_rule_zero" attribute="pingd" 
operation="lte" value="0"/>
         </rule>
       </rsc_location>
     </constraints>
   </configuration>
   <status>
     <node_state id="a7d30f67-3efd-4d20-8774-fd2b82566b2e" uname="odin" 
crmd="online" crm-debug-origin="do_update_resource" shutdown="0" in_ccm="true" 
ha="active" join="member" expected="member">
       <transient_attributes id="a7d30f67-3efd-4d20-8774-fd2b82566b2e">
         <instance_attributes id="status-a7d30f67-3efd-4d20-8774-fd2b82566b2e">
           <attributes>
             <nvpair id="status-a7d30f67-3efd-4d20-8774-fd2b82566b2e-pingd" 
name="pingd" value="400"/>
             <nvpair 
id="status-a7d30f67-3efd-4d20-8774-fd2b82566b2e-probe_complete" 
name="probe_complete" value="true"/>
             <nvpair 
id="status-a7d30f67-3efd-4d20-8774-fd2b82566b2e-fail-count-drbd0:1" 
name="fail-count-drbd0:1" value="1"/>
           </attributes>
         </instance_attributes>
         <instance_attributes id="master-a7d30f67-3efd-4d20-8774-fd2b82566b2e">
           <attributes>
             <nvpair 
id="status-master-drbd0:1-a7d30f67-3efd-4d20-8774-fd2b82566b2e" 
name="master-drbd0:1" value="10"/>
             <nvpair 
id="status-master-drbd0:0-a7d30f67-3efd-4d20-8774-fd2b82566b2e" 
name="master-drbd0:0" value="10"/>
           </attributes>
         </instance_attributes>
       </transient_attributes>
       <lrm id="a7d30f67-3efd-4d20-8774-fd2b82566b2e">
         <lrm_resources>
           <lrm_resource id="drbd0:0" type="drbd" class="ocf" 
provider="heartbeat">
             <lrm_rsc_op id="drbd0:0_monitor_0" operation="monitor" 
crm-debug-origin="build_active_RAs" 
transition_key="5:0:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="4:7;5:0:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="2" 
crm_feature_set="1.0.9" rc_code="7" op_status="4" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:0_start_0" operation="start" 
crm-debug-origin="do_update_resource" 
transition_key="4:25:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;4:25:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="42" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:0_post_notify_start_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="40:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;40:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="49" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:0_pre_notify_promote_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="42:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;42:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="44" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:0_promote_0" operation="promote" 
crm-debug-origin="do_update_resource" 
transition_key="7:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;7:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="45" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:0_post_notify_promote_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="43:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;43:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="46" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:0_monitor_10000" operation="monitor" 
crm-debug-origin="do_update_resource" 
transition_key="8:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:8;8:26:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="47" 
crm_feature_set="1.0.9" rc_code="8" op_status="0" interval="10000" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:0_pre_notify_start_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="39:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;39:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="48" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
           </lrm_resource>
           <lrm_resource id="drbd0:1" type="drbd" class="ocf" 
provider="heartbeat">
             <lrm_rsc_op id="drbd0:1_monitor_0" operation="monitor" 
crm-debug-origin="build_active_RAs" 
transition_key="5:1:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="4:7;5:1:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="3" 
crm_feature_set="1.0.9" rc_code="7" op_status="4" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_demote_0" operation="demote" 
crm-debug-origin="do_update_resource" 
transition_key="6:23:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;6:23:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="37" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_stop_0" operation="stop" 
crm-debug-origin="do_update_resource" 
transition_key="2:24:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;2:24:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="41" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_start_0" operation="start" 
crm-debug-origin="build_active_RAs" 
transition_key="9:17:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;9:17:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="25" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_promote_0" operation="promote" 
crm-debug-origin="build_active_RAs" 
transition_key="8:19:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;8:19:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="31" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_post_notify_stop_0" operation="notify" 
crm-debug-origin="build_active_RAs" 
transition_key="42:20:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;42:20:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="34" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_pre_notify_demote_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="40:23:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;40:23:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="36" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_post_notify_demote_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="41:23:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;41:23:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="38" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_pre_notify_stop_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="39:24:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;39:24:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="40" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
           </lrm_resource>
         </lrm_resources>
       </lrm>
     </node_state>
     <node_state id="356346cd-d590-454c-bf29-4b8427fb0a2c" uname="frigg" 
crmd="online" crm-debug-origin="do_update_resource" shutdown="0" in_ccm="true" 
ha="active" join="member" expected="member">
       <lrm id="356346cd-d590-454c-bf29-4b8427fb0a2c">
         <lrm_resources>
           <lrm_resource id="drbd0:0" type="drbd" class="ocf" 
provider="heartbeat">
             <lrm_rsc_op id="drbd0:0_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" 
transition_key="4:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:7;4:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="5" 
crm_feature_set="1.0.9" rc_code="7" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
           </lrm_resource>
           <lrm_resource id="drbd0:1" type="drbd" class="ocf" 
provider="heartbeat">
             <lrm_rsc_op id="drbd0:1_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" 
transition_key="5:24:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:7;5:24:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="3" 
crm_feature_set="1.0.9" rc_code="7" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_start_0" operation="start" 
crm-debug-origin="do_update_resource" 
transition_key="10:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;10:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="6" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_post_notify_start_0" operation="notify" 
crm-debug-origin="do_update_resource" 
transition_key="47:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;47:27:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="7" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="0" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
             <lrm_rsc_op id="drbd0:1_monitor_12000" operation="monitor" 
crm-debug-origin="do_update_resource" 
transition_key="11:28:d027a60e-1ab1-4561-b3be-a183054a8fa2" 
transition_magic="0:0;11:28:d027a60e-1ab1-4561-b3be-a183054a8fa2" call_id="8" 
crm_feature_set="1.0.9" rc_code="0" op_status="0" interval="12000" 
op_digest="c0e018b73fdf522b6cdd355e125af15e"/>
           </lrm_resource>
         </lrm_resources>
       </lrm>
       <transient_attributes id="356346cd-d590-454c-bf29-4b8427fb0a2c">
         <instance_attributes id="status-356346cd-d590-454c-bf29-4b8427fb0a2c">
           <attributes>
             <nvpair 
id="status-356346cd-d590-454c-bf29-4b8427fb0a2c-probe_complete" 
name="probe_complete" value="true"/>
             <nvpair id="status-356346cd-d590-454c-bf29-4b8427fb0a2c-pingd" 
name="pingd" value="400"/>
           </attributes>
         </instance_attributes>
         <instance_attributes id="master-356346cd-d590-454c-bf29-4b8427fb0a2c">
           <attributes>
             <nvpair 
id="status-master-drbd0:1-356346cd-d590-454c-bf29-4b8427fb0a2c" 
name="master-drbd0:1" value="5"/>
           </attributes>
         </instance_attributes>
       </transient_attributes>
     </node_state>
   </status>
 </cib>

Attachment: pgpuUAKNitxrZ.pgp
Description: PGP signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to