Dominik, As usual, you are right on the money. I should have caught that myself. Thank you for catching that for me. What happened was that I used a different server to compile DRBD and I had assumed that Nomen and Rubic (my test nodes) were on the same kernel.
Moreover, I had also combined Neil's suggestion to yours as he had mentioned that pacemaker-1.0.1 and drbd-8.2 works. My current issues are as follows: 1) I cannot migrate the resource fs0 from Nomen to Rubric. Running the command " crm resource migrate fs0" just puts fs0 to offline state. This sounds like a config change. NOTE: I am planning to add fs0 into a Group that will be able to migrate between the two nodes (Nomen and Rubric). Help. Please provide the crm(live) syntax as I have tried the ones below and crm complains that the syntax is wrong. order ms-drbd0-before-fs0 mandatory: ms-drbd0:promote fs0:start colocation fs0-on-ms-drbd0 inf: fs0 ms-drbd0:Master 2) Is there a documentation for what resources, constraints and the like I can add into the cib.xml via crm(live)? Moreover, their syntax to add them via crm(live)? Help. Thank you in advance. FYI, below is my current configuration as well as logs during the migration test. ####################### #Current Configuration# ####################### Installed Applications: ======================= drbd-8.2.7-3 drbd-km-2.6.18_128.1.1.el5-8.2.7-3 heartbeat-2.99.2-6.1 pacemaker-1.0.1-3.1 kernel-2.6.18-128.1.1.el5 drbd.conf: ========== global { usage-count no; } resource r0 { protocol C; handlers { pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f"; local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5"; pri-lost "echo pri-lost. Have a look at the log files. | mail -s 'DRBD Alert' root"; out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; } startup { wfc-timeout 0; } disk { on-io-error pass_on; } net { max-buffers 2048; cram-hmac-alg "sha1"; shared-secret "FooFunFactory"; after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { rate 100M; al-extents 257; } on nomen.esri.com { device /dev/drbd0; disk /dev/sda5; address 192.168.0.1:7789; meta-disk internal; } on rubric.esri.com { device /dev/drbd0; disk /dev/sda5; address 192.168.0.2:7789; meta-disk internal; } } ha.cf: ====== # Logging debug 3 use_logd false logfacility daemon # Misc Options traditional_compression off compression bz2 coredumps true # Communications udpport 691 bcast eth1 autojoin any # Thresholds (in seconds) keepalive 1 warntime 6 deadtime 10 initdead 15 ping 10.50.254.254 crm respawn apiauth mgmtd uid=root respawn root /usr/lib/heartbeat/mgmtd -v cib.xml: ======== <cib admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0" have-quorum="1" epoch="153" num_updates="0" cib-last-written="Fri Mar 6 12:52:27 2009" dc-uuid="3a8b681c-a14b-4037-a8e6-2d4af2eff88e"> <configuration> <crm_config> <cluster_property_set id="cib-bootstrap-options"> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.0.1-node: 6fc5ce8302abf145a02891ec41e5a492efbe8efe"/> <nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1236213117"/> </cluster_property_set> </crm_config> <nodes> <node id="3a8b681c-a14b-4037-a8e6-2d4af2eff88e" uname="nomen.esri.com" type="normal"/> <node id="a5e95310-f27d-418e-9cb9-42e50310f702" uname="rubric.esri.com" type="normal"/> </nodes> <resources> <master id="ms-drbd0"> <meta_attributes id="ms-drbd0-meta_attributes"> <nvpair id="ms-drbd0-meta_attributes-clone-max" name="clone-max" value="2"/> <nvpair id="ms-drbd0-meta_attributes-notify" name="notify" value="true"/> <nvpair id="ms-drbd0-meta_attributes-globally-unique" name="globally-unique" value="false"/> <nvpair id="ms-drbd0-meta_attributes-target-role" name="target-role" value="Started"/> </meta_attributes> <primitive class="ocf" id="drbd0" provider="heartbeat" type="drbd"> <instance_attributes id="drbd0-instance_attributes"> <nvpair id="drbd0-instance_attributes-drbd_resource" name="drbd_resource" value="r0"/> </instance_attributes> <operations id="drbd0-ops"> <op id="drbd0-monitor-59s" interval="59s" name="monitor" role="Master" timeout="30s"/> <op id="drbd0-monitor-60s" interval="60s" name="monitor" role="Slave" timeout="30s"/> </operations> </primitive> </master> <primitive class="ocf" id="VIP" provider="heartbeat" type="IPaddr"> <instance_attributes id="VIP-instance_attributes"> <nvpair id="VIP-instance_attributes-ip" name="ip" value="10.50.26.250"/> </instance_attributes> <operations id="VIP-ops"> <op id="VIP-monitor-5s" interval="5s" name="monitor" timeout="5s"/> </operations> </primitive> <primitive class="ocf" id="fs0" provider="heartbeat" type="Filesystem"> <instance_attributes id="fs0-instance_attributes"> <nvpair id="fs0-instance_attributes-fstype" name="fstype" value="ext3"/> <nvpair id="fs0-instance_attributes-directory" name="directory" value="/data"/> <nvpair id="fs0-instance_attributes-device" name="device" value="/dev/drbd0"/> </instance_attributes> </primitive> </resources> <constraints/> </configuration> </cib> messages: ================== Mar 6 12:56:07 nomen lrmd: [14509]: info: Resource Agent output: [] Mar 6 12:56:08 nomen crm_shadow: [1551]: info: Invoked: crm_shadow Mar 6 12:56:08 nomen crm_shadow: [1565]: info: Invoked: crm_shadow Mar 6 12:56:08 nomen crm_resource: [1566]: info: Invoked: crm_resource -M -r fs0 Mar 6 12:56:09 nomen cib: [14508]: info: cib_process_request: Operation complete: op cib_delete for section constraints (origin=local/crm_resource/3): ok (rc=0) Mar 6 12:56:09 nomen haclient: on_event:evt:cib_changed Mar 6 12:56:09 nomen crmd: [14603]: info: abort_transition_graph: need_abort:60 - Triggered transition abort (complete=1) : Non-status change Mar 6 12:56:09 nomen crmd: [14603]: info: need_abort: Aborting on change to epoch Mar 6 12:56:09 nomen crmd: [14603]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Mar 6 12:56:09 nomen crmd: [14603]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Mar 6 12:56:09 nomen crmd: [14603]: info: do_pe_invoke: Query 112: Requesting the current CIB: S_POLICY_ENGINE Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: - <cib epoch="153" num_updates="2" /> Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + <cib epoch="154" num_updates="1" > Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + <configuration > Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + <constraints > Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + <rsc_location id="cli-standby-fs0" rsc="fs0" __crm_diff_marker__="added:top" > Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + <rule id="cli-standby-rule-fs0" score="-INFINITY" boolean-op="and" > Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + <expression id="cli-standby-expr-fs0" attribute="#uname" operation="eq" value="nomen.esri.com" type="string" /> Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + </rule> Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + </rsc_location> Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + </constraints> Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + </configuration> Mar 6 12:56:09 nomen cib: [14508]: info: log_data_element: cib:diff: + </cib> Mar 6 12:56:09 nomen cib: [14508]: info: cib_process_request: Operation complete: op cib_modify for section constraints (origin=local/crm_resource/4): ok (rc=0) Mar 6 12:56:09 nomen crmd: [14603]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1236372969-107, seq=2, quorate=1 Mar 6 12:56:09 nomen pengine: [14645]: WARN: unpack_resources: No STONITH resources have been defined Mar 6 12:56:09 nomen pengine: [14645]: info: determine_online_status: Node rubric.esri.com is online Mar 6 12:56:09 nomen pengine: [14645]: info: unpack_rsc_op: fs0_start_0 on rubric.esri.com returned 1 (unknown error) instead of the expected value: 0 (ok) Mar 6 12:56:09 nomen pengine: [14645]: WARN: unpack_rsc_op: Processing failed op fs0_start_0 on rubric.esri.com: Error Mar 6 12:56:09 nomen pengine: [14645]: WARN: unpack_rsc_op: Compatibility handling for failed op fs0_start_0 on rubric.esri.com Mar 6 12:56:09 nomen pengine: [14645]: info: determine_online_status: Node nomen.esri.com is online Mar 6 12:56:09 nomen pengine: [14645]: notice: clone_print: Master/Slave Set: ms-drbd0 Mar 6 12:56:09 nomen pengine: [14645]: notice: native_print: drbd0:0 (ocf::heartbeat:drbd): Master nomen.esri.com Mar 6 12:56:09 nomen pengine: [14645]: notice: native_print: drbd0:1 (ocf::heartbeat:drbd): Started rubric.esri.com Mar 6 12:56:09 nomen pengine: [14645]: notice: native_print: VIP (ocf::heartbeat:IPaddr): Started nomen.esri.com Mar 6 12:56:09 nomen pengine: [14645]: notice: native_print: fs0 (ocf::heartbeat:Filesystem):Started nomen.esri.com Mar 6 12:56:09 nomen pengine: [14645]: info: get_failcount: fs0 has failed 1000000 times on rubric.esri.com Mar 6 12:56:09 nomen pengine: [14645]: info: master_color: Promoting drbd0:0 (Master nomen.esri.com) Mar 6 12:56:09 nomen pengine: [14645]: info: master_color: ms-drbd0: Promoted 1 instances of a possible 1 to master Mar 6 12:56:09 nomen pengine: [14645]: WARN: native_color: Resource fs0 cannot run anywhere Mar 6 12:56:09 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:0 (Master nomen.esri.com) Mar 6 12:56:09 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:1 (Slave rubric.esri.com) Mar 6 12:56:09 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:0 (Master nomen.esri.com) Mar 6 12:56:09 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:1 (Slave rubric.esri.com) Mar 6 12:56:09 nomen pengine: [14645]: notice: NoRoleChange: Leave resource VIP (Started nomen.esri.com) Mar 6 12:56:09 nomen pengine: [14645]: notice: NoRoleChange: Stop resource fs0 (Started nomen.esri.com) Mar 6 12:56:09 nomen pengine: [14645]: notice: StopRsc: nomen.esri.com Stop fs0 Mar 6 12:56:09 nomen mgmtd: [14526]: info: CIB query: cib Mar 6 12:56:09 nomen crmd: [14603]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Mar 6 12:56:09 nomen pengine: [14645]: WARN: process_pe_message: Transition 34: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/heartbeat/pengine/pe-warn-37.bz2 Mar 6 12:56:09 nomen pengine: [14645]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues. Mar 6 12:56:09 nomen crmd: [14603]: info: unpack_graph: Unpacked transition 34: 2 actions in 2 synapses Mar 6 12:56:09 nomen crmd: [14603]: info: do_te_invoke: Processing graph 34 (ref=pe_calc-dc-1236372969-107) derived from /var/lib/heartbeat/pengine/pe-warn-37.bz2 Mar 6 12:56:09 nomen crmd: [14603]: info: send_rsc_command: Initiating action 41: stop fs0_stop_0 on nomen.esri.com Mar 6 12:56:09 nomen cib: [1567]: info: write_cib_contents: Wrote version 0.154.0 of the CIB to disk (digest: bbea2bdada182cedaa9f52f91c178cdb) Mar 6 12:56:09 nomen cib: [1567]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) Mar 6 12:56:09 nomen cib: [14508]: info: Managed write_cib_contents process 1567 exited with return code 0. Mar 6 12:56:09 nomen crmd: [14603]: info: do_lrm_rsc_op: Performing key=41:34:0:44aada21-7997-4a4f-ba9a-4ae8a2629a58 op=fs0_stop_0 ) Mar 6 12:56:09 nomen lrmd: [14509]: info: rsc:fs0: stop Mar 6 12:56:09 nomen Filesystem[1569]: INFO: Running stop for /dev/drbd0 on /data Mar 6 12:56:09 nomen Filesystem[1569]: INFO: Trying to unmount /data Mar 6 12:56:09 nomen Filesystem[1569]: INFO: unmounted /data successfully Mar 6 12:56:09 nomen lrmd: [14509]: info: Managed fs0:stop process 1569 exited with return code 0. Mar 6 12:56:09 nomen lrmd: [14509]: info: Resource Agent output: [] Mar 6 12:56:09 nomen crmd: [14603]: info: process_lrm_event: LRM operation fs0_stop_0 (call=46, rc=0, cib-update=113, confirmed=true) complete ok Mar 6 12:56:09 nomen haclient: on_event:evt:cib_changed Mar 6 12:56:09 nomen crmd: [14603]: info: match_graph_event: Action fs0_stop_0 (41) confirmed on nomen.esri.com (rc=0) Mar 6 12:56:09 nomen crmd: [14603]: info: te_pseudo_action: Pseudo action 4 fired and confirmed Mar 6 12:56:09 nomen crmd: [14603]: info: run_graph: ==================================================== Mar 6 12:56:09 nomen crmd: [14603]: notice: run_graph: Transition 34 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/heartbeat/pengine/pe-warn-37.bz2): Complete Mar 6 12:56:09 nomen crmd: [14603]: info: te_graph_trigger: Transition 34 is now complete Mar 6 12:56:09 nomen cib: [14508]: info: cib_process_request: Operation complete: op cib_modify for section 'all' (origin=local/crmd/113): ok (rc=0) Mar 6 12:56:09 nomen crmd: [14603]: info: notify_crmd: Transition 34 status: done - <null> Mar 6 12:56:09 nomen crmd: [14603]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Mar 6 12:56:10 nomen mgmtd: [14526]: info: CIB query: cib Mar 6 12:56:12 nomen lrmd: [14509]: info: Resource Agent output: [] Mar 6 12:56:14 nomen crm_shadow: [1662]: info: Invoked: crm_shadow Mar 6 12:56:14 nomen crm_shadow: [1676]: info: Invoked: crm_shadow Mar 6 12:56:14 nomen crm_resource: [1677]: info: Invoked: crm_resource -U -r fs0 Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - <cib epoch="154" num_updates="2" > Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - <configuration > Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - <constraints > Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - <rsc_location id="cli-standby-fs0" rsc="fs0" __crm_diff_marker__="removed:top" > Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - <rule id="cli-standby-rule-fs0" score="-INFINITY" boolean-op="and" > Mar 6 12:56:14 nomen haclient: on_event:evt:cib_changed Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - <expression id="cli-standby-expr-fs0" attribute="#uname" operation="eq" value="nomen.esri.com" type="string" /> Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - </rule> Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - </rsc_location> Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - </constraints> Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - </configuration> Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: - </cib> Mar 6 12:56:14 nomen cib: [14508]: info: log_data_element: cib:diff: + <cib epoch="155" num_updates="1" /> Mar 6 12:56:14 nomen cib: [14508]: info: cib_process_request: Operation complete: op cib_delete for section constraints (origin=local/crm_resource/3): ok (rc=0) Mar 6 12:56:14 nomen crmd: [14603]: info: abort_transition_graph: need_abort:60 - Triggered transition abort (complete=1) : Non-status change Mar 6 12:56:14 nomen crmd: [14603]: info: need_abort: Aborting on change to epoch Mar 6 12:56:14 nomen crmd: [14603]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Mar 6 12:56:14 nomen crmd: [14603]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Mar 6 12:56:14 nomen crmd: [14603]: info: do_pe_invoke: Query 114: Requesting the current CIB: S_POLICY_ENGINE Mar 6 12:56:14 nomen cib: [14508]: info: cib_process_request: Operation complete: op cib_delete for section constraints (origin=local/crm_resource/4): ok (rc=0) Mar 6 12:56:14 nomen crmd: [14603]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1236372974-109, seq=2, quorate=1 Mar 6 12:56:14 nomen pengine: [14645]: WARN: unpack_resources: No STONITH resources have been defined Mar 6 12:56:14 nomen pengine: [14645]: info: determine_online_status: Node rubric.esri.com is online Mar 6 12:56:14 nomen pengine: [14645]: info: unpack_rsc_op: fs0_start_0 on rubric.esri.com returned 1 (unknown error) instead of the expected value: 0 (ok) Mar 6 12:56:14 nomen pengine: [14645]: WARN: unpack_rsc_op: Processing failed op fs0_start_0 on rubric.esri.com: Error Mar 6 12:56:14 nomen pengine: [14645]: WARN: unpack_rsc_op: Compatibility handling for failed op fs0_start_0 on rubric.esri.com Mar 6 12:56:14 nomen pengine: [14645]: info: determine_online_status: Node nomen.esri.com is online Mar 6 12:56:14 nomen pengine: [14645]: notice: clone_print: Master/Slave Set: ms-drbd0 Mar 6 12:56:14 nomen pengine: [14645]: notice: native_print: drbd0:0 (ocf::heartbeat:drbd): Master nomen.esri.com Mar 6 12:56:14 nomen pengine: [14645]: notice: native_print: drbd0:1 (ocf::heartbeat:drbd): Started rubric.esri.com Mar 6 12:56:14 nomen pengine: [14645]: notice: native_print: VIP (ocf::heartbeat:IPaddr): Started nomen.esri.com Mar 6 12:56:14 nomen pengine: [14645]: notice: native_print: fs0 (ocf::heartbeat:Filesystem):Stopped Mar 6 12:56:14 nomen pengine: [14645]: info: get_failcount: fs0 has failed 1000000 times on rubric.esri.com Mar 6 12:56:14 nomen pengine: [14645]: info: master_color: Promoting drbd0:0 (Master nomen.esri.com) Mar 6 12:56:14 nomen pengine: [14645]: info: master_color: ms-drbd0: Promoted 1 instances of a possible 1 to master Mar 6 12:56:14 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:0 (Master nomen.esri.com) Mar 6 12:56:14 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:1 (Slave rubric.esri.com) Mar 6 12:56:14 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:0 (Master nomen.esri.com) Mar 6 12:56:14 nomen pengine: [14645]: notice: NoRoleChange: Leave resource drbd0:1 (Slave rubric.esri.com) Mar 6 12:56:14 nomen pengine: [14645]: notice: NoRoleChange: Leave resource VIP (Started nomen.esri.com) Mar 6 12:56:14 nomen pengine: [14645]: notice: StartRsc: nomen.esri.com Start fs0 Mar 6 12:56:14 nomen mgmtd: [14526]: info: CIB query: cib Mar 6 12:56:14 nomen crmd: [14603]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Mar 6 12:56:14 nomen pengine: [14645]: info: process_pe_message: Transition 35: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-75.bz2 Mar 6 12:56:14 nomen pengine: [14645]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues. Mar 6 12:56:14 nomen crmd: [14603]: info: unpack_graph: Unpacked transition 35: 1 actions in 1 synapses Mar 6 12:56:14 nomen crmd: [14603]: info: do_te_invoke: Processing graph 35 (ref=pe_calc-dc-1236372974-109) derived from /var/lib/heartbeat/pengine/pe-input-75.bz2 Mar 6 12:56:14 nomen crmd: [14603]: info: send_rsc_command: Initiating action 41: start fs0_start_0 on nomen.esri.com Mar 6 12:56:14 nomen crmd: [14603]: info: do_lrm_rsc_op: Performing key=41:35:0:44aada21-7997-4a4f-ba9a-4ae8a2629a58 op=fs0_start_0 ) Mar 6 12:56:14 nomen lrmd: [14509]: info: rsc:fs0: start Mar 6 12:56:14 nomen Filesystem[1681]: INFO: Running start for /dev/drbd0 on /data Mar 6 12:56:14 nomen cib: [1678]: info: write_cib_contents: Wrote version 0.155.0 of the CIB to disk (digest: 0fd876c0a5f2db21a9aa66b3f997194f) Mar 6 12:56:14 nomen cib: [1678]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) Mar 6 12:56:14 nomen cib: [14508]: info: Managed write_cib_contents process 1678 exited with return code 0. Mar 6 12:56:14 nomen kernel: kjournald starting. Commit interval 5 seconds Mar 6 12:56:14 nomen kernel: EXT3 FS on drbd0, internal journal Mar 6 12:56:14 nomen kernel: EXT3-fs: mounted filesystem with ordered data mode. Mar 6 12:56:14 nomen lrmd: [14509]: info: Managed fs0:start process 1681 exited with return code 0. Mar 6 12:56:14 nomen lrmd: [14509]: info: Resource Agent output: [] Mar 6 12:56:14 nomen crmd: [14603]: info: process_lrm_event: LRM operation fs0_start_0 (call=47, rc=0, cib-update=115, confirmed=true) complete ok Mar 6 12:56:15 nomen cib: [14508]: info: cib_process_request: Operation complete: op cib_modify for section 'all' (origin=local/crmd/115): ok (rc=0) Mar 6 12:56:15 nomen crmd: [14603]: info: match_graph_event: Action fs0_start_0 (41) confirmed on nomen.esri.com (rc=0) Mar 6 12:56:15 nomen crmd: [14603]: info: run_graph: ==================================================== Mar 6 12:56:15 nomen crmd: [14603]: notice: run_graph: Transition 35 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/heartbeat/pengine/pe-input-75.bz2): Complete Mar 6 12:56:15 nomen crmd: [14603]: info: te_graph_trigger: Transition 35 is now complete Mar 6 12:56:15 nomen crmd: [14603]: info: notify_crmd: Transition 35 status: done - <null> Mar 6 12:56:15 nomen crmd: [14603]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Mar 6 12:56:15 nomen haclient: on_event: from message queue: evt:cib_changed Mar 6 12:56:15 nomen mgmtd: [14526]: info: CIB query: cib Mar 6 12:56:15 nomen heartbeat: [14466]: WARN: G_CH_dispatch_int: Dispatch function for read child took too long to execute: 70 ms (> 50 ms) (GSource: 0x94add68) Mar 6 12:56:17 nomen lrmd: [14509]: info: Resource Agent output: [] Regards, jerome -----Original Message----- From: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Dominik Klein Sent: Wednesday, March 04, 2009 10:54 PM To: General Linux-HA mailing list Subject: Re: [Linux-HA] Having issues with getting DRBD to work with Pacemaker Hi Jerome Yanga wrote: > Hi! I am having issues with getting DRBD to work with Pacemaker. I can get > Pacemaker and DRBD run individually but not DRBD managed by Pacemaker. I > tried following the instruction in the site below but the resources will not > go online. > > http://clusterlabs.org/wiki/DRBD_HowTo_1.0 > > Below is my configuration. > > Installed applications: > ======================= > kernel-2.6.18-128.el5 copy that > drbd-8.3.0-3 > heartbeat-2.99.2-6.1 > pacemaker-1.0.1-3.1 > > > > drbd.conf: > ========== > global { > usage-count no; > } > > resource r0 { > protocol C; > handlers { > pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; > pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f"; > local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; > outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5"; > pri-lost "echo pri-lost. Have a look at the log files. | mail -s 'DRBD > Alert' root"; > out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; > } > startup { > wfc-timeout 0; > } > > disk { > on-io-error pass_on; > } > net { > max-buffers 2048; > after-sb-0pri disconnect; > after-sb-1pri disconnect; > after-sb-2pri disconnect; > rr-conflict disconnect; > } > syncer { > rate 100M; > al-extents 257; > } > on nomen.esri.com { > device /dev/drbd0; > disk /dev/sda5; > address 192.168.0.1:7789; > meta-disk internal; > } > on rubric.esri.com { > device /dev/drbd0; > disk /dev/sda5; > address 192.168.0.2:7789; > meta-disk internal; > } > } > > > > Cib.xml: > ======== > <cib admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0" > have-quorum="1" dc-uuid="a5 > e95310-f27d-418e-9cb9-42e50310f702" epoch="56" num_updates="0" > cib-last-written="Wed Mar 4 14:27:59 > 2009"> > <configuration> > <crm_config> > <cluster_property_set id="cib-bootstrap-options"> > <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" > value="1.0.1-node: 6fc5ce830 > 2abf145a02891ec41e5a492efbe8efe"/> > </cluster_property_set> > </crm_config> > <nodes> > <node id="3a8b681c-a14b-4037-a8e6-2d4af2eff88e" uname="nomen.esri.com" > type="normal"/> > <node id="a5e95310-f27d-418e-9cb9-42e50310f702" uname="rubric.esri.com" > type="normal"/> > </nodes> > <resources> > <master id="ms-drbd0"> > <meta_attributes id="ms-drbd0-meta_attributes"> > <nvpair id="ms-drbd0-meta_attributes-clone-max" name="clone-max" > value="2"/> > <nvpair id="ms-drbd0-meta_attributes-notify" name="notify" > value="true"/> > <nvpair id="ms-drbd0-meta_attributes-globally-unique" > name="globally-unique" value="false" > /> > <nvpair name="target-role" > id="ms-drbd0-meta_attributes-target-role" value="Started"/> > </meta_attributes> > <primitive class="ocf" id="drbd0" provider="heartbeat" type="drbd"> > <instance_attributes id="drbd0-instance_attributes"> > <nvpair id="drbd0-instance_attributes-drbd_resource" > name="drbd_resource" value="r0"/> > </instance_attributes> > <operations id="drbd0-ops"> > <op id="drbd0-monitor-59s" interval="59s" name="monitor" > role="Master" timeout="30s"/> > <op id="drbd0-monitor-60s" interval="60s" name="monitor" > role="Slave" timeout="30s"/> > </operations> > </primitive> > </master> > </resources> > <constraints/> > </configuration> > </cib> > > > /var/log/messages: > ================== > Mar 4 14:27:58 nomen crm_resource: [30167]: info: Invoked: crm_resource > --meta -r ms-drbd0 -p target-role -v Started > Mar 4 14:27:58 nomen cib: [29899]: info: cib_process_xpath: Processing > cib_query op for > //cib/configuration/resources//*...@id="ms-drbd0"]//meta_attributes//nvpa...@name="target-role"] > (/cib/configuration/resources/master/meta_attributes/nvpair[4]) > Mar 4 14:27:59 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=5:5:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:0_start_0 ) > Mar 4 14:27:59 nomen haclient: on_event:evt:cib_changed > Mar 4 14:27:59 nomen lrmd: [29900]: info: rsc:drbd0:0: start > Mar 4 14:27:59 nomen cib: [30168]: info: write_cib_contents: Wrote version > 0.56.0 of the CIB to disk (digest: 2365d9802f1b9c55e0ed87b8ebda5db3) > Mar 4 14:27:59 nomen cib: [30168]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.xml (digest: > /var/lib/heartbeat/crm/cib.xml.sig) > Mar 4 14:27:59 nomen cib: [29899]: info: Managed write_cib_contents process > 30168 exited with return code 0. > Mar 4 14:27:59 nomen modprobe: FATAL: Module drbd not found. > Mar 4 14:27:59 nomen lrmd: [29900]: info: RA output: (drbd0:0:start:stdout) > Mar 4 14:27:59 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:27:59 nomen lrmd: [29900]: info: RA output: (drbd0:0:start:stdout) > Could not stat("/proc/drbd"): No such file or directory do you need to load > the module? try: modprobe drbd Command 'drbdsetup /dev/drbd0 disk /dev/sda5 > /dev/sda5 internal --set-defaults --create-device --on-io-error=pass_on' > terminated with exit code 20 drbdadm attach r0: exited with code 20 > Mar 4 14:27:59 nomen drbd[30169]: ERROR: r0 start: not in Secondary mode > after start. > Mar 4 14:27:59 nomen lrmd: [29900]: WARN: Managed drbd0:0:start process > 30169 exited with return code 1. > Mar 4 14:27:59 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:0_start_0 (call=3, rc=1, cib-update=13, confirmed=true) complete > unknown error > Mar 4 14:27:59 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:27:59 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:00 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=41:6:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:0_notify_0 ) > Mar 4 14:28:00 nomen lrmd: [29900]: info: rsc:drbd0:0: notify > Mar 4 14:28:00 nomen lrmd: [29900]: info: Managed drbd0:0:notify process > 30310 exited with return code 0. > Mar 4 14:28:00 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:0_notify_0 (call=4, rc=0, cib-update=14, confirmed=true) complete ok > Mar 4 14:28:00 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:00 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:00 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:01 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=2:6:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:0_stop_0 ) > Mar 4 14:28:01 nomen lrmd: [29900]: info: rsc:drbd0:0: stop > Mar 4 14:28:01 nomen lrmd: [29900]: info: Managed drbd0:0:stop process 30324 > exited with return code 0. > Mar 4 14:28:01 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:0_stop_0 (call=5, rc=0, cib-update=15, confirmed=true) complete ok > Mar 4 14:28:01 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:01 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:01 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:02 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=10:6:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:1_start_0 ) > Mar 4 14:28:02 nomen lrmd: [29900]: info: rsc:drbd0:1: start > Mar 4 14:28:02 nomen modprobe: FATAL: Module drbd not found. > Mar 4 14:28:02 nomen lrmd: [29900]: info: RA output: (drbd0:1:start:stdout) > Mar 4 14:28:02 nomen lrmd: [29900]: info: RA output: (drbd0:1:start:stdout) > Could not stat("/proc/drbd"): No such file or directory do you need to load > the module? try: modprobe drbd Command 'drbdsetup /dev/drbd0 disk /dev/sda5 > /dev/sda5 internal --set-defaults --create-device --on-io-error=pass_on' > terminated with exit code 20 drbdadm attach r0: exited with code 20 > Mar 4 14:28:02 nomen drbd[30338]: ERROR: r0 start: not in Secondary mode > after start. > Mar 4 14:28:02 nomen lrmd: [29900]: WARN: Managed drbd0:1:start process > 30338 exited with return code 1. > Mar 4 14:28:02 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:1_start_0 (call=6, rc=1, cib-update=16, confirmed=true) complete > unknown error > Mar 4 14:28:02 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:02 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:02 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:03 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=44:7:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:1_notify_0 ) > Mar 4 14:28:03 nomen lrmd: [29900]: info: rsc:drbd0:1: notify > Mar 4 14:28:03 nomen lrmd: [29900]: info: Managed drbd0:1:notify process > 30472 exited with return code 0. > Mar 4 14:28:03 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:1_notify_0 (call=7, rc=0, cib-update=17, confirmed=true) complete ok > Mar 4 14:28:03 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:03 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:03 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:04 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=2:7:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:1_stop_0 ) > Mar 4 14:28:04 nomen lrmd: [29900]: info: rsc:drbd0:1: stop > Mar 4 14:28:04 nomen lrmd: [29900]: info: Managed drbd0:1:stop process 30486 > exited with return code 0. > Mar 4 14:28:04 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:1_stop_0 (call=8, rc=0, cib-update=18, confirmed=true) complete ok > Mar 4 14:28:04 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:04 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:04 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:05 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=7:7:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:0_start_0 ) > Mar 4 14:28:05 nomen lrmd: [29900]: info: rsc:drbd0:0: start > Mar 4 14:28:05 nomen modprobe: FATAL: Module drbd not found. > Mar 4 14:28:05 nomen lrmd: [29900]: info: RA output: (drbd0:0:start:stdout) > Mar 4 14:28:05 nomen lrmd: [29900]: info: RA output: (drbd0:0:start:stdout) > Could not stat("/proc/drbd"): No such file or directory do you need to load > the module? try: modprobe drbd Command 'drbdsetup /dev/drbd0 disk /dev/sda5 > /dev/sda5 internal --set-defaults --create-device --on-io-error=pass_on' > terminated with exit code 20 drbdadm attach r0: exited with code 20 > Mar 4 14:28:05 nomen drbd[30500]: ERROR: r0 start: not in Secondary mode > after start. > Mar 4 14:28:05 nomen lrmd: [29900]: WARN: Managed drbd0:0:start process > 30500 exited with return code 1. > Mar 4 14:28:05 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:0_start_0 (call=9, rc=1, cib-update=19, confirmed=true) complete > unknown error > Mar 4 14:28:05 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:05 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:06 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=38:8:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:0_notify_0 ) > Mar 4 14:28:06 nomen lrmd: [29900]: info: rsc:drbd0:0: notify > Mar 4 14:28:06 nomen lrmd: [29900]: info: Managed drbd0:0:notify process > 30634 exited with return code 0. > Mar 4 14:28:06 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:0_notify_0 (call=10, rc=0, cib-update=20, confirmed=true) complete ok > Mar 4 14:28:06 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:06 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:07 nomen crmd: [29903]: info: do_lrm_rsc_op: Performing > key=1:8:0:d4b86e31-ca4a-4033-8437-6486622eb19f op=drbd0:0_stop_0 ) > Mar 4 14:28:07 nomen lrmd: [29900]: info: rsc:drbd0:0: stop > Mar 4 14:28:07 nomen lrmd: [29900]: info: Managed drbd0:0:stop process 30648 > exited with return code 0. > Mar 4 14:28:07 nomen crmd: [29903]: info: process_lrm_event: LRM operation > drbd0:0_stop_0 (call=11, rc=0, cib-update=21, confirmed=true) complete ok > Mar 4 14:28:07 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:07 nomen mgmtd: [29904]: info: CIB query: cib > Mar 4 14:28:08 nomen haclient: on_event: from message queue: evt:cib_changed > Mar 4 14:28:08 nomen mgmtd: [29904]: info: CIB query: cib > > FYI, I had to add the following line to /etc/init.d/drbd to get it working. > > insmod /lib/modules/2.6.18-92.1.22.el5/kernel/drivers/block/drbd.ko copied from the start of your email. kernel-2.6.18-128.el5 So your kernel module does not match your running kernel and therefore the modprobe command cannot find the module. Recompile drbd for your running kernel. Regards Dominik _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems