So, I’m setting up a two node cluster that will eventually (hopefully) serve as an HA iSCSI Target (Active/Passive) on RHEL 6. I’m using the [incredibly poorly written] guide I found on Linbit’s website (“Highly available iSCSI storage with DRBD and Pacemaker”). I have somehow gotten pretty far through it, but I’ve hit a couple of snags.
Here’s my drbd.conf (/etc/drbd.d/global_common.conf): global { usage-count yes; # minor-count dialog-refresh disable-ip-verification } common { protocol C; handlers { # These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. # pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; # pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; # local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb } disk { # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes # no-disk-drain no-md-flushes max-bio-bvecs } net { # sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork } syncer { # rate after al-extents use-rle cpu-mask verify-alg csums-alg } } And here’s my targetfs.res config (/etc/drbd.d/targetfs.res): resource targetfs { protocol C; meta-disk internal; device /dev/drbd1; disk /dev/xvdf; syncer { verify-alg sha1; c-plan-ahead 0; rate 32M; } net { allow-two-primaries; } on node1 { address 10.130.96.120:7789; } on node2 { address 10.130.97.165:7789; } } These, of course, live on both nodes. Once I create the drbd md and sync the nodes: (node1)# drbdadm create-md targetfs (node2)# drbdadm create-md targetfs (node1)# drbdadm up targetfs (node2)# drbdadm up targetfs (node2)# drbdadm invalidate targetfs (node1)# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:134213632 nr:0 dw:36 dr:134215040 al:1 bm:8192 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 I run a pvcreate and an lvcreate on node1: (node1)# pvcreate /dev/drbd/by-res/targetfs (node1)# vgcreate targetfs /dev/drbd/by-res/targetfs (node1)# pvs && vgs PV VG Fmt Attr PSize PFree /dev/drbd1 targetfs lvm2 a--u 127.99g 127.99g VG #PV #LV #SN Attr VSize VFree targetfs 1 0 0 wz--n- 127.99g 127.99g pcs cluster configuration goes well enough for a bit: # pcs cluster setup --name gctvanas node1 node2 --transport udpu # pcs cluster start --all # pcs property set stonith-enabled=false # pcs property set no-quorum-policy=ignore # pcs property set default-resource-stickiness="200" # pcs resource create gctvanas-vip ocf:heartbeat:IPaddr2 ip=10.30.96.100 cidr_netmask=32 nic=eth0 op monitor interval=30s # pcs cluster cib drbd_cfg # pcs -f drbd_cfg resource create gctvanas-fs1o ocf:linbit:drbd drbd_resource=targetfs op monitor interval=10s # pcs -f drbd_cfg resource master gctvanas-fs2o gctvanas-fs1o master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true # pcs cluster cib-push drbd_cfg # pcs status Cluster name: gctvanas Stack: cman Current DC: node1 (version 1.1.15-1.9a34920.git.el6-9a34920) - partition with quorum Last updated: Fri Aug 26 11:29:11 2016 Last change: Fri Aug 26 11:29:07 2016 by root via cibadmin on node1 2 nodes configured 3 resources configured Online: [ node1 node2 ] Full list of resources: Master/Slave Set: gctvanas-fs2o [gctvanas-fs1o] Masters: [ node1 ] Slaves: [ node2 ] gctvanas-vip (ocf::heartbeat:IPaddr2): Started node1 PCSD Status: node1: Online node2: Online And then I do this: # pcs resource create gctvanas-lvm ocf:heartbeat:LVM params volgrpname=targetfs op monitor interval="30s” and the wheels come off the cart :-| # pcs status Cluster name: gctvanas Stack: cman Current DC: node1 (version 1.1.15-1.9a34920.git.el6-9a34920) - partition with quorum Last updated: Fri Aug 26 11:27:29 2016 Last change: Fri Aug 26 10:57:21 2016 by root via cibadmin on node1 2 nodes configured 4 resources configured Online: [ node1 node2 ] Full list of resources: Master/Slave Set: gctvanas-fs2o [gctvanas-fs1o] Masters: [ node1 ] Slaves: [ node2 ] gctvanas-vip (ocf::heartbeat:IPaddr2): Started node1 gctvanas-lvm (ocf::heartbeat:LVM): Stopped Failed Actions: * gctvanas-lvm_start_0 on node1 'not running' (7): call=42, status=complete, exitreason='LVM: targetfs did not activate correctly', last-rc-change='Fri Aug 26 10:57:22 2016', queued=0ms, exec=577ms * gctvanas-lvm_start_0 on node2 'unknown error' (1): call=34, status=complete, exitreason='Volume group [targetfs] does not exist or contains error! Volume group "targetfs" not found', last-rc-change='Fri Aug 26 10:57:21 2016', queued=0ms, exec=322ms PCSD Status: node1: Online node2: Online I’m not seeing anything obvious on node1 that indicates what the issue is there, but the error on node2 makes a little more sense since node2 doesn’t actually know about the pv/vg created on the drbd disk (so I feel like I’m missing a step that somehow lets node2 know what’s going on there). Any ideas? Here’s my current cluster.conf <cluster config_version="9" name="gctvanas"> <fence_daemon/> <clusternodes> <clusternode name="node1" nodeid="1"> <fence> <method name="pcmk-method"> <device name="pcmk-redirect" port="node1"/> </method> </fence> </clusternode> <clusternode name="node2" nodeid="2"> <fence> <method name="pcmk-method"> <device name="pcmk-redirect" port="node2"/> </method> </fence> </clusternode> </clusternodes> <cman broadcast="no" expected_votes="1" transport="udpu" two_node="1"/> <fencedevices> <fencedevice agent="fence_pcmk" name="pcmk-redirect"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster> Here’s my current cib dump: <cib crm_feature_set="3.0.11" validate-with="pacemaker-2.6" epoch="56" num_updates="10" admin_epoch="0" cib-last-written="Fri Aug 26 10:57:21 2016" update-origin="node1" update-client="cibadmin" update-user="root" have-quorum="1" dc-uuid="node1"> <configuration> <crm_config> <cluster_property_set id="cib-bootstrap-options"> <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="false"/> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.15-1.9a34920.git.el6-9a34920"/> <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="cman"/> <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/> <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="ignore"/> <nvpair id="cib-bootstrap-options-default-resource-stickiness" name="default-resource-stickiness" value="200"/> <nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1472203780"/> </cluster_property_set> </crm_config> <nodes> <node id="node2" uname="node2"/> <node id="node1" uname="node1"/> </nodes> <resources> <master id="gctvanas-fs2o"> <primitive class="ocf" id="gctvanas-fs1o" provider="linbit" type="drbd"> <instance_attributes id="gctvanas-fs1o-instance_attributes"> <nvpair id="gctvanas-fs1o-instance_attributes-drbd_resource" name="drbd_resource" value="targetfs"/> </instance_attributes> <operations> <op id="gctvanas-fs1o-start-interval-0s" interval="0s" name="start" timeout="240"/> <op id="gctvanas-fs1o-promote-interval-0s" interval="0s" name="promote" timeout="90"/> <op id="gctvanas-fs1o-demote-interval-0s" interval="0s" name="demote" timeout="90"/> <op id="gctvanas-fs1o-stop-interval-0s" interval="0s" name="stop" timeout="100"/> <op id="gctvanas-fs1o-monitor-interval-10s" interval="10s" name="monitor"/> </operations> </primitive> <meta_attributes id="gctvanas-fs2o-meta_attributes"> <nvpair id="gctvanas-fs2o-meta_attributes-master-max" name="master-max" value="1"/> <nvpair id="gctvanas-fs2o-meta_attributes-master-node-max" name="master-node-max" value="1"/> <nvpair id="gctvanas-fs2o-meta_attributes-clone-max" name="clone-max" value="2"/> <nvpair id="gctvanas-fs2o-meta_attributes-clone-node-max" name="clone-node-max" value="1"/> <nvpair id="gctvanas-fs2o-meta_attributes-notify" name="notify" value="true"/> </meta_attributes> </master> <primitive class="ocf" id="gctvanas-vip" provider="heartbeat" type="IPaddr2"> <instance_attributes id="gctvanas-vip-instance_attributes"> <nvpair id="gctvanas-vip-instance_attributes-ip" name="ip" value="10.30.96.100"/> <nvpair id="gctvanas-vip-instance_attributes-cidr_netmask" name="cidr_netmask" value="32"/> <nvpair id="gctvanas-vip-instance_attributes-nic" name="nic" value="eth0"/> </instance_attributes> <operations> <op id="gctvanas-vip-start-interval-0s" interval="0s" name="start" timeout="20s"/> <op id="gctvanas-vip-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> <op id="gctvanas-vip-monitor-interval-30s" interval="30s" name="monitor"/> </operations> </primitive> <primitive class="ocf" id="gctvanas-lvm" provider="heartbeat" type="LVM"> <instance_attributes id="gctvanas-lvm-instance_attributes"> <nvpair id="gctvanas-lvm-instance_attributes-volgrpname" name="volgrpname" value="targetfs"/> </instance_attributes> <operations> <op id="gctvanas-lvm-start-interval-0s" interval="0s" name="start" timeout="30"/> <op id="gctvanas-lvm-stop-interval-0s" interval="0s" name="stop" timeout="30"/> <op id="gctvanas-lvm-monitor-interval-30s" interval="30s" name="monitor"/> </operations> </primitive> </resources> <constraints/> <rsc_defaults> <meta_attributes id="rsc_defaults-options"> <nvpair id="rsc_defaults-options-resource-stickiness" name="resource-stickiness" value="100"/> </meta_attributes> </rsc_defaults> </configuration> <status> <node_state id="node1" uname="node1" in_ccm="true" crmd="online" crm-debug-origin="do_update_resource" join="member" expected="member"> <lrm id="node1"> <lrm_resources> <lrm_resource id="gctvanas-vip" type="IPaddr2" class="ocf" provider="heartbeat"> <lrm_rsc_op id="gctvanas-vip_last_0" operation_key="gctvanas-vip_start_0" operation="start" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="32:1:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:0;32:1:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node1" call-id="11" rc-code="0" op-status="0" interval="0" last-run="1472203004" last-rc-change="1472203004" exec-time="74" queue-time="0" op-digest="15b4ba230497d33ad5d77f05e4b9a83e"/> <lrm_rsc_op id="gctvanas-vip_monitor_30000" operation_key="gctvanas-vip_monitor_30000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="33:1:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:0;33:1:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node1" call-id="14" rc-code="0" op-status="0" interval="30000" last-rc-change="1472203004" exec-time="72" queue-time="0" op-digest="205179ac48c643694ee24512cc3b1429"/> </lrm_resource> <lrm_resource id="gctvanas-fs1o" type="drbd" class="ocf" provider="linbit"> <lrm_rsc_op id="gctvanas-fs1o_last_failure_0" operation_key="gctvanas-fs1o_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="3:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:8;3:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node1" call-id="37" rc-code="8" op-status="0" interval="0" last-run="1472203781" last-rc-change="1472203781" exec-time="37" queue-time="1" op-digest="fb1e24e691d75f64117224686c0f806b"/> <lrm_rsc_op id="gctvanas-fs1o_last_0" operation_key="gctvanas-fs1o_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="3:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:8;3:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node1" call-id="37" rc-code="8" op-status="0" interval="0" last-run="1472203781" last-rc-change="1472203781" exec-time="37" queue-time="1" op-digest="fb1e24e691d75f64117224686c0f806b"/> </lrm_resource> <lrm_resource id="gctvanas-lvm" type="LVM" class="ocf" provider="heartbeat"> <lrm_rsc_op id="gctvanas-lvm_last_0" operation_key="gctvanas-lvm_stop_0" operation="stop" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="2:36:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:0;2:36:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node1" call-id="43" rc-code="0" op-status="0" interval="0" last-run="1472223443" last-rc-change="1472223443" exec-time="369" queue-time="0" op-digest="df48f11b20123e34ddf99999ce9f3f1c" exit-reason="LVM: targetfs did not activate correctly"/> <lrm_rsc_op id="gctvanas-lvm_last_failure_0" operation_key="gctvanas-lvm_start_0" operation="start" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="37:35:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:7;37:35:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" exit-reason="LVM: targetfs did not activate correctly" on_node="node1" call-id="42" rc-code="7" op-status="0" interval="0" last-run="1472223442" last-rc-change="1472223442" exec-time="577" queue-time="0" op-digest="df48f11b20123e34ddf99999ce9f3f1c"/> </lrm_resource> </lrm_resources> </lrm> <transient_attributes id="node1"> <instance_attributes id="status-node1"> <nvpair id="status-node1-shutdown" name="shutdown" value="0"/> <nvpair id="status-node1-last-failure-gctvanas-fs1o" name="last-failure-gctvanas-fs1o" value="1472203145"/> <nvpair id="status-node1-master-gctvanas-fs1o" name="master-gctvanas-fs1o" value="10000"/> <nvpair id="status-node1-fail-count-gctvanas-lvm" name="fail-count-gctvanas-lvm" value="INFINITY"/> <nvpair id="status-node1-last-failure-gctvanas-lvm" name="last-failure-gctvanas-lvm" value="1472223442"/> </instance_attributes> </transient_attributes> </node_state> <node_state id="node2" uname="node2" in_ccm="true" crmd="online" crm-debug-origin="do_update_resource" join="member" expected="member"> <lrm id="node2"> <lrm_resources> <lrm_resource id="gctvanas-vip" type="IPaddr2" class="ocf" provider="heartbeat"> <lrm_rsc_op id="gctvanas-vip_last_0" operation_key="gctvanas-vip_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="5:0:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:7;5:0:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node2" call-id="10" rc-code="7" op-status="0" interval="0" last-run="1472203004" last-rc-change="1472203004" exec-time="63" queue-time="0" op-digest="15b4ba230497d33ad5d77f05e4b9a83e"/> </lrm_resource> <lrm_resource id="gctvanas-fs1o" type="drbd" class="ocf" provider="linbit"> <lrm_rsc_op id="gctvanas-fs1o_last_failure_0" operation_key="gctvanas-fs1o_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="4:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:0;4:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node2" call-id="28" rc-code="0" op-status="0" interval="0" last-run="1472203781" last-rc-change="1472203781" exec-time="39" queue-time="0" op-digest="fb1e24e691d75f64117224686c0f806b"/> <lrm_rsc_op id="gctvanas-fs1o_last_0" operation_key="gctvanas-fs1o_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="4:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:0;4:9:7:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node2" call-id="28" rc-code="0" op-status="0" interval="0" last-run="1472203781" last-rc-change="1472203781" exec-time="39" queue-time="0" op-digest="fb1e24e691d75f64117224686c0f806b"/> <lrm_rsc_op id="gctvanas-fs1o_monitor_10000" operation_key="gctvanas-fs1o_monitor_10000" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="9:10:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:0;9:10:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node2" call-id="29" rc-code="0" op-status="0" interval="10000" last-rc-change="1472203781" exec-time="45" queue-time="0" op-digest="1a2c711d2933e74557b4c1a13ec62162"/> </lrm_resource> <lrm_resource id="gctvanas-lvm" type="LVM" class="ocf" provider="heartbeat"> <lrm_rsc_op id="gctvanas-lvm_last_0" operation_key="gctvanas-lvm_stop_0" operation="stop" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="3:35:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:0;3:35:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" on_node="node2" call-id="35" rc-code="0" op-status="0" interval="0" last-run="1472223442" last-rc-change="1472223442" exec-time="151" queue-time="0" op-digest="df48f11b20123e34ddf99999ce9f3f1c" exit-reason="Volume group [targetfs] does not exist or contains error! Volume group "targetfs" not found"/> <lrm_rsc_op id="gctvanas-lvm_last_failure_0" operation_key="gctvanas-lvm_start_0" operation="start" crm-debug-origin="do_update_resource" crm_feature_set="3.0.11" transition-key="38:33:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" transition-magic="0:1;38:33:0:681b3ca7-f83d-4396-a249-d6d80e0efe16" exit-reason="Volume group [targetfs] does not exist or contains error! Volume group "targetfs" not found" on_node="node2" call-id="34" rc-code="1" op-status="0" interval="0" last-run="1472223441" last-rc-change="1472223441" exec-time="322" queue-time="0" op-digest="df48f11b20123e34ddf99999ce9f3f1c"/> </lrm_resource> </lrm_resources> </lrm> <transient_attributes id="node2"> <instance_attributes id="status-node2"> <nvpair id="status-node2-shutdown" name="shutdown" value="0"/> <nvpair id="status-node2-last-failure-gctvanas-fs1o" name="last-failure-gctvanas-fs1o" value="1472203144"/> <nvpair id="status-node2-master-gctvanas-fs1o" name="master-gctvanas-fs1o" value="10000"/> <nvpair id="status-node2-fail-count-gctvanas-lvm" name="fail-count-gctvanas-lvm" value="INFINITY"/> <nvpair id="status-node2-last-failure-gctvanas-lvm" name="last-failure-gctvanas-lvm" value="1472223441"/> </instance_attributes> </transient_attributes> </node_state> </status> </cib> -- [ jR ] @: ja...@eramsey.org<mailto:ja...@eramsey.org> there is no path to greatness; greatness is the path
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org