Hi all,
This is something of a repeat of my questions on the Pacemaker
mailing list, but I think I am dealing with a DRBD issue, so allow me to
ask again here.
I have drbd configured thus:
====
# /etc/drbd.conf
common {
protocol C;
net {
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
disk {
fencing resource-and-stonith;
}
syncer {
rate 40M;
}
handlers {
fence-peer /usr/lib/drbd/crm-fence-peer.sh;
}
}
# resource r0 on an-a04n01.alteeve.ca: not ignored, not stacked
resource r0 {
on an-a04n01.alteeve.ca {
device /dev/drbd0 minor 0;
disk /dev/sda5;
address ipv4 10.10.40.1:7788;
meta-disk internal;
}
on an-a04n02.alteeve.ca {
device /dev/drbd0 minor 0;
disk /dev/sda5;
address ipv4 10.10.40.2:7788;
meta-disk internal;
}
}
# resource r1 on an-a04n01.alteeve.ca: not ignored, not stacked
resource r1 {
on an-a04n01.alteeve.ca {
device /dev/drbd1 minor 1;
disk /dev/sda6;
address ipv4 10.10.40.1:7789;
meta-disk internal;
}
on an-a04n02.alteeve.ca {
device /dev/drbd1 minor 1;
disk /dev/sda6;
address ipv4 10.10.40.2:7789;
meta-disk internal;
}
}
====
I've setup pacemaker thus:
====
Cluster Name: an-anvil-04
Corosync Nodes:
Pacemaker Nodes:
an-a04n01.alteeve.ca an-a04n02.alteeve.ca
Resources:
Master: drbd_r0_Clone
Meta Attrs: master-max=2 master-node-max=1 clone-max=2
clone-node-max=1 notify=true
Resource: drbd_r0 (class=ocf provider=linbit type=drbd)
Attributes: drbd_resource=r0
Operations: monitor interval=30s (drbd_r0-monitor-interval-30s)
Master: lvm_n01_vg0_Clone
Meta Attrs: master-max=2 master-node-max=1 clone-max=2
clone-node-max=1 notify=true
Resource: lvm_n01_vg0 (class=ocf provider=heartbeat type=LVM)
Attributes: volgrpname=an-a04n01_vg0
Operations: monitor interval=30s (lvm_n01_vg0-monitor-interval-30s)
Stonith Devices:
Resource: fence_n01_ipmi (class=stonith type=fence_ipmilan)
Attributes: pcmk_host_list=an-a04n01.alteeve.ca ipaddr=an-a04n01.ipmi
action=reboot login=admin passwd=Initial1 delay=15
Operations: monitor interval=60s (fence_n01_ipmi-monitor-interval-60s)
Resource: fence_n02_ipmi (class=stonith type=fence_ipmilan)
Attributes: pcmk_host_list=an-a04n02.alteeve.ca ipaddr=an-a04n02.ipmi
action=reboot login=admin passwd=Initial1
Operations: monitor interval=60s (fence_n02_ipmi-monitor-interval-60s)
Fencing Levels:
Location Constraints:
Resource: drbd_r0_Clone
Constraint: drbd-fence-by-handler-r0-drbd_r0_Clone
Rule: score=-INFINITY role=Master
(id:drbd-fence-by-handler-r0-rule-drbd_r0_Clone)
Expression: #uname ne an-a04n01.alteeve.ca
(id:drbd-fence-by-handler-r0-expr-drbd_r0_Clone)
Ordering Constraints:
promote drbd_r0_Clone then start lvm_n01_vg0_Clone (Mandatory)
(id:order-drbd_r0_Clone-lvm_n01_vg0_Clone-mandatory)
Colocation Constraints:
Cluster Properties:
cluster-infrastructure: cman
dc-version: 1.1.10-14.el6_5.3-368c726
last-lrm-refresh: 1403147476
no-quorum-policy: ignore
stonith-enabled: true
====
Note the -INFINITY rule, I didn't add that, crm-fence-peer.sh did on
start. This brings me to the question;
When pacemaker starts the DRBD resource, immediately the peer is
resource fenced. Note that only r0 is configured at this time to
simplify debugging this issue.
Here is the DRBD related /var/log/messages entries on node 1 (an-a04n01)
from when I start pacemaker:
====
Jun 19 00:14:22 an-a04n01 crmd[16895]: notice: do_state_transition:
State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC
cause=C_FSA_INTERNAL origin=do_election_check ]
Jun 19 00:14:22 an-a04n01 attrd[16893]: notice: attrd_local_callback:
Sending full refresh (origin=crmd)
Jun 19 00:14:22 an-a04n01 pengine[16894]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Jun 19 00:14:22 an-a04n01 pengine[16894]: notice: LogActions: Start
fence_n01_ipmi#011(an-a04n01.alteeve.ca)
Jun 19 00:14:22 an-a04n01 pengine[16894]: notice: LogActions: Start
fence_n02_ipmi#011(an-a04n02.alteeve.ca)
Jun 19 00:14:22 an-a04n01 pengine[16894]: notice: LogActions: Start
drbd_r0:0#011(an-a04n01.alteeve.ca)
Jun 19 00:14:22 an-a04n01 pengine[16894]: notice: LogActions: Start
drbd_r0:1#011(an-a04n02.alteeve.ca)
Jun 19 00:14:22 an-a04n01 pengine[16894]: notice: process_pe_message:
Calculated Transition 0: /var/lib/pacemaker/pengine/pe-input-230.bz2
Jun 19 00:14:22 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 8: monitor fence_n01_ipmi_monitor_0 on
an-a04n02.alteeve.ca
Jun 19 00:14:22 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 4: monitor fence_n01_ipmi_monitor_0 on
an-a04n01.alteeve.ca (local)
Jun 19 00:14:22 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 9: monitor fence_n02_ipmi_monitor_0 on
an-a04n02.alteeve.ca
Jun 19 00:14:22 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 5: monitor fence_n02_ipmi_monitor_0 on
an-a04n01.alteeve.ca (local)
Jun 19 00:14:22 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 6: monitor drbd_r0:0_monitor_0 on an-a04n01.alteeve.ca
(local)
Jun 19 00:14:22 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 10: monitor drbd_r0:1_monitor_0 on an-a04n02.alteeve.ca
Jun 19 00:14:23 an-a04n01 crmd[16895]: notice: process_lrm_event: LRM
operation drbd_r0_monitor_0 (call=14, rc=7, cib-update=28,
confirmed=true) not running
Jun 19 00:14:23 an-a04n01 crmd[16895]: notice: process_lrm_event:
an-a04n01.alteeve.ca-drbd_r0_monitor_0:14 [ \n ]
Jun 19 00:14:23 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 3: probe_complete probe_complete on
an-a04n01.alteeve.ca (local) - no waiting
Jun 19 00:14:23 an-a04n01 attrd[16893]: notice: attrd_trigger_update:
Sending flush op to all hosts for: probe_complete (true)
Jun 19 00:14:23 an-a04n01 attrd[16893]: notice: attrd_perform_update:
Sent update 4: probe_complete=true
Jun 19 00:14:23 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 7: probe_complete probe_complete on
an-a04n02.alteeve.ca - no waiting
Jun 19 00:14:23 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 11: start fence_n01_ipmi_start_0 on
an-a04n01.alteeve.ca (local)
Jun 19 00:14:23 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 13: start fence_n02_ipmi_start_0 on an-a04n02.alteeve.ca
Jun 19 00:14:23 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 15: start drbd_r0:0_start_0 on an-a04n01.alteeve.ca
(local)
Jun 19 00:14:24 an-a04n01 stonith-ng[16891]: notice:
stonith_device_register: Device 'fence_n01_ipmi' already existed in
device list (2 active devices)
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 17: start drbd_r0:1_start_0 on an-a04n02.alteeve.ca
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: process_lrm_event: LRM
operation fence_n01_ipmi_start_0 (call=19, rc=0, cib-update=29,
confirmed=true) ok
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 12: monitor fence_n01_ipmi_monitor_60000 on
an-a04n01.alteeve.ca (local)
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 14: monitor fence_n02_ipmi_monitor_60000 on
an-a04n02.alteeve.ca
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: process_lrm_event: LRM
operation fence_n01_ipmi_monitor_60000 (call=24, rc=0, cib-update=30,
confirmed=false) ok
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: Starting worker thread
(from cqueue [3265])
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: disk( Diskless ->
Attaching )
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: Found 4 transactions (126
active extents) in activity log.
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: Method to ensure write
ordering: flush
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: drbd_bm_resize called
with capacity == 909525832
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: resync bitmap:
bits=113690729 words=1776418 pages=3470
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: size = 434 GB (454762916 KB)
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: bitmap READ of 3470 pages
took 8 jiffies
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: recounting of set bits
took additional 16 jiffies
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: 0 KB (0 bits) marked
out-of-sync by on disk bit-map.
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: disk( Attaching ->
Consistent )
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: attached to UUIDs
561F3328043888C0:0000000000000000:052A1A6B59936EC5:05291A6B59936EC5
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: conn( StandAlone ->
Unconnected )
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: Starting receiver thread
(from drbd0_worker [17045])
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: receiver (re)started
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: conn( Unconnected ->
WFConnection )
Jun 19 00:14:24 an-a04n01 attrd[16893]: notice: attrd_trigger_update:
Sending flush op to all hosts for: master-drbd_r0 (5)
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: process_lrm_event: LRM
operation drbd_r0_start_0 (call=21, rc=0, cib-update=31, confirmed=true) ok
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 48: notify drbd_r0:0_post_notify_start_0 on
an-a04n01.alteeve.ca (local)
Jun 19 00:14:24 an-a04n01 attrd[16893]: notice: attrd_perform_update:
Sent update 9: master-drbd_r0=5
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 49: notify drbd_r0:1_post_notify_start_0 on
an-a04n02.alteeve.ca
Jun 19 00:14:24 an-a04n01 attrd[16893]: notice: attrd_perform_update:
Sent update 11: master-drbd_r0=5
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: process_lrm_event: LRM
operation drbd_r0_notify_0 (call=28, rc=0, cib-update=0, confirmed=true) ok
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: run_graph: Transition 0
(Complete=23, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-230.bz2): Stopped
Jun 19 00:14:24 an-a04n01 pengine[16894]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Jun 19 00:14:24 an-a04n01 pengine[16894]: notice: LogActions: Promote
drbd_r0:0#011(Slave -> Master an-a04n01.alteeve.ca)
Jun 19 00:14:24 an-a04n01 pengine[16894]: notice: LogActions: Promote
drbd_r0:1#011(Slave -> Master an-a04n02.alteeve.ca)
Jun 19 00:14:24 an-a04n01 pengine[16894]: notice: process_pe_message:
Calculated Transition 1: /var/lib/pacemaker/pengine/pe-input-231.bz2
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 52: notify drbd_r0_pre_notify_promote_0 on
an-a04n01.alteeve.ca (local)
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 54: notify drbd_r0_pre_notify_promote_0 on
an-a04n02.alteeve.ca
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: process_lrm_event: LRM
operation drbd_r0_notify_0 (call=31, rc=0, cib-update=0, confirmed=true) ok
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 13: promote drbd_r0_promote_0 on an-a04n01.alteeve.ca
(local)
Jun 19 00:14:24 an-a04n01 crmd[16895]: notice: te_rsc_command:
Initiating action 16: promote drbd_r0_promote_0 on an-a04n02.alteeve.ca
Jun 19 00:14:24 an-a04n01 kernel: block drbd0: helper command:
/sbin/drbdadm fence-peer minor-0
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: Handshake successful:
Agreed network protocol version 97
Jun 19 00:14:25 an-a04n01 crm-fence-peer.sh[17156]: invoked for r0
Jun 19 00:14:25 an-a04n01 cibadmin[17188]: notice: crm_log_args:
Invoked: cibadmin -C -o constraints -X <rsc_location rsc="drbd_r0_Clone"
id="drbd-fence-by-handler-r0-drbd_r0_Clone">#012 <rule role="Master"
score="-INFINITY" id="drbd-fence-by-handler-r0-rule-drbd_r0_Clone">#012
<expression attribute="#uname" operation="ne"
value="an-a04n01.alteeve.ca"
id="drbd-fence-by-handler-r0-expr-drbd_r0_Clone"/>#012
</rule>#012</rsc_location>
Jun 19 00:14:25 an-a04n01 crmd[16895]: notice: handle_request: Current
ping state: S_TRANSITION_ENGINE
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: Diff: --- 0.94.19
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: Diff: +++
0.95.1 4f095b8add6dcbb173de1254bf02fcf6
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: -- <cib
admin_epoch="0" epoch="94" num_updates="19"/>
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: ++
<rsc_location rsc="drbd_r0_Clone"
id="drbd-fence-by-handler-r0-drbd_r0_Clone">
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: ++
<rule role="Master" score="-INFINITY"
id="drbd-fence-by-handler-r0-rule-drbd_r0_Clone">
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: ++
<expression attribute="#uname" operation="ne"
value="an-a04n01.alteeve.ca"
id="drbd-fence-by-handler-r0-expr-drbd_r0_Clone"/>
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: ++ </rule>
Jun 19 00:14:25 an-a04n01 cib[16890]: notice: cib:diff: ++
</rsc_location>
Jun 19 00:14:25 an-a04n01 stonith-ng[16891]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Jun 19 00:14:25 an-a04n01 crm-fence-peer.sh[17156]: INFO peer is
reachable, my disk is Consistent: placed constraint
'drbd-fence-by-handler-r0-drbd_r0_Clone'
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: helper command:
/sbin/drbdadm fence-peer minor-0 exit code 4 (0x400)
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: fence-peer helper
returned 4 (peer was fenced)
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: role( Secondary ->
Primary ) disk( Consistent -> UpToDate ) pdsk( DUnknown -> Outdated )
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: new current UUID
25DF173CF8D89023:561F3328043888C0:052A1A6B59936EC5:05291A6B59936EC5
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: conn( WFConnection ->
WFReportParams )
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: Starting asender thread
(from drbd0_receiver [17062])
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: data-integrity-alg:
<not-used>
Jun 19 00:14:25 an-a04n01 stonith-ng[16891]: notice:
stonith_device_register: Device 'fence_n01_ipmi' already existed in
device list (2 active devices)
Jun 19 00:14:25 an-a04n01 cib[16890]: warning: update_results: Action
cib_create failed: Name not unique on network (cde=-76)
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures <failed>
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures <failed_update
id="drbd-fence-by-handler-r0-drbd_r0_Clone" object_type="rsc_location"
operation="cib_create" reason="Name not unique on network">
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures <rsc_location rsc="drbd_r0_Clone"
id="drbd-fence-by-handler-r0-drbd_r0_Clone">
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures <rule role="Master" score="-INFINITY"
id="drbd-fence-by-handler-r0-rule-drbd_r0_Clone">
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures <expression attribute="#uname" operation="ne"
value="an-a04n02.alteeve.ca"
id="drbd-fence-by-handler-r0-expr-drbd_r0_Clone"/>
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures </rule>
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures </rsc_location>
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures </failed_update>
Jun 19 00:14:25 an-a04n01 cib[16890]: error: cib_process_create: CIB
Update failures </failed>
Jun 19 00:14:25 an-a04n01 cib[16890]: warning: cib_process_request:
Completed cib_create operation for section constraints: Name not unique
on network (rc=-76, origin=an-a04n02.alteeve.ca/cibadmin/2, version=0.95.1)
Jun 19 00:14:25 an-a04n01 stonith-ng[16891]: notice:
stonith_device_register: Added 'fence_n02_ipmi' to the device list (2
active devices)
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: drbd_sync_handshake:
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: self
25DF173CF8D89023:561F3328043888C0:052A1A6B59936EC5:05291A6B59936EC5
bits:0 flags:0
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: peer
561F3328043888C0:0000000000000000:052A1A6B59936EC4:05291A6B59936EC5
bits:0 flags:0
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: uuid_compare()=1 by rule 70
Jun 19 00:14:25 an-a04n01 kernel: block drbd0: peer( Unknown ->
Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( Outdated ->
Consistent )
Jun 19 00:14:25 an-a04n01 crmd[16895]: notice: process_lrm_event: LRM
operation drbd_r0_promote_0 (call=34, rc=0, cib-update=33,
confirmed=true) ok
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: helper command:
/sbin/drbdadm before-resync-source minor-0
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: helper command:
/sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: conn( WFBitMapS ->
SyncSource ) pdsk( Consistent -> Inconsistent )
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: Began resync as
SyncSource (will sync 0 KB [0 bits set]).
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: updated sync UUID
25DF173CF8D89023:56203328043888C0:561F3328043888C0:052A1A6B59936EC5
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: Diff: --- 0.95.2
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: Diff: +++
0.96.1 86f147e11a7e9934f7b2a686715dcca6
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: --
<rsc_location rsc="drbd_r0_Clone"
id="drbd-fence-by-handler-r0-drbd_r0_Clone">
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: --
<rule role="Master" score="-INFINITY"
id="drbd-fence-by-handler-r0-rule-drbd_r0_Clone">
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: --
<expression attribute="#uname" operation="ne"
value="an-a04n01.alteeve.ca"
id="drbd-fence-by-handler-r0-expr-drbd_r0_Clone"/>
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: -- </rule>
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: --
</rsc_location>
Jun 19 00:14:26 an-a04n01 cib[16890]: notice: cib:diff: ++ <cib
admin_epoch="0" cib-last-written="Thu Jun 19 00:14:26 2014"
crm_feature_set="3.0.7" epoch="96" have-quorum="1" num_updates="1"
update-client="cibadmin" update-origin="an-a04n02.alteeve.ca"
validate-with="pacemaker-1.2" dc-uuid="an-a04n01.alteeve.ca"/>
Jun 19 00:14:26 an-a04n01 stonith-ng[16891]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Jun 19 00:14:26 an-a04n01 stonith-ng[16891]: notice:
stonith_device_register: Device 'fence_n01_ipmi' already existed in
device list (2 active devices)
Jun 19 00:14:26 an-a04n01 stonith-ng[16891]: notice:
stonith_device_register: Added 'fence_n02_ipmi' to the device list (2
active devices)
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: Resync done (total 1 sec;
paused 0 sec; 0 K/sec)
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: updated UUIDs
25DF173CF8D89023:0000000000000000:56203328043888C0:561F3328043888C0
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: conn( SyncSource ->
Connected ) pdsk( Inconsistent -> UpToDate )
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: bitmap WRITE of 3470
pages took 9 jiffies
Jun 19 00:14:26 an-a04n01 kernel: block drbd0: 0 KB (0 bits) marked
out-of-sync by on disk bit-map.
====
This causes pacemaker to declare the resource failed, though eventually
the crm-unfence-peer.sh is called, clearing the -INFINITY constraint and
node 2 (an-a04n02) finally promotes to Primary. However, the drbd_r0
resource remains listed as failed.
I have no idea why this is happening, and I am really stumped. Any help
would be much appreciated.
digimer
PS - RHEL 6.5 fully updated, DRBD 8.3.15, Pacemaker 1.1.10
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user