Hello,
I have configured a drbd0 resource (nfsdata) in pacemaker, acting as
active/passive, using the linbit resource agent with master/slave config.
It works ok in different operations I tried with pacemaker.

Then on both two nodes I'm going to test ocfs2 for another drbd1 resource I
have created (ocfs2data).

Env for drbd seems ok on both nodes, with ocfs2 fs mounted on both, but the
drbd0 pacemaker resource failed

[r...@ha1 ~]# cat /proc/drbd
version: 8.3.6 (api:88/proto:86-91)
GIT-hash: f3606c47cc6fcf6b3f086e425cb34af8b7a81bbf build by r...@ha1,
2010-04-28 09:01:04
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:3128688 nr:8 dw:1027576 dr:2586722 al:524 bm:130 lo:0 pe:0 ua:0 ap:0
ep:1 wo:d oos:0
 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----
    ns:1280472 nr:3146699 dw:4427171 dr:28918 al:343 bm:129 lo:0 pe:0 ua:0
ap:0 ep:1 wo:d oos:0

[r...@ha1 ~]# crm_mon -1
============
Last updated: Fri Apr 30 09:46:28 2010
Stack: openais
Current DC: ha1 - partition with quorum
Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
2 Nodes configured, 2 expected votes
3 Resources configured.
============

Online: [ ha1 ha2 ]

 Master/Slave Set: NfsData
     nfsdrbd:0 (ocf::linbit:drbd): Slave ha1 (unmanaged) FAILED
     nfsdrbd:1 (ocf::linbit:drbd): Slave ha2 (unmanaged) FAILED

Failed actions:
    nfsdrbd:0_demote_0 (node=ha1, call=590, rc=5, status=complete): not
installed
    nfsdrbd:0_stop_0 (node=ha1, call=593, rc=5, status=complete): not
installed
    nfsdrbd:1_monitor_60000 (node=ha2, call=33, rc=5, status=complete): not
installed
    nfsdrbd:1_stop_0 (node=ha2, call=38, rc=5, status=complete): not
installed


drbd1 is not mentioned (yet) in my pacemajer config, but I notice I got this
error in messages, that seems to say that pacemaker tries to take care of
drbd1 too...

Apr 29 17:58:25 ha1 pengine: [1616]: notice: unpack_rsc_op: Hard error -
nfsdrbd:1_monitor_60000 failed with rc=5: Preventing NfsData from
re-starting on ha2
Apr 29 17:58:25 ha1 pengine: [1616]: WARN: unpack_rsc_op: Processing failed
op nfsdrbd:1_monitor_60000 on ha2: not installed (5)
Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print: SitoWeb
 (ocf::heartbeat:apache):        Started ha1
Apr 29 17:58:25 ha1 pengine: [1616]: notice: clone_print:  Master/Slave Set:
NfsData
Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      nfsdrbd:1
    (ocf::linbit:drbd):     Slave ha2 FAILED
Apr 29 17:58:25 ha1 pengine: [1616]: notice: short_print:      Masters: [
ha1 ]
Apr 29 17:58:25 ha1 pengine: [1616]: notice: group_print:  Resource Group:
nfs-group
Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      ClusterIP
    (ocf::heartbeat:IPaddr2):       Started ha1
Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      lv_drbd0
   (ocf::heartbeat:LVM):   Started ha1
Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      NfsFS
(ocf::heartbeat:Filesystem):    Started ha1
Apr 29 17:58:25 ha1 pengine: [1616]: info: get_failcount: NfsData has failed
1 times on ha2

As I didn't see anything about caveats with multiple resources both managed
and not managed by pacemaker @ linbit site, I presumed it is possible.
Is this correct?

My current config is this (btw: there are nfs words in resources, but not a
nfs server inside at the moment...):


[r...@ha1 ~]# crm configure show
node ha1 \
attributes standby="off"
node ha2 \
attributes standby="off"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.101.53" cidr_netmask="32" \
op monitor interval="30s"
primitive NfsFS ocf:heartbeat:Filesystem \
params device="/dev/vg_drbd0/lv_drbd0" directory="/nfsdata" fstype="ext3" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60"
primitive SitoWeb ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min" \
op start interval="0" timeout="40" \
op stop interval="0" timeout="60"
primitive lv_drbd0 ocf:heartbeat:LVM \
params volgrpname="vg_drbd0" exclusive="yes" \
op monitor interval="10" timeout="30" depth="0" \
op start interval="0" timeout="30" \
op stop interval="0" timeout="30"
primitive nfsdrbd ocf:linbit:drbd \
params drbd_resource="nfsdata" \
op monitor interval="60s" \
op start interval="0" timeout="240" \
op stop interval="0" timeout="100"
group nfs-group ClusterIP lv_drbd0 NfsFS
ms NfsData nfsdrbd \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
notify="true"
location prefer-ha1 SitoWeb 50: ha1
colocation nfs_on_drbd0 inf: nfs-group NfsData:Master
colocation website-with-ip inf: SitoWeb nfs-group
order NfsFS-after-NfsData inf: NfsData:promote nfs-group:start
order apache-after-ip inf: nfs-group SitoWeb
property $id="cib-bootstrap-options" \
dc-version="1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"

Thanks,
Gianluca
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to