[ClusterLabs] group resources not grouped ?!?

zulucloud Wed, 07 Oct 2015 07:15:19 -0700

Hi,
i got a problem i don't understand, maybe someone can give me a hint.

My 2-node cluster (named ali and baba) is configured to run mysql, an IPfor mysql and the filesystem resource (on drbd master) together as aGROUP. After doing some crash-tests i ended up having filesystem andmysql running happily on one host (ali), and the related IP on the other(baba) .... although, the IP's not really up and running, crm_mon justSHOWS it as started there. In fact it's nowhere up, neither on ali noron baba.

crm_mon shows that pacemaker tried to start it on baba, but gave upafter fail-count=1000000.

Q1: why doesn't pacemaker put the IP on ali, where all the rest of it'sgroup lives?Q2: why doesn't pacemaker try to start the IP on ali, after maxfailcount had been reached on baba?Q3: why is crm_mon showing the IP as "started", when it's down after100000 tries?


Thanks :)


config (some parts removed):
-------------------------------
node ali
node baba

primitive res_drbd ocf:linbit:drbd \
        params drbd_resource="r0" \
        op stop interval="0" timeout="100" \
        op start interval="0" timeout="240" \
        op promote interval="0" timeout="90" \
        op demote interval="0" timeout="90" \
        op notify interval="0" timeout="90" \
        op monitor interval="40" role="Slave" timeout="20" \
        op monitor interval="20" role="Master" timeout="20"
primitive res_fs ocf:heartbeat:Filesystem \
        params device="/dev/drbd0" directory="/drbd_mnt" fstype="ext4" \
        op monitor interval="30s"
primitive res_hamysql_ip ocf:heartbeat:IPaddr2 \
        params ip="XXX.XXX.XXX.224" nic="eth0" cidr_netmask="23" \
        op monitor interval="10s" timeout="20s" depth="0"
primitive res_mysql lsb:mysql \
        op start interval="0" timeout="15" \
        op stop interval="0" timeout="15" \
        op monitor start-delay="30" interval="15" time-out="15"

group gr_mysqlgroup res_fs res_mysql res_hamysql_ip \
        meta target-role="Started"
ms ms_drbd res_drbd \

meta master-max="1" master-node-max="1" clone-max="2"clone-node-max="1" notify="true"


colocation col_fs_on_drbd_master inf: res_fs:Started ms_drbd:Master

order ord_drbd_master_then_fs inf: ms_drbd:promote res_fs:start

property $id="cib-bootstrap-options" \
        dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
        cluster-infrastructure="openais" \
        stonith-enabled="false" \
        no-quorum-policy="ignore" \
        expected-quorum-votes="2" \
        last-lrm-refresh="1438857246"


crm_mon -rnf (some parts removed):
---------------------------------
Node ali: online
        res_fs  (ocf::heartbeat:Filesystem) Started
        res_mysql       (lsb:mysql) Started
        res_drbd:0      (ocf::linbit:drbd) Master
Node baba: online
        res_hamysql_ip  (ocf::heartbeat:IPaddr2) Started
        res_drbd:1      (ocf::linbit:drbd) Slave

Inactive resources:

Migration summary:

* Node baba:
   res_hamysql_ip: migration-threshold=1000000 fail-count=1000000

Failed actions:

res_hamysql_ip_stop_0 (node=a891vl107s, call=35, rc=1,status=complete): unknown error


corosync.log:
--------------

pengine: [1223]: WARN: should_dump_input: Ignoring requirement thatres_hamysql_ip_stop_0 comeplete before gr_mysqlgroup_stopped_0:unmanaged failed resources cannot prevent shutdown


Software:
----------
corosync 1.2.1-4
pacemaker 1.0.9.1+hg15626-1
drbd8-utils 2:8.3.7-2.1
(for some reason it's not possible to update at this time)



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] group resources not grouped ?!?

Reply via email to