[Pacemaker] Failing back a multi-state resource eg. DRBD

2011-03-02 Thread Dominic Malolepszy

Hi,

I'm trying to simulate various scenarios and what to do to correct the 
problem. I have a DRBD cluster as defined below; if the primary fails 
(ie power cycled drbd01.test), the secondary (drbd02.test) takes over 
successfully, so DRBD:master now runs on drbd02.test. When node 
drbd01.test comes back up, DRBD:master remains on drbd02.st (ie due to 
resource stickness); and drbd01.test simply becomes DRBD:Slave; this is 
what I want.


Now what command/s would I need to run to move the master back to 
drbd01.test, and make drbd02.test the new slave? The name of the 
multi-state resource is ms-drbd0, below is the config I am currently 
running.



node drbd01.test \
attributes standby="off"
node drbd02.test \
attributes standby="off"
primitive drbd0 ocf:linbit:drbd \
params drbd_resource="drbd0" \
op monitor interval="60s" \
op start interval="0" timeout="240s" \
op promote interval="0" timeout="90s" start-delay="3s" \
op demote interval="0" timeout="90s" start-delay="3s" \
op notify interval="0" timeout="90s" \
op stop interval="0" timeout="100s" \
op monitor interval="10s" role="Master" timeout="20s" 
start-delay="5s" \

op monitor interval="20s" role="Slave" timeout="20s" start-delay="5s"
primitive fs0 ocf:heartbeat:Filesystem \
params directory="/var/lib/pgsql/9.0/data" device="/dev/drbd0" 
fstype="ext3" \

op start interval="0" timeout="60s" start-delay="1s" \
op stop interval="0"
primitive ip ocf:heartbeat:IPaddr \
params ip="192.168.1.50" cidr_netmask="24" \
op monitor interval="10s"
primitive pgsql0 ocf:heartbeat:pgsql \
params pgctl="/usr/pgsql-9.0/bin/pg_ctl" \
params psql="/usr/pgsql-9.0/bin/psql" \
params pgdata="/var/lib/pgsql/9.0/data" \
op monitor interval="30s" timeout="30s" \
op start interval="0" timeout="120s" start_delay="1s" \
op stop interval="0" timeout="120s"
primitive ping_gateway ocf:pacemaker:ping \
params host_list="192.168.1.1" multiplier="1000" \
op monitor interval="10s" timeout="60s" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="20s"
ms ms-drbd0 drbd0 \
meta master-max="1" master-node-max="1" notify="true" 
clone-node-max="1" clone-max="2"

clone connectivity_check ping_gateway \
meta globally-unique="false"
location master-connected-node ms-drbd0 \
rule $id="master-connected-node-rule" $role="master" -inf: 
not_defined pingd or pingd lte 0

location primary_location ip 50: drbd01.test
colocation fs0-with-drbd0 inf: fs0 ms-drbd0:Master
colocation ip-with-pgsql0 inf: ip pgsql0
colocation pgsql0-with-fs0 inf: pgsql0 fs0
order fs0-after-drbd0 inf: ms-drbd0:promote fs0:start
order ip-after-pgsql0 inf: pgsql0 ip
order pgsql0-after-fs0 inf: fs0:start pgsql0
property $id="cib-bootstrap-options" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"


Cheers,
Dominic.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Failing back a multi-state resource eg. DRBD

2011-03-04 Thread David McCurley
Are you wanting to move all the resources back or just that one resource?

I'm still learning, but one simple way I move all resources back from nodeb to 
nodea is like this:

# on nodeb
sudo crm node standby
# now services migrate to nodea
# still on nodeb
sudo crm node online

This may be a naive way to do it but it works for now :)

There is also a "crm resource migrate" to migrate individual resources.  For 
that, see here:

http://www.clusterlabs.org/doc/crm_cli.html


- Original Message -
> From: "Dominic Malolepszy" 
> To: pacemaker@oss.clusterlabs.org
> Sent: Thursday, March 3, 2011 12:18:51 AM
> Subject: [Pacemaker] Failing back a multi-state resource eg. DRBD
> Hi,
> 
> I'm trying to simulate various scenarios and what to do to correct the
> problem. I have a DRBD cluster as defined below; if the primary fails
> (ie power cycled drbd01.test), the secondary (drbd02.test) takes over
> successfully, so DRBD:master now runs on drbd02.test. When node
> drbd01.test comes back up, DRBD:master remains on drbd02.st (ie due to
> resource stickness); and drbd01.test simply becomes DRBD:Slave; this
> is what I want.
> 
> Now what command/s would I need to run to move the master back to
> drbd01.test, and make drbd02.test the new slave? The name of the
> multi-state resource is ms-drbd0, below is the config I am currently
> running.
> 
> 
> node drbd01.test \
> attributes standby="off"
> node drbd02.test \
> attributes standby="off"
> primitive drbd0 ocf:linbit:drbd \
> params drbd_resource="drbd0" \
> op monitor interval="60s" \
> op start interval="0" timeout="240s" \
> op promote interval="0" timeout="90s" start-delay="3s" \
> op demote interval="0" timeout="90s" start-delay="3s" \
> op notify interval="0" timeout="90s" \
> op stop interval="0" timeout="100s" \
> op monitor interval="10s" role="Master" timeout="20s" start-delay="5s"
> \
> op monitor interval="20s" role="Slave" timeout="20s" start-delay="5s"
> primitive fs0 ocf:heartbeat:Filesystem \
> params directory="/var/lib/pgsql/9.0/data" device="/dev/drbd0"
> fstype="ext3" \
> op start interval="0" timeout="60s" start-delay="1s" \
> op stop interval="0"
> primitive ip ocf:heartbeat:IPaddr \
> params ip="192.168.1.50" cidr_netmask="24" \
> op monitor interval="10s"
> primitive pgsql0 ocf:heartbeat:pgsql \
> params pgctl="/usr/pgsql-9.0/bin/pg_ctl" \
> params psql="/usr/pgsql-9.0/bin/psql" \
> params pgdata="/var/lib/pgsql/9.0/data" \
> op monitor interval="30s" timeout="30s" \
> op start interval="0" timeout="120s" start_delay="1s" \
> op stop interval="0" timeout="120s"
> primitive ping_gateway ocf:pacemaker:ping \
> params host_list="192.168.1.1" multiplier="1000" \
> op monitor interval="10s" timeout="60s" \
> op start interval="0" timeout="60s" \
> op stop interval="0" timeout="20s"
> ms ms-drbd0 drbd0 \
> meta master-max="1" master-node-max="1" notify="true"
> clone-node-max="1" clone-max="2"
> clone connectivity_check ping_gateway \
> meta globally-unique="false"
> location master-connected-node ms-drbd0 \
> rule $id="master-connected-node-rule" $role="master" -inf: not_defined
> pingd or pingd lte 0
> location primary_location ip 50: drbd01.test
> colocation fs0-with-drbd0 inf: fs0 ms-drbd0:Master
> colocation ip-with-pgsql0 inf: ip pgsql0
> colocation pgsql0-with-fs0 inf: pgsql0 fs0
> order fs0-after-drbd0 inf: ms-drbd0:promote fs0:start
> order ip-after-pgsql0 inf: pgsql0 ip
> order pgsql0-after-fs0 inf: fs0:start pgsql0
> property $id="cib-bootstrap-options" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> no-quorum-policy="ignore" \
> dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100"
> 
> 
> Cheers,
> Dominic.
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Failing back a multi-state resource eg. DRBD

2011-03-07 Thread Dejan Muhamedagic
Hi,

On Fri, Mar 04, 2011 at 09:12:46AM -0500, David McCurley wrote:
> Are you wanting to move all the resources back or just that one resource?
> 
> I'm still learning, but one simple way I move all resources back from nodeb 
> to nodea is like this:
> 
> # on nodeb
> sudo crm node standby
> # now services migrate to nodea
> # still on nodeb
> sudo crm node online
> 
> This may be a naive way to do it but it works for now :)

Yes, that would work. Though that would also make all other
resources move from the standby node.

> There is also a "crm resource migrate" to migrate individual resources.  For 
> that, see here:

resource migrate has no option to move ms resources, i.e. to make
another node the master.

What would work right now is to create a temporary location
constraint:

location tmp1 ms-drbd0 \
rule $id="tmp1-rule" $role="Master" inf: #uname eq nodea

Then, once the drbd got promoted on nodea, just remove the
constraint:

crm configure delete tmp1

Obviously, we'd need to make some improvements here. "resource
migrate" uses crm_resource to insert the location constraint,
perhaps we should update it to also accept the role parameter.

Can you please make an enhancement bugzilla report so that this
doesn't get lost.

Thanks,

Dejan

> http://www.clusterlabs.org/doc/crm_cli.html
> 
> 
> - Original Message -
> > From: "Dominic Malolepszy" 
> > To: pacemaker@oss.clusterlabs.org
> > Sent: Thursday, March 3, 2011 12:18:51 AM
> > Subject: [Pacemaker] Failing back a multi-state resource eg. DRBD
> > Hi,
> > 
> > I'm trying to simulate various scenarios and what to do to correct the
> > problem. I have a DRBD cluster as defined below; if the primary fails
> > (ie power cycled drbd01.test), the secondary (drbd02.test) takes over
> > successfully, so DRBD:master now runs on drbd02.test. When node
> > drbd01.test comes back up, DRBD:master remains on drbd02.st (ie due to
> > resource stickness); and drbd01.test simply becomes DRBD:Slave; this
> > is what I want.
> > 
> > Now what command/s would I need to run to move the master back to
> > drbd01.test, and make drbd02.test the new slave? The name of the
> > multi-state resource is ms-drbd0, below is the config I am currently
> > running.
> > 
> > 
> > node drbd01.test \
> > attributes standby="off"
> > node drbd02.test \
> > attributes standby="off"
> > primitive drbd0 ocf:linbit:drbd \
> > params drbd_resource="drbd0" \
> > op monitor interval="60s" \
> > op start interval="0" timeout="240s" \
> > op promote interval="0" timeout="90s" start-delay="3s" \
> > op demote interval="0" timeout="90s" start-delay="3s" \
> > op notify interval="0" timeout="90s" \
> > op stop interval="0" timeout="100s" \
> > op monitor interval="10s" role="Master" timeout="20s" start-delay="5s"
> > \
> > op monitor interval="20s" role="Slave" timeout="20s" start-delay="5s"
> > primitive fs0 ocf:heartbeat:Filesystem \
> > params directory="/var/lib/pgsql/9.0/data" device="/dev/drbd0"
> > fstype="ext3" \
> > op start interval="0" timeout="60s" start-delay="1s" \
> > op stop interval="0"
> > primitive ip ocf:heartbeat:IPaddr \
> > params ip="192.168.1.50" cidr_netmask="24" \
> > op monitor interval="10s"
> > primitive pgsql0 ocf:heartbeat:pgsql \
> > params pgctl="/usr/pgsql-9.0/bin/pg_ctl" \
> > params psql="/usr/pgsql-9.0/bin/psql" \
> > params pgdata="/var/lib/pgsql/9.0/data" \
> > op monitor interval="30s" timeout="30s" \
> > op start interval="0" timeout="120s" start_delay="1s" \
> > op stop interval="0" timeout="120s"
> > primitive ping_gateway ocf:pacemaker:ping \
> > params host_list="192.168.1.1" multiplier="1000" \
> > op monitor interval="10s" timeout="60s" \
> > op start interval="0" timeout="60s" \
> > op stop interval="0" timeout="20s"
> > ms ms-drbd0 drbd0 \
> > meta master-max="1" master-node-max="1" notify="true"
> > clone-node-max="1" clone-max="2"
> > clone connectivity_check ping_gateway \
> > met

Re: [Pacemaker] Failing back a multi-state resource eg. DRBD

2011-03-11 Thread Holger Teutsch
On Mon, 2011-03-07 at 14:21 +0100, Dejan Muhamedagic wrote:
> Hi,
> 
> On Fri, Mar 04, 2011 at 09:12:46AM -0500, David McCurley wrote:
> > Are you wanting to move all the resources back or just that one resource?
> > 
> > I'm still learning, but one simple way I move all resources back from nodeb 
> > to nodea is like this:
> > 
> > # on nodeb
> > sudo crm node standby
> > # now services migrate to nodea
> > # still on nodeb
> > sudo crm node online
> > 
> > This may be a naive way to do it but it works for now :)
> 
> Yes, that would work. Though that would also make all other
> resources move from the standby node.
> 
> > There is also a "crm resource migrate" to migrate individual resources.  
> > For that, see here:
> 
> resource migrate has no option to move ms resources, i.e. to make
> another node the master.
> 
> What would work right now is to create a temporary location
> constraint:
> 
> location tmp1 ms-drbd0 \
> rule $id="tmp1-rule" $role="Master" inf: #uname eq nodea
> 
> Then, once the drbd got promoted on nodea, just remove the
> constraint:
> 
> crm configure delete tmp1
> 
> Obviously, we'd need to make some improvements here. "resource
> migrate" uses crm_resource to insert the location constraint,
> perhaps we should update it to also accept the role parameter.
> 
> Can you please make an enhancement bugzilla report so that this
> doesn't get lost.
> 
> Thanks,
> 
> Dejan

Hi Dejan,
it seems that the original author did not file the bug.
I entered it as

http://developerbugs.linux-foundation.org/show_bug.cgi?id=2567

Regards
Holger



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Failing back a multi-state resource eg. DRBD

2011-03-11 Thread Dejan Muhamedagic
Hi Holger,

On Fri, Mar 11, 2011 at 02:45:07PM +0100, Holger Teutsch wrote:
> On Mon, 2011-03-07 at 14:21 +0100, Dejan Muhamedagic wrote:
> > Hi,
> > 
> > On Fri, Mar 04, 2011 at 09:12:46AM -0500, David McCurley wrote:
> > > Are you wanting to move all the resources back or just that one resource?
> > > 
> > > I'm still learning, but one simple way I move all resources back from 
> > > nodeb to nodea is like this:
> > > 
> > > # on nodeb
> > > sudo crm node standby
> > > # now services migrate to nodea
> > > # still on nodeb
> > > sudo crm node online
> > > 
> > > This may be a naive way to do it but it works for now :)
> > 
> > Yes, that would work. Though that would also make all other
> > resources move from the standby node.
> > 
> > > There is also a "crm resource migrate" to migrate individual resources.  
> > > For that, see here:
> > 
> > resource migrate has no option to move ms resources, i.e. to make
> > another node the master.
> > 
> > What would work right now is to create a temporary location
> > constraint:
> > 
> > location tmp1 ms-drbd0 \
> > rule $id="tmp1-rule" $role="Master" inf: #uname eq nodea
> > 
> > Then, once the drbd got promoted on nodea, just remove the
> > constraint:
> > 
> > crm configure delete tmp1
> > 
> > Obviously, we'd need to make some improvements here. "resource
> > migrate" uses crm_resource to insert the location constraint,
> > perhaps we should update it to also accept the role parameter.
> > 
> > Can you please make an enhancement bugzilla report so that this
> > doesn't get lost.
> > 
> > Thanks,
> > 
> > Dejan
> 
> Hi Dejan,
> it seems that the original author did not file the bug.
> I entered it as
> 
> http://developerbugs.linux-foundation.org/show_bug.cgi?id=2567

Thanks for taking care of that.

Dejan

> Regards
> Holger
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker