Re: [Linux-HA] Command line option to fail back a master/slave resource

Bob Schatz Thu, 18 Feb 2010 17:18:41 -0800

Thanks Marian and Dejan!

I did these steps for fail back:


# crm resource meta SS0 delete target-role
# crm resource demote ms-SS0
# crm resource promote ms-SS0

I noticed that if I type to fast between the "demote" and the "promote" then 
the ms-SS0
ends going MASTER->SLAVE->MASTER on the same node.

I do not know if this is expected or not.

Am I doing the correct command sequence?

Also, if I remove the "target-role" from the primitive permanently then it 
appears that
the master resources are not balanced across the cluster.  Is this expected?


Thanks very much!

Bob


----- Original Message ----
From: Dejan Muhamedagic <[email protected]>
To: General Linux-HA mailing list <[email protected]>
Sent: Thu, February 18, 2010 7:28:29 AM
Subject: Re: [Linux-HA] Command line option to fail back a master/slave resource

Hi Marian,

On Thu, Feb 18, 2010 at 03:18:59PM +0100, Dejan Muhamedagic wrote:
> Hi,
> 
> On Thu, Feb 18, 2010 at 03:50:28PM +0200, Marian Marinov wrote:
> > On Thursday 18 February 2010 15:23:13 Dejan Muhamedagic wrote:
> > > Hi,
> > > 
> > > On Thu, Feb 18, 2010 at 03:14:05PM +0200, Marian Marinov wrote:
> > > > I had almost identical problem as you.
> > > >
> > > > I'm currently working on a solution for this problem. I hope next week
> > > > I'll have that part finished and I'll file the enhancement bugzilla  for
> > > > that feature.
> > > 
> > > You mean you're preparing a patch?
> > 
> > Yup, I already asked you what is the right way of handling the situation so 
> > I 
> > decided to fix it. 
> 
> Great! This will be more or less the first patch for crm from
> the community.

There is already an enhancement bugzilla for this:

http://developerbugs.linux-foundation.org/show_bug.cgi?id=2315

Probably the most straightforward way is to use xpath (I think
it's called like that). Note that the shell shouldn't create
target-roles indiscriminately, probably better to remove all of
those beneath the top level resource (clone/group).

Thanks,

Dejan


> Cheers,
> 
> Dejan
> 
> > But currently I'm still testing the replication awareness of the mysql RA. 
> > After I finish with it I'll continue with the crm.
> > 
> > Regards,
> > Marian
> > 
> > > 
> > > > For now, what you can do is remove the target-role meta attribute from
> > > > the resources you want to promote using:
> > > > crm> resource meta RESOUCE_PRIMITIVE delete target-role
> > > 
> > > Oh, completely forgot that it can be done this way too. Thanks
> > > for mentioning it.
> > > 
> > > Dejan
> > > 
> > > > You have to remove the target role only from the local primitive not 
> > > > from
> > > > the clone.
> > > >
> > > > This is how we deal with the problem.
> > > >
> > > > Regards,
> > > > Marian
> > > >
> > > > On Thursday 18 February 2010 14:56:03 Dejan Muhamedagic wrote:
> > > > > Hi,
> > > > >
> > > > > On Tue, Feb 16, 2010 at 08:01:10PM -0800, Bob Schatz wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I have configured 8 master/slave resources on two virtual machines
> > > > > > named fc12-64-1 and fc12-64-2.
> > > > > >
> > > > > > They are running fedora core 12 64 bit with the version of pacemaker
> > > > > > from a "yum install pacemaker" or: Name       : heartbeat
> > > > > >    Version    : 3.0.0
> > > > > >
> > > > > >    Name       : pacemaker
> > > > > >    Version    : 1.0.5
> > > > > >
> > > > > >   Name       : pacemaker-libs
> > > > > >   Version    : 1.0.5
> > > > > >
> > > > > >   Name       : cluster-glue
> > > > > >   Version    : 1.0
> > > > > >
> > > > > >   Name       : cluster-glue-libs
> > > > > >   Version    : 1.0
> > > > > >
> > > > > > My requirements are:
> > > > > >
> > > > > > 1.Four resources start as master on each node and the slave for each
> > > > > > resource starts on the other node - i.e.
> > > > > >
> > > > > > SS0 (master) on fc12-64-1
> > > > > > SS0 (slave) on fc12-64-2
> > > > > > 2.After a failover (power off one node) I have all 8 resources
> > > > > > running as master on one node.
> > > > > >
> > > > > > 3.I do not want automatic failback when a node comes back.  I only
> > > > > > want it to occur under operator control.
> > > > > >
> > > > > > 4.If the process associated with each resource dies, it will be
> > > > > > restarted as a slave and the other node will convert the process to
> > > > > > master.
> > > > > >
> > > > > > My problem:
> > > > > >
> > > > > > I am able to start both nodes and have four masters on each node 
> > > > > > with
> > > > > > a slave on the second node.  Also, failover works as expected - all 
> > > > > > 8
> > > > > > resources are master on the remaining node if one node dies and if I
> > > > > > have process death I only failover the one resource.
> > > > > >
> > > > > > However, I am not sure how I can cause a fail back to occur from the
> > > > > > command line (requirement #3)
> > > > > >
> > > > > > I start with this to figure out how to do a failback:
> > > > > >
> > > > > > # crm_mon -n
> > > > > >
> > > > > > ============
> > > > > > Last updated: Tue Feb 16 19:44:12 2010
> > > > > > Stack: Heartbeat
> > > > > > Current DC: fc12-64-1 (d7b30d08-d835-4014-b9c6-ebf53099cbe3) -
> > > > > > partition with quorum Version:
> > > > > > 1.0.5-ee19d8e83c2a5d45988f1cee36d334a631d84fc7 2 Nodes configured,
> > > > > > unknown expected votes
> > > > > > 8 Resources configured.
> > > > > > ============
> > > > > >
> > > > > > Node fc12-64-1 (d7b30d08-d835-4014-b9c6-ebf53099cbe3): online
> > > > > >         SS6:0   (ocf::omneon:ss) Master
> > > > > >        SS3:0   (ocf::omneon:ss) Master
> > > > > >         SS7:0   (ocf::omneon:ss) Master
> > > > > >         SS0:0   (ocf::omneon:ss) Master
> > > > > >         SS4:0   (ocf::omneon:ss) Master
> > > > > >         SS1:0   (ocf::omneon:ss) Master
> > > > > >         SS5:0   (ocf::omneon:ss) Master
> > > > > >         SS2:0   (ocf::omneon:ss) Master
> > > > > > Node fc12-64-2 (b69df3a6-a630-4edb-adf4-28727f8c1222): online
> > > > > >         SS0:1   (ocf::omneon:ss) Slave
> > > > > >         SS2:1   (ocf::omneon:ss) Slave
> > > > > >         SS1:1   (ocf::omneon:ss) Slave
> > > > > >         SS3:1   (ocf::omneon:ss) Slave
> > > > > >         SS5:1   (ocf::omneon:ss) Slave
> > > > > >         SS4:1   (ocf::omneon:ss) Slave
> > > > > >         SS7:1   (ocf::omneon:ss) Slave
> > > > > >         SS6:1   (ocf::omneon:ss) Slave
> > > > > >
> > > > > > And tried these steps to do a failback:
> > > > > >
> > > > > > # crm resource migrate ms-SS0 fc12-64-2
> > > > > > Error performing operation: ms-SS0 is already active on fc12-64-2
> > > > > >
> > > > > > ====> which makes sense since it is a "slave" on fc12-64-2
> > > > > >
> > > > > > # crm resource
> > > > > > crm(live)resource# promote ms-SS0:1
> > > > > > ERROR: ms-SS0:1 is not a master-slave resource crm(live)resource#
> > > > > > promote SS0:1 ERROR: SS0:1 is not a master-slave resource
> > > > > > crm(live)resource# promote SS0 ERROR: SS0 is not a master-slave
> > > > > > resource crm(live)resource# promote ms-SS0 Multiple attributes match
> > > > > > name=target-role Value: Started
> > > > > > (id=ms-SS0-meta_attributes-target-role) Value: Started
> > > > > > (id=SS0-meta_attributes-target-role) Error performing operation:
> > > > > > Required data for this CIB API call not found crm(live)resource#
> > > > > > demote ms-SS0 Multiple attributes match name=target-role Value:
> > > > > > Started
> > > > > > (id=ms-SS0-meta_attributes-target-role) Value: Started
> > > > > > (id=SS0-meta_attributes-target-role) Error performing operation:
> > > > > > Required data for this CIB API call not found crm(live)resource#
> > > > > >
> > > > > > Since I have a location constraint for ms-SS0 to fc12-64-1, I tried
> > > > > > the same operation using ms-SS1 which has a location constraint for
> > > > > > fc12-64-2.
> > > > > >
> > > > > > This gave me the same messages or:
> > > > > >
> > > > > > crm(live)resource# promote ms-SS1
> > > > > > Multiple attributes match name=target-role
> > > > > >   Value: Started        (id=ms-SS1-meta_attributes-target-role)
> > > > > >    Value: Started        (id=SS1-meta_attributes-target-role)
> > > > > > Error performing operation: Required data for this CIB API call not
> > > > > > found crm(live)resource# demote ms-SS1
> > > > > > Multiple attributes match name=target-role
> > > > > >   Value: Started        (id=ms-SS1-meta_attributes-target-role)
> > > > > >   Value: Started        (id=SS1-meta_attributes-target-role)
> > > > > > Error performing operation: Required data for this CIB API call not
> > > > > > found
> > > > > >
> > > > > > Could you tell me what I am doing wrong?
> > > > >
> > > > > There are multiple target-roles, i.e. for both the clone and the
> > > > > resource which is cloned. The tools can't decide which one to
> > > > > change. The crm shell will try to deal with that with the next
> > > > > Pacemaker release (I think that there's already an enhancement
> > > > > bugzilla open).  In the meantime, you can edit your configuration
> > > > > and remove the meta attributes from primitives.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Dejan
> > > > >
> > > > > > My configuration file is attached below.
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Bob
> > > > > >
> > > > > > --------------------------- my configuration
> > > > > > ----------------------------------------------------------- node
> > > > > > $id="b69df3a6-a630-4edb-adf4-28727f8c1222" fc12-64-2
> > > > > > node $id="d7b30d08-d835-4014-b9c6-ebf53099cbe3" fc12-64-1
> > > > > > primitive SS0 ocf:omneon:ss \
> > > > > >         params ss_resource="SS0" \
> > > > > >         params ssconf="/tmp/config.0" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > primitive SS1 ocf:omneon:ss \
> > > > > >         params ss_resource="SS1" \
> > > > > >         params ssconf="/tmp/config.1" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > primitive SS2 ocf:omneon:ss \
> > > > > >         params ss_resource="SS2" \
> > > > > >         params ssconf="/tmp/config.2" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > primitive SS3 ocf:omneon:ss \
> > > > > >         params ss_resource="SS3" \
> > > > > >         params ssconf="/tmp/config.3" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > primitive SS4 ocf:omneon:ss \
> > > > > >         params ss_resource="SS4" \
> > > > > >         params ssconf="/tmp/config.4" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > primitive SS5 ocf:omneon:ss \
> > > > > >         params ss_resource="SS5" \
> > > > > >         params ssconf="/tmp/config.5" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > primitive SS6 ocf:omneon:ss \
> > > > > >         params ss_resource="SS6" \
> > > > > >         params ssconf="/tmp/config.6" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > primitive SS7 ocf:omneon:ss \
> > > > > >         params ss_resource="SS7" \
> > > > > >         params ssconf="/tmp/config.7" \
> > > > > >         op monitor interval="59s" role="Master" timeout="30s" \
> > > > > >         op monitor interval="60s" role="Slave" timeout="28" \
> > > > > >         meta target-role="Started"
> > > > > > ms ms-SS0 SS0 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" ms ms-SS1 SS1 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" ms ms-SS2 SS2 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" ms ms-SS3 SS3 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" ms ms-SS4 SS4 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" ms ms-SS5 SS5 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" ms ms-SS6 SS6 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" ms ms-SS7 SS7 \
> > > > > >         meta clone-max="2" notify="true" globaally-unique="false"
> > > > > > target-role="Started" location ms-SS0-master-w1 ms-SS0 \
> > > > > >         rule $id="ms-SS0-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-1 location ms-SS1-master-w1 ms-SS1 \
> > > > > >         rule $id="ms-SS1-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-2 location ms-SS2-master-w1 ms-SS2 \
> > > > > >         rule $id="ms-SS2-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-1 location ms-SS3-master-w1 ms-SS3 \
> > > > > >         rule $id="ms-SS3-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-2 location ms-SS4-master-w1 ms-SS4 \
> > > > > >         rule $id="ms-SS4-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-1 location ms-SS5-master-w1 ms-SS5 \
> > > > > >         rule $id="ms-SS5-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-2 location ms-SS6-master-w1 ms-SS6 \
> > > > > >         rule $id="ms-SS6-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-1 location ms-SS7-master-w1 ms-SS7 \
> > > > > >         rule $id="ms-SS7-master-w1-rule" $role="master" 100: #uname
> > > > > > eq fc12-64-2 property $id="cib-bootstrap-options" \
> > > > > >         dc-version="1.0.5-ee19d8e83c2a5d45988f1cee36d334a631d84fc7" 
> > > > > > \
> > > > > >         cluster-infrastructure="Heartbeat" \
> > > > > >         stonith-enabled="false" \
> > > > > >         symmetric-cluster="true"
> > > >
> > > > _______________________________________________
> > > > Linux-HA mailing list
> > > > [email protected]
> > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > > See also: http://linux-ha.org/ReportingProblems
> > > 
> > > _______________________________________________
> > > Linux-HA mailing list
> > > [email protected]
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > > See also: http://linux-ha.org/ReportingProblems
> > > 
> > 
> > -- 
> > Best regards,
> > Marian Marinov
> 
> 
> 
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems



      
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Command line option to fail back a master/slave resource

Reply via email to