Re: [ClusterLabs] Colocation constraint for grouping all master-mode stateful resources with important stateless resources

Sam Gardner Fri, 23 Mar 2018 10:46:26 -0700

The shared-ip resource agents are our own custom RA - they do what we want and 
the failover mechanism for the single resource works fine as far as I can tell 
- using conntrackd or some other non-IP master/slave resource would be the same 
sort of failure that we're running into because the constraints aren't set up 
correctly.
-- 
Sam Gardner
Trustwave | SMART SECURITY ON DEMAND










On 3/23/18, 12:42 PM, "Users on behalf of Sam Gardner" 
<users-boun...@clusterlabs.org on behalf of sgard...@trustwave.com> wrote:

>Thanks, Ken.
>
>I just want all master-mode resources to be running wherever DRBDFS is running 
>(essentially). If the cluster detects that any of the master-mode resources 
>can't run on the current node (but can run on the other per ethmon), all other 
>master-mode resources as well as DRBDFS should move over to the other node.
>
>The current set of constraints I have will let DRBDFS move to the standby node 
>and "take" the Master mode resources with it, but the Master mode resources 
>failing over to the other node won't take the other Master resources or DRBDFS.
>
>As a side note, there are other resources I have in play (some active/passive 
>like DRBDFS, some Master/Slave like the ship resources) that are related, but 
>not shown here - I'm just having a hard time reasoning about the generalized 
>form that my constraints should take to make this sort of thing work.
>-- 
>Sam Gardner
>Trustwave | SMART SECURITY ON DEMAND
>
>
>
>
>
>
>
>
>On 3/23/18, 12:34 PM, "Users on behalf of Ken Gaillot" 
><users-boun...@clusterlabs.org on behalf of kgail...@redhat.com> wrote:
>
>>On Tue, 2018-03-20 at 16:34 +0000, Sam Gardner wrote:
>>> Hi All -
>>> 
>>> I've implemented a simple two-node cluster with DRBD and a couple of
>>> network-based Master/Slave resources.
>>> 
>>> Using the ethmonitor RA, I set up failover whenever the
>>> Master/Primary node loses link on the specified ethernet physical
>>> device by constraining the Master role only on nodes where the ethmon
>>> variable is "1".
>>> 
>>> Something is going wrong with my colocation constraint, however - if
>>> I set up the DRBDFS resource to monitor link on eth1, unplugging eth1
>>> on the Primary node causes a failover as expected - all Master
>>> resources are demoted to "slave" and promoted on the opposite node,
>>> and the "normal" DRBDFS moves to the other node as expected.
>>> 
>>> However, if I put the same ethmonitor constraint on the network-based 
>>> Master/Slave resource, only that specific resource fails over -
>>> DRBDFS stays in the same location (though it stops) as do the other
>>> Master/Slave resources.
>>> 
>>> This *smells* like a constraints issue to me - does anyone know what
>>> I might be doing wrong?
>>>
>>> PCS before:
>>> Cluster name: 
>>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKS2oHmF9g&s=5&u=http%3a%2f%2fnode1%2ehostname%2ecom%5fnode2%2ehostname%2ecom
>>> Stack: corosync
>>> Current DC: node2.hostname.com_0 (version 1.1.16-12.el7_4.4-94ff4df)
>>> - partition with quorum
>>> Last updated: Tue Mar 20 16:25:47 2018
>>> Last change: Tue Mar 20 16:00:33 2018 by hacluster via crmd on
>>> node2.hostname.com_0
>>> 
>>> 2 nodes configured
>>> 11 resources configured
>>> 
>>> Online: [ node1.hostname.com_0 node2.hostname.com_0 ]
>>> 
>>> Full list of resources:
>>> 
>>>  Master/Slave Set: drbd.master [drbd.slave]
>>>      Masters: [ node1.hostname.com_0 ]
>>>      Slaves: [ node2.hostname.com_0 ]
>>>  drbdfs (ocf::heartbeat:Filesystem):    Started node1.hostname.com_0
>>>  Master/Slave Set: inside-interface-sameip.master [inside-interface-
>>> sameip.slave]
>>>      Masters: [ node1.hostname.com_0 ]
>>>      Slaves: [ node2.hostname.com_0 ]
>>>  Master/Slave Set: outside-interface-sameip.master [outside-
>>> interface-sameip.slave]
>>>      Masters: [ node1.hostname.com_0 ]
>>>      Slaves: [ node2.hostname.com_0 ]
>>>  Clone Set: monitor-eth1-clone [monitor-eth1]
>>>      Started: [ node1.hostname.com_0 node2.hostname.com_0 ]
>>>  Clone Set: monitor-eth2-clone [monitor-eth2]
>>>      Started: [ node1.hostname.com_0 node2.hostname.com_0 ]
>>
>>What agent are the two IP resources using? I'm not familiar with any IP
>>resource agents that are master/slave clones.
>>
>>> Daemon Status:
>>>   corosync: active/enabled
>>>   pacemaker: active/enabled
>>>   pcsd: inactive/disabled
>>> 
>>> PCS after:
>>> Cluster name: 
>>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKS2oHmF9g&s=5&u=http%3a%2f%2fnode1%2ehostname%2ecom%5fnode2%2ehostname%2ecom
>>> Stack: corosync
>>> Current DC: node2.hostname.com_0 (version 1.1.16-12.el7_4.4-94ff4df)
>>> - partition with quorum
>>> Last updated: Tue Mar 20 16:29:40 2018
>>> Last change: Tue Mar 20 16:00:33 2018 by hacluster via crmd on
>>> node2.hostname.com_0
>>> 
>>> 2 nodes configured
>>> 11 resources configured
>>> 
>>> Online: [ node1.hostname.com_0 node2.hostname.com_0 ]
>>> 
>>> Full list of resources:
>>> 
>>>  Master/Slave Set: drbd.master [drbd.slave]
>>>      Masters: [ node1.hostname.com_0 ]
>>>      Slaves: [ node2.hostname.com_0 ]
>>>  drbdfs (ocf::heartbeat:Filesystem):    Stopped
>>>  Master/Slave Set: inside-interface-sameip.master [inside-interface-
>>> sameip.slave]
>>>      Masters: [ node2.hostname.com_0 ]
>>>      Stopped: [ node1.hostname.com_0 ]
>>>  Master/Slave Set: outside-interface-sameip.master [outside-
>>> interface-sameip.slave]
>>>      Masters: [ node1.hostname.com_0 ]
>>>      Slaves: [ node2.hostname.com_0 ]
>>>  Clone Set: monitor-eth1-clone [monitor-eth1]
>>>      Started: [ node1.hostname.com_0 node2.hostname.com_0 ]
>>>  Clone Set: monitor-eth2-clone [monitor-eth2]
>>>      Started: [ node1.hostname.com_0 node2.hostname.com_0 ]
>>> 
>>> Daemon Status:
>>>   corosync: active/enabled
>>>   pacemaker: active/enabled
>>>   pcsd: inactive/disabled
>>> 
>>> This is the "constraints" section of my CIB (full CIB is attached):
>>>       <rsc_colocation
>>> id="pcs_rsc_colocation_set_drbdfs_set_drbd.master_inside-interface-
>>> sameip.master_outside-interface-sameip.master" score="INFINITY">
>>>         <resource_set id="pcs_rsc_set_drbdfs" sequential="false">
>>>           <resource_ref id="drbdfs"/>
>>>         </resource_set>
>>>         <resource_set id="pcs_rsc_set_drbd.master_inside-interface-
>>> sameip.master_outside-interface-sameip.master" role="Master"
>>> sequential="false">
>>>           <resource_ref id="drbd.master"/>
>>>           <resource_ref id="inside-interface-sameip.master"/>
>>>           <resource_ref id="outside-interface-sameip.master"/>
>>>         </resource_set>
>>>       </rsc_colocation>
>>
>>Resource sets can be confusing in the best of cases.
>>
>>The above constraint says: Place drbdfs only on a node where the master
>>instances of drbd.master and the two IPs are running (without any
>>dependencies between those resources).
>>
>>This explains why the master instances can run on different nodes, and
>>why drbdfs was stopped when they did.
>>
>>>       <rsc_order id="pcs_rsc_order_set_drbd.master_inside-interface-
>>> sameip.master_outside-interface-sameip.master_set_drbdfs"
>>> kind="Serialize" symmetrical="false">
>>>         <resource_set action="promote"
>>> id="pcs_rsc_set_drbd.master_inside-interface-sameip.master_outside-
>>> interface-sameip.master-1" role="Master">
>>>           <resource_ref id="drbd.master"/>
>>>           <resource_ref id="inside-interface-sameip.master"/>
>>>           <resource_ref id="outside-interface-sameip.master"/>
>>>         </resource_set>
>>>         <resource_set id="pcs_rsc_set_drbdfs-1">
>>>           <resource_ref id="drbdfs"/>
>>>         </resource_set>
>>>       </rsc_order>
>>
>>The above constraint says: if promoting any of drbd.master and the two
>>interfaces and/or starting drbdfs, do each action one at a time (in any
>>order). Other actions (including demoting and stopping) can happen in
>>any order.
>>
>>>       <rsc_location id="location-inside-interface-sameip.master"
>>> rsc="inside-interface-sameip.master">
>>>         <rule id="location-inside-interface-sameip.master-rule"
>>> score="-INFINITY">
>>>           <expression attribute="ethmon_result-eth1" id="location-
>>> inside-interface-sameip.master-rule-expr" operation="ne" value="1"/>
>>>         </rule>
>>>       </rsc_location>
>>>       <rsc_location id="location-outside-interface-sameip.master"
>>> rsc="outside-interface-sameip.master">
>>>         <rule id="location-outside-interface-sameip.master-rule"
>>> score="-INFINITY">
>>>           <expression attribute="ethmon_result-eth2" id="location-
>>> outside-interface-sameip.master-rule-expr" operation="ne" value="1"/>
>>>         </rule>
>>>       </rsc_location>
>>
>>The above constraints keep inside-interface on a node where eth1 is
>>good, and outside-interface on a node where eth2 is good.
>>
>>I'm guessing you want to keep these two constraints, and start over
>>from scratch on the others. What are your intended relationships
>>between the various resources?
>>
>>>     </constraints>
>>> -- 
>>> Sam Gardner  
>>> Trustwave | SMART SECURITY ON DEMAND
>>> _______________________________________________
>>> Users mailing list: Users@clusterlabs.org
>>> https://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKbg83mFqw&s=5&u=https%3a%2f%2flists%2eclusterlabs%2eorg%2fmailman%2flistinfo%2fusers
>>> 
>>> Project Home: 
>>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPHk83DV9w&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg
>>> Getting started: 
>>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKa18XCC8A&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg%2fdoc%2fCluster%5ffrom%5fScratch
>>> pdf
>>> Bugs: 
>>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPDh9HjR9g&s=5&u=http%3a%2f%2fbugs%2eclusterlabs%2eorg
>>-- 
>>Ken Gaillot <kgail...@redhat.com>
>>_______________________________________________
>>Users mailing list: Users@clusterlabs.org
>>https://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKbg83mFqw&s=5&u=https%3a%2f%2flists%2eclusterlabs%2eorg%2fmailman%2flistinfo%2fusers
>>
>>Project Home: 
>>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPHk83DV9w&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg
>>Getting started: 
>>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPKx-izWog&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg%2fdoc%2fCluster%5ffrom%5fScratch%2epdf
>>Bugs: 
>>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPDh9HjR9g&s=5&u=http%3a%2f%2fbugs%2eclusterlabs%2eorg
>_______________________________________________
>Users mailing list: Users@clusterlabs.org
>https://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKbg83mFqw&s=5&u=https%3a%2f%2flists%2eclusterlabs%2eorg%2fmailman%2flistinfo%2fusers
>
>Project Home: 
>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPHk83DV9w&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg
>Getting started: 
>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPKx-izWog&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg%2fdoc%2fCluster%5ffrom%5fScratch%2epdf
>Bugs: 
>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPDh9HjR9g&s=5&u=http%3a%2f%2fbugs%2eclusterlabs%2eorg
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Colocation constraint for grouping all master-mode stateful resources with important stateless resources

Reply via email to