The shared-ip resource agents are our own custom RA - they do what we want and the failover mechanism for the single resource works fine as far as I can tell - using conntrackd or some other non-IP master/slave resource would be the same sort of failure that we're running into because the constraints aren't set up correctly. -- Sam Gardner Trustwave | SMART SECURITY ON DEMAND
On 3/23/18, 12:42 PM, "Users on behalf of Sam Gardner" <users-boun...@clusterlabs.org on behalf of sgard...@trustwave.com> wrote: >Thanks, Ken. > >I just want all master-mode resources to be running wherever DRBDFS is running >(essentially). If the cluster detects that any of the master-mode resources >can't run on the current node (but can run on the other per ethmon), all other >master-mode resources as well as DRBDFS should move over to the other node. > >The current set of constraints I have will let DRBDFS move to the standby node >and "take" the Master mode resources with it, but the Master mode resources >failing over to the other node won't take the other Master resources or DRBDFS. > >As a side note, there are other resources I have in play (some active/passive >like DRBDFS, some Master/Slave like the ship resources) that are related, but >not shown here - I'm just having a hard time reasoning about the generalized >form that my constraints should take to make this sort of thing work. >-- >Sam Gardner >Trustwave | SMART SECURITY ON DEMAND > > > > > > > > >On 3/23/18, 12:34 PM, "Users on behalf of Ken Gaillot" ><users-boun...@clusterlabs.org on behalf of kgail...@redhat.com> wrote: > >>On Tue, 2018-03-20 at 16:34 +0000, Sam Gardner wrote: >>> Hi All - >>> >>> I've implemented a simple two-node cluster with DRBD and a couple of >>> network-based Master/Slave resources. >>> >>> Using the ethmonitor RA, I set up failover whenever the >>> Master/Primary node loses link on the specified ethernet physical >>> device by constraining the Master role only on nodes where the ethmon >>> variable is "1". >>> >>> Something is going wrong with my colocation constraint, however - if >>> I set up the DRBDFS resource to monitor link on eth1, unplugging eth1 >>> on the Primary node causes a failover as expected - all Master >>> resources are demoted to "slave" and promoted on the opposite node, >>> and the "normal" DRBDFS moves to the other node as expected. >>> >>> However, if I put the same ethmonitor constraint on the network-based >>> Master/Slave resource, only that specific resource fails over - >>> DRBDFS stays in the same location (though it stops) as do the other >>> Master/Slave resources. >>> >>> This *smells* like a constraints issue to me - does anyone know what >>> I might be doing wrong? >>> >>> PCS before: >>> Cluster name: >>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKS2oHmF9g&s=5&u=http%3a%2f%2fnode1%2ehostname%2ecom%5fnode2%2ehostname%2ecom >>> Stack: corosync >>> Current DC: node2.hostname.com_0 (version 1.1.16-12.el7_4.4-94ff4df) >>> - partition with quorum >>> Last updated: Tue Mar 20 16:25:47 2018 >>> Last change: Tue Mar 20 16:00:33 2018 by hacluster via crmd on >>> node2.hostname.com_0 >>> >>> 2 nodes configured >>> 11 resources configured >>> >>> Online: [ node1.hostname.com_0 node2.hostname.com_0 ] >>> >>> Full list of resources: >>> >>> Master/Slave Set: drbd.master [drbd.slave] >>> Masters: [ node1.hostname.com_0 ] >>> Slaves: [ node2.hostname.com_0 ] >>> drbdfs (ocf::heartbeat:Filesystem): Started node1.hostname.com_0 >>> Master/Slave Set: inside-interface-sameip.master [inside-interface- >>> sameip.slave] >>> Masters: [ node1.hostname.com_0 ] >>> Slaves: [ node2.hostname.com_0 ] >>> Master/Slave Set: outside-interface-sameip.master [outside- >>> interface-sameip.slave] >>> Masters: [ node1.hostname.com_0 ] >>> Slaves: [ node2.hostname.com_0 ] >>> Clone Set: monitor-eth1-clone [monitor-eth1] >>> Started: [ node1.hostname.com_0 node2.hostname.com_0 ] >>> Clone Set: monitor-eth2-clone [monitor-eth2] >>> Started: [ node1.hostname.com_0 node2.hostname.com_0 ] >> >>What agent are the two IP resources using? I'm not familiar with any IP >>resource agents that are master/slave clones. >> >>> Daemon Status: >>> corosync: active/enabled >>> pacemaker: active/enabled >>> pcsd: inactive/disabled >>> >>> PCS after: >>> Cluster name: >>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKS2oHmF9g&s=5&u=http%3a%2f%2fnode1%2ehostname%2ecom%5fnode2%2ehostname%2ecom >>> Stack: corosync >>> Current DC: node2.hostname.com_0 (version 1.1.16-12.el7_4.4-94ff4df) >>> - partition with quorum >>> Last updated: Tue Mar 20 16:29:40 2018 >>> Last change: Tue Mar 20 16:00:33 2018 by hacluster via crmd on >>> node2.hostname.com_0 >>> >>> 2 nodes configured >>> 11 resources configured >>> >>> Online: [ node1.hostname.com_0 node2.hostname.com_0 ] >>> >>> Full list of resources: >>> >>> Master/Slave Set: drbd.master [drbd.slave] >>> Masters: [ node1.hostname.com_0 ] >>> Slaves: [ node2.hostname.com_0 ] >>> drbdfs (ocf::heartbeat:Filesystem): Stopped >>> Master/Slave Set: inside-interface-sameip.master [inside-interface- >>> sameip.slave] >>> Masters: [ node2.hostname.com_0 ] >>> Stopped: [ node1.hostname.com_0 ] >>> Master/Slave Set: outside-interface-sameip.master [outside- >>> interface-sameip.slave] >>> Masters: [ node1.hostname.com_0 ] >>> Slaves: [ node2.hostname.com_0 ] >>> Clone Set: monitor-eth1-clone [monitor-eth1] >>> Started: [ node1.hostname.com_0 node2.hostname.com_0 ] >>> Clone Set: monitor-eth2-clone [monitor-eth2] >>> Started: [ node1.hostname.com_0 node2.hostname.com_0 ] >>> >>> Daemon Status: >>> corosync: active/enabled >>> pacemaker: active/enabled >>> pcsd: inactive/disabled >>> >>> This is the "constraints" section of my CIB (full CIB is attached): >>> <rsc_colocation >>> id="pcs_rsc_colocation_set_drbdfs_set_drbd.master_inside-interface- >>> sameip.master_outside-interface-sameip.master" score="INFINITY"> >>> <resource_set id="pcs_rsc_set_drbdfs" sequential="false"> >>> <resource_ref id="drbdfs"/> >>> </resource_set> >>> <resource_set id="pcs_rsc_set_drbd.master_inside-interface- >>> sameip.master_outside-interface-sameip.master" role="Master" >>> sequential="false"> >>> <resource_ref id="drbd.master"/> >>> <resource_ref id="inside-interface-sameip.master"/> >>> <resource_ref id="outside-interface-sameip.master"/> >>> </resource_set> >>> </rsc_colocation> >> >>Resource sets can be confusing in the best of cases. >> >>The above constraint says: Place drbdfs only on a node where the master >>instances of drbd.master and the two IPs are running (without any >>dependencies between those resources). >> >>This explains why the master instances can run on different nodes, and >>why drbdfs was stopped when they did. >> >>> <rsc_order id="pcs_rsc_order_set_drbd.master_inside-interface- >>> sameip.master_outside-interface-sameip.master_set_drbdfs" >>> kind="Serialize" symmetrical="false"> >>> <resource_set action="promote" >>> id="pcs_rsc_set_drbd.master_inside-interface-sameip.master_outside- >>> interface-sameip.master-1" role="Master"> >>> <resource_ref id="drbd.master"/> >>> <resource_ref id="inside-interface-sameip.master"/> >>> <resource_ref id="outside-interface-sameip.master"/> >>> </resource_set> >>> <resource_set id="pcs_rsc_set_drbdfs-1"> >>> <resource_ref id="drbdfs"/> >>> </resource_set> >>> </rsc_order> >> >>The above constraint says: if promoting any of drbd.master and the two >>interfaces and/or starting drbdfs, do each action one at a time (in any >>order). Other actions (including demoting and stopping) can happen in >>any order. >> >>> <rsc_location id="location-inside-interface-sameip.master" >>> rsc="inside-interface-sameip.master"> >>> <rule id="location-inside-interface-sameip.master-rule" >>> score="-INFINITY"> >>> <expression attribute="ethmon_result-eth1" id="location- >>> inside-interface-sameip.master-rule-expr" operation="ne" value="1"/> >>> </rule> >>> </rsc_location> >>> <rsc_location id="location-outside-interface-sameip.master" >>> rsc="outside-interface-sameip.master"> >>> <rule id="location-outside-interface-sameip.master-rule" >>> score="-INFINITY"> >>> <expression attribute="ethmon_result-eth2" id="location- >>> outside-interface-sameip.master-rule-expr" operation="ne" value="1"/> >>> </rule> >>> </rsc_location> >> >>The above constraints keep inside-interface on a node where eth1 is >>good, and outside-interface on a node where eth2 is good. >> >>I'm guessing you want to keep these two constraints, and start over >>from scratch on the others. What are your intended relationships >>between the various resources? >> >>> </constraints> >>> -- >>> Sam Gardner >>> Trustwave | SMART SECURITY ON DEMAND >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org >>> https://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKbg83mFqw&s=5&u=https%3a%2f%2flists%2eclusterlabs%2eorg%2fmailman%2flistinfo%2fusers >>> >>> Project Home: >>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPHk83DV9w&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg >>> Getting started: >>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKa18XCC8A&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg%2fdoc%2fCluster%5ffrom%5fScratch >>> pdf >>> Bugs: >>> http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPDh9HjR9g&s=5&u=http%3a%2f%2fbugs%2eclusterlabs%2eorg >>-- >>Ken Gaillot <kgail...@redhat.com> >>_______________________________________________ >>Users mailing list: Users@clusterlabs.org >>https://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKbg83mFqw&s=5&u=https%3a%2f%2flists%2eclusterlabs%2eorg%2fmailman%2flistinfo%2fusers >> >>Project Home: >>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPHk83DV9w&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg >>Getting started: >>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPKx-izWog&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg%2fdoc%2fCluster%5ffrom%5fScratch%2epdf >>Bugs: >>http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPDh9HjR9g&s=5&u=http%3a%2f%2fbugs%2eclusterlabs%2eorg >_______________________________________________ >Users mailing list: Users@clusterlabs.org >https://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PKbg83mFqw&s=5&u=https%3a%2f%2flists%2eclusterlabs%2eorg%2fmailman%2flistinfo%2fusers > >Project Home: >http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPHk83DV9w&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg >Getting started: >http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPKx-izWog&s=5&u=http%3a%2f%2fwww%2eclusterlabs%2eorg%2fdoc%2fCluster%5ffrom%5fScratch%2epdf >Bugs: >http://scanmail.trustwave.com/?c=4062&d=mby12vU1sHtzTmhG_RQD80FeoJfoSXp5PPDh9HjR9g&s=5&u=http%3a%2f%2fbugs%2eclusterlabs%2eorg _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org