I see these prints. pengine: info: rsc_merge_weights: cu_4: Rolling back scores from cu_3 pengine: debug: native_assign_node: Assigning Redun_CU4_Wb30 to cu_4 pengine: info: rsc_merge_weights: cu_3: Rolling back scores from cu_2 pengine: debug: native_assign_node: Assigning Redund_CU5_WB30 to cu_3
Looks like rolling back the scores is causing the new decision to relocate the resources. Am I using the scores incorrectly? [root@Redund_CU5_WB30 root]# pcs constraint Location Constraints: Resource: cu_2 Enabled on: Redun_CU4_Wb30 (score:0) Enabled on: Redund_CU5_WB30 (score:0) Enabled on: Redund_CU3_WB30 (score:0) Enabled on: Redund_CU1_WB30 (score:0) Resource: cu_3 Enabled on: Redun_CU4_Wb30 (score:0) Enabled on: Redund_CU5_WB30 (score:0) Enabled on: Redund_CU3_WB30 (score:0) Enabled on: Redund_CU1_WB30 (score:0) Resource: cu_4 Enabled on: Redun_CU4_Wb30 (score:0) Enabled on: Redund_CU5_WB30 (score:0) Enabled on: Redund_CU3_WB30 (score:0) Enabled on: Redund_CU1_WB30 (score:0) Ordering Constraints: Colocation Constraints: cu_2 with cu_4 (score:-INFINITY) cu_3 with cu_4 (score:-INFINITY) cu_2 with cu_3 (score:-INFINITY) On Mon, Oct 17, 2016 at 8:16 PM, Nikhil Utane <nikhil.subscri...@gmail.com> wrote: > This is driving me insane. > > This is how the resources were started. Redund_CU1_WB30 was the DC which > I rebooted. > cu_4 (ocf::redundancy:RedundancyRA): Started Redund_CU1_WB30 > cu_2 (ocf::redundancy:RedundancyRA): Started Redund_CU5_WB30 > cu_3 (ocf::redundancy:RedundancyRA): Started Redun_CU4_Wb30 > > Since the standby node was not UP. I was expecting resource cu_4 to be > waiting to be scheduled. > But then it re-arranged everything as below. > cu_4 (ocf::redundancy:RedundancyRA): Started Redun_CU4_Wb30 > cu_2 (ocf::redundancy:RedundancyRA): Stopped > cu_3 (ocf::redundancy:RedundancyRA): Started Redund_CU5_WB30 > > There is not much information available in the logs on new DC. It just > shows what it has decided to do but nothing to suggest why it did it that > way. > > notice: Start cu_4 (Redun_CU4_Wb30) > notice: Stop cu_2 (Redund_CU5_WB30) > notice: Move cu_3 (Started Redun_CU4_Wb30 -> Redund_CU5_WB30) > > I have default stickiness set to 100 which is higher than any score that I > have configured. > I have migration_threshold set to 1. Should I bump that up instead? > > -Thanks > Nikhil > > On Sat, Oct 15, 2016 at 12:36 AM, Ken Gaillot <kgail...@redhat.com> wrote: > >> On 10/14/2016 06:56 AM, Nikhil Utane wrote: >> > Hi, >> > >> > Thank you for the responses so far. >> > I added reverse colocation as well. However seeing some other issue in >> > resource movement that I am analyzing. >> > >> > Thinking further on this, why doesn't "/a not with b" does not imply "b >> > not with a"?/ >> > Coz wouldn't putting "b with a" violate "a not with b"? >> > >> > Can someone confirm that colocation is required to be configured both >> ways? >> >> The anti-colocation should only be defined one-way. Otherwise, you get a >> dependency loop (as seen in logs you showed elsewhere). >> >> The one-way constraint is enough to keep the resources apart. However, >> the question is whether the cluster might move resources around >> unnecessarily. >> >> For example, "A not with B" means that the cluster will place B first, >> then place A somewhere else. So, if B's node fails, can the cluster >> decide that A's node is now the best place for B, and move A to a free >> node, rather than simply start B on the free node? >> >> The cluster does take dependencies into account when placing a resource, >> so I would hope that wouldn't happen. But I'm not sure. Having some >> stickiness might help, so that A has some preference against moving. >> >> > -Thanks >> > Nikhil >> > >> > / >> > / >> > >> > On Fri, Oct 14, 2016 at 1:09 PM, Vladislav Bogdanov >> > <bub...@hoster-ok.com <mailto:bub...@hoster-ok.com>> wrote: >> > >> > On October 14, 2016 10:13:17 AM GMT+03:00, Ulrich Windl >> > <ulrich.wi...@rz.uni-regensburg.de >> > <mailto:ulrich.wi...@rz.uni-regensburg.de>> wrote: >> > >>>> Nikhil Utane <nikhil.subscri...@gmail.com >> > <mailto:nikhil.subscri...@gmail.com>> schrieb am 13.10.2016 um >> > >16:43 in >> > >Nachricht >> > ><CAGNWmJUbPucnBGXroHkHSbQ0LXovwsLFPkUPg1R8gJqRFqM9Dg@mail. >> gmail.com >> > <mailto:CAGNWmJUbPucnBGXroHkHSbQ0LXovwsLFPkUPg1R8gJqRFqM9Dg >> @mail.gmail.com>>: >> > >> Ulrich, >> > >> >> > >> I have 4 resources only (not 5, nodes are 5). So then I only >> need 6 >> > >> constraints, right? >> > >> >> > >> [,1] [,2] [,3] [,4] [,5] [,6] >> > >> [1,] "A" "A" "A" "B" "B" "C" >> > >> [2,] "B" "C" "D" "C" "D" "D" >> > > >> > >Sorry for my confusion. As Andrei Borzenkovsaid in >> > ><CAA91j0W+epAHFLg9u6VX_X8LgFkf9Rp55g3nocY4oZNA9BbZ+g@mail. >> gmail.com >> > <mailto:CAA91j0W%2BepAHFLg9u6VX_X8LgFkf9Rp55g3nocY4oZNA9BbZ >> %2...@mail.gmail.com>> >> > >you probably have to add (A, B) _and_ (B, A)! Thinking about it, I >> > >wonder whether an easier solution would be using "utilization": If >> > >every node has one token to give, and every resource needs on >> token, no >> > >two resources will run on one node. Sounds like an easier solution >> to >> > >me. >> > > >> > >Regards, >> > >Ulrich >> > > >> > > >> > >> >> > >> I understand that if I configure constraint of R1 with R2 score >> as >> > >> -infinity, then the same applies for R2 with R1 score as >> -infinity >> > >(don't >> > >> have to configure it explicitly). >> > >> I am not having a problem of multiple resources getting schedule >> on >> > >the >> > >> same node. Rather, one working resource is unnecessarily getting >> > >relocated. >> > >> >> > >> -Thanks >> > >> Nikhil >> > >> >> > >> >> > >> On Thu, Oct 13, 2016 at 7:45 PM, Ulrich Windl < >> > >> ulrich.wi...@rz.uni-regensburg.de >> > <mailto:ulrich.wi...@rz.uni-regensburg.de>> wrote: >> > >> >> > >>> Hi! >> > >>> >> > >>> Don't you need 10 constraints, excluding every possible pair of >> your >> > >5 >> > >>> resources (named A-E here), like in this table (produced with >> R): >> > >>> >> > >>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] >> > >>> [1,] "A" "A" "A" "A" "B" "B" "B" "C" "C" "D" >> > >>> [2,] "B" "C" "D" "E" "C" "D" "E" "D" "E" "E" >> > >>> >> > >>> Ulrich >> > >>> >> > >>> >>> Nikhil Utane <nikhil.subscri...@gmail.com >> > <mailto:nikhil.subscri...@gmail.com>> schrieb am 13.10.2016 >> > >um >> > >>> 15:59 in >> > >>> Nachricht >> > >>> >> > ><CAGNWmJW0CWMr3bvR3L9xZCAcJUzyczQbZEzUzpaJxi+Pn7Oj_A@mail. >> gmail.com >> > <mailto:CAGNWmJW0CWMr3bvR3L9xZCAcJUzyczQbZEzUzpaJxi% >> 2bpn7o...@mail.gmail.com>>: >> > >>> > Hi, >> > >>> > >> > >>> > I have 5 nodes and 4 resources configured. >> > >>> > I have configured constraint such that no two resources can be >> > >>> co-located. >> > >>> > I brought down a node (which happened to be DC). I was >> expecting >> > >the >> > >>> > resource on the failed node would be migrated to the 5th >> waiting >> > >node >> > >>> (that >> > >>> > is not running any resource). >> > >>> > However what happened was the failed node resource was >> started on >> > >another >> > >>> > active node (after stopping it's existing resource) and that >> > >node's >> > >>> > resource was moved to the waiting node. >> > >>> > >> > >>> > What could I be doing wrong? >> > >>> > >> > >>> > <nvpair id="cib-bootstrap-options-have-watchdog" value="true" >> > >>> > name="have-watchdog"/> >> > >>> > <nvpair id="cib-bootstrap-options-dc-version" >> > >value="1.1.14-5a6cdd1" >> > >>> > name="dc-version"/> >> > >>> > <nvpair id="cib-bootstrap-options-cluster-infrastructure" >> > >>> value="corosync" >> > >>> > name="cluster-infrastructure"/> >> > >>> > <nvpair id="cib-bootstrap-options-stonith-enabled" >> value="false" >> > >>> > name="stonith-enabled"/> >> > >>> > <nvpair id="cib-bootstrap-options-no-quorum-policy" >> value="ignore" >> > >>> > name="no-quorum-policy"/> >> > >>> > <nvpair id="cib-bootstrap-options-default-action-timeout" >> > >value="240" >> > >>> > name="default-action-timeout"/> >> > >>> > <nvpair id="cib-bootstrap-options-symmetric-cluster" >> value="false" >> > >>> > name="symmetric-cluster"/> >> > >>> > >> > >>> > # pcs constraint >> > >>> > Location Constraints: >> > >>> > Resource: cu_2 >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0) >> > >>> > Enabled on: Redund_CU2_WB30 (score:0) >> > >>> > Enabled on: Redund_CU3_WB30 (score:0) >> > >>> > Enabled on: Redund_CU5_WB30 (score:0) >> > >>> > Enabled on: Redund_CU1_WB30 (score:0) >> > >>> > Resource: cu_3 >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0) >> > >>> > Enabled on: Redund_CU2_WB30 (score:0) >> > >>> > Enabled on: Redund_CU3_WB30 (score:0) >> > >>> > Enabled on: Redund_CU5_WB30 (score:0) >> > >>> > Enabled on: Redund_CU1_WB30 (score:0) >> > >>> > Resource: cu_4 >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0) >> > >>> > Enabled on: Redund_CU2_WB30 (score:0) >> > >>> > Enabled on: Redund_CU3_WB30 (score:0) >> > >>> > Enabled on: Redund_CU5_WB30 (score:0) >> > >>> > Enabled on: Redund_CU1_WB30 (score:0) >> > >>> > Resource: cu_5 >> > >>> > Enabled on: Redun_CU4_Wb30 (score:0) >> > >>> > Enabled on: Redund_CU2_WB30 (score:0) >> > >>> > Enabled on: Redund_CU3_WB30 (score:0) >> > >>> > Enabled on: Redund_CU5_WB30 (score:0) >> > >>> > Enabled on: Redund_CU1_WB30 (score:0) >> > >>> > Ordering Constraints: >> > >>> > Colocation Constraints: >> > >>> > cu_3 with cu_2 (score:-INFINITY) >> > >>> > cu_4 with cu_2 (score:-INFINITY) >> > >>> > cu_4 with cu_3 (score:-INFINITY) >> > >>> > cu_5 with cu_2 (score:-INFINITY) >> > >>> > cu_5 with cu_3 (score:-INFINITY) >> > >>> > cu_5 with cu_4 (score:-INFINITY) >> > >>> > >> > >>> > -Thanks >> > >>> > Nikhil >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> _______________________________________________ >> > >>> Users mailing list: Users@clusterlabs.org >> > <mailto:Users@clusterlabs.org> >> > >>> http://clusterlabs.org/mailman/listinfo/users >> > <http://clusterlabs.org/mailman/listinfo/users> >> > >>> >> > >>> Project Home: http://www.clusterlabs.org >> > >>> Getting started: >> > >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >> > >>> Bugs: http://bugs.clusterlabs.org >> > >>> >> > > >> > > >> > > >> > > >> > >_______________________________________________ >> > >Users mailing list: Users@clusterlabs.org >> > <mailto:Users@clusterlabs.org> >> > >http://clusterlabs.org/mailman/listinfo/users >> > <http://clusterlabs.org/mailman/listinfo/users> >> > > >> > >Project Home: http://www.clusterlabs.org >> > >Getting started: >> > >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >> > >Bugs: http://bugs.clusterlabs.org >> > >> > Hi, >> > >> > use of utilization (balanced strategy) has one caveat: resources are >> > not moved just because of utilization of one node is less, when >> > nodes have the same allocation score for the resource. >> > So, after the simultaneus outage of two nodes in a 5-node cluster, >> > it may appear that one node runs two resources and two recovered >> > nodes run nothing. >> > >> > Original 'utilization' strategy only limits resource placement, it >> > is not considered when choosing a node for a resource. >> > >> > Vladislav >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > >
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org