On 5 Jun 2014, at 10:38 am, Patrick Hemmer <pacema...@feystorm.net> wrote:
> From: Andrew Beekhof <and...@beekhof.net> > Sent: 2014-06-04 20:15:22 EDT > To: The Pacemaker cluster resource manager <pacemaker@oss.clusterlabs.org> > Subject: Re: [Pacemaker] resources not rebalancing > >> On 5 Jun 2014, at 12:57 am, Patrick Hemmer <pacema...@feystorm.net> >> wrote: >> >> >>> From: Andrew Beekhof <and...@beekhof.net> >>> >>> Sent: 2014-06-04 04:15:48 E >>> To: The Pacemaker cluster resource manager >>> <pacemaker@oss.clusterlabs.org> >>> >>> Subject: Re: [Pacemaker] resources not rebalancing >>> >>> >>>> On 4 Jun 2014, at 4:22 pm, Patrick Hemmer <pacema...@feystorm.net> >>>> >>>> wrote: >>>> >>>> >>>> >>>>> Testing some different scenarios, and after bringing a node back online, >>>>> none of the resources move to it unless they are restarted. However >>>>> default-resource-stickiness is set to 0, so they should be able to move >>>>> around freely. >>>>> >>>>> # pcs status >>>>> Cluster name: docker >>>>> Last updated: Wed Jun 4 06:09:26 2014 >>>>> Last change: Wed Jun 4 06:08:40 2014 via cibadmin on i-093f1f55 >>>>> Stack: corosync >>>>> Current DC: i-083f1f54 (3) - partition with quorum >>>>> Version: 1.1.11-1.fc20-9d39a6b >>>>> 3 Nodes configured >>>>> 8 Resources configured >>>>> >>>>> >>>>> Online: [ i-053f1f59 i-083f1f54 i-093f1f55 ] >>>>> >>>>> Full list of resources: >>>>> >>>>> dummy2 (ocf::pacemaker:Dummy): Started i-083f1f54 >>>>> Clone Set: dummy1-clone [dummy1] (unique) >>>>> dummy1:0 (ocf::pacemaker:Dummy): Started i-083f1f54 >>>>> dummy1:1 (ocf::pacemaker:Dummy): Started i-093f1f55 >>>>> dummy1:2 (ocf::pacemaker:Dummy): Started i-093f1f55 >>>>> dummy1:3 (ocf::pacemaker:Dummy): Started i-083f1f54 >>>>> dummy1:4 (ocf::pacemaker:Dummy): Started i-093f1f55 >>>>> >>>>> # pcs resource show --all >>>>> Resource: dummy2 (class=ocf provider=pacemaker type=Dummy) >>>>> Clone: dummy1-clone >>>>> Meta Attrs: clone-max=5 clone-node-max=5 globally-unique=true >>>>> Resource: dummy1 (class=ocf provider=pacemaker type=Dummy) >>>>> >>>>> # pcs property show --all | grep default-resource-stickiness >>>>> default-resource-stickiness: 0 >>>>> >>>>> Notice how i-053f1f59 isn't running anything. I feel like I'm missing >>>>> something obvious, but it escapes me. >>>>> >>>>> >>>> clones are ever so slightly sticky by default, try setting >>>> resource-stickiness=0 for the clone resource >>>> (and unset it once everything has moved back) >>>> >>>> >>>> >>> Thanks, that did indeed fix it. But how come dummy2 didn't move? It's not a >>> clone, but it didn't move either? >>> >> Do you have a location constraint that says it should prefer i-053f1f59? > No location constraint. > >>> And now a separate follow up question, the resources didn't balance as they >>> should. I've got several utilization attributes set, and the resources >>> aren't balanced according to the placement-strategy. >>> >>> # pcs property show placement-strategy >>> Cluster Properties: >>> placement-strategy: balanced >>> >>> # crm_simulate -URL >>> >>> Current cluster status: >>> Online: [ i-053f1f59 i-083f1f54 i-093f1f55 ] >>> >>> dummy2 (ocf::pacemaker:Dummy): Started i-053f1f59 >>> Clone Set: dummy1-clone [dummy1] (unique) >>> dummy1:0 (ocf::pacemaker:Dummy): Started i-053f1f59 >>> dummy1:1 (ocf::pacemaker:Dummy): Started i-093f1f55 >>> dummy1:2 (ocf::pacemaker:Dummy): Started i-083f1f54 >>> dummy1:3 (ocf::pacemaker:Dummy): Started i-083f1f54 >>> dummy1:4 (ocf::pacemaker:Dummy): Started i-093f1f55 >>> >>> Utilization information: >>> Original: i-053f1f59 capacity: cpu=5000000 mem=3840332000 >>> Original: i-083f1f54 capacity: cpu=5000000 mem=3840332000 >>> Original: i-093f1f55 capacity: cpu=5000000 mem=3840332000 >>> calculate_utilization: dummy2 utilization on i-053f1f59: cpu=10000 >>> calculate_utilization: dummy1:2 utilization on i-083f1f54: cpu=1000 >>> calculate_utilization: dummy1:1 utilization on i-093f1f55: cpu=1000 >>> calculate_utilization: dummy1:0 utilization on i-053f1f59: cpu=1000 >>> calculate_utilization: dummy1:3 utilization on i-083f1f54: cpu=1000 >>> calculate_utilization: dummy1:4 utilization on i-093f1f55: cpu=1000 >>> Remaining: i-053f1f59 capacity: cpu=4989000 mem=3840332000 >>> Remaining: i-083f1f54 capacity: cpu=4998000 mem=3840332000 >>> Remaining: i-093f1f55 capacity: cpu=4998000 mem=3840332000 >>> >>> >>> >>> The "balanced" strategy is defined as: "the node that has more free >>> capacity gets consumed first". >>> Notice that dummy2 consumes cpu=10000, while dummy1 is only 1000 (10x >>> less). After dummy2 was placed on i-053f1f59, that should have consumed >>> enough "cpu" resource to keep dummy1 off it and on the other 2 nodes, but >>> dummy1:0 got placed on the node. >>> >> But i-053f1f59 still has orders of magnitude more cpu capacity left to run >> things. > > I don't follow. They're all equal in terms of total "cpu" capacity. Right. But each node still has 4998000+ units with which to accommodate something that only requires 10000. Thats about 0.2% of the remaining capacity, so wherever it starts, its hardly making a dint. > And at the bottom of the simulate output, the "Remaining" even shows > i-053f1f59 has less remaining than the other nodes. > > However after playing with it some more, this appears to be an issue with > clones. When I created 5 separate resources instead, this does work as > expected. the dummy2 resource gets put on a node by itself, and the other > resources get distributed among the remaining nodes (at least until the "cpu" > used balances out). > > Since this smells like a bug, I can enter it on the bug tracker you mention > below. Its probably a result of clone stickiness (they have a default of 1) and the hoops we have to jump through to avoid them needlessly shuffling around. > >> >>> Also how difficult is it to add a strategy? >>> >> It might be challenging, the policy engine is deep voodoo :) >> Can you create an entry at bugs.clusterlabs.org and include the result of >> 'cibadmin -Q' when the cluster is in the state you describe above? >> >> It wont make it into 1.1.12 but we can look at it for .13 >> > > Will ponder possible scenarios and then enter it. Another thought occurred > that you might want to balance based on percentage of capacity used. So now > you've got, balanced based on amount of capacity used, balanced based on > amount of capacity free, and balance based on percent of capacity. All 3 of > them are probably similar enough in logic that the same algorithm could take > care of them, would just need a way to tune that algorithm (this would be my > guess anyway, no clue what the code looks like). > > >> >>> I'd be interested in having a strategy which places a resource on a node >>> with the least amount of capacity used? Kind of the inverse of "balanced". >>> The docs say balanced looks at much capacity is free. The 2 strategies >>> would be equivalent if all nodes have the same capacity, but if one node >>> has 10x the capacity of the other nodes, I want the resources to be >>> distributed evenly (based on the capacity each uses), and not over-utilize >>> that one node. >>> >>> Thanks >>> >>> -Patrick >>> >>> _ >>> > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org