On 11/08/2016 03:02 AM, Ulrich Windl wrote: >>>> Ken Gaillot <kgail...@redhat.com> schrieb am 07.11.2016 um 16:15 in >>>> Nachricht >>>>>> * Pacemaker's existing "node health" feature allows resources to move >>>>>> off nodes that become unhealthy. Now, when using >>>>>> node-health-strategy=progressive, a new cluster property >>>>>> node-health-base will be used as the initial health score of newly >>>>>> joined nodes (defaulting to 0, which is the previous behavior). This >>>>>> allows cloned and multistate resource instances to start on a node even >>>>>> if it has some "yellow" health attributes. >>>>> >>>>> So the node health is more or less a "node score"? I don't understand the >>>> last >>>>> sentence. Maybe give an example? >>>> >>>> Yes, node health is a score that's added when deciding where to place a >>>> resource. It does get complicated ... >>>> >>>> Node health monitoring is optional, and off by default. >>>> >>>> Node health attributes are set to red, yellow or green (outside >>>> pacemaker itself -- either by a resource agent, or some external >>>> process). As an example, let's say we have three node health attributes >>>> for CPU usage, CPU temperature, and SMART error count. >>>> >>>> With a progressive strategy, red and yellow are assigned some negative >>>> score, and green is 0. In our example, let's say yellow gets a -10 score. >>>> >>>> If any of our attributes are yellow, resources will avoid the node >>>> (unless they have higher positive scores from something like stickiness >>>> or a location constraint). >>>> >>> >>> I understood so far. >>> >>>> Normally, this is what you want, but if your resources are cloned on all >>>> nodes, maybe you don't care if some attributes are yellow. In that case, >>>> you can set node-health-base=20, so even if two attributes are yellow, >>>> it won't prevent resources from running (20 + -10 + -10 = 0). >>> >>> I don't understand that: "node-health-base" is a global setting, but what >>> you >> want is an exception for some specific (clone) resource. >>> To me the more obvious solution would be to provide an exception rule for >> the resource, not a global setting for the node. >> >> The main advantage of node-health-base over other approaches -- such as >> defining a constant #health-base attribute for all nodes, or defining >> positive location constraints for each resource on each node -- is that >> node-health-base applies to all resources and nodes, present and future. >> If someone adds a node to the cluster, it will automatically get >> node-health-base when it joins, whereas any other approach requires >> additional configuration changes (which leaves a window where the value >> is not applied). > > So the node-health-base is a default value for the node until it will be > explicitly set? Do you try to handle the problem "all nodes are to be assumed > bad until proven to be good"? Are we maybe fighting a completely different > problem (with some RAs)?
node-health-base is a sort of default health value, but node-health is never explicitly set -- it's the sum of node-health-base and the adjustments for each health attribute. node-health-base could be used for the "assumed bad" approach: you could set node-health-base to a negative value, and set green to a positive value (rather than 0, which is its default). Then, each green attribute would eat away at the deficit. >> >> It also simplifies the configuration the more nodes/resources you have, >> and is less prone to accidental configuration mistakes. >> >> The idea is straightforward: instead of each node starting with a health >> score of 0 (which means any negative health attribute will push all >> resources away), start each node with a positive health score, so that >> health has to drop below a certain point before affecting resources. > > I don't see the difference between "starting at 0, substracting a small > score" and "staring at some positive, subtracting a large score": You are > saying that any negative score will move all resources away? I thought it > only happens on -INFINITY. Pacemaker always combines scores from all sources and uses the final value to decide resource placement. So, if a node's health is -50 but a resource has a location preference of +100 for that node, then the resource could still be placed there. You are right, only -INFINITY is mandatory, but if there are no positive scores from other sources, any negative score will have the same effect of keeping resources off the node. So, starting from a positive number is a big difference in effect. The user is responsible for choosing meaningful values. For example, if node-health-base is +10 but yellow is -15, then any yellow attribute will still push resources away. Of course, that could still be meaningful when combined with other scores -- someone might do that if they want a location preference of +5 to counteract a single yellow attribute. Or maybe instead of node-health-base, someone sets a positive stickiness, so existing resources can stay on a yellow node, but new resources won't be placed there. It can be as simple or complicated as you want to get :) _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org