On 11/08/2016 03:02 AM, Ulrich Windl wrote:
>>>> Ken Gaillot <kgail...@redhat.com> schrieb am 07.11.2016 um 16:15 in 
>>>> Nachricht
>>>>>> * Pacemaker's existing "node health" feature allows resources to move
>>>>>> off nodes that become unhealthy. Now, when using
>>>>>> node-health-strategy=progressive, a new cluster property
>>>>>> node-health-base will be used as the initial health score of newly
>>>>>> joined nodes (defaulting to 0, which is the previous behavior). This
>>>>>> allows cloned and multistate resource instances to start on a node even
>>>>>> if it has some "yellow" health attributes.
>>>>>
>>>>> So the node health is more or less a "node score"? I don't understand the 
>>>> last
>>>>> sentence. Maybe give an example?
>>>>
>>>> Yes, node health is a score that's added when deciding where to place a
>>>> resource. It does get complicated ...
>>>>
>>>> Node health monitoring is optional, and off by default.
>>>>
>>>> Node health attributes are set to red, yellow or green (outside
>>>> pacemaker itself -- either by a resource agent, or some external
>>>> process). As an example, let's say we have three node health attributes
>>>> for CPU usage, CPU temperature, and SMART error count.
>>>>
>>>> With a progressive strategy, red and yellow are assigned some negative
>>>> score, and green is 0. In our example, let's say yellow gets a -10 score.
>>>>
>>>> If any of our attributes are yellow, resources will avoid the node
>>>> (unless they have higher positive scores from something like stickiness
>>>> or a location constraint).
>>>>
>>>
>>> I understood so far.
>>>
>>>> Normally, this is what you want, but if your resources are cloned on all
>>>> nodes, maybe you don't care if some attributes are yellow. In that case,
>>>> you can set node-health-base=20, so even if two attributes are yellow,
>>>> it won't prevent resources from running (20 + -10 + -10 = 0).
>>>
>>> I don't understand that: "node-health-base" is a global setting, but what 
>>> you 
>> want is an exception for some specific (clone) resource.
>>> To me the more obvious solution would be to provide an exception rule for 
>> the resource, not a global setting for the node.
>>
>> The main advantage of node-health-base over other approaches -- such as
>> defining a constant #health-base attribute for all nodes, or defining
>> positive location constraints for each resource on each node -- is that
>> node-health-base applies to all resources and nodes, present and future.
>> If someone adds a node to the cluster, it will automatically get
>> node-health-base when it joins, whereas any other approach requires
>> additional configuration changes (which leaves a window where the value
>> is not applied).
> 
> So the node-health-base is a default value for the node until it will be 
> explicitly set? Do you try to handle the problem "all nodes are to be assumed 
> bad until proven to be good"? Are we maybe fighting a completely different 
> problem (with some RAs)?

node-health-base is a sort of default health value, but node-health is
never explicitly set -- it's the sum of node-health-base and the
adjustments for each health attribute.

node-health-base could be used for the "assumed bad" approach: you could
set node-health-base to a negative value, and set green to a positive
value (rather than 0, which is its default). Then, each green attribute
would eat away at the deficit.

>>
>> It also simplifies the configuration the more nodes/resources you have,
>> and is less prone to accidental configuration mistakes.
>>
>> The idea is straightforward: instead of each node starting with a health
>> score of 0 (which means any negative health attribute will push all
>> resources away), start each node with a positive health score, so that
>> health has to drop below a certain point before affecting resources.
> 
> I don't see the difference between "starting at 0, substracting a small 
> score" and "staring at some positive, subtracting a large score": You are 
> saying that any negative score will move all resources away? I thought it 
> only happens on -INFINITY.

Pacemaker always combines scores from all sources and uses the final
value to decide resource placement.

So, if a node's health is -50 but a resource has a location preference
of +100 for that node, then the resource could still be placed there.

You are right, only -INFINITY is mandatory, but if there are no positive
scores from other sources, any negative score will have the same effect
of keeping resources off the node. So, starting from a positive number
is a big difference in effect.

The user is responsible for choosing meaningful values. For example, if
node-health-base is +10 but yellow is -15, then any yellow attribute
will still push resources away. Of course, that could still be
meaningful when combined with other scores -- someone might do that if
they want a location preference of +5 to counteract a single yellow
attribute. Or maybe instead of node-health-base, someone sets a positive
stickiness, so existing resources can stay on a yellow node, but new
resources won't be placed there. It can be as simple or complicated as
you want to get :)

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to