Hello Andrew,
Thank you for your prompt response.
I tried your patch and it works fine!
Please backport this patch to latest Pacemaker
and Heartbeat 2.1.4.
Best Regards,
NAKAHIRA Kazutomo
Andrew Beekhof wrote:
2008/4/24 NAKAHIRA Kazutomo <[EMAIL PROTECTED]>:
hello, all
I tried same test pattern reported by Hideo Yamauchi,
and still automatic fail-back occurs in latest Pacemaker.
(Pacemaker changeset: bf619298929c, Heartbeat changeset: 54723736ab18)
oh :-(
sorry, i just assumed it was the same problem
There is a log output by PE when execute "crm_resource -C -r
group1-dummy2 -H dl380g5e".
(snip ha-log)
pengine[13894]: 2008/04/24_19:02:57 info: common_apply_stickiness:
Setting failure stickiness for group1-dummy2 on dl380g5e: 727379968
(snip ha-log)
It seems that if fail-count become INFINITY for any reason and
default-resource-failure-stickiness value defined as "-INFINITY",
then common_apply_stickiness() calculates invalid value.
Can you try the following patch?
diff -r 5229c9b520f3 lib/crm/pengine/complex.c
--- a/lib/crm/pengine/complex.c Thu Apr 24 13:20:48 2008 +0200
+++ b/lib/crm/pengine/complex.c Thu Apr 24 15:47:40 2008 +0200
@@ -372,14 +372,19 @@ common_apply_stickiness(resource_t *rsc,
if(fail_count > 0 && rsc->fail_stickiness != 0) {
resource_t *failed = rsc;
+ int score = fail_count * rsc->fail_stickiness;
if(is_not_set(rsc->flags, pe_rsc_unique)) {
failed = uber_parent(rsc);
}
- resource_location(failed, node, fail_count *
rsc->fail_stickiness,
- "fail_stickiness", data_set);
+
+ /* detect and prevent score underflows */
+ if(rsc->fail_stickiness < 0 && (score > 0 || score <
-INFINITY)) {
+ score = -INFINITY;
+ }
+
+ resource_location(failed, node, score, "fail_stickiness",
data_set);
crm_info("Setting failure stickiness for %s on %s: %d",
- failed->id, node->details->uname,
- fail_count * rsc->fail_stickiness);
+ failed->id, node->details->uname, score);
}
g_hash_table_destroy(meta_hash);
}
Best regards,
NAKAHIRA Kazutomo
HIDEO YAMAUCHI wrote:
> Hi,
>
>> 2008/4/17 HIDEO YAMAUCHI <[EMAIL PROTECTED]>:
>>> Hi,
>>>
>>> I used Heartbeat-STABLE-2-1-932f11969945.
>>> I confirmed movement of a simple group resource.
>>>
>>> 1)I fail in the start movement of one resource in an Active node.
>>>
>>>
>>> 2)All resources move to a Standby node.
>>>
>>> 3)I make the resource of the Active node clear by a crm_resource command.
>>> crm_resource -C -r group1-dummy1 -H rh51-pm
>>>
>>> 4)All the resources move to an Active node. (Automatic failback occurs.)
>>>
>>> Node: rh51-pm (fe4ff160-196b-4b5f-b341-5b1ccf666bf1): online
>>> Node: rh51-pm2 (19ca6bf8-a6a0-4207-ad1f-bd4ed22ebcd4): online
>>>
>>> Resource Group: resource_group1
>>> group1-dummy1 (ocf::heartbeat:Dummy): Started rh51-pm
>>> group1-dummy2 (ocf::heartbeat:Dummy2): Started rh51-pm
>>>
>>>
>>> I think that the failback did not work in Ver2.1.3. (at case 4)
>>>
>>> Is this new specifications from Ver2.1.4?
>> No it was a bug that I fixed a few days back - I guess the fix hasn't
>> been backported yet
>
> OK.
>
> I wait for the revision of the bug to be reflected.
>
> Thanks,
>
> Hideo Yamauchi.
>
>>> And, is there the setting method that does not failback in the same way
as Ver2.1.3?
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
--
----------------------------------------
NAKAHIRA Kazutomo
NTT DATA INTELLILINK CORPORATION
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
--
----------------------------------------
NAKAHIRA Kazutomo
NTT DATA INTELLILINK CORPORATION
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/