Re: [Linux-HA] Failover not working as I expected

Andrew Beekhof Tue, 20 Jan 2009 13:33:38 -0800

On Tue, Jan 20, 2009 at 21:48, Jerome Yanga <jya...@esri.com> wrote:
> Dominik,
>
> Per your request, attached is my current configuration.
>
> To reiterate, the following are still concerns:
>
> 01)  Resources gets bounced when Nomen rejoins the cluster.
> 02)  Group failover will not work as hoped.
>
> As per resource monitoring, I believe that the customized init scripts are 
> working properly; however, me being a noob seems to contradict this.  I have 
> tested the init scripts in a way that when a failure of the resource is 
> experienced the service is restarted.  After seeing that the init script is 
> working, I have set the "On Fail" value to "stop" instead of "restart".
>
> Moreover, I have tried varying the group scores by changing the 
> resource_stickiness and the resource_failure_stickiness values.


I would highly encourage you to upgrade to the latest stable series of
Pacemaker.
The whole failure stickiness nonsense has been completely dropped in
favor of something thats actually usable.

http://clusterlabs.org/wiki/Install
http://clusterlabs.org/wiki/Documentation <-- look for the 1.0 version
of configuration explained

> However, I have not been able to consistently failover the group by stopping 
> one of the resources.  During the testing, I have tried using the equation 
> below from the site you provided in your previous email.
>
> node = (constraint-score) + (num_group_resources * resource_stickiness) + 
> (failcount * (resource_failure_stickiness) )
>
> Unfortunately, the scores does not seem to follow this equation as I would 
> verify them using the showscores.sh.  The following values were assign to the 
> Directory_Server group during this testing.
>
> resource_stickiness=100
> resource_failure_stickiness=-500
>
> I have also attempted to use the crm_failcount command to make sure that the 
> scores prior to failing any resource gets reset, but showscores.sh seems to 
> show that the command is not working.
>
> I have also tried to change the cib.xml file manually to assign the values 
> above to default-resource-stickiness and default-resource-failure-stickiness 
> respectively, but after doing so, all the resources seems to disappear.  
> (Good thing I had created a copy of the cib.xml file.)
>
> By the way, I have changed the values back to the following:
>
> resource_stickiness=100
> resource_failure_stickiness=-100
>
> Help.
>
> Regards,
> Jerome
>
>
>
>
> -----Original Message-----
> From: linux-ha-boun...@lists.linux-ha.org 
> [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Dominik Klein
> Sent: Monday, January 19, 2009 11:31 PM
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] Failover not working as I expected
>
> Jerome Yanga wrote:
>> Dominik,
>>
>> Thank you much.   Adding "resource-stickiness" and getting rid of the 
>> constraint helped a lot.  The resources does not go back to Nomen anymore 
>> when it's heartbeat is started again  (resources stays with Rubric).  
>> However, the resources still gets bounced once Nomen joins the cluster.  Is 
>> there any way to keep the resources from bouncing when Nomen rejoins the 
>> cluster?
>
> Please share your current configuration.
>
>> I have also observed another issue.  As you have seen in my cib.xml, I have 
>> created a group called Directory_Server.  In this group, there are three 
>> resources, namely:  VIP, ECAS and FDS_Admin.  If I manually turn off any of 
>> these resources, I would like the group resource, Directory_Server, to 
>> failover to the other node.  Is there a configuration that will do this?  
>> Currently, if one of three resources goes down it stays down and the rest 
>> continues running.  All three resources will need to be up and running for 
>> our applications to work properly.
>
> Sounds like you're not doing any resource monitoring. Read up on that
> and configure it. The ScoreCalculation page might be handy to understand
> how things work: http://www.linux-ha.org/ScoreCalculation
>
> Regards
> Dominik
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Failover not working as I expected

Reply via email to