Re: [Linux-HA] migration/fence after fail-count > X

Sebastian Reitenbach Tue, 13 Nov 2007 06:38:30 -0800

Hi, 

Andrew Beekhof <[EMAIL PROTECTED]> wrote: 
> 
> On Nov 13, 2007, at 1:02 PM, Sebastian Reitenbach wrote:
> 
> > Hi,
> >
> > I read in the v2 FAQ the following:
> >
> > What happens when monitor detects the resource down?
> > The node will try to restart the resource, but if this fails, it  
> > will fail
> > over to an other node.
> > A feature that allows failover after N failures in a given period of  
> > time is
> > planned.
> >
> > Is that feature still planned?
> 
> thats how it works already - sort of.
> there is a layer of indirection with resource-failcount-stickiness,  
> but basically once failcount hits a threshold - the resource moves.
> 
> knowing what to set resource-failcount-stickiness to can be tricky.
> one of the easiest, i can turn my brain off, ways is:
> 1) to start the cluster and make sure everything is running
> 2) figure out the current score (see conversations regarding the  
> getscores.sh script that has been posted here)
Ah, I need to look for that.


> 3) divide said score by X and add 1
> 
> > Could it also be instead of failover, fence the node X when  
> > failcount > X?
> 
> no, at least not yet anyway
> 
> interesting idea though
I think that would be a viable option for resources that could get damaged 
or produce confusion, when started multiple times in a cluster, e.g. Xen 
domU's, non cluster aware Filesystems, IP addresses...

> 
> > Or is that working already, and the FAQ is not upated?
> > At least when I see this:
> > http://www.linux-ha.org/v2/faq/forced_failover
> > It seems to work already, but only in combination with moving a  
> > resource to
> > another location, but not to be used to fence a node after a critical
> > fail-count is reached.
> > I've seen the fail_count utility, and tried to find examples on the  
> > webpage,
> > but that search was not too exhaustive.
> >
> > Also, can the fail-count of different resources be summed up to make a
> > decision in combination with fencing? E.g. Resources A, B, C...
> > The failcount of A=3, + B=4 = SUM=7 > 6, then fecnce the node where  
> > that
> > limit is reached.
> 
> as above. not at the moment
> 
Thanks for the input. I'll open some enhancement requests in the bugzilla 
later today for the two not possible ways.

kind regards
Sebastian

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] migration/fence after fail-count > X

Reply via email to