Re: [Linux-HA] restart x times before a failover

Max Hofer Thu, 20 Sep 2007 05:14:49 -0700

On Thursday 20 September 2007, Andrew Beekhof wrote:
> On 9/18/07, Max Hofer <[EMAIL PROTECTED]> wrote:
> > On Tuesday 18 September 2007, Spindler Michael wrote:
> > > Hi *,
> > >
> > > I´ve got a (hopefully) simple question:
> > >
> > > I have 5 node cluster, running 20 resources (single proceses). I would
> > > like to have the following behavior: If a resource fails, it should try
> > > to restart it on the same node. But this should be done max 2 times,
> > > then the rsesource should failover to another node. The resource should
> > > not do a auto failback, after a failed host is up again.
> > >
> > > I have tried the following:
> > > - default_resource_failure_stickiness set to -1
> > > - resource_stickiness set to 3 (on each resource)
> > > - no places or other constraints configured.
> > >
> > > According to http://linux-ha.org/v2/faq/forced_failover we should get:
> > >
> > > (stickiness) / abs(failure stickiness) = maximum times, a resource can
> > > fail before moved to another node.
> > >
> > > So in my case: 3 / abs(-1) = 3
> > >
> > > But my resources do a failover to other nodes immediatly after the
> > > first failure.
> > >
> > >
> > > Anyone here who is able to help me with this failover-scenario?
> >
> > First of all always provide the file created by the pengine which lead to
> > the failover - so we can give you an answer ;-)  (see below for
> > explanations).
> >
> > The best way to takle such kind of errors is following method:
> >
> > * trigger a resource failure
> >
> > * check the ha-log and see which CIB-status file was written on the
> > failover (grep "PEngine Input stored" /var/log/halog) ---> they are
> > usually stored in /var/lib/heartbeat/pengine
>
> or just run:
>     cibadmin -Ql > tmp.cib.xml
> and use that.  much easier than hunting around in the logs :-)
I do not know what the -l option does. Can you explain it please?


The manual is quite obscure:
-l command takes effect locally (rarely used, advanced option)

It seems that for this option it is important on which node the command is 
run. Is that true?

I have the problem that if something triggers a resource movement i have to 
tackle down the CIB-state in which the trigger occured. I used the those 
pengine logs to do that. 

Because the CIB after the change is not of much use if something changed in 
the cluster state afterwards.

kind regards Max
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] restart x times before a failover

Reply via email to