On Thursday 20 September 2007, Andrew Beekhof wrote: > On 9/18/07, Max Hofer <[EMAIL PROTECTED]> wrote: > > On Tuesday 18 September 2007, Spindler Michael wrote: > > > Hi *, > > > > > > I´ve got a (hopefully) simple question: > > > > > > I have 5 node cluster, running 20 resources (single proceses). I would > > > like to have the following behavior: If a resource fails, it should try > > > to restart it on the same node. But this should be done max 2 times, > > > then the rsesource should failover to another node. The resource should > > > not do a auto failback, after a failed host is up again. > > > > > > I have tried the following: > > > - default_resource_failure_stickiness set to -1 > > > - resource_stickiness set to 3 (on each resource) > > > - no places or other constraints configured. > > > > > > According to http://linux-ha.org/v2/faq/forced_failover we should get: > > > > > > (stickiness) / abs(failure stickiness) = maximum times, a resource can > > > fail before moved to another node. > > > > > > So in my case: 3 / abs(-1) = 3 > > > > > > But my resources do a failover to other nodes immediatly after the > > > first failure. > > > > > > > > > Anyone here who is able to help me with this failover-scenario? > > > > First of all always provide the file created by the pengine which lead to > > the failover - so we can give you an answer ;-) (see below for > > explanations). > > > > The best way to takle such kind of errors is following method: > > > > * trigger a resource failure > > > > * check the ha-log and see which CIB-status file was written on the > > failover (grep "PEngine Input stored" /var/log/halog) ---> they are > > usually stored in /var/lib/heartbeat/pengine > > or just run: > cibadmin -Ql > tmp.cib.xml > and use that. much easier than hunting around in the logs :-) I do not know what the -l option does. Can you explain it please?
The manual is quite obscure: -l command takes effect locally (rarely used, advanced option) It seems that for this option it is important on which node the command is run. Is that true? I have the problem that if something triggers a resource movement i have to tackle down the CIB-state in which the trigger occured. I used the those pengine logs to do that. Because the CIB after the change is not of much use if something changed in the cluster state afterwards. kind regards Max _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
