Re: [Linux-HA] Unrelated resource getting restarted when other group resource status changes

Doug Knight Mon, 18 Jun 2007 08:01:13 -0700

Thanks Andrew.

I took a look at the links you provided, and had some questions. The
diff looks to have come from a different version of the lrm.c file than
the one in my 2.0.8 source tree. Also, the packages you pointed me to
are for 2.1.0. What would be the best and easiest way to incorporate the
fix into my existing 2.0.8 source tree?


Doug

On Mon, 2007-06-18 at 16:40 +0200, Andrew Beekhof wrote:

> On 6/18/07, Doug Knight <[EMAIL PROTECTED]> wrote:
> > All,
> > I have an HA cluster consisting of two nodes running HA 2.0.8. I have
> > configured two groups and a single individual resource, as follows:
> >
> > grp_pgsql_mirror - drbd, file system, postgresql, alias IP address
> > skybase_ingestor_HA - a background process that feeds our database with
> > raw data
> > grp_decoders - background processes that are triggered by storage of raw
> > data through postgresql triggers
> >
> > Both groups have colocated and ordered set to true. I control where each
> > runs by using location constraints. The problem I'm having is that when
> > I move the decoder group from one node to another, the ingestor resource
> > gets a restart when it should not be touched. I've attached my cibadmin
> > -Q output, a portion of the log where I've executed the switch showing
> > the restart on the ingestor, and the piece of xml I use to update the
> > location constraint. The command I use to apply the revised location
> > constraint xml is:
> >
> > cibadmin -o constraints -R -x rule_locate_decoder_dk.xml
> >
> > I simply do not see why making changes to the decoders would have any
> > impact on the ingestor from a heartbeat stand point. Any ideas, am I
> > missing something in the XML?
> 
> It got restarted because of these lines:
> pengine[14991]: 2007/06/18_08:51:00 WARN: check_action_definition:
> Parameters to skybase_ingestor_HA_start_0 on arc-dknightlx changed:
> recorded f7d867defb23b1498919d3b8aa223431 vs. calculated
> 05cc0923186775b674d1d4876ac94e56
> 
> So the restart was "caused" by the update you made only in that it
> triggered a re-run of the PE.
> 
> The real mystery is why we think the parameters changed.
> 
> oh.... I bet you're suffering from this problem:
>    http://hg.beekhof.net/lha/crm-dev/rev/f7775a4af780
> 
> that would cause false positives
> 
> can i suggest the packages at:
>    http://software.opensuse.org/download/server:/ha-clustering
> they will have the indicated patch included.
> 
> >
> > Doug
> > p.s. I don't think the size of this email will exceed the list's limits,
> > but if it does I respectfully ask that it be passed along.
> 
> compressing the logs usually helps, but it looks like it was small enough 
> anyway
> 
> > I have a
> > milestone to meet this week and would like to get this last issue
> > resolved as soon as I can. Thanks again,.
> >
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> >
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Unrelated resource getting restarted when other group resource status changes

Reply via email to