----- Original Message ----- > From: "James Guthrie" <j...@open.ch> > To: "The Pacemaker cluster resource manager" <pacemaker@oss.clusterlabs.org> > Sent: Wednesday, February 6, 2013 6:52:07 AM > Subject: Re: [Pacemaker] Pacemaker resource migration behaviour > > A quick addendum to this message: > > The log files I provided actually continue until the resources do get > started on the host. The trigger for that is the 6-minute > failure-timeout timer that pops. As can be seen in pe-input-50, the > resources conntrackd, condition, sub-ospfd and sub-ripd are in slave > on both hosts and sub-squid is not started on either. This shows > that the desired end-state of the transitions produced with > pe-input-49 is never reached. >
Yep, This looks like a bug in attrd. I see the command going out to delete the fail-count for squid, but it fails. Since the fail-count isn't properly expired that sub-squid device can't start. Can you open a bugs.clusterlabs.org issue for this please. Include the logs. Thanks, -- Vossel > James > > On Feb 6, 2013, at 1:41 PM, James Guthrie <j...@open.ch> wrote: > > > Hi David, > > > > Unfortunately crm_report doesn't work correctly on my hosts as we > > have compiled from source with custom paths and apparently the > > crm_report and associated tools are not built to use the paths > > that can be customised with autoconf. > > > > Despite that, I have done some investigation and think I may have > > found an inconsistency. I have attached the pacemaker-relevant > > syslog, including the pe-input files. The logfile starts where > > pacemaker detects that sub-squid is not running on mu. It then > > fails over to nu, where two further failures take place. In order > > to recover from these failures, the pengine produces transitions > > 106, 107, 108 and 109, with the corresponding pe-input files 46, > > 47, 48 and 49. > > > > The way I understand it, pacemaker works through the transitions > > until something happens from outside, at which point the > > transitions are recalculated and pacemaker continues on. > > > > Using crm_simulate to observe the transitions that should happen > > tells me that the transitions that were calculated from > > pe-input-49 ought to have resulted in the resources conntrackd, > > condition, sub-ospfd, sub-ripd and sub-squid being promote to > > master. In fact, this never happens, but the crmd reports the > > transition as being complete. It appears as though nowhere is it > > acknowledged that the current state is not the desired outcome as > > calculated by the pengine. Is it possible that this is a bug? > > > > Regards, > > James > > > > <pacemaker-not-starting-resources.tar.gz> > > On Feb 5, 2013, at 7:41 PM, David Vossel <dvos...@redhat.com> > > wrote: > > > >> > >> > >> ----- Original Message ----- > >>> From: "James Guthrie" <j...@open.ch> > >>> To: "The Pacemaker cluster resource manager" > >>> <pacemaker@oss.clusterlabs.org> > >>> Sent: Tuesday, February 5, 2013 8:12:57 AM > >>> Subject: Re: [Pacemaker] Pacemaker resource migration behaviour > >>> > >>> Hi all, > >>> > >>> as a follow-up to this, I realised that I needed to slightly > >>> change > >>> the way the resource constraints are put together, but I'm still > >>> seeing the same behaviour. > >>> > > > >>> Below are an excerpt from the logs on the host and the revised > >>> xml > >>> configuration. In this case, I caused two failures on the host > >>> mu, > >>> which forced the resources onto nu then I forced two failures on > >>> nu. > >>> What can be seen in the logs are the two detected failures on nu > >>> (the "warning: update_failcount:" lines). After the two failures > >>> on > >>> nu, the VIP is migrated back to mu, but none of the "support" > >>> resources are promoted with it. > >> > >> I can't tell much from this output. > >> > >> Run the steps you use to reproduce this and create a crm_report of > >> the issue so we can see both the logs and pengine transition > >> files that proceed this. > >> > >> -- Vossel > >> > >> > >>> Regards, > >>> James > >>> > >> > >> _______________________________________________ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org