On Fri, 2021-01-15 at 11:40 +0100, Ulrich Windl wrote: > Hi! > > With a cluster recheck interval, I see periodic log messages like > this: > Jan 15 11:05:50 h19 pacemaker-controld[4804]: notice: State > transition S_TRANSITION_ENGINE -> S_IDLE > Jan 15 11:15:50 h19 pacemaker-controld[4804]: notice: State > transition S_IDLE -> S_POLICY_ENGINE
The "transition" terminology is a little confusing. Note that the above uses of it are just in the normal sense, i.e. the controller state changed. The controller uses a finite state machine to keep track of what it's doing now and next. Going from "transition engine" to "idle" means it finished whatever needed to be done in that transition (in the more technical Pacemaker sense). Going from "idle" to "police engine" means it is ready to re-invoke the scheduler to re-check whether anything needs to be done. > Jan 15 11:15:50 h19 pacemaker-schedulerd[4803]: notice: Watchdog > will be used via SBD if fencing is required and stonith-watchdog- > timeout is nonzero > Jan 15 11:15:50 h19 pacemaker-schedulerd[4803]: notice: Calculated > transition 596, saving inputs in /var/lib/pacemaker/pengine/pe-input- > 41.bz2 > Jan 15 11:15:50 h19 pacemaker-controld[4804]: notice: Processing > graph 596 (ref=pe_calc-dc-1610705750-978) derived from > /var/lib/pacemaker/pengine/pe-input-41.bz2 > Jan 15 11:15:50 h19 pacemaker-controld[4804]: notice: Transition 596 > (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pacemaker/pengine/pe-input-41.bz2): Complete > > The "transition" number increases each time, while there is visible > no action to be performed. So what's in such a "transition"? Couldn't > the cluster skip those lines if there's nothing to do? > > Regards, > Ulrich "Transition" as Pacemaker uses it in a technical sense is what you called in a different post an "action plan". A transition is all actions needed to bring the cluster to the desired state (as defined by the configuration), given everything known about the cluster at the moment (represented by the complete CIB including configuration and status). The controller starts a new transition whenever something interesting happens (like a resource monitor failure), when a transition action returns an unexpected result (like a start failing instead of succeeding), and periodically (according to cluster-recheck-interval). In any case, it's possible there's nothing to do, so the transition has no actions. It's still a record that the cluster checked whether anything needed to be done, and decided no. I have considered lowering the log message to info level in that case, though -- that probably makes sense. -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/