On Mon, 2019-08-12 at 17:46 +0200, Ulrich Windl wrote: > Hi! > > I just noticed that a "crm resource cleanup <rsc>" caused some > unexpected behavior and the syslog message: > crmd[7281]: warning: new_event_notification (7281-97955-15): Broken > pipe (32) > > It's SLES14 SP4 last updated Sept. 2018 (up since then, pacemaker- > 1.1.19+20180928.0d2680780-1.8.x86_64). > > The cleanup was due to a failed monitor. As an unexpected consequence > of this cleanup, CRM seemed to restart the complete resource (and > dependencies), even though it was running.
I assume the monitor failure was old, and recovery had already completed? If not, recovery might have been initiated before the clean- up was recorded. > I noticed that a manual "crm_resource -C -r <rsc> -N <node>" command > has the same effect (multiple resources are "Cleaned up", resources > are restarted seemingly before the "probe" is done.). Can you verify whether the probes were done? The DC should log a message when each <rsc>_monitor_0 result comes in. > Actually the manual says when cleaning up a single primitive, the > whole group is cleaned up, unless using --force. Well ,I don't like > this default, as I expect any status change from probe would > propagate to the group anyway... In 1.1, clean-up always wipes the history of the affected resources, regardless of whether the history is for success or failure. That means all the cleaned resources will be reprobed. In 2.0, clean-up by default wipes the history only if there's a failed action (--refresh/-R is required to get the 1.1 behavior). That lessens the impact of the "default to whole group" behavior. I think the original idea was that a group indicates that the resources are closely related, so changing the status of one member might affect what status the others report. > Regards, > Ulrich -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/