Hi, On Tue, Dec 14, 2010 at 12:16:22PM +0200, Chris Picton wrote: > Hi > > I have noticed this happening a few times on various of my clusters. > The monitor operation for some resources stops running, and thus > resource failures are not detected. If I edit the cib, and change > something regarding the resource (generally I change the monitor > interval), the resource starts monitoring again, detects the failure and > restarts correctly > > I am using pacemaker 1.0.9 live, and 1.0.10 in test. > > This has happened with both clone and non-clone resources. > > I have attached a log which shows the behaviour. I have a resource > (megaswitch) running cloned over 6 nodes. > > Until 06:48:22, the monitor is running correctly (the app logs the > "Deleting context for MONTEST-" line when the monitor is run) > After that, the monitor is not run again on this node > > I have the logs for the other nodes, if they are needed to try and debug > this.
Nov 28 06:48:26 sbc-tpna2-01 crmd: [4863]: info: do_lrm_invoke: Removing resource megaswitch:3 from the LRM Nov 28 06:48:26 sbc-tpna2-01 crmd: [4863]: info: do_lrm_invoke: Resource 'megaswitch:3' deleted for 19511_crm_resource on sbc-tpna2-06.ecntelecoms.za.net Nov 28 06:48:26 sbc-tpna2-01 crmd: [4863]: info: notify_deleted: Notifying 19511_crm_resource on sbc-tpna2-06.ecntelecoms.za.net that megaswitch:3 was deleted Somebody/something on sbc-tpna2-06.ecntelecoms.za.net ran crm_resource (or perhaps the crm shell) and removed megaswitch from LRM. Any suspicious cron jobs over there? Thanks, Dejan > -- > Chris Picton > > Executive Manager - Systems > ECN Telecommunications (Pty) Ltd > t: 010 590 0031 m: 079 721 8521 > f: 087 941 0813 > e: ch...@ecntelecoms.com > > "Lowering the cost of doing business" > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker