On Fri, Jul 6, 2012 at 8:25 AM, Errol Neal <en...@businessgrade.com> wrote: > Hi again. I was hoping to get some insight into why two nodes get rebooted in > my cluster when I halt one of of them. > > I'm running corosync 1.1.4 and pacemaker-1.1.6 on CentOS 6.2. I've put my > configuration up on pastebin if anyone would like to take a look > > http://pastebin.com/raw.php?i=6cAkJ3Qk
Not really enough I'm afraid. We'd need a crm_report archive which has the logs and other data necessary to debug an issue of this kind. > > Could this be related? No. > > ERROR: native_create_actions: Resource st-xenapi-nas1-dev3-fence > (stonith::fence_xenapi) is active on 2 nodes attempting recovery > > I noticed that during such times, multiple nodes are running the same > resource. Incidentally, even if this isn't the cause, Is there a way to > prevent this? Not really, although I have been thinking about how to mask it in the PE. Basically if there is a fencing device active on nodeX that is about to be fenced, under some conditions we start it on nodeY before stopping it on nodeX. This is cheating a little, but is the only way to make progress if nodeY needs it to fence nodeX or another node that failed at the same time. > Thanks in advance.. > > -Errol > > > > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org