On Fri, Jul 6, 2012 at 8:25 AM, Errol Neal <en...@businessgrade.com> wrote:
> Hi again. I was hoping to get some insight into why two nodes get rebooted in 
> my cluster when I halt one of of them.
>
> I'm running corosync 1.1.4 and pacemaker-1.1.6 on CentOS 6.2. I've put my 
> configuration up on pastebin if anyone would like to take a look
>
> http://pastebin.com/raw.php?i=6cAkJ3Qk

Not really enough I'm afraid. We'd need a crm_report archive which has
the logs and other data necessary to debug an issue of this kind.

>
> Could this be related?

No.

>
> ERROR: native_create_actions: Resource st-xenapi-nas1-dev3-fence 
> (stonith::fence_xenapi) is active on 2 nodes attempting recovery
>
> I noticed that during such times, multiple nodes are running the same 
> resource. Incidentally, even if this isn't the cause, Is there a way to 
> prevent this?

Not really, although I have been thinking about how to mask it in the PE.

Basically if there is a fencing device active on nodeX that is about
to be fenced, under some conditions we start it on nodeY before
stopping it on nodeX.
This is cheating a little, but is the only way to make progress if
nodeY needs it to fence nodeX or another node that failed at the same
time.

> Thanks in advance..
>
> -Errol
>
>
>
>
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to