[ClusterLabs] Problem with stonith and starting services

Cesar Hernandez Mon, 03 Jul 2017 00:39:07 -0700

Hi

I have installed a pacemaker cluster with two nodes. The same type of 
installation has done before many times and the following error never appeared 
before. The situation is the following:


both nodes running cluster services
stop pacemaker&corosync on node 1
stop pacemaker&corosync on node 2
start corosync&pacemaker on node 1

Then node 1 starts, it sees node2 down, and it fences it, as it was expected. 
But the problem comes when node 2 is rebooted and starts cluster services: 
sometimes, it starts the corosync service but the pacemaker service starts and 
then stops. The syslog shows the following error in these cases:

Jul  3 09:07:04 node2 pacemakerd[597]:  warning: The crmd process (608) can no 
longer be respawned, shutting the cluster down.
Jul  3 09:07:04 node2 pacemakerd[597]:   notice: Shutting down Pacemaker

Previous messages show some warning messages that I'm not sure they are related 
with the shutdown:


Jul  3 09:07:04 node2 stonith-ng[604]:   notice: Operation reboot of node2 by 
node1 for crmd.2413@node1.608d8118: OK
Jul  3 09:07:04 node2 crmd[608]:     crit: We were allegedly just fenced by 
node1 for node1!
Jul  3 09:07:04 node2 corosync[585]:   [pcmk  ] info: pcmk_ipc_exit: Client 
crmd (conn=0x1471800, async-conn=0x1471800) left


On node1, all resources become unrunnable and it stays there forever until I 
start manually pacemaker service on node2. 
As I said, same type of installation has done before on other servers and never 
happened this. The only difference is that in previous installations I 
configured corosync with multicast and now I have configured with unicast (my 
current network environment doesn't allow multicast) but I think it's not 
related with that behaviour

Cluster software versions:
corosync-1.4.8
crmsh-2.1.5
libqb-0.17.2
Pacemaker-1.1.14
resource-agents-3.9.6



Can you help me?

Thanks

Cesar



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Problem with stonith and starting services

Reply via email to