Alexander Markov <prof...@tic-tac.ru> writes: > Hello guys, > > it looks like I miss something obvious, but I just don't get what has > happened. > > I've got a number of stonith-enabled clusters within my big POWER boxes. > My stonith devices are two HMC (hardware management consoles) - separate > servers from IBM that can reboot separate LPARs (logical partitions) > within POWER boxes - one per every datacenter. > > So my definition for stonith devices was pretty straightforward: > > primitive st_dc2_hmc stonith:ibmhmc \ > params ipaddr=10.1.2.9 > primitive st_dc1_hmc stonith:ibmhmc \ > params ipaddr=10.1.2.8 > clone cl_st_dc2_hmc st_dc2_hmc > clone cl_st_dc1_hmc st_dc1_hmc > > Everything was ok when we tested failover. But today upon power outage
Did you test failover through pacemaker itself? Otherwise, the logs for the attempted stonith should reveal more about how Pacemaker tried to call the stonith device, and what went wrong. However: Am I understanding it correctly that you have one node in each data center, and a stonith device in each data center? That doesn't sound like a setup that can recover from data center failure: If the data center is lost, the stonith device for the node in that data center would also be lost and thus not able to fence. In such a hardware configuration, only a poison pill solution like SBD could work, I think. Cheers, Kristoffer > we lost one DC completely. Shortly after that cluster just literally > hanged itself upong trying to reboot nonexistent node. No failover > occured. Nonexistent node was marked OFFLINE UNCLEAN and resources were > marked "Started UNCLEAN" on nonexistent node. > > UNCLEAN seems to flag a problems with stonith configuration. So my > question is: how to avoid such behaviour? > > Thank you! > > -- > Regards, > Alexander > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- // Kristoffer Grönlund // kgronl...@suse.com _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org