Hello, Dejan,
The first thing I'd try is making sure you can fence each node from the
command line by manually running the fence agent. I'm not sure how to
do
that for the "stonith:" type agents.
There's a program stonith(8). It's easy to replicate the
configuration on the command line.
Unfortunately, it is not.
The landscape I refer to is similar to VMWare. We use cluster for
virtual machines (LPARs) and everything works OK but the real pain
occurs when whole host system is down. Keeping in mind that it's
actually used now in production, I just can't afford to turn it off for
test reason.
Stonith agents are to be queried for the list of nodes they can
manage. It's part of the interface. Some agents can figure that
out by themself and some need a parameter defining the node list.
And this is just the place I'm stuck. I've got two stonith devices
(ibmhmc) for redundancy. Both of them are capable to manage every node.
The problem starts when
1) one stonith device is completely lost and inaccessible (due to power
outage in datacenter)
2) survived stonith device cannot access nor cluster node neither
hosting system (in VMWare terms) for this cluster node, for both of them
are also lost due to power outage.
What is the correct solution for this situation?
Well, this used to be a standard way to configure one kind of
stonith resources, one common representative being ipmi, and
served exactly the purpose of restricting the stonith resource
from being enabled ("running") on a node which this resource
manages.
Unfortunately, there's no such thing as ipmi in IBM Power boxes. But it
triggers interesting question for me: if both one node and its
complementary ipmi device are lost (due to power outage) - what's
happening with a cluster? Survived node, running stonith resource for
dead node tries to contact ipmi device (which is also dead). How does
cluster understand that lost node is really dead and it's not just a
network issue?
Thank you.
--
Regards,
Alexander Markov
+79104531955
_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org