You are definitely right, all should be redundant and it is. We are using an UPS, but I have seen some UPS tested well monthly and they crashed on a real power crash.
Is it possible to define a timeout? If "nodeB" can not STONITH "nodeA" for maybe 5 minutes, the risk to have a split brain would be calculateable. Thanks, Christian 2008/8/28 Dejan Muhamedagic <[EMAIL PROTECTED]> > On Wed, Aug 27, 2008 at 04:24:48PM +0200, Christian W?rns wrote: > > Please have also a look in the message "*Don't retry to fence the other > node > > *" from Alexander Hoffman. He has the same problem I think. > > > > http://www.gossamer-threads.com/lists/linuxha/users/50496 > > > > Dejan answered that the ilo has to answer a success message also when the > > server is off. So, in our case the ilo can't answer because the hole > server > > is powerless. > > That's a serious issue with all lights-out devices: they depend > on the same power source as the node. Somehow, in that post you > referenced, I missed that the power was completely out. Hence, if > you use stonith devices of this kind you better make sure that > there's always power: use redundant power supplies and a UPS. > > Thanks, > > Dejan > > > Thanks, > > Christian > > > > 2008/8/27 Christian W?rns <[EMAIL PROTECTED]> > > > > > Hallo! > > > > > > Today we are testing desaster scenarios with our heartbeat / drbd > cluster. > > > Our first testcase was to make node A totally powerless without > shutdown. > > > Heartbeat sees the resource offline an tries to do a stonith to make > sure > > > its really dead. Because also the ilo-connect is powerless the stonith > > > failed. Then heartbeat hangs up and no services get started on the > remaining > > > node. Is there any solution for this behaviour? > > > > > > Packages: (SLES10 SP2, standard pakages) > > > > > > heartbeat-ldirectord-2.1.3-0.9 > > > heartbeat-2.1.3-0.9 > > > heartbeat-cmpi-2.1.3-0.9 > > > heartbeat-stonith-2.1.3-0.9 > > > heartbeat-pils-2.1.3-0.9 > > > yast2-heartbeat-2.13.13-0.3 > > > drbd-kmp-bigsmp-0.7.22_2.6.16.60_0.21-42.16 > > > drbd-0.7.22-42.16 > > > > > > I will attach the current cib on this message. > > > > > > Any help much appreciated, > > > Christian > > > > > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
