You are definitely right, all should be redundant and it is. We are using an
UPS, but I have seen
some UPS tested well monthly and they crashed on a real power crash.

Is it possible to define a timeout? If "nodeB" can not STONITH "nodeA" for
maybe 5 minutes,
the risk to have a split brain would be calculateable.

Thanks,
   Christian

2008/8/28 Dejan Muhamedagic <[EMAIL PROTECTED]>

> On Wed, Aug 27, 2008 at 04:24:48PM +0200, Christian W?rns wrote:
> > Please have also a look in the message "*Don't retry to fence the other
> node
> > *" from Alexander Hoffman. He has the same problem I think.
> >
> > http://www.gossamer-threads.com/lists/linuxha/users/50496
> >
> > Dejan answered that the ilo has to answer a success message also when the
> > server is off. So, in our case the ilo can't answer because the hole
> server
> > is powerless.
>
> That's a serious issue with all lights-out devices: they depend
> on the same power source as the node. Somehow, in that post you
> referenced, I missed that the power was completely out. Hence, if
> you use stonith devices of this kind you better make sure that
> there's always power: use redundant power supplies and a UPS.
>
> Thanks,
>
> Dejan
>
> > Thanks,
> >    Christian
> >
> > 2008/8/27 Christian W?rns <[EMAIL PROTECTED]>
> >
> > > Hallo!
> > >
> > > Today we are testing desaster scenarios with our heartbeat / drbd
> cluster.
> > > Our first testcase was to make node A totally powerless without
> shutdown.
> > > Heartbeat sees the resource offline an tries to do a stonith to make
> sure
> > > its really dead. Because also the ilo-connect is powerless the stonith
> > > failed. Then heartbeat hangs up and no services get started on the
> remaining
> > > node. Is there any solution for this behaviour?
> > >
> > > Packages: (SLES10 SP2, standard pakages)
> > >
> > > heartbeat-ldirectord-2.1.3-0.9
> > > heartbeat-2.1.3-0.9
> > > heartbeat-cmpi-2.1.3-0.9
> > > heartbeat-stonith-2.1.3-0.9
> > > heartbeat-pils-2.1.3-0.9
> > > yast2-heartbeat-2.13.13-0.3
> > > drbd-kmp-bigsmp-0.7.22_2.6.16.60_0.21-42.16
> > > drbd-0.7.22-42.16
> > >
> > > I will attach the current cib on this message.
> > >
> > > Any help much appreciated,
> > >    Christian
> > >
> > >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to