Depends on the HA package you are using. Heartbeat comes with a script that supports IPMI.
The important thing is that stonith NOT succeed if you don't _know_ that the node is off. So it is absolutely not a 1-line script. Kevin David Noriega wrote: > I think I'll go the ipmi route. So reading on STONITH, its just a > script, so all I would need is a script to run ipmi that tells the > server to power off, right? > > Also while reading through the lustre manual, seems some things are > being deleted from the wiki, > http://wiki.lustre.org/index.php?title=Clu_Manager no longer exists, > and noticed this too when I found the lustre quick guide is no longer > available. > > Thanks > David > > On Tue, Aug 10, 2010 at 10:57 AM, Kevin Van Maren > <[email protected]> wrote: > >> David Noriega wrote: >> >>> Could you describe this resource fencing in more detail? As for >>> regards to STONITH, the pdu already has the grubby hands of IT plugged >>> into it and doubt they would be happy if I unplugged them. What about >>> the network management port or ILOM? >>> >>> >> Resource fencing is needed to ensure that a node does not take over a >> resource (ie, OST) >> while the other node is still accessing it (as could happen if the node only >> partly crashes, >> where it is not responding to the HA package but still writing to the disk). >> >> STONITH is a pretty common way to ensure the other node is dead and can no >> longer >> access the resource. If you can't use your switched PDU, then using the >> ILOM for IPMI-based >> power control works. The other common way to do resource fencing is to use >> scsi reserve >> commands (if supported by the hardware and the HA package) to ensure >> exclusive access. >> >> Kevin >> >> >>> On Mon, Aug 9, 2010 at 1:08 PM, Kevin Van Maren >>> <[email protected]> wrote: >>> >>> >>>> On Aug 9, 2010, at 11:45 AM, David Noriega <[email protected]> wrote: >>>> >>>> >>>> >>>>> My understanding of setting up fail-over is you need some control over >>>>> the power so with a script it can turn off a machine by cutting its >>>>> power? Is this correct? >>>>> >>>>> >>>> It is the recommended configuration because it is simple to understand >>>> and >>>> implement. >>>> >>>> But the only _hard_ requirement is that both nodes can access the >>>> storage. >>>> >>>> >>>> >>>> >>>>> Is there a way to do fail-over without having >>>>> access to the pdu(power strips)? >>>>> >>>>> >>>> If you have IPMI support, that can be used for power control, instead of >>>> a >>>> switched PDU. Depending on the storage, you may be able to do resource >>>> fencing of the disks instead of STONITH. Or you can run fast-and-loose, >>>> without any way to ensure the dead node is really "dead" and not >>>> accessing >>>> storage (at your risk). While Lustre has MMP, it is really more to >>>> protect >>>> against a mount typo than to guarantee resource fencing. >>>> >>>> >>>> >>>> >>>>> Thanks >>>>> David >>>>> >>>>> -- >>>>> Personally, I liked the university. They gave us money and facilities, >>>>> we didn't have to produce anything! You've never been out of college! >>>>> You don't know what it's like out there! I've worked in the private >>>>> sector. They expect results. -Ray Ghostbusters >>>>> _______________________________________________ >>>>> Lustre-discuss mailing list >>>>> [email protected] >>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>>> >>>>> >>> >>> >>> >> > > > > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
