> > > No Linux project I know of can yet provide generic failsafe clustering.
Have a look at http://www.eddieware.org/. Might be what you are looking for, although
over-the-top for the original request.
> And what about the system BIOS looking for the master boot record (ie,
> LILO) on the first disk? If that disk is completely dead, fine, it'll
> try the next one, but if it's just returning corrupt data? (Never mind
> IDE hardware that would even handle the completely-dead case).
It's not bullet-proof, but couldn't you get a reasonable degree of fault-tolerance
with the following RAID-1 setup:
Install LILO on the MBR of both disks. On the second disk's lilo.conf, use the
bios=0x80 trick to make sure you can boot from this if the first disk fails
completely. On the first disk's lilo.conf, make the default option be to look for the
kernel etc. on the 2nd disk. Add another option (let's give it the label "normal") to
boot from the first disk, with an append line which includes "panic=30" (or however
long you want to give it). Add a "lilo -R normal" line to the shutdown script. Now,
if disk one fails completely, the system should boot from the second disk. If disk
one develops an error, it should reboot if the kernel panics on bootup and this time
load from disk two. (I haven't tried this, but I guess you could get the same effect
using LILO's "fallback" option.) The event which this setup cannot handle is
corruption of the MBR on disk one, but I don't think you will get perfection without
going to hardware RAID or server-clustering. However, I would have thought this is
security enough for the majority of applications.
Cheers,
Bruno Prior [EMAIL PROTECTED]