Hello,

I have just joined this list, but have read through as many of the archives
as I could about booting from a raid1 /boot partition. From what I have read
someone has done it using "grub" (I dont know it), but not via lilo. If what
I discuss below has already been done/discussed or is just plain wrong (and
I got lucky), please accept my apologies.

I wanted to set up a pair of mirrored disks on my machine, and to be able to
boot from either of them in the event one failed. I have read the software
RAID mini-howto, and took note of the advice about needing a non-raid /boot
partition on the bootable disk. It seemed if I wanted to boot from either
disk I would need to ensure that both disks had identical /boot partitions,
and lilo that lilo was run twice, once writing to the bootable disk and once
writing to the non-bootable disk.

This is fine, but I *know* that somewhere along the line I would forget to
update the /boot on the second disk when I built a new kernel, did
maintenace etc. I really wanted to have /boot as a raid1 partition. I did
find one solution, but it involves hacking lilo.

I set up a raid1 disk pair on my machine, and set all partitions on it
(except swap), including /boot to be mirrored. I copied an entire linux
system onto the disk pairs, so in the end the mount table looks something
like this:

/dev/md3 on / type ext2 (rw)
/dev/md2 on /tmp type ext2 (rw)
/dev/md0 on /boot type ext2 (rw)
/dev/md5 on /home type ext2 (rw)
/dev/md4 on /usr type ext2 (rw)
/dev/md6 on /var type ext2 (rw)

/proc/mdstat looks like:

md0 : active raid1 hdf1[1] hde1[0] 64128 blocks [2/2] [UU]
md1 : active raid1 hdf2[1] hde2[0] 264960 blocks [2/2] [UU]
md2 : active raid1 hdf6[1] hde6[0] 264960 blocks [2/2] [UU]
md3 : active raid1 hdf7[1] hde7[0] 521984 blocks [2/2] [UU]
md4 : active raid1 hdf8[1] hde8[0] 2096384 blocks [2/2] [UU]
md5 : active raid1 hdf9[1] hde9[0] 2096384 blocks [2/2] [UU]
md6 : active raid1 hdf10[1] hde10[0] 2096384 blocks [2/2] [UU]

Note the /boot is on /dev/md0=hde1+hdf1. If I run the stock standard lilo,
it comes back with the (understandable) error message:

Sorry, don't know how to handle device 0x0900

So far, so bad. lilo thinks that the boot map is on /dev/md0, and doesn't
know how to handle it. As far as I can tell though, if lilo used /dev/hde1
or /dev/hdf1 for the boot map, it would get "understand" the disk geometry
and be able to to put the location of the boot map into the MBR

I then hacked the lilo source that comes with RH6.0 (version 21) and added a
"mapdevice" option. This is meant to tell lilo "ignore whatever device the
boot map is currently on, and use device xxyy".  I then make two lilo.conf's
(call them lilo.conf.hde and lilo.conf.hdf). They look like this (some stuff
in the append is specific to my system):

[lilo.conf.hde]
boot=/dev/hde
map=/boot/map
mapdevice=2101
install=/boot/boot.b
prompt
default=linux
image=/boot/vmlinuz-2.2.5-22
        label=linux
        root=/dev/md3
        read-only
        append="ide2=0xd800,0xdc02,11 hde=1123,255,63 hdf=1123,255,63"

[lilo.conf.hdf]
boot=/dev/hdf
map=/boot/map
mapdevice=2101 /* <- /dev/hde1 major/minor device number. dont use /dev/hd1f
here ! */
install=/boot/boot.b
prompt
default=linux
image=/boot/vmlinuz-2.2.5-22
        label=linux
        root=/dev/md3
        read-only
        append="ide2=0xd800,0xdc02,11 hde=1123,255,63 hdf=1123,255,63"

If I now run "lilo -C /etc/lilo.conf.hde" and "lilo -C /etc/lilo.conf.hdf"
lilo seems to be happy, and I can boot off of either disk (ie I can unplug
either one of the two drives and still boot. In my case my slave drive - a
Quantum Fireball KA - seems "smart" enough to know the master is gone, but
is still bootable. I do not think this is the case for all IDE drives
though, but imagine in the worst case you would have to change the
master/slave jumper if you lost the master). I now only need to run lilo
twice (which can be done automatically via an alias or shell
script/function), and do not need to worry about keeping two /boot
partitions in sync.

To me, this seems to have accomplished my objectives, and I cannot see any
drawbacks on the surface, although I am sure there must be somewhere (it
seemed too easy). Obviously it will only apply to raid1 disk pairs, and
using it on a raid5 set will probably give less that satisfactory results
(ie and unbootable system I imagine !).

What do others think ? The lilo patches are not meant to be permanent (they
really are a hack, but are very small). If this technique seems to be
warranted I can clean them up and submit them to the lilo maintainer for
consideration.

Let me know what you think. The usual disclaimer about using at your own
risk, and the possibility of your disk drives spontaneously combusting etc.
etc. apply. Do not use on any "real" systems !

Andrew Speer
[EMAIL PROTECTED]

PS: attentive people may notice I have my mirrored disks on an external
controller, but as a master/slave pair on the same IDE channel. This goes
against the RAID performance recommendations, however whenever I set it up
as two masters on the primary and secondary IDE channels I got *worse*
performance than in this configuration. Current config takes 3.10 for a
kernel compile, "recommended" config takes 5.30 for a kernel compile.

I have no real idea why, but would guess it is because I am using an
outboard controller (a HPT-366, no native linux driver yet), and if I have
to use two channels I am still sharing only one interrupt. That is my guess
anyway, but I am not too concerned, as the performace is still much better
that my old box.

lilo.raid1.diff

Reply via email to