Re: linux software RAID with hot-swap hardware
Russell Coker wrote: raid-extra-boot=/dev/sda,/dev/sdb According to the documentation of lilo, this shouldn't be necessary, but apparently either the funcionality or the docs are buggy. Without that line I couldn't boot at all from the second disk, the way I've tested it also works with install-mbr /dev/md1 Why would you want to use install-mbr on a RAID device? I use install-mbr for the MBR on the hard drive (/dev/sda and /dev/sdb in this case) and then have it load the real boot block from /dev/md1 which was created by LILO. In my experience a lilo.conf such as this works: map=/boot/map install=/boot/boot.b boot=/dev/md1 timeout=50 prompt default=Linux image=/boot/vmlinuz-2.4.18-19.7.x label=Linux root=/dev/md4 initrd=/boot/initrd-2.4.18-19.7.x.img read-only Simply run lilo thereafter, you will see that it automatically installs boot sectors in both drives. Like this: boot = /dev/hda, map = /boot/map.0301 Added Linux * boot = /dev/hdc, map = /boot/map.1601 Added Linux * The above is from a Redhat 7.3 system but this worked back in Redhat 6.2, I have also done this with a sid system within the past month. In the above case /dev/md1 is a 24MB partition mounted as /boot, consisting of /dev/hda1 and /dev/hdb1. Since lilo is supposed to get past the 1024 cylinder limit these days that is probably not relevant. In testing I was able to boot with hda only (that one's obvious), I was able to boot with hdc only (both directly using BIOS and using lilo installed on floppy). I'm 90% sure that I physically moved hdc so that it became hda and was able to boot perfectly but it was getting late and I lost track of what I was testing. All partition types are set to RAID Autodetect (FD), I believe that may be important. Fraser -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: linux software RAID with hot-swap hardware
On Sat, 25 Jan 2003 14:15, Fraser Campbell wrote: the way I've tested it also works with install-mbr /dev/md1 Why would you want to use install-mbr on a RAID device? I use install-mbr for the MBR on the hard drive (/dev/sda and /dev/sdb in this case) and then have it load the real boot block from /dev/md1 which was created by LILO. In my experience a lilo.conf such as this works: boot=/dev/md1 Simply run lilo thereafter, you will see that it automatically installs boot sectors in both drives. Like this: boot = /dev/hda, map = /boot/map.0301 Added Linux * boot = /dev/hdc, map = /boot/map.1601 Added Linux * That looks like an old version of LILO. The latest LILO in Debian is 22.3.3 and doesn't work like that. The above is from a Redhat 7.3 system but this worked back in Redhat 6.2, I have also done this with a sid system within the past month. Strange, it doesn't work like that for me on my SID systems, and I haven't made a new release of LILO since September last year. In the above case /dev/md1 is a 24MB partition mounted as /boot, consisting of /dev/hda1 and /dev/hdb1. Since lilo is supposed to get past the 1024 cylinder limit these days that is probably not relevant. Correct, and on modern LBA drives 1024 cylinders should be large enough for the entire root file system anyway. In testing I was able to boot with hda only (that one's obvious), I was able to boot with hdc only (both directly using BIOS and using lilo installed on floppy). You've got a good BIOS then. The only time I've had that work for me was when using SCSI on a machine with hot-swap drives (it cost enough so you expect such things to work). For desktop machines I buy mostly what's cheapest. I'm 90% sure that I physically moved hdc so that it became hda and was able to boot perfectly but it was getting late and I lost track of what I was testing. That used to tend to not work on the older versions of LILO with a default setup. With the re-write of the boot code the default seems to work well for that (but may have problems with the drive as /dev/hdc). All partition types are set to RAID Autodetect (FD), I believe that may be important. Of course. The root file system on RAID won't work unless you have the partitions labeled as 0xfd or have an initrd that's programmed for it. This is an interesting discussion, but you didn't answer my question of why anyone would want to run install-mbr on a RAID device. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
linux software RAID with hot-swap hardware
I've written a document on using Linux software RAID with hot-swap SCSI hardware. It's slightly specific to the hardware I use (I wrote it for internal use) but can easily be adapted to be more generic. If someone wants to add it to a HOWTO or something then be my guest, please give me appropriate credit. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page Title: Linux Software RAID and hot-swap SCSI Basics of Linux Software RAID The status of a running software RAID in Linux can be obtained from /proc/mdstat, here's a sample: md1 : active raid1 sdb1[1] sda1[0] 136 blocks [2/2] [UU] This is for a software RAID (Meta-Disk) named /dev/md1 which is comprised of /dev/sda1 and /dev/sdb1 devices in a RAID-1 (mirroring) setup. The Dell 1U server machines will have all their disks in software RAID-1 arrays. When you have a disk installed and recognised by Linux you can then add partitions to degraded RAID arrays at any time with the raidhotadd command. Here is the data you see in /proc/mdstat for a degraded RAID array: md1 : active raid1 sdb1[1](F) sda1[0] 136 blocks [2/1] [U_] Device /dev/sdb1 has failed (to generate this error I unplugged the disk /dev/sdb). Now I have just swapped the hard drive (see below) and want to put the new drive back in the array. Firstly I must remove the record for the disk in the failed (F) state with the command raidhotremove /dev/md1 /dev/sdb1, which gives the following state in /proc/mdstat: md1 : active raid1 sda1[0] 136 blocks [2/1] [U_] Once this has been done for every partition that was in a RAID array the drive will be regarded as being unsed by Linux which allows it to be repartitioned or unregistered (see below for details of hardware recognition). If you want to instruct the software RAID driver to stop using a partition on a functional disk then you would use the raidsetfaulty command, EG: raidsetfaulty /dev/md1 /dev/sdb1 to set the partition in failed state so that you can then use raidhotremove to remove it. When you have a new partition you want to add to add to a RAID set you can use the command raidhotadd ARRAY DEVICE to add it, EG raidhotadd /dev/md1 /dev/sdb1 which results in the following data in /proc/mdstat: md1 : active raid1 sdb1[2] sda1[0] 136 blocks [2/1] [U_] [===>.] recovery = 37.8% (755976/136) finish=0.4min speed=48732K/sec Note that when a device name is followed by [2] then it's in a reconstruction state. When running raidhotadd commands there is no need to wait for one command to finish before running the next, the kernel maintains a queue of devices to reconstruct. You can schedule several RAID partitions to reconstruct and then go for a coffee break (or a lunch break depending on the speed of the drives). Hardware recognition The drives are hot-swap, so you can unplug one disk at any time. However you must inform Linux that you have removed a disk before you can have the new disk recognised. Before you can do this you have to make sure that the disk is no longer recognised as "in-use" by Linux, see the above section for information on the raidsetfaulty and raidhotremove commands. Once the disk is no longer in use and it is unplugged (the two operations of making it unused and unplugging it can proceed in any order) then you can inform the SCSI driver of the removal with the command scsiadd -r ID. ID is the identity of the disk you want to remove, which is determined by which bay the drive is in. The bays are numbered 0, 1, and 2 from left to right (and the numbers are printed on top of the case - where you can't see them when it's mounted). So if you want to swap the second disk which Linux usually (but not always) knows as sdb then you use the command scsiadd -r 1 to inform the Linux SCSI driver that the disk is removed, then you can insert the new drive (or re-insert the drive you just removed) and use the command scsiadd -a ID (which in this example is scsiadd -a 1 to make Linux recognise the new disk. After that time you are free to partition the disk and add it to software RAID ready for use. If you see the error message parity error detected then after waiting for a minute or two (and seeing many other error messages) the error should be corrected and the drive should be recognised. It is best to leave the drive physically in place for some time before running the scsiadd -a command to reduce the risk of this error. Sometimes this error can only be solved with a hardware reset... Booting To make a RAID-1 device bootable you first have to use fdisk to set the bootable flag on both the partitions for the root file system (if one disk is removed you want the other disk to be bootable). Then you have to to configure LILO with the
Re: linux software RAID with hot-swap hardware
:- Russell == Russell Coker [EMAIL PROTECTED] writes: Hi I've written a document on using Linux software RAID with hot-swap SCSI hardware. nice doc, just a little comment about booting: *Booting* To make a RAID-1 device bootable you first have to use fdisk to set the bootable flag on both the partitions for the root file system (if one disk is removed you want the other disk to be bootable). ...make this in bold, I was bitten by it :) Also, and this is probably specific to my hardware (compaq ml530, which has 2 scsi hosts), to boot from the right-side disks you have to tell the bios to use the second scsi host. Quite annoying. Then you have to to configure LILO with the root=/dev/md1 and boot=/dev/md1 lines to configure the root file system as the boot device (NB if you use a RAID device other than /dev/md1 for the root file system then adjust the LILO configuration accordingly. The LILO configuration is in /etc/lilo.conf, to apply the changes run the lilo command with no parameters. Finally you have to use the install-mbr command to set up a boot block that the BIOS can run to load the LILO block, use the following commands: install-mbr /dev/sda install-mbr /dev/sdb This installs the Debian MBR on both hard drives so that whichever drive is removed the other has a boot loader that can then load LILO to boot Linux. Instead of install-mbr, I used the following line in lilo.conf: raid-extra-boot=/dev/sda,/dev/sdb According to the documentation of lilo, this shouldn't be necessary, but apparently either the funcionality or the docs are buggy. Without that line I couldn't boot at all from the second disk, or from any disk that weren't formatted during the initial installation process (note that after installing the first machine, I just replicated the others by swapping the disks around and having raid regenerate the arrays) Hope this is useful for you Pf -- --- Pierfrancesco Caci | ik5pvx | mailto:[EMAIL PROTECTED] - http://gusp.dyndns.org Firenze - Italia | Office for the Complication of Otherwise Simple Affairs Linux penny 2.4.20-ac2 #1 Fri Jan 17 18:10:25 CET 2003 i686 GNU/Linux -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: linux software RAID with hot-swap hardware
re On Thu, 2003-01-23 at 14:42, Pierfrancesco Caci wrote: Instead of install-mbr, I used the following line in lilo.conf: raid-extra-boot=/dev/sda,/dev/sdb According to the documentation of lilo, this shouldn't be necessary, but apparently either the funcionality or the docs are buggy. Without that line I couldn't boot at all from the second disk, the way I've tested it also works with install-mbr /dev/md1 and 'raid-extra-boot=auto' option in /etc/lilo.conf. If you run lilo with verbose option you can see, that it actually writes to both disks in RAID1 array. regards, Andraz -- Hobbes : Well, you still have afternoons and weekends Calvin : That's when I watch TV. signature.asc Description: This is a digitally signed message part
Re: linux software RAID with hot-swap hardware
On Fri, 24 Jan 2003 01:25, Andraz Sraka wrote: On Thu, 2003-01-23 at 14:42, Pierfrancesco Caci wrote: Instead of install-mbr, I used the following line in lilo.conf: raid-extra-boot=/dev/sda,/dev/sdb According to the documentation of lilo, this shouldn't be necessary, but apparently either the funcionality or the docs are buggy. Without that line I couldn't boot at all from the second disk, the way I've tested it also works with install-mbr /dev/md1 Why would you want to use install-mbr on a RAID device? I use install-mbr for the MBR on the hard drive (/dev/sda and /dev/sdb in this case) and then have it load the real boot block from /dev/md1 which was created by LILO. Someone suggested that there's a lilo option to remove the need for this. Last time I did a lot of experimenting with LILO and RAID-1 (a while who) install-mbr seemed to be the only thing that worked. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]