Re: linux software RAID with hot-swap hardware

2003-01-25 Thread Fraser Campbell
Russell Coker wrote:

   raid-extra-boot=/dev/sda,/dev/sdb
  
   According to the documentation of lilo, this shouldn't be necessary,
   but apparently either the funcionality or the docs are buggy. Without
   that line I couldn't boot at all from the second disk,
 
  the way I've tested it also works with
 
  install-mbr /dev/md1

 Why would you want to use install-mbr on a RAID device?

 I use install-mbr for the MBR on the hard drive (/dev/sda and /dev/sdb in
 this case) and then have it load the real boot block from /dev/md1 which
 was created by LILO.

In my experience a lilo.conf such as this works:

map=/boot/map
install=/boot/boot.b
boot=/dev/md1
timeout=50
prompt
default=Linux

image=/boot/vmlinuz-2.4.18-19.7.x
label=Linux
root=/dev/md4
initrd=/boot/initrd-2.4.18-19.7.x.img
read-only

Simply run lilo thereafter, you will see that it automatically installs boot 
sectors in both drives.  Like this:

boot = /dev/hda, map = /boot/map.0301
Added Linux *
boot = /dev/hdc, map = /boot/map.1601
Added Linux *

The above is from a Redhat 7.3 system but this worked back in Redhat 6.2, I 
have also done this with a sid system within the past month.

In the above case /dev/md1 is a 24MB partition mounted as /boot, consisting of 
/dev/hda1 and /dev/hdb1.  Since lilo is supposed to get past the 1024 
cylinder limit these days that is probably not relevant.

In testing I was able to boot with hda only (that one's obvious), I was able 
to boot with hdc only (both directly using BIOS and using lilo installed on 
floppy).

I'm 90% sure that I physically moved hdc so that it became hda and was able to 
boot perfectly but it was getting late and I lost track of what I was 
testing.

All partition types are set to RAID Autodetect (FD), I believe that may be 
important.

Fraser


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: linux software RAID with hot-swap hardware

2003-01-25 Thread Russell Coker
On Sat, 25 Jan 2003 14:15, Fraser Campbell wrote:
   the way I've tested it also works with
  
   install-mbr /dev/md1
 
  Why would you want to use install-mbr on a RAID device?
 
  I use install-mbr for the MBR on the hard drive (/dev/sda and /dev/sdb in
  this case) and then have it load the real boot block from /dev/md1 which
  was created by LILO.

 In my experience a lilo.conf such as this works:

 boot=/dev/md1

 Simply run lilo thereafter, you will see that it automatically installs
 boot sectors in both drives.  Like this:

 boot = /dev/hda, map = /boot/map.0301
 Added Linux *
 boot = /dev/hdc, map = /boot/map.1601
 Added Linux *

That looks like an old version of LILO.  The latest LILO in Debian is 22.3.3 
and doesn't work like that.

 The above is from a Redhat 7.3 system but this worked back in Redhat 6.2, I
 have also done this with a sid system within the past month.

Strange, it doesn't work like that for me on my SID systems, and I haven't 
made a new release of LILO since September last year.

 In the above case /dev/md1 is a 24MB partition mounted as /boot, consisting
 of /dev/hda1 and /dev/hdb1.  Since lilo is supposed to get past the 1024
 cylinder limit these days that is probably not relevant.

Correct, and on modern LBA drives 1024 cylinders should be large enough for 
the entire root file system anyway.

 In testing I was able to boot with hda only (that one's obvious), I was
 able to boot with hdc only (both directly using BIOS and using lilo
 installed on floppy).

You've got a good BIOS then.  The only time I've had that work for me was when 
using SCSI on a machine with hot-swap drives (it cost enough so you expect 
such things to work).  For desktop machines I buy mostly what's cheapest.

 I'm 90% sure that I physically moved hdc so that it became hda and was able
 to boot perfectly but it was getting late and I lost track of what I was
 testing.

That used to tend to not work on the older versions of LILO with a default 
setup.  With the re-write of the boot code the default seems to work well for 
that (but may have problems with the drive as /dev/hdc).

 All partition types are set to RAID Autodetect (FD), I believe that may be
 important.

Of course.  The root file system on RAID won't work unless you have the 
partitions labeled as 0xfd or have an initrd that's programmed for it.


This is an interesting discussion, but you didn't answer my question of why 
anyone would want to run install-mbr on a RAID device.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




linux software RAID with hot-swap hardware

2003-01-23 Thread Russell Coker
I've written a document on using Linux software RAID with hot-swap SCSI 
hardware.

It's slightly specific to the hardware I use (I wrote it for internal use) but 
can easily be adapted to be more generic.

If someone wants to add it to a HOWTO or something then be my guest, please 
give me appropriate credit.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page

Title: Linux Software RAID and hot-swap SCSI


Basics of Linux Software RAID
The status of a running software RAID in Linux can be obtained from
/proc/mdstat, here's a sample:

md1 : active raid1 sdb1[1] sda1[0]
  136 blocks [2/2] [UU]

This is for a software RAID (Meta-Disk) named /dev/md1 which is comprised of
/dev/sda1 and /dev/sdb1 devices in a RAID-1 (mirroring) setup.  The Dell 1U
server machines will have all their disks in software RAID-1 arrays.
When you have a disk installed and recognised by Linux you can then add
partitions to degraded RAID arrays at any time with the raidhotadd
command.  Here is the data you see in /proc/mdstat for a degraded RAID
array:

md1 : active raid1 sdb1[1](F) sda1[0]
  136 blocks [2/1] [U_]

Device /dev/sdb1 has failed (to generate this error I unplugged the disk
/dev/sdb).  Now I have just swapped the hard drive (see below) and want to put
the new drive back in the array.  Firstly I must remove the record for the disk
in the failed (F) state with the command
raidhotremove /dev/md1 /dev/sdb1, which gives the following state in
/proc/mdstat:

md1 : active raid1 sda1[0]
  136 blocks [2/1] [U_]

Once this has been done for every partition that was in a RAID array the drive
will be regarded as being unsed by Linux which allows it to be
repartitioned or unregistered (see below for details of hardware recognition).

If you want to instruct the software RAID driver to stop using a partition on
a functional disk then you would use the raidsetfaulty command, EG:
raidsetfaulty /dev/md1 /dev/sdb1 to set the partition in failed state
so that you can then use raidhotremove to remove it.

When you have a new partition you want to add to add to a RAID set you can
use the command raidhotadd ARRAY DEVICE to add it, EG
raidhotadd /dev/md1 /dev/sdb1 which results in the following data in
/proc/mdstat:

md1 : active raid1 sdb1[2] sda1[0]
  136 blocks [2/1] [U_]
  [===>.]  recovery = 37.8% (755976/136) finish=0.4min speed=48732K/sec

Note that when a device name is followed by [2] then it's in a
reconstruction state.
When running raidhotadd commands there is no need to wait for one command to
finish before running the next, the kernel maintains a queue of devices to
reconstruct.  You can schedule several RAID partitions to reconstruct and then
go for a coffee break (or a lunch break depending on the speed of the drives).

Hardware recognition
The drives are hot-swap, so you can unplug one disk at any time.  However you
must inform Linux that you have removed a disk before you can have the new
disk recognised.  Before you can do this you have to make sure that the disk
is no longer recognised as "in-use" by Linux, see the above section for
information on the raidsetfaulty and raidhotremove commands.

Once the disk is no longer in use and it is unplugged (the two operations of
making it unused and unplugging it can proceed in any order) then you can
inform the SCSI driver of the removal with the command scsiadd -r ID.
ID is the identity of the disk you want to remove, which is determined
by which bay the drive is in.  The bays are numbered 0, 1, and
2 from left to right (and the numbers are printed on top of the case -
where you can't see them when it's mounted).
So if you want to swap the second disk which Linux usually (but not always)
knows as sdb then you use the command scsiadd -r 1 to inform the
Linux SCSI driver that the disk is removed, then you can insert the new drive
(or re-insert the drive you just removed) and use the command
scsiadd -a ID (which in this example is scsiadd -a 1 to make
Linux recognise the new disk.  After that time you are free to partition the
disk and add it to software RAID ready for use.
If you see the error message parity error detected then after waiting
for a minute or two (and seeing many other error messages) the error should
be corrected and the drive should be recognised.  It is best to leave the drive
physically in place for some time before running the scsiadd -a command
to reduce the risk of this error.  Sometimes this error can only be solved with
a hardware reset...

Booting
To make a RAID-1 device bootable you first have to use fdisk to set the
bootable flag on both the partitions for the root file system (if one
disk is removed you want the other disk to be bootable).
Then you have to
to configure LILO with the 

Re: linux software RAID with hot-swap hardware

2003-01-23 Thread Pierfrancesco Caci
:- Russell == Russell Coker [EMAIL PROTECTED] writes:

Hi

 I've written a document on using Linux software RAID with hot-swap SCSI 
 hardware.

nice doc, just a little comment about booting:


 *Booting*

 To make a RAID-1 device bootable you first have to use fdisk to
 set the bootable flag on both the partitions for the root file
 system (if one disk is removed you want 
 the other disk to be bootable). 

...make this in bold, I was bitten by it :)
Also, and this is probably specific to my hardware (compaq ml530,
which has 2 scsi hosts), to boot from the right-side disks you have to
tell the bios to use the second scsi host. Quite annoying.

 Then you have to to configure LILO with the root=/dev/md1 and
 boot=/dev/md1 lines to configure the root file system as the
 boot device (NB if you use a RAID device 
 other than /dev/md1 for the root file system then adjust the
 LILO configuration accordingly. The LILO configuration is in
 /etc/lilo.conf, to apply the changes run the 
 lilo command with no parameters.

 Finally you have to use the install-mbr command to set up a boot
 block that the BIOS can run to load the LILO block, use the
 following commands: 

 install-mbr /dev/sda
 install-mbr /dev/sdb

 This installs the Debian MBR on both hard drives so that
 whichever drive is removed the other has a boot loader that can
 then load LILO to boot Linux. 


Instead of install-mbr, I used the following line in lilo.conf:

raid-extra-boot=/dev/sda,/dev/sdb

According to the documentation of lilo, this shouldn't be necessary,
but apparently either the funcionality or the docs are buggy. Without
that line I couldn't boot at all from the second disk, or from any
disk that weren't formatted during the initial installation process
(note that after installing the first machine, I just replicated the
others by swapping the disks around and having raid regenerate the
arrays)

Hope this is useful for you

Pf


-- 

---
 Pierfrancesco Caci | ik5pvx | mailto:[EMAIL PROTECTED]  -  http://gusp.dyndns.org
  Firenze - Italia  | Office for the Complication of Otherwise Simple Affairs 
 Linux penny 2.4.20-ac2 #1 Fri Jan 17 18:10:25 CET 2003 i686 GNU/Linux


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: linux software RAID with hot-swap hardware

2003-01-23 Thread Andraz Sraka
re

On Thu, 2003-01-23 at 14:42, Pierfrancesco Caci wrote:

 Instead of install-mbr, I used the following line in lilo.conf:
 
 raid-extra-boot=/dev/sda,/dev/sdb
 
 According to the documentation of lilo, this shouldn't be necessary,
 but apparently either the funcionality or the docs are buggy. Without
 that line I couldn't boot at all from the second disk,

the way I've tested it also works with 

install-mbr /dev/md1

and 'raid-extra-boot=auto' option in /etc/lilo.conf. If you run lilo
with verbose option you can see, that it actually writes to both disks
in RAID1 array.

regards,
 Andraz

-- 
   Hobbes : Well, you still have afternoons and weekends 
   Calvin : That's when I watch TV.



signature.asc
Description: This is a digitally signed message part


Re: linux software RAID with hot-swap hardware

2003-01-23 Thread Russell Coker
On Fri, 24 Jan 2003 01:25, Andraz Sraka wrote:
 On Thu, 2003-01-23 at 14:42, Pierfrancesco Caci wrote:
  Instead of install-mbr, I used the following line in lilo.conf:
 
  raid-extra-boot=/dev/sda,/dev/sdb
 
  According to the documentation of lilo, this shouldn't be necessary,
  but apparently either the funcionality or the docs are buggy. Without
  that line I couldn't boot at all from the second disk,

 the way I've tested it also works with

 install-mbr /dev/md1

Why would you want to use install-mbr on a RAID device?

I use install-mbr for the MBR on the hard drive (/dev/sda and /dev/sdb in this 
case) and then have it load the real boot block from /dev/md1 which was 
created by LILO.

Someone suggested that there's a lilo option to remove the need for this.  
Last time I did a lot of experimenting with LILO and RAID-1 (a while who) 
install-mbr seemed to be the only thing that worked.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]