Hey folks, I'm a software-raid newbie so go easy. :-)

I'm testing a very minimal 2-disk RAID-1 to get a hang of the failure
recovery mechanisms, and I've run across a problem.

/etc/raidtab
------------
raiddev /dev/md0
raid-level              1
persistent-superblock   1
chunk-size              16

nr-raid-disks           2
nr-spare-disks          0

device                  /dev/sda1
raid-disk               0
device                  /dev/sdb1
raid-disk               1
------------

The system copes okay if I physically remove one of the drives, or use the
raidsetfaulty tool to simulate a failure.

I can raidhotremove the (simulated-)faulty disk, and then physically remove
it. Next, I put the disk back in physically. I then want to run raidhotadd to
add the disk back into the array and begin reconstruction.

Problem is, when I run raidhotadd, the system totally locks up solid. I've
tried giving it time to come back to life, but nothing happens even after
several minutes, and the system is so dead that the software watchdog is
also toast.

kernel: 2.2.16pre2 SMP
raid:   mingo's raid-2.2.15-A0
tools:  raidtools-19990824-0.90

Is this a known problem? Am I using the right procedure to replace a faulty
disk? Would a raidstop/raidstart work? Isn't there a way to replace a drive
without taking the array down? The HOWTO is not very detailed in this area
of reconstruction. It makes it sound like this should all be a no-brainer.

Regards,
Ian Morgan

-------------------------------------------------------------------
 Ian E. Morgan                                  [EMAIL PROTECTED]
 Vice President & C.O.O.                            (613) 276-6206
 Webcon, Inc.                                http://www.webcon.net
-------------------------------------------------------------------

Reply via email to