Ingo:
I tested your newest raid-patch & tools.
I happily found the commands and config files mostly revamped, itīs much
more straightforward now compared to the old patch&tools (0.41). Iīm
also very pleased with the background raid5 construction.
Enough good news, now the bad ones:
compiling the tools after compiling & booting the kernel in
/usr/src/linux) gave a minor problem.
mkpv.c: BLKGETSIZE undefined.
After adding
#include <linux/fs.h>
all went well.
================
minor inconvenience with raidstop:
raidstart -a
works, but
raidstop -a
gives "nothing to do",
raidstop /dev/md0
works
================
no (automatic?) reconstruction of degraded array:
building the raid, mke2fs & mount went well.
I waited until the raid5-construct was complete, then did a little test:
umount
raidstop
badblocks -w /dev/hde1 (scrambling first raid5 disk with 0xaaaaaaaa)
raidstart /dev/md0 worked (after reordering /dev/raidtab), but the
kernel didnīt want to reconstruct /dev/hde1 (var/log/messages below).
The disk is marked as down in /proc/mdstat:
_UUU
then I did
mke2fs /dev/hde1
and raidstart just to see what happens. As expected, didnīt rebuild
either (var/log/messages below).
My question is: How can I convince the kernel to rebuild /dev/hde1?
Did I miss an additional command that accomplishes this (a new mdop)?
Should I reboot the system?
The problem is that I want to be absolutely sure that I can restore an
array to a non-degraded state.
some system information & the logs:
================
system: suse 5.2
compiler: gcc 2.7.2.1
kernel: 2.0.36 (stock)
kernel-patch: raid0145-19981215-2_0_36
raidtools: 0.90 alpha 19981214
hardware: P133, promise ultra33, 4 ibm16.8gb drives
================
/etc/raidtab
raiddev /dev/md0
raid-level 5
nr-raid-disks 4
nr-spare-disks 0
parity-algorithm left-symmetric
chunk-size 32
device /dev/hdf1
raid-disk 1
device /dev/hde1
raid-disk 0
device /dev/hdg1
raid-disk 2
device /dev/hdh1
raid-disk 3
================
/var/log/messages after badblocks -w /dev/hde & raidstart /dev/md0
Dec 16 19:20:26 fileserver kernel: (read) hde1's sb offset: 16510912
[events: aaaaaaaa]
Dec 16 19:20:26 fileserver kernel: md: invalid raid superblock magic on
hde1
Dec 16 19:20:26 fileserver kernel: md: hde1 has invalid sb, not
importing!
Dec 16 19:20:26 fileserver kernel: could not import hde1!
Dec 16 19:20:26 fileserver kernel: autostart hde1 failed!
Dec 16 19:20:26 fileserver kernel: huh12?
Dec 16 19:21:43 fileserver kernel: md: can not import hde1, has active
inodes!
Dec 16 19:21:43 fileserver kernel: could not import hde1!
Dec 16 19:21:43 fileserver kernel: autostart hde1 failed!
Dec 16 19:21:43 fileserver kernel: huh12?
Dec 16 19:21:45 fileserver kernel: md: can not import hde1, has active
inodes!
Dec 16 19:21:45 fileserver kernel: could not import hde1!
Dec 16 19:21:45 fileserver kernel: autostart hde1 failed!
Dec 16 19:21:45 fileserver kernel: huh12?
Dec 16 19:21:46 fileserver kernel: md: can not import hde1, has active
inodes!
Dec 16 19:21:46 fileserver kernel: could not import hde1!
Dec 16 19:21:46 fileserver kernel: autostart hde1 failed!
Dec 16 19:21:46 fileserver kernel: huh12?
Dec 16 19:21:46 fileserver kernel: md: can not import hde1, has active
inodes!
Dec 16 19:21:46 fileserver kernel: could not import hde1!
Dec 16 19:21:46 fileserver kernel: autostart hde1 failed!
Dec 16 19:21:46 fileserver kernel: huh12?
Dec 16 19:24:00 fileserver kernel: (read) hdf1's sb offset: 16510912
[events: 00000004]
Dec 16 19:24:00 fileserver kernel: md: can not import hde1, has active
inodes!
Dec 16 19:24:00 fileserver kernel: could not import hde1, trying to run
array nevertheless.
Dec 16 19:24:00 fileserver kernel: (read) hdg1's sb offset: 16510912
[events: 00000004]
Dec 16 19:24:00 fileserver kernel: (read) hdh1's sb offset: 16510912
[events: 00000004]
Dec 16 19:24:00 fileserver kernel: autorun ...
Dec 16 19:24:00 fileserver kernel: considering hdh1 ...
Dec 16 19:24:00 fileserver kernel: adding hdh1 ...
Dec 16 19:24:00 fileserver kernel: adding hdg1 ...
Dec 16 19:24:00 fileserver kernel: adding hdf1 ...
Dec 16 19:24:00 fileserver kernel: created md0
Dec 16 19:24:00 fileserver kernel: bind<hdf1,1>
Dec 16 19:24:00 fileserver kernel: bind<hdg1,2>
Dec 16 19:24:00 fileserver kernel: bind<hdh1,3>
Dec 16 19:24:00 fileserver kernel: running: <hdh1><hdg1><hdf1>
Dec 16 19:24:00 fileserver kernel: now!
Dec 16 19:24:00 fileserver kernel: hdh1's event counter: 00000004
Dec 16 19:24:00 fileserver kernel: hdg1's event counter: 00000004
Dec 16 19:24:00 fileserver kernel: hdf1's event counter: 00000004
Dec 16 19:24:00 fileserver kernel: md0: former device hde1 is
unavailable, removing from array!
Dec 16 19:24:00 fileserver kernel: md0: max total readahead window set
to 384k
Dec 16 19:24:00 fileserver kernel: md0: 3 data-disks, max readahead per
data-disk: 128k
Dec 16 19:24:00 fileserver kernel: raid5: device hdh1 operational as
raid disk 3
Dec 16 19:24:00 fileserver kernel: raid5: device hdg1 operational as
raid disk 2
Dec 16 19:24:00 fileserver kernel: raid5: device hdf1 operational as
raid disk 1
Dec 16 19:24:00 fileserver kernel: raid5: md0, not all disks are
operational -- trying to recover array
Dec 16 19:24:00 fileserver kernel: raid5: allocated 4254kB for md0
Dec 16 19:24:00 fileserver kernel: raid5: raid level 5 set md0 active
with 3 out of 4 devices, algorithm 2
Dec 16 19:24:00 fileserver kernel: RAID5 conf printout:
Dec 16 19:24:00 fileserver kernel: --- rd:4 wd:3 fd:1
Dec 16 19:24:00 fileserver kernel: disk 0, s:0, o:0, n:0 rd:0 us:1
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 1, s:0, o:1, n:1 rd:1 us:1
dev:hdf1
Dec 16 19:24:00 fileserver kernel: disk 2, s:0, o:1, n:2 rd:2 us:1
dev:hdg1
Dec 16 19:24:00 fileserver kernel: disk 3, s:0, o:1, n:3 rd:3 us:1
dev:hdh1
Dec 16 19:24:00 fileserver kernel: disk 4, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 5, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 6, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 7, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 8, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 9, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 10, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 11, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: RAID5 conf printout:
Dec 16 19:24:00 fileserver kernel: --- rd:4 wd:3 fd:1
Dec 16 19:24:00 fileserver kernel: disk 0, s:0, o:0, n:0 rd:0 us:1
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 1, s:0, o:1, n:1 rd:1 us:1
dev:hdf1
Dec 16 19:24:00 fileserver kernel: disk 2, s:0, o:1, n:2 rd:2 us:1
dev:hdg1
Dec 16 19:24:00 fileserver kernel: disk 3, s:0, o:1, n:3 rd:3 us:1
dev:hdh1
Dec 16 19:24:00 fileserver kernel: disk 4, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 5, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 6, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 7, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 8, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 9, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 10, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: disk 11, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 19:24:00 fileserver kernel: md: updating md0 RAID superblock on
device
Dec 16 19:24:00 fileserver kernel: hdh1 [events: 00000005](write) hdh1's
sb offset: 16510912
Dec 16 19:24:00 fileserver kernel: md: recovery thread got woken up ...
Dec 16 19:24:00 fileserver kernel: md0: no spare disk to reconstruct
array! -- continuing in degraded mode
Dec 16 19:24:00 fileserver kernel: md: recovery thread finished ...
Dec 16 19:24:00 fileserver kernel: hdg1 [events: 00000005](write) hdg1's
sb offset: 16510912
Dec 16 19:24:00 fileserver kernel: hdf1 [events: 00000005](write) hdf1's
sb offset: 16510912
Dec 16 19:24:00 fileserver kernel: .
Dec 16 19:24:00 fileserver kernel: ... autorun DONE.
Dec 16 20:14:57 fileserver kernel: interrupting MD-thread pid 695
Dec 16 20:14:57 fileserver kernel: raid5d(695) flushing signals.
Dec 16 20:14:57 fileserver kernel: marking sb clean...
Dec 16 20:14:57 fileserver kernel: md: updating md0 RAID superblock on
device
Dec 16 20:14:57 fileserver kernel: hdh1 [events: 00000006](write) hdh1's
sb offset: 16510912
Dec 16 20:14:57 fileserver kernel: hdg1 [events: 00000006](write) hdg1's
sb offset: 16510912
Dec 16 20:14:57 fileserver kernel: hdf1 [events: 00000006](write) hdf1's
sb offset: 16510912
Dec 16 20:14:57 fileserver kernel: .
Dec 16 20:14:57 fileserver kernel: unbind<hdh1,2>
Dec 16 20:14:57 fileserver kernel: export_rdev(hdh1)
Dec 16 20:14:57 fileserver kernel: unbind<hdg1,1>
Dec 16 20:14:57 fileserver kernel: export_rdev(hdg1)
Dec 16 20:14:57 fileserver kernel: unbind<hdf1,0>
Dec 16 20:14:57 fileserver kernel: export_rdev(hdf1)
Dec 16 20:14:57 fileserver kernel: md0 stopped.
/var/log/messages after mke2fs /dev/hde & raidstart /dev/md0
Dec 16 20:20:12 fileserver kernel: (read) hdf1's sb offset: 16510912
[events: 00000006]
Dec 16 20:20:12 fileserver kernel: (read) hdg1's sb offset: 16510912
[events: 00000006]
Dec 16 20:20:12 fileserver kernel: (read) hdh1's sb offset: 16510912
[events: 00000006]
Dec 16 20:20:12 fileserver kernel: autorun ...
Dec 16 20:20:12 fileserver kernel: considering hdh1 ...
Dec 16 20:20:12 fileserver kernel: adding hdh1 ...
Dec 16 20:20:12 fileserver kernel: adding hdg1 ...
Dec 16 20:20:12 fileserver kernel: adding hdf1 ...
Dec 16 20:20:12 fileserver kernel: created md0
Dec 16 20:20:12 fileserver kernel: bind<hdf1,1>
Dec 16 20:20:12 fileserver kernel: bind<hdg1,2>
Dec 16 20:20:12 fileserver kernel: bind<hdh1,3>
Dec 16 20:20:12 fileserver kernel: running: <hdh1><hdg1><hdf1>
Dec 16 20:20:12 fileserver kernel: now!
Dec 16 20:20:12 fileserver kernel: hdh1's event counter: 00000006
Dec 16 20:20:12 fileserver kernel: hdg1's event counter: 00000006
Dec 16 20:20:12 fileserver kernel: hdf1's event counter: 00000006
Dec 16 20:20:12 fileserver kernel: md0: max total readahead window set
to 384k
Dec 16 20:20:12 fileserver kernel: md0: 3 data-disks, max readahead per
data-disk: 128k
Dec 16 20:20:12 fileserver kernel: raid5: device hdh1 operational as
raid disk 3
Dec 16 20:20:12 fileserver kernel: raid5: device hdg1 operational as
raid disk 2
Dec 16 20:20:12 fileserver kernel: raid5: device hdf1 operational as
raid disk 1
Dec 16 20:20:12 fileserver kernel: raid5: md0, not all disks are
operational -- trying to recover array
Dec 16 20:20:12 fileserver kernel: raid5: allocated 4254kB for md0
Dec 16 20:20:12 fileserver kernel: raid5: raid level 5 set md0 active
with 3 out of 4 devices, algorithm 2
Dec 16 20:20:12 fileserver kernel: RAID5 conf printout:
Dec 16 20:20:12 fileserver kernel: --- rd:4 wd:3 fd:1
Dec 16 20:20:12 fileserver kernel: disk 0, s:0, o:0, n:0 rd:0 us:1
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 1, s:0, o:1, n:1 rd:1 us:1
dev:hdf1
Dec 16 20:20:12 fileserver kernel: disk 2, s:0, o:1, n:2 rd:2 us:1
dev:hdg1
Dec 16 20:20:12 fileserver kernel: disk 3, s:0, o:1, n:3 rd:3 us:1
dev:hdh1
Dec 16 20:20:12 fileserver kernel: disk 4, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 5, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 6, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 7, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 8, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 9, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 10, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 11, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: RAID5 conf printout:
Dec 16 20:20:12 fileserver kernel: --- rd:4 wd:3 fd:1
Dec 16 20:20:12 fileserver kernel: disk 0, s:0, o:0, n:0 rd:0 us:1
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 1, s:0, o:1, n:1 rd:1 us:1
dev:hdf1
Dec 16 20:20:12 fileserver kernel: disk 2, s:0, o:1, n:2 rd:2 us:1
dev:hdg1
Dec 16 20:20:12 fileserver kernel: disk 3, s:0, o:1, n:3 rd:3 us:1
dev:hdh1
Dec 16 20:20:12 fileserver kernel: disk 4, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 5, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 6, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 7, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 8, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 9, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 10, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: disk 11, s:0, o:0, n:0 rd:0 us:0
dev:[dev 00:00]
Dec 16 20:20:12 fileserver kernel: md: updating md0 RAID superblock on
device
Dec 16 20:20:12 fileserver kernel: hdh1 [events: 00000007](write) hdh1's
sb offset: 16510912
Dec 16 20:20:12 fileserver kernel: md: recovery thread got woken up ...
Dec 16 20:20:12 fileserver kernel: md0: no spare disk to reconstruct
array! -- continuing in degraded mode
Dec 16 20:20:12 fileserver kernel: md: recovery thread finished ...
Dec 16 20:20:12 fileserver kernel: hdg1 [events: 00000007](write) hdg1's
sb offset: 16510912
Dec 16 20:20:12 fileserver kernel: hdf1 [events: 00000007](write) hdf1's
sb offset: 16510912
Dec 16 20:20:12 fileserver kernel: .
Dec 16 20:20:12 fileserver kernel: ... autorun DONE.
---
Thilo Herrmann, Rival Network GmbH
Phone: +49 89 82087-0, email: [EMAIL PROTECTED]
---