Dear Raid users,
I've been using RAID5 system for nearly six months without problem,
but recently the machine halted while the rebooting process (displayed
message attached below). I tried old valid kernels and some succeeded
to boot, but the md device(/dev/md0) was still invisible. According to
the dmesg (attached the last of the mail), the kernel looks trying to
recognize md device.
I tried several old kernels which all had worked fine, but the
situation didn't change.
There's no /proc/mdstat, and "mkraid --upgrade /dev/md0" failed saying
> handling MD device /dev/md0
> analyzing super-block
> disk 0: /dev/sda1, 48846847kB, raid superblock at 48846720kB
> array needs no upgrade
> mkraid: aborted, see the syslog and /proc/mdstat for potential clues.
Is it possible to recover the content of the RAID? (there were about
200G of data..) Or do I have to "really force" mkraid?
Any suggestion, information is welcome.
Other information (System, raidtab, etc) is also attached below.
Best regards,
Takamura
Seishi Takamura, Dr.Eng.
NTT Cyber Space Laboratories
Y517A 1-1 Hikarino-Oka, Yokosuka, Kanagawa, 239-0847 Japan
Tel: +81 468 59 2371, Fax: +81 468 59 2829
E-mail: [EMAIL PROTECTED]
Message (from the display, possibly incorrectly typed):
attempt to access beyond end of device
03:07: rw=0, want=2, limit=0
dev 03:07 blksize=1024 blocknr=1 sector=2 size=1024 count=1
EXT2-fs: unable to read superblock
attempt to access beyond end of device
03:07: rw=0, want=1, limit=0
dev 03:07 blksize=1024 blocknr=0 sector=0 size=1024 count=1
FAT bread failed
attempt to access beyond end of device
03:07: rw=0, want=33, limit=0
dev 03:07 blksize=1024 blocknr=32 sector=64 size=1024 count=1
isofs_read_super: bread failed, dev=03:07, iso_blknum=16, block=32
Kernel panic: VFS: unable to mount root fs on 03:07
System:
RedHat 6.1
kernel-2.2.14 + raid-2.2.14-B1 + mypatch1(attached below)
raidtools 19990824-0.90 + mypatch2(attached below)
CPU Pentium III 600MHz(single)
3 SCSI Cards (Adaptec AHA2940U2W)
24 SCSI HDD Drives (Seagate ST150176LW Barracuda 50.1GB)
Each SCSI card has eight HDD's connected
RAID is mounted /raid (not /)
/etc/raidtab:
raiddev /dev/md0
raid-level 5
nr-raid-disks 24
nr-spare-disks 0
chunk-size 32
persistent-superblock 1
parity-algorithm left-symmetric
device /dev/sda1
raid-disk 0
...
device /dev/sdx1
raid-disk 23
mypatch1: (increment disk# limit from 12 to 24, fix integer overflow)
--- linux/include/linux/raid/md_p.h~ Thu Mar 23 11:23:03 2000
+++ linux/include/linux/raid/md_p.h Thu Mar 23 13:47:20 2000
@@ -65,7 +65,7 @@
#define MD_SB_GENERIC_STATE_WORDS 32
#define MD_SB_GENERIC_WORDS (MD_SB_GENERIC_CONSTANT_WORDS +
MD_SB_GENERIC_STATE_WORDS)
#define MD_SB_PERSONALITY_WORDS 64
-#define MD_SB_DISKS_WORDS 384
+#define MD_SB_DISKS_WORDS 800
#define MD_SB_DESCRIPTOR_WORDS 32
#define MD_SB_RESERVED_WORDS (1024 - MD_SB_GENERIC_WORDS -
MD_SB_PERSONALITY_WORDS - MD_SB_DISKS_WORDS - MD_SB_DESCRIPTOR_WORDS)
#define MD_SB_EQUAL_WORDS (MD_SB_GENERIC_WORDS + MD_SB_PERSONALITY_WORDS
+ MD_SB_DISKS_WORDS)
--- linux/drivers/block/raid5.c~ Thu Mar 23 11:23:03 2000
+++ linux/drivers/block/raid5.c Thu Mar 23 13:43:54 2000
@@ -665,7 +665,7 @@
* Output: index of the data and parity disk, and the sector # in them.
*/
static inline unsigned long
-raid5_compute_sector (int r_sector, unsigned int raid_disks, unsigned int data_disks,
+raid5_compute_sector (unsigned long r_sector, unsigned int raid_disks, unsigned int
+data_disks,
unsigned int * dd_idx, unsigned int * pd_idx,
raid5_conf_t *conf)
{
mypatch2: (increment disk# limit from 12 to 24)
--- md-int.h~ Fri Jan 14 15:19:22 2000
+++ md-int.h Mon Jan 17 12:29:34 2000
@@ -137,7 +137,7 @@
#define MD_SB_GENERIC_STATE_WORDS 32
#define MD_SB_GENERIC_WORDS (MD_SB_GENERIC_CONSTANT_WORDS +
MD_SB_GENERIC_STATE_WORDS)
#define MD_SB_PERSONALITY_WORDS 64
-#define MD_SB_DISKS_WORDS 800 /* taka, was 384*/
+#define MD_SB_DISKS_WORDS 800 /* taka, was 384 (see
+/usr/src/linux/include/linux/raid/md_p.h) */
#define MD_SB_DESCRIPTOR_WORDS 32
#define MD_SB_RESERVED_WORDS (1024 - MD_SB_GENERIC_WORDS -
MD_SB_PERSONALITY_WORDS - MD_SB_DISKS_WORDS - MD_SB_DESCRIPTOR_WORDS)
#define MD_SB_EQUAL_WORDS (MD_SB_GENERIC_WORDS + MD_SB_PERSONALITY_WORDS
+ MD_SB_DISKS_WORDS)
part of dmesg output:
Partition check:
sda: sda1
sdb: sdb1
sdc: sdc1
sdd: sdd1
sde: sde1
sdf: sdf1
sdg: sdg1
sdh: sdh1
sdi: sdi1
sdj: sdj1
sdk: sdk1
sdl: sdl1
sdm: sdm1
sdn: sdn1
sdo: sdo1
sdp: sdp1
sdq: sdq1
sdr: sdr1
sds: sds1
sdt: sdt1
sdu: sdu1
sdv: sdv1
sdw: sdw1
sdx: sdx1
md.c: sizeof(mdp_super_t) = 4096
hda: hda1 hda2 < hda5 hda6 >
autodetecting RAID arrays
(read) sda1's sb offset: 48846720 [events: 0000007c]
(read) sdb1's sb offset: 48846720 [events: 0000007c]
(read) sdc1's sb offset: 48846720 [events: 0000007c]
(read) sdd1's sb offset: 48846720 [events: 0000007c]
(read) sde1's sb offset: 48846720 [events: 0000007c]
(read) sdf1's sb offset: 48846720 [events: 0000007c]
(read) sdg1's sb offset: 48846720 [events: 0000007c]
(read) sdh1's sb offset: 48846720 [events: 0000007c]
(read) sdi1's sb offset: 48846720 [events: 0000007c]
(read) sdj1's sb offset: 48846720 [events: 0000007c]
(read) sdk1's sb offset: 48846720 [events: 0000007c]
(read) sdl1's sb offset: 48846720 [events: 0000007c]
(read) sdm1's sb offset: 48846720 [events: 0000007c]
(read) sdn1's sb offset: 48846720 [events: 0000007c]
(read) sdo1's sb offset: 48846720 [events: 0000007c]
(read) sdp1's sb offset: 48846720 [events: 0000007c]
(read) sdq1's sb offset: 48846720 [events: 0000007c]
(read) sdr1's sb offset: 48846720 [events: 0000007c]
(read) sds1's sb offset: 48846720 [events: 0000007c]
(read) sdt1's sb offset: 48846720 [events: 0000007c]
(read) sdu1's sb offset: 48846720 [events: 0000007c]
(read) sdv1's sb offset: 48846720 [events: 0000007c]
(read) sdw1's sb offset: 48846720 [events: 0000007c]
(read) sdx1's sb offset: 48846720 [events: 0000007c]
autorun ...
considering sdx1 ...
adding sdx1 ...
adding sdw1 ...
adding sdv1 ...
adding sdu1 ...
adding sdt1 ...
adding sds1 ...
adding sdr1 ...
adding sdq1 ...
adding sdp1 ...
adding sdo1 ...
adding sdn1 ...
adding sdm1 ...
adding sdl1 ...
adding sdk1 ...
adding sdj1 ...
adding sdi1 ...
adding sdh1 ...
adding sdg1 ...
adding sdf1 ...
adding sde1 ...
adding sdd1 ...
adding sdc1 ...
adding sdb1 ...
adding sda1 ...
created md0
bind<sda1,1>
bind<sdb1,2>
bind<sdc1,3>
bind<sdd1,4>
bind<sde1,5>
bind<sdf1,6>
bind<sdg1,7>
bind<sdh1,8>
bind<sdi1,9>
bind<sdj1,10>
bind<sdk1,11>
bind<sdl1,12>
bind<sdm1,13>
bind<sdn1,14>
bind<sdo1,15>
bind<sdp1,16>
bind<sdq1,17>
bind<sdr1,18>
bind<sds1,19>
bind<sdt1,20>
bind<sdu1,21>
bind<sdv1,22>
bind<sdw1,23>
bind<sdx1,24>
running:
<sdx1><sdw1><sdv1><sdu1><sdt1><sds1><sdr1><sdq1><sdp1><sdo1><sdn1><sdm1><sdl1><sdk1><sdj1><sdi1><sdh1><sdg1><sdf1><sde1><sdd1><sdc1><sdb1><sda1>
now!
sdx1's event counter: 0000007c
sdw1's event counter: 0000007c
sdv1's event counter: 0000007c
sdu1's event counter: 0000007c
sdt1's event counter: 0000007c
sds1's event counter: 0000007c
sdr1's event counter: 0000007c
sdq1's event counter: 0000007c
sdp1's event counter: 0000007c
sdo1's event counter: 0000007c
sdn1's event counter: 0000007c
sdm1's event counter: 0000007c
sdl1's event counter: 0000007c
sdk1's event counter: 0000007c
sdj1's event counter: 0000007c
sdi1's event counter: 0000007c
sdh1's event counter: 0000007c
sdg1's event counter: 0000007c
sdf1's event counter: 0000007c
sde1's event counter: 0000007c
sdd1's event counter: 0000007c
sdc1's event counter: 0000007c
sdb1's event counter: 0000007c
sda1's event counter: 0000007c
md: md0: raid array is not clean -- starting background reconstruction
md0: max total readahead window set to 2944k
md0: 23 data-disks, max readahead per data-disk: 128k
raid5: disabled device sdx1 (device 23 already operational)
raid5: disabled device sdw1 (device 22 already operational)
raid5: disabled device sdv1 (device 21 already operational)
raid5: disabled device sdu1 (device 20 already operational)
raid5: disabled device sdt1 (device 19 already operational)
raid5: disabled device sds1 (device 18 already operational)
raid5: device sdr1 operational as raid disk 17
raid5: device sdq1 operational as raid disk 16
raid5: disabled device sdp1 (device 15 already operational)
raid5: device sdo1 operational as raid disk 14
raid5: device sdn1 operational as raid disk 13
raid5: device sdm1 operational as raid disk 12
raid5: device sdl1 operational as raid disk 11
raid5: device sdk1 operational as raid disk 10
raid5: device sdj1 operational as raid disk 9
raid5: device sdi1 operational as raid disk 8
raid5: device sdh1 operational as raid disk 7
raid5: device sdg1 operational as raid disk 6
raid5: device sdf1 operational as raid disk 5
raid5: device sde1 operational as raid disk 4
raid5: device sdd1 operational as raid disk 3
raid5: device sdc1 operational as raid disk 2
raid5: device sdb1 operational as raid disk 1
raid5: device sda1 operational as raid disk 0
raid5: not enough operational devices for md0 (10/24 failed)
RAID5 conf printout:
--- rd:24 wd:14 fd:10
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda1
disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdb1
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1
disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdd1
disk 4, s:0, o:1, n:4 rd:4 us:1 dev:sde1
disk 5, s:0, o:1, n:5 rd:5 us:1 dev:sdf1
disk 6, s:0, o:1, n:6 rd:6 us:1 dev:sdg1
disk 7, s:0, o:1, n:7 rd:7 us:1 dev:sdh1
disk 8, s:0, o:1, n:8 rd:8 us:1 dev:sdi1
disk 9, s:0, o:1, n:9 rd:9 us:1 dev:sdj1
disk 10, s:0, o:1, n:10 rd:10 us:1 dev:sdk1
disk 11, s:0, o:1, n:11 rd:11 us:1 dev:sdl1
raid5: failed to run raid set md0
pers->run() failed ...
do_md_run() returned -22
unbind<sdx1,23>
export_rdev(sdx1)
unbind<sdw1,22>
export_rdev(sdw1)
unbind<sdv1,21>
export_rdev(sdv1)
unbind<sdu1,20>
export_rdev(sdu1)
unbind<sdt1,19>
export_rdev(sdt1)
unbind<sds1,18>
export_rdev(sds1)
unbind<sdr1,17>
export_rdev(sdr1)
unbind<sdq1,16>
export_rdev(sdq1)
unbind<sdp1,15>
export_rdev(sdp1)
unbind<sdo1,14>
export_rdev(sdo1)
unbind<sdn1,13>
export_rdev(sdn1)
unbind<sdm1,12>
export_rdev(sdm1)
unbind<sdl1,11>
export_rdev(sdl1)
unbind<sdk1,10>
export_rdev(sdk1)
unbind<sdj1,9>
export_rdev(sdj1)
unbind<sdi1,8>
export_rdev(sdi1)
unbind<sdh1,7>
export_rdev(sdh1)
unbind<sdg1,6>
export_rdev(sdg1)
unbind<sdf1,5>
export_rdev(sdf1)
unbind<sde1,4>
export_rdev(sde1)
unbind<sdd1,3>
export_rdev(sdd1)
unbind<sdc1,2>
export_rdev(sdc1)
unbind<sdb1,1>
export_rdev(sdb1)
unbind<sda1,0>
export_rdev(sda1)
md0 stopped.
... autorun DONE.
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 40k freed
Adding Swap: 208804k swap-space (priority -1)
(read) sda1's sb offset: 48846720 [events: 0000007c]
(read) sdb1's sb offset: 48846720 [events: 0000007c]
(read) sdc1's sb offset: 48846720 [events: 0000007c]
(read) sdd1's sb offset: 48846720 [events: 0000007c]
(read) sde1's sb offset: 48846720 [events: 0000007c]
(read) sdf1's sb offset: 48846720 [events: 0000007c]
(read) sdg1's sb offset: 48846720 [events: 0000007c]
(read) sdh1's sb offset: 48846720 [events: 0000007c]
(read) sdi1's sb offset: 48846720 [events: 0000007c]
(read) sdj1's sb offset: 48846720 [events: 0000007c]
(read) sdk1's sb offset: 48846720 [events: 0000007c]
(read) sdl1's sb offset: 48846720 [events: 0000007c]
autorun ...
considering sdl1 ...
adding sdl1 ...
adding sdk1 ...
adding sdj1 ...
adding sdi1 ...
adding sdh1 ...
adding sdg1 ...
adding sdf1 ...
adding sde1 ...
adding sdd1 ...
adding sdc1 ...
adding sdb1 ...
adding sda1 ...
created md0
bind<sda1,1>
bind<sdb1,2>
bind<sdc1,3>
bind<sdd1,4>
bind<sde1,5>
bind<sdf1,6>
bind<sdg1,7>
bind<sdh1,8>
bind<sdi1,9>
bind<sdj1,10>
bind<sdk1,11>
bind<sdl1,12>
running: <sdl1><sdk1><sdj1><sdi1><sdh1><sdg1><sdf1><sde1><sdd1><sdc1><sdb1><sda1>
now!
sdl1's event counter: 0000007c
sdk1's event counter: 0000007c
sdj1's event counter: 0000007c
sdi1's event counter: 0000007c
sdh1's event counter: 0000007c
sdg1's event counter: 0000007c
sdf1's event counter: 0000007c
sde1's event counter: 0000007c
sdd1's event counter: 0000007c
sdc1's event counter: 0000007c
sdb1's event counter: 0000007c
sda1's event counter: 0000007c
md: md0: raid array is not clean -- starting background reconstruction
md0: max total readahead window set to 2944k
md0: 23 data-disks, max readahead per data-disk: 128k
raid5: device sdl1 operational as raid disk 11
raid5: device sdk1 operational as raid disk 10
raid5: device sdj1 operational as raid disk 9
raid5: device sdi1 operational as raid disk 8
raid5: device sdh1 operational as raid disk 7
raid5: device sdg1 operational as raid disk 6
raid5: device sdf1 operational as raid disk 5
raid5: device sde1 operational as raid disk 4
raid5: device sdd1 operational as raid disk 3
raid5: device sdc1 operational as raid disk 2
raid5: device sdb1 operational as raid disk 1
raid5: device sda1 operational as raid disk 0
raid5: not enough operational devices for md0 (12/24 failed)
RAID5 conf printout:
--- rd:24 wd:12 fd:12
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda1
disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdb1
disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1
disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdd1
disk 4, s:0, o:1, n:4 rd:4 us:1 dev:sde1
disk 5, s:0, o:1, n:5 rd:5 us:1 dev:sdf1
disk 6, s:0, o:1, n:6 rd:6 us:1 dev:sdg1
disk 7, s:0, o:1, n:7 rd:7 us:1 dev:sdh1
disk 8, s:0, o:1, n:8 rd:8 us:1 dev:sdi1
disk 9, s:0, o:1, n:9 rd:9 us:1 dev:sdj1
disk 10, s:0, o:1, n:10 rd:10 us:1 dev:sdk1
disk 11, s:0, o:1, n:11 rd:11 us:1 dev:sdl1
raid5: failed to run raid set md0
pers->run() failed ...
do_md_run() returned -22
unbind<sdl1,11>
export_rdev(sdl1)
unbind<sdk1,10>
export_rdev(sdk1)
unbind<sdj1,9>
export_rdev(sdj1)
unbind<sdi1,8>
export_rdev(sdi1)
unbind<sdh1,7>
export_rdev(sdh1)
unbind<sdg1,6>
export_rdev(sdg1)
unbind<sdf1,5>
export_rdev(sdf1)
unbind<sde1,4>
export_rdev(sde1)
unbind<sdd1,3>
export_rdev(sdd1)
unbind<sdc1,2>
export_rdev(sdc1)
unbind<sdb1,1>
export_rdev(sdb1)
unbind<sda1,0>
export_rdev(sda1)
md0 stopped.
... autorun DONE.
Bad md_map in ll_rw_block
EXT2-fs: unable to read superblock
Bad md_map in ll_rw_block
EXT2-fs: unable to read superblock