Hi, I have got a few questions about an array I have here, running on a RedHat 6.0 distribution with a 2.2.5-22 kernel, and raidtools 0.9. The array has 4 SCSI disks, where one has failed: # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sda1[0](F) sdd1[3] sdc1[2] sde1[1] 26627328 blocks level 5, 128k chunk, algorithm 2 [4/3] [_UUU] unused devices: <none> I tried to replace sda1 (the drive with SCSI ID 0) with another physical drive I have outside the machine (which was once part of this RAID array. The drive that went in had ID 0 too. I played around trying to get the array working, but for some reason it would not work (sorry I do not have any output from this time). I suspect that the fact that the drive had old information on it may have caused a problem, since putting the faulty drive with time inconsistencies allowed the array to be started up again. I have a spare disk in this machine which has been added to the array with raidhotadd. I wanted this spare disk to be automatically added as a hotspare drive, but I have been unable to get this working (now commented out in raidtab.conf file below). Can anyone give me some insight into what is going on here. Should I format the partition on the drive that did not work so that superblock/etc information is no longer present? Should I seriously consider compiling a 2.2.14 kernel with the latest raidtools patch? Should I pull my last hair out of my head? Kind regards, Stuart. # raidstart --version raidstart v0.3d compiled for md raidtools-0.90 # cat /etc/raidtab raiddev /dev/md0 raid-level 5 nr-raid-disks 4 chunk-size 128 persistent-superblock 1 parity-algorithm left-symmetric # Spare disks for hot reconstruction #nr-spare-disks 1 device /dev/sda1 raid-disk 0 device /dev/sde1 raid-disk 1 device /dev/sdc1 raid-disk 2 device /dev/sdd1 raid-disk 3 #device /dev/sdb1 #spare-disk 0 # cat /var/log/dmesg wansea University Computer Society NET3.039 NET4: Unix domain sockets 1.0 for Linux NET4.0. NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP, IGMP Initializing RT netlink socket Starting kswapd v 1.5 Detected PS/2 Mouse Port. Serial driver version 4.27 with MANY_PORTS MULTIPORT SHARE_IRQ enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A pty: 256 Unix98 ptys configured apm: BIOS version 1.2 Flags 0x03 (Driver version 1.9) Real Time Clock Driver v1.09 RAM disk driver initialized: 16 RAM disks of 4096K size PIIX: IDE controller on PCI bus 00 dev 38 PIIX: not 100% native mode: will probe irqs later PIIX: neither IDE port enabled (BIOS) PIIX: IDE controller on PCI bus 00 dev 39 PIIX: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xe800-0xe807, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0xe808-0xe80f, BIOS settings: hdc:pio, hdd:pio hda: ST31720A, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hda: ST31720A, 1626MB w/0kB Cache, CHS=826/64/63 Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12 raid5: measuring checksumming speed 8regs : 169.545 MB/sec 32regs : 149.352 MB/sec using fastest function: 8regs (169.545 MB/sec) scsi : 0 hosts. scsi : detected total. md.c: sizeof(mdp_super_t) = 4096 Partition check: hda: hda1 hda2 hda3 RAMDISK: Compressed image found at block 0 autodetecting RAID arrays autorun ... ... autorun DONE. VFS: Mounted root (ext2 filesystem). (scsi0) <Adaptec AHA-294X SCSI host adapter> found at PCI 10/0 (scsi0) Narrow Channel, SCSI ID=7, 16/255 SCBs (scsi0) Downloading sequencer code... 406 instructions downloaded scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.16/3.2.4 <Adaptec AHA-294X SCSI host adapter> scsi : 1 host. (scsi0:0:0:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST19171N Rev: 0023 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 (scsi0:0:1:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST19171N Rev: 0024 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0 (scsi0:0:2:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST19171N Rev: 0023 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdc at scsi0, channel 0, id 2, lun 0 (scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST39140N Rev: 1498 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdd at scsi0, channel 0, id 3, lun 0 (scsi0:0:4:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST19171N Rev: 0024 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sde at scsi0, channel 0, id 4, lun 0 SCSI device sda: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB] sda: sda1 SCSI device sdb: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB] sdb: sdb1 SCSI device sdc: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB] sdc: sdc1 SCSI device sdd: hdwr sector= 512 bytes. Sectors= 17783240 [8683 MB] [8.7 GB] sdd: sdd1 SCSI device sde: hdwr sector= 512 bytes. Sectors= 17783112 [8683 MB] [8.7 GB] sde: sde1 autodetecting RAID arrays (read) sda1's sb offset: 8883840 [events: 0000003c] (read) sdb1's sb offset: 8883840 [events: 00000021] (read) sdc1's sb offset: 8875776 [events: 0000003e] (read) sdd1's sb offset: 8883840 [events: 0000003e] (read) sde1's sb offset: 8883840 [events: 0000003e] autorun ... considering sde1 ... adding sde1 ... adding sdd1 ... adding sdc1 ... adding sdb1 ... adding sda1 ... created md0 bind<sda1,1> bind<sdb1,2> bind<sdc1,3> bind<sdd1,4> bind<sde1,5> running: <sde1><sdd1><sdc1><sdb1><sda1> now! sde1's event counter: 0000003e sdd1's event counter: 0000003e sdc1's event counter: 0000003e sdb1's event counter: 00000021 sda1's event counter: 0000003c md: superblock update time inconsistency -- using the most recent one freshest: sde1 md: kicking non-fresh sdb1 from array! unbind<sdb1,4> export_rdev(sdb1) md: kicking non-fresh sda1 from array! unbind<sda1,3> export_rdev(sda1) md0: removing former faulty sda1! kmod: failed to exec /sbin/modprobe -s -k md-personality-4, errno = 2 do_md_run() returned -22 unbind<sde1,2> export_rdev(sde1) unbind<sdd1,1> export_rdev(sdd1) unbind<sdc1,0> export_rdev(sdc1) md0 stopped. ... autorun DONE. VFS: Mounted root (ext2 filesystem) readonly. change_root: old root has d_count=1 Trying to unmount old root ... okay Freeing unused kernel memory: 60k freed Adding Swap: 128988k swap-space (priority -1) (read) sda1's sb offset: 8883840 [events: 0000003c] (read) sde1's sb offset: 8883840 [events: 0000003e] (read) sdc1's sb offset: 8875776 [events: 0000003e] (read) sdd1's sb offset: 8883840 [events: 0000003e] autorun ... considering sdd1 ... adding sdd1 ... adding sdc1 ... adding sde1 ... adding sda1 ... created md0 bind<sda1,1> bind<sde1,2> bind<sdc1,3> bind<sdd1,4> running: <sdd1><sdc1><sde1><sda1> now! sdd1's event counter: 0000003e sdc1's event counter: 0000003e sde1's event counter: 0000003e sda1's event counter: 0000003c md: superblock update time inconsistency -- using the most recent one freshest: sdd1 md: kicking non-fresh sda1 from array! unbind<sda1,3> export_rdev(sda1) md0: removing former faulty sda1! raid5 personality registered md0: max total readahead window set to 1536k md0: 3 data-disks, max readahead per data-disk: 512k raid5: device sdd1 operational as raid disk 3 raid5: device sdc1 operational as raid disk 2 raid5: device sde1 operational as raid disk 1 raid5: md0, not all disks are operational -- trying to recover array raid5: allocated 4248kB for md0 raid5: raid level 5 set md0 active with 3 out of 4 devices, algorithm 2 RAID5 conf printout: --- rd:4 wd:3 fd:1 disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde1 disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1 disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdd1 disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] RAID5 conf printout: --- rd:4 wd:3 fd:1 disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde1 disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc1 disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdd1 disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] md: updating md0 RAID superblock on device sdd1 [events: 0000003f](write) sdd1's sb offset: 8883840 md: recovery thread got woken up ... md0: no spare disk to reconstruct array! -- continuing in degraded mode md: recovery thread finished ... sdc1 [events: 0000003f](write) sdc1's sb offset: 8875776 sde1 [events: 0000003f](write) sde1's sb offset: 8883840