Hello Jakob, I recently had a two disk failure, and found your howto as well as a message from Martin Bene very helpful in resolving this. Below you find a modified version of chapter 6.1 of your faq, where I merged your and Martinīs version in order to make things somewhat more detailed and explicit. The new version may make the procedure for recovery more clear for people experiencing the problem. And in that situation, they of course are happy about all help they can get... The two-disk-failed situation seems to happen relatively often (due to controller/hardware failure/hickups), and this is where raid is effectively more dangerous than non-raid (one large disk). A more automated and fool-proof tool for resolving this might be the ideal solution (but more than I can deliver currently). If someone on the mailing list finds some mistake (am I really right about the spare-disk?) or has an improved version, please post! ========= my proposed howto version: 6.1 Recovery from a multiple disk failure The scenario is: A controller dies and takes two disks offline at the same time, All disks on one scsi bus can no longer be reached if a disk dies, A cable comes loose... In short: quite often you get a temporary failure of several disks at once; afterwards the RAID superblocks are out of sync and you can no longer init your RAID array. One thing left: rewrite the RAID superblocks by mkraid --force To get this to work, you'll need to have an up to date /etc/raidtab - if it doesn't EXACTLY match devices and ordering of the original disks this won't work. Look at the syslog produced by trying to start the array, you'll see the event count for each superblock. Usually it's best to leave out the disk with the lowest event count, i.e the one that failed first (by using "failed-disk"). Itīs important that you replace "raid-disk" by "failed-disk" for that drive in your raidtab. If you mkraid without that "failed-disk"-change, the recovery thread will kick in immediately and start rebuilding the parity blocks. If you got something wrong this will definitely kill your data. So, you mark one disk as failed and create the array in degraded mode (the kernel wonīt try to recover/resync the array then). With "failed-disk" you can specify exactly which disks you want to be active and perhaps try different combinations for best results. BTW, only mount the filesystem read-only while trying this out... If you have a spare-disk, you should mark that as "failed-disk", too. * Check your raidtab against the info you get in the logs from the failed startup (correct sequence of partitions). * mark one of the disks with the lowest event count as a "failed-disk" instead of "raid-disk" in /etc/raidtab * recreate the raid superblocks using mkraid * try to mount readonly, check if all is OK * if it doesn't work, recheck raidtab, perhaps mark a different drive as failed, go back to the mkraid step. * unmount, so you can fsck your raid drive (which you probably want to do) * add the last disk using raidhotadd * mount normally * remove the failed-disk stuff from your raidtab. ========= your original version at http://www.ostenfeld.dk/~jakob/Software-RAID.HOWTO/ 6.1 Recovery from a multiple disk failure The scenario is: A controller dies and takes two disks offline at the same time, All disks on one scsi bus can no longer be reached if a disk dies, A cable comes loose... In short: quite often you get a temporary failure of several disks at once; afterwards the RAID superblocks are out of sync and you can no longer init your RAID array. One thing left: rewrite the RAID superblocks by mkraid --force To get this to work, you'll need to have an up to date /etc/raidtab - if it doesn't EXACTLY match devices and ordering of the original disks this won't work. Look at the sylog produced by trying to start the array, you'll see the event count for each superblock; usually it's best to leave out the disk with the lowest event count, i.e the oldest one. If you mkraid without failed-disk, the recovery thread will kick in immediately and start rebuilding the parity blocks - not necessarily what you want at that moment. With failed-disk you can specify exactly which disks you want to be active and perhaps try different combinations for best results. BTW, only mount the filesystem read-only while trying this out... This has been successfully used by at least two guys I've been in contact with.