List,

        I thought I would pass this along should anyone else experience a loss 
of all partitions on a drive or array. May help somebody out someday:

dmraid Partition Loss with dmraid-1.0.0rc15

        Testing dmraid-1.0.0rc15 on a box with two separate dmraid arrays, I 
experienced the total loss of all partitions on the second dmraid array. The 
first array held an openSuSE install running dmraid-1.0.0rc14 while the second 
held Archlinux with dmraid-1.0.0rc15 where testing was being done. All testing 
of dmraid-1.0.0rc15 on Archlinux went fine, the problem occurred when the 
machine was boot back into openSuSE. Regardless of the situation, whether using 
a raid setup or not, partition loss is serious business.

dmraid Partition Recovery

        Recovery of dmraid partitions proceed in the same manner as recovering 
partitions from a singe drive. if you haven't destroyed the information on the 
array, you should be able to put the pieces of the puzzle back together again. 
The basic outline for the process is to locate and restore the partitions on 
the array and then reinstall the boot loader so your box is functional again. 
(Note: if you were smart enough to save the "fdisk -l" information for your 
drives, you can simply fdisk your array and be done)

Tools Required

        Partition location and recovery software (I used testdisk)
                http://www.cgsecurity.org/
                http://www.cgsecurity.org/wiki/TestDisk_Download
                http://www.cgsecurity.org/testdisk-6.11.linux26.tar.bz2
                
        Rescue CD for your OS (generally your install CD/DVD, or knoppix, etc.)

Using testdisk

        testdisk is a great piece of GPL code written by Christophe Grenier. 
testdisk can be used with most operating systems and will scan you disk or 
array and locate partition boundaries and give you the opportunity to recover 
them. I had 4 partitions dedicated to my Archlinux install totaling roughly 70G 
on a 750G raid array. To start testdisk, for Linux26, you will untar the bzip 
archive and then cd into the linux subdirectory. The prebuilt binary is:
        
        ./testdisk_static
        
        The first thing you will need to do is set the correct disk geometry. 
In my case the disk reported 254 heads and needed to be changed to 255 heads to 
work properly. (This is recommended if the first Quick Scan doesn't find your 
partitions).
        
        After setting the geometry, just choose "Analyze" and "Quick Scan" and 
go get a coffee or something. In my case since the 70G I was using was at the 
front of the 750G array, it had found my partitions within 5 minutes or so. 
Once all of your partitions are found you can "Stop" the scan by hitting the 
return key.
        
        You are then presented with the list of found partitions. They will be 
initially labeled "D" for deleted and you simply toggle on the partitions you 
need to recover by selecting ("P" Primary, "*" Primary Boot, "L" Logical or 
leave as "D" for Deleted). testdisk will check your selections for partition 
overlap and give you confirmation in green if your partition layout is OK. Just 
hit return to continue. Don't worry about the extended partition boundary, it 
will be provided. Review the partitions to be recovered and choose "Write" and 
your are done. (a reboot is required to activate the partitions)
        
        If no partitions were found during the "Quick Scan", then (1) check 
your drive geometry setting; and (2) you will be given the option to do an "In 
Depth Scan" (go get 4 cups of coffee, walk the dog, etc...)

Have Your Rescue CD Handy

        Once the partition information has been changed, there is a near 100% 
chance your boot loader configuration will be messed up. Don't worry, 
everything is still there, you just have to reinstall grub or lilo into the 
boot record to recover from the situation.

Reinstalling Grub

        Here you will be booting from your CD or DVD into rescue mode, using 
dmraid to activate the arrays, and then using the information about the dm 
nodes in /dev/mapper and the partition information in from "cat 
/proc/partitions" to create a chroot of your install to repair the boot loader:

(1) boot from the install DVD

(2) choose "Rescue System", login as "root" (no password needed)

(3) activate the dmraid arrays with "dmraid -ay"

(4) check which device nodes to use to create the chroot with "ls -al /dev/dm*" 
or "ls -al /dev/mapper". I was dealing with 2 separate arrays, 9 partitions 
(duplicated by having both dmraid-1.0.0rc14 and dmraid-1.0.0rc15 metadata) that 
left me with dm-0 to dm-20 to deal with. Compare the size shown for dm-X, 
/dev/mapper/raiddevice_name and the size shown from "cat /proc/partitions" to 
determine your "/", "/home", and "/boot" and any other partitions you need to 
setup in your chroot.

(5) mount all dm-X devices or /dev/mapper devices under /mnt to create your 
actual filesystem, and then bind dev/, proc/ and sys/ to their respective mount 
points under /mnt and chroot.

    **Note, you need to mount the device containing the / (root) filesystem 
first before mounting /boot and /home. Otherwise, the /boot and /home mount 
points will not exist:

        Example:

        mount /dev/dm-5 /mnt
        mount /dev/dm-7 /mnt/boot
        mount /dev/dm-6 /mnt/home
        mount -o bind /dev /mnt/dev
        mount -o bind /proc /mnt/proc
        mount -o bind /sys /mnt/sys
        cd /mnt
        chroot /mnt

(6) Reinstall grub to fix the mbr on your raid discs (mine were hd0 and hd1). 
See http://wiki.archlinux.org/index.php/Installing_with_Fake-RAID#Install_GRUB 
for my notes on getting the (hdX,Y) numbers right. When you start grub, you get 
a small ">" prompt, just use the following as a guide. If you only have a 
single array, you will only need to worry about setting up hd0:

        grub
        >root (hd0,4)
        >setup (hd0)
        >*** few lines of grub output ***
        >root (hd1,5)
        >setup (hd1)
        >*** more lines of grub output ***
        >quit

(7) check your /etc/grub.conf to make sure it agrees with the way you have just 
configured grub. For the example above, it should look like this for hd0 (I 
boot to hd0 and then chainload to get to hd1 and the second array)

        setup --stage2=/boot/grub/stage2 (hd0) (hd0,4)
        quit

(8) exit (to exit chroot) and reboot, and if you were successful (or just damn 
lucky), your system will be 100% again. Now immediately do "fdisk -l" on each 
of your arrays and drives and save that information remotely so if this happens 
again, you have a shortcut;-)


-- 
David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
www.rankinlawfirm.com

Reply via email to