On Thu, 2009-10-22 at 16:00 +1300, Roger Searle wrote: > Hi, I noticed by chance that I have a failed drive in a raid1 array on a > file server that I need to replace, and seeking some guidance or > confirmation of being on the right track to resolve this. Seems from > the failure of more than 1 partition I will be needing to buy a new disk > rather than any repair option being possible, I may as well get a new > pair but replace the failed disk first, then when resolved replace the > other. Yes, I have backups of the valuable data on other drives both in > the same machine (not in this array) and elsewhere. And I then need to > set up better monitoring because the failure began a few weeks ago. But > for now... There are so many levels of electronics that you go through to get to the platter there days that if you see even a single hard error, then now's a good time to use it only for skeet shooting... > > The failed disk is 320GB, and contain (mirrored) /, home, and swap. > Presumably I could buy much larger disks, and need to repartition prior > to adding it back into the array? Best to use the same make/model of disk if possible. Speed differences between the two can make it unreliable ( that's an exaggeration, but you know what I mean ). > > The partitions should be at least the same size but could be much larger > without any problem? If you want to add more space, then I'd buy a new pair of bigger disks, create a new set, and copy everything across. Reason I'm saying that is that your 2 existing disks are probably exactly the same make/model, with similar serial numbers??? Guess what's going to fail next (:
Last time I looked, 1TB was around the $125 mark. > > There is some configuration data in mdadm.conf including UUIDs of the > arrays, and this doesn't match with the UUIDs in fstab, do I need to be > concerned about this sort of thing and can just use mdadm or other tools > to rebuild the arrays and that will update any relevant config files? mdadm.conf is pretty redundant I think. They tend to be automagically configured at boot time these days. Building a new raid array *should* add the correct data to it. Although I have a grand old time with a hardy server of mine in this respect. > > Is there anything else I should be looking out for or preparing? Don't forget to add a bootstrap to each new disk if this is going to contain the boot partition as well. > > Thanks for any pointers anyone may care to share. You could try mdadm --add /dev/md3 /dev/sdb4 and see whether it resilvers. Looking in dmesg for hard errors is the best place. hth, Steve > > A couple of examples of DegradedArray and Fail Event emails to root > recently follow: > > To: [email protected] > Subject: Fail event on /dev/md1:jupiter > Date: Wed, 21 Oct 2009 17:42:49 +1300 > > This is an automatically generated mail message from mdadm > running on jupiter > > A Fail event had been detected on md device /dev/md1. > > It could be related to component device /dev/sdb2. > > Faithfully yours, etc. > > P.S. The /proc/mdstat file currently contains the following: > > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] > [raid4] [raid10] > md3 : active raid1 sda4[0] > 290977216 blocks [2/1] [U_] > > md2 : active raid1 sda3[0] sdb3[1] > 104320 blocks [2/2] [UU] > > md1 : active raid1 sda2[0] sdb2[2](F) > 1951808 blocks [2/1] [U_] > > md0 : active raid1 sda1[0] > 19534912 blocks [2/1] [U_] > > unused devices: <none> > > > > Subject: DegradedArray event on /dev/md3:jupiter > Date: Wed, 07 Oct 2009 08:26:49 +1300 > > This is an automatically generated mail message from mdadm > running on jupiter > > A DegradedArray event had been detected on md device /dev/md3. > > Faithfully yours, etc. > > P.S. The /proc/mdstat file currently contains the following: > > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] > [raid4] [raid10] > md3 : active raid1 sda4[0] > 290977216 blocks [2/1] [U_] > > md2 : active raid1 sda3[0] sdb3[1] > 104320 blocks [2/2] [UU] > > md1 : active raid1 sda2[0] sdb2[1] > 1951808 blocks [2/2] [UU] > > md0 : active raid1 sda1[0] > 19534912 blocks [2/1] [U_] > > unused devices: <none> > > Cheers, > Roger > -- Steve Holdoway <[email protected]> http://www.greengecko.co.nz MSN: [email protected] GPG Fingerprint = B337 828D 03E1 4F11 CB90 853C C8AB AF04 EF68 52E0
signature.asc
Description: This is a digitally signed message part
