Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-21 Thread Greg Smith

On Tue, 20 Oct 2009, Ow Mun Heng wrote:

Raid10 is supposed to be able to withstand up to 2 drive failures if the 
failures are from different sides of the mirror.  Right now, I'm not 
sure which drive belongs to which. How do I determine that? Does it 
depend on the output of /prod/mdstat and in that order?


You build a 4-disk RAID10 array on Linux by first building two RAID1 
pairs, then striping both of the resulting /dev/mdX devices together via 
RAID0.  You'll actually have 3 /dev/mdX devices around as a result.  I 
suspect you're trying to execute mdadm operations on the outer RAID0, when 
what you actually should be doing is fixing the bottom-level RAID1 
volumes.  Unfortunately I'm not too optimistic about your case though, 
because if you had a repairable situation you technically shouldn't have 
lost the array in the first place--it should still be running, just in 
degraded mode on both underlying RAID1 halves.


There's a good example of how to set one of these up 
http://www.sanitarium.net/golug/Linux_Software_RAID.html ; note how the 
RAID10 involves /dev/md{0,1,2,3} for the 6-disk volume.


Here's what will probably show you the parts you're trying to figure out:

mdadm --detail /dev/md0
mdadm --detail /dev/md1
mdadm --detail /dev/md2

That should give you an idea what md devices are hanging around and what's 
inside of them.


One thing you don't see there is what devices were originally around if 
they've already failed.  I highly recommend saving a copy of the mdadm 
detail (and smartctl -i for each underlying drive) on any production 
server, to make it easier to answer questions like what's the serial 
number of the drive that failed in /dev/md0?.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-21 Thread Scott Marlowe
On Wed, Oct 21, 2009 at 12:10 AM, Greg Smith gsm...@gregsmith.com wrote:
 On Tue, 20 Oct 2009, Ow Mun Heng wrote:

 Raid10 is supposed to be able to withstand up to 2 drive failures if the
 failures are from different sides of the mirror.  Right now, I'm not sure
 which drive belongs to which. How do I determine that? Does it depend on the
 output of /prod/mdstat and in that order?

 You build a 4-disk RAID10 array on Linux by first building two RAID1 pairs,
 then striping both of the resulting /dev/mdX devices together via RAID0.

Actually, later models of linux have a direct RAID-10 level built in.
I haven't used it.  Not sure how it would look in /proc/mdstat either.

  You'll actually have 3 /dev/mdX devices around as a result.  I suspect
 you're trying to execute mdadm operations on the outer RAID0, when what you
 actually should be doing is fixing the bottom-level RAID1 volumes.
  Unfortunately I'm not too optimistic about your case though, because if you
 had a repairable situation you technically shouldn't have lost the array in
 the first place--it should still be running, just in degraded mode on both
 underlying RAID1 halves.

Exactly.  Sounds like both drives in a pair failed.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-21 Thread Greg Smith

On Tue, 20 Oct 2009, Craig Ringer wrote:

You made an exact image of each drive onto new, spare drives with `dd' 
or a similar disk imaging tool before trying ANYTHING, right? Otherwise, 
you may well have made things worse, particularly since you've tried to 
resync the array. Even if the data was recoverable before, it might not 
be now.


This is actually pretty hard to screw up with Linux software RAID.  It's 
not easy to corrupt a working volume by trying to add a bogus one or 
typing simple commands wrong.  You'd have to botch the drive addition 
process altogether and screw with something else to take out a good drive.



If the problem is just a few bad sectors, you can usually just
force-re-add the drives into the array and then copy the array contents
to another drive either at a low level (with dd_rescue) or at a file
system level.


This approach has saved me more than once.  On the flip side, I have also 
more than once accidentally wiped out my only good copy of the data when 
making a mistake during an attempt at stressed out heroics like this. 
You certainly don't want to wander down this more complicated path if 
there's a simple fix available within the context of the standard tools 
for array repairs.



On a side note: I'm personally increasingly annoyed with the tendency of
RAID controllers (and s/w raid implementations) to treat disks with
unrepairable bad sectors as dead and fail them out of the array.


Given how fast drives tend to go completely dead once the first error 
shows up, this is a reasonable policy in general.


Rather than failing a drive and as a result rendering the whole array 
unreadable in such situations, it should mark the drive defective, set 
the array to read-only, and start screaming for help.


The idea is great, but you have to ask just exactly how the hardware and 
software involved is supposed to enforce making the array read-only.  I 
don't think the ATA and similar command sets have that concept implemented 
in a way you can actually do this at the level it would need to happen at 
for hardware RAID to implement this idea.  Linux software RAID could keep 
you from mounting the array read/write in this situation, but the way 
errors percolate up from the disk devices to the array ones in Linux has 
too many layers in it (especially if LVM is stuck in the middle there too) 
for that to be simple to implement either.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-21 Thread Greg Smith

On Wed, 21 Oct 2009, Scott Marlowe wrote:


Actually, later models of linux have a direct RAID-10 level built in.
I haven't used it.  Not sure how it would look in /proc/mdstat either.


I think I actively block memory of that because the UI on it is so cryptic 
and it's been historically much more buggy than the simpler RAID0/RAID1 
implementaions.  But you're right that it's completely possible Ow used 
it.  Would explain not being able to figure out what's going on too.


There's a good example of what the result looks like with failed drives in 
one of the many bug reports related to that feature at 
https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/285156 and I 
liked the discussion of some of the details here at 
http://robbat2.livejournal.com/231207.html


The other hint I forgot to mention is that you should try:

mdadm --examine /dev/XXX

For each of the drives that still works, to help figure out where they fit 
into the larger array.  That and --detail are what I find myself using 
instead of /proc/mdstat , which provides an awful interface IMHO.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-21 Thread Ow Mun Heng


-Original Message-
From: Greg Smith [mailto:gsm...@gregsmith.com] 
On Wed, 21 Oct 2009, Scott Marlowe wrote:

 Actually, later models of linux have a direct RAID-10 level built in.
 I haven't used it.  Not sure how it would look in /proc/mdstat either.

I think I actively block memory of that because the UI on it is so cryptic 
and it's been historically much more buggy than the simpler RAID0/RAID1 
implementaions.  But you're right that it's completely possible Ow used 
it.  Would explain not being able to figure out what's going on too.

You're right, the newer linux all support raid10 by default and do not do
the funky Raid1 first then raid0 stuffs combined.

There's a good example of what the result looks like with failed drives in 
one of the many bug reports related to that feature at 
https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/285156 and I 
liked the discussion of some of the details here at 
http://robbat2.livejournal.com/231207.html

I actually stumbled onto that (the 2nd link) and tried some of the methods,
but it's actually kinda of outdated I think.

 The other hint I forgot to mention is that you should try:

 mdadm --examine /dev/XXX

 For each of the drives that still works, to help figure out where they fit

 into the larger array.  That and --detail are what I find myself using 
 instead of /proc/mdstat , which provides an awful interface IMHO.

That's one of the problem, I'm not exactly sure.

Sda1 = 1
Sdb1 = 2
Sdc1 = 3
Sdd1 = 4

If they are following the sequence, and I'm losing sda1 and sdd1, I
theoretically is supposed to be able to recover them, but I'm not getting
much luck.

FYI.. I've left the box as it is for now and have yet to connect it back up
and all, hence, I can't really post the outputs of /proc/mdstat and
--examine.

But I will once I boot it up.



--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-20 Thread Scott Marlowe
On Tue, Oct 20, 2009 at 1:11 AM, Ow Mun Heng ow.mun.h...@wdc.com wrote:
 Sorry guys, I know this is very off-track for this list, but google hasn't
 been of much help. This is my raid array on which my PG data resides.

 I have a 4 disk Raid10 array running on linux MD raid.
 Sda / sdb / sdc / sdd

 One fine day, 2 of the drives just suddenly decide to die on me. (sda and
 sdd)

 I've tried multiple methods to try to determine if I can get them back
 online.

 1) replace sda w/ fresh drive and resync - Failed
 2) replace sdd w/ fresh drive and resync - Failed
 3) replace sda w/ fresh drive but keeping existing sdd and resync - Failed
 4) replace sdd w/ fresh drive but keeping existing sda and resync - Failed


 Raid10 is supposed to be able to withstand up to 2 drive failures if the
 failures are from different sides of the mirror.

 Right now, I'm not sure which drive belongs to which. How do I determine
 that? Does it depend on the output of /prod/mdstat and in that order?

Is this software raid in linux?  What does

cat /proc/mdstat

say?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OT - 2 of 4 drives in a Raid10 array failed - Any chance of recovery?

2009-10-20 Thread Craig Ringer
On 20/10/2009 4:41 PM, Scott Marlowe wrote:

 I have a 4 disk Raid10 array running on linux MD raid.
 Sda / sdb / sdc / sdd

 One fine day, 2 of the drives just suddenly decide to die on me. (sda and
 sdd)

 I've tried multiple methods to try to determine if I can get them back
 online

You made an exact image of each drive onto new, spare drives with `dd'
or a similar disk imaging tool before trying ANYTHING, right?

Otherwise, you may well have made things worse,  particularly since
you've tried to resync the array. Even if the data was recoverable
before, it might not be now.



How, exactly, have the drives failed? Are they totally dead, so that the
BIOS / disk controller don't even see them? Can the partition tables be
read? Does 'file -s /dev/sda' report any output? What's the output of:

smartctl -d ata -a /dev/sda

(repeat for sdd)

?



If the problem is just a few bad sectors, you can usually just
force-re-add the drives into the array and then copy the array contents
to another drive either at a low level (with dd_rescue) or at a file
system level.

If the problem is one or more totally fried drives, where the drive is
totally inaccessible or most of the data is hopelessly corrupt /
unreadable, then you're in a lot more trouble. RAID 10 effectively
stripes the data across the mirrored pairs, so if you lose a whole
mirrored pair you've lost half the stripes. It's not that different from
running paper through a shredder, discarding half the shreds, and lining
it all back up.


On a side note: I'm personally increasingly annoyed with the tendency of
RAID controllers (and s/w raid implementations) to treat disks with
unrepairable bad sectors as dead and fail them out of the array. That's
OK if you have a hot spare and no other drive fails during rebuild, but
it's just not good enough if failing that drive would result in the
array going into failed state. Rather than failing a drive and as a
result rendering the whole array unreadable in such situations, it
should mark the drive defective, set the array to read-only, and start
screaming for help. Way too much data gets murdered by RAID
implementations removing mildly faulty drives from already-degraded
arrays instead of just going read-only.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general