Re: hardware raid and replicating errors (Re: [plug] ubuntu 7.04)

andrelst Sun, 13 May 2007 21:44:52 -0700

On 5/11/07, Gerald Timothy Quimpo <[EMAIL PROTECTED]> wrote:

On Tue, 2007-05-08 at 16:36 +0800, Orlando Andico wrote:
> On 5/8/07, Rogelio Serrano <[EMAIL PROTECTED]> wrote:
> >
> > i dont use hardware raid because it is much slower than software raid.
> > and it willingly replicates filesystem blocks with errors. since the
> > boot partition is mostly read only thats fine.
>
> Tell that to people like EMC and Veritas. They seem to have built
> entire business models around hardware RAID..


Hi Orly,

Clearly EMC and Veritas are good arguments merely by their existence and
business success.

On the other hand, I'm interested in the original point.  *Does*
hardware RAID replicate filesystem blocks with errors? The answer
is probably "it depends", which is fine.  All good answers start with
"it depends", but depends on what?


No. Once the HW raid controller detects that it can't write to say,
on the secondary harddrive on a RAID 1 configuration, It drops any read/writes
to it and informs you via SNMP, e-mail, etc. additionally, it mirrors to another
harddrive if you have configured it to have spares.

Please also note that HW RAID does not even know anything about the
underlying filesystem or what it's doing.

Which RAID types does that affect most? I don't have enough experience
or theory with RAID to know things like, if you have RAID-1, is one
of the mirrors a primary (for read/write) and the second is mostly for
replication (read/write too, but preference goes to the primary).  Or
does the RAID hardware (for RAID-1, anyway) actually randomly choose
which half of the mirror to write to and then (possibly) replicate
disk errors on that written half to the other half?
How about RAID-5, can data errors due to hardware errors propagate
with that?

Or maybe some RAID controllers detect the error, mark that block bad
(at the hardware RAID level, no need to badblocks) and move the data
to some other block and replicate that?


That's usually the job of the Harddrive controller, not the RAID controller
to flag the block as unusable and remap the block in theory.

That might cause its own problems too though, hiding the bug so that
drives can slowly degrade and RAID still works and people who think
that RAID means they're backed up wake up one day and find themselves
in deep shyte.


No. HW RAID (or SW RAID for that matter) is like yes or no, 1 or 0. A
harddrive it manages is either working or not, and nothing in between.

In my experience with a variety of HW RAID controllers or Storage Arrays,
"drives can slowly degrade and RAID still works", has not happened.

But don't let that verbosity stand in the way of explaining.  I don't
know enough about the realities of RAID and comparative advantages of
different RAID controllers (which brands stand out overall, which
families of controller models to stay away from) to actually know what
I'm talking about.  I'm hoping there'll be discussion here that I can
learn from.


Bottomline, HW raid is no different with SW raid.

I'm a fan of HW raid(just hate the md/lvm2 combo SW RAID) on the boot drives,
but likes SW raid on application mount points because of the
flexibility it gives me.

The only thing to watch out for HW raid is the different manufacturers
where the quality of the firmware widely differs. Rule of thumb, the
more expensive it is, the better and reliable the HW raid controller is.

--
regards,
Andre | http://www.varon.ca
_________________________________________________
Philippine Linux Users' Group (PLUG) Mailing List
[email protected] (#PLUG @ irc.free.net.ph)
Read the Guidelines: http://linux.org.ph/lists
Searchable Archives: http://archives.free.net.ph

Re: hardware raid and replicating errors (Re: [plug] ubuntu 7.04)

Reply via email to