Re: [Dovecot] Better to use a single large storage server or multiple smaller for mdbox?

Ed W Thu, 12 Apr 2012 03:59:33 -0700

On 12/04/2012 11:20, Stan Hoeppner wrote:

On 4/11/2012 9:23 PM, Emmanuel Noobadmin wrote:

On 4/12/12, Stan Hoeppner<s...@hardwarefreak.com>  wrote:

On 4/11/2012 11:50 AM, Ed W wrote:

One of the snags of md RAID1 vs RAID6 is the lack of checksumming in the
event of bad blocks.  (I'm not sure what actually happens when md
scrubbing finds a bad sector with raid1..?).  For low performance
requirements I have become paranoid and been using RAID6 vs RAID10,
filesystems with sector checksums seem attractive...

Except we're using hardware RAID1 here and mdraid linear.  Thus the
controller takes care of sector integrity.  RAID6 yields nothing over
RAID10, except lower performance, and more usable space if more than 4
drives are used.

How would the control ensure sector integrity unless it is writing
additional checksum information to disk? I thought only a few
filesystems like ZFS does the sector checksum to detect if any data
corruption occurred. I suppose the controller could throw an error if
the two drives returned data that didn't agree with each other but it
wouldn't know which is the accurate copy but that wouldn't protect the
integrity of the data, at least not directly without additional human
intervention I would think.

When a drive starts throwing uncorrectable read errors, the controller
faults the drive and tells you to replace it.  Good hardware RAID
controllers are notorious for their penchant to kick drives that would
continue to work just fine in mdraid or as a single drive for many more
years.  The mindset here is that anyone would rather spent $150-$2500
dollars on a replacement drive than take a chance with his/her valuable
data.


I'm asking a subtlely different question.

The claim by ZFS/BTRFS authors and others is that data silently "bitrots" on it's own. The claim is therefore that you can have a raid1 pairwhere neither drive reports a hardware failure, but each gives youdifferent data? I can't personally claim to have observed this, so itremains someone else's theory... (for background my experience issimply: RAID10 for high performance arrays and RAID6 for all my personaldata - I intend to investigate your linear raid idea in the future though)

I do agree that if one drive reports a read error, then it's quite easyto guess which pair of the array is wrong...

Just as an aside, I don't have a lot of failure experience. However,the few I have had (perhaps 6-8 events now) is that there is a massivecorrelation in failure time with RAID1, eg one pair I had lasted perhaps2 years and then both failed within 6 hours of each other. I also had abad experience with RAID 5 that wasn't being scrubbed regularly and whenone drive started reporting errors (ie lack of monitoring meant it hadbeen bad for a while), the rest of the array turned out to be apatchwork of read errors - linux raid then turns out to be quite fragilein the presence of a small number of read failures and it's extremelydifficult to salvage the 99% of the array which is ok due to the disksgetting kicked out... (of course regular scrubs would have preventedgetting so deep into that situation - it was a small cheap nas boxwithout such features)


Ed W

Re: [Dovecot] Better to use a single large storage server or multiple smaller for mdbox?

Reply via email to