Re: [zfs-discuss] Proposed idea for enhancement - damage control

Bob Friesenhahn Tue, 16 Feb 2010 18:20:43 -0800

On Tue, 16 Feb 2010, Christo Kutrovsky wrote:

Just finished reading the following excellent post:


http://queue.acm.org/detail.cfm?id=1670144

A nice article, even if I don't agree with all of its surmises andconclusions. :-)


In fact, I would reach a different conclusion.

I considered something like simply do a 2way mirror. What are thechances for a very specific drive to fail in 2 way mirror? What if Ido not want to take that chance?

The probability of whole drive failure, or individual sector failure,has not increased over the years. The probability of individualsector failure has diminished substantially over the years. Theprobability of losing a whole mirror pair has gone down since theprobability of individual drive failure has gone down.

I could always put "copies=2" (or more) to my important datasets andtake some risk and tolerate such a failure.

I don't believe that "copies=2" buys much at all when using mirrordisks (or raidz). It assumes that there is a concurrency ofsimultaneous media failure, which is actually quite rare indeed. The"copies=2" setting only buys something when there is no otherredundancy available.

One of the ideas that sparkled is have a "max devices" property foreach data set, and limit how many mirrored devices a given data setcan be spread on. I mean if you don't need the performance, you canlimit (minimize) the device, should your capacity allow this.

What you seem to be suggesting is a sort of targeted heirarchical vdevwithout extra RAID.

Remember. The goal is damage control. I know 2x raidz2 offers betterprotection for more capacity (altought less performance, but that'sno the point).

It seems that Adam Leventhal's excellent paper reaches the wrongconclusions because it assumes that history is a predictor for thefuture. However, history is a rather poor predictor in this case.Imagine if 9" floppies had increased their density to support 20GBeach (up from 160KB), but that did not happen, and now we don't usefloppies at all. We already see many cases where history was nolonger a good predictor of the future, and (as an example) increasedintegration has brought us multi-core CPUs rather than 20GHz CPUs.

My own conclusions (supported by Adam Leventhal's excellent paper) arethat


 - maximum device size should be constrained based on its time to
   resilver.

 - devices are growing too large and it is about time to transition to
   the next smaller physical size.

It is unreasonable to spend more than 24 hours to resilver a singledrive. It is unreasonable to spend more than 6 days resilvering allof the devices in a RAID group (the 7th day is reserved for the systemadministrator). It is unreasonable to spend very much time at all onresilvering (using current rotating media) since the resilveringprocess kills performance.

When looking at the possibility of data failure it is wise to considerphysical issues such as


 - shared power supply

 - shared chassis

 - shared physical location

 - shared OS kernel or firmware instance

all of which are very bad for data reliability since a problem withanything shared can lead to destruction of all copies of the data.

In New York City, all of the apartment doors seem to be fitted withthree deadlocks, all of which lock into the same flimsy splintereddoor frame. It is important to consider each significant systemweakness in turn in order to achieve the least chance of loss, whileproviding the best service.

Bob

P.S. NASA is tracking large asteroids and meteors with the hope thatthey will eventually be able to deflect any which will strike ourplanet in order to in an effort to save your precious data.

--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Proposed idea for enhancement - damage control

Reply via email to