Re: [zfs-discuss] Quickest way to find files with cksum errors without doing scrub

Richard Elling Mon, 28 Sep 2009 16:40:06 -0700

On Sep 28, 2009, at 11:41 AM, Bob Friesenhahn wrote:

On Mon, 28 Sep 2009, Richard Elling wrote:
In other words, I am concerned that people replace good dataprotectionpractices with scrubs and expecting scrub to deliver betterdata protection
           (it won't).
Many people here would profoundly disagree with the above. There isno substitute for good backups, but a periodic scrub helps validatethat a later resilver would succeed. A perioic scrub also helpsfind system problems early when they are less likely to crater yourbusiness. It is much better to find an issue during a scrub ratherthan during resilver of a mirror or raidz.

As I said, I am concerned that people would mistakenly expect thatscrubbing

offers data protection. It doesn't.  I think you proved my point? ;-)

Scrubs are also useful for detecting broken hardware. However,normal activity will also detect broken hardware, so it is betterto think of scrubs as finding degradation of old data rather thanbeing a hardware checking service.
Do you have a scientific reference for this notion that "old data"is more likely to be corrupt than "new data" or is it just a gut-feeling? This hypothesis does not sound very supportable to me.Magnetic hysteresis lasts quite a lot longer than the recommendedservice life for a hard drive. Studio audio tapes from the '60s arestill being used to produce modern "remasters" of old audiorecordings which sound better than they ever did before (other thanthe master tape).


Those are analog tapes... they just fade away...

For data, it depends on the ECC methods, quality of the media,environment, etc.You will find considerable attention spent on verification of data ontapes inarchiving products. In the tape world, there are slightly differentconditions thanthe magnetic disk world, but I can't think of a single study whichshows thatmagnetic disks get more reliable over time, while there are dozenswhich showthat they get less reliable and that latent sector errors dominate, asmuch as 5x,over full disk failures. My studies of Sun disk failure rates haveshown similar

results.

Some forms of magnetic hysteresis are known to last millions ofyears. Media failure is more often than not mechanical or chemicaland not related to loss of magnetic hysteresis. Head failures maybe construed to be media failures.

Here is a good study from the University of Wisconsin-Madison whichclearlyshows the relationship between disk age and latent sector errors. Italso showshow the increase in aerial density also increases the latent sectorerror (LSE) rate.Additionally, this gets back to the ECC method, which we observe to bedifferenton consumer-grade and enterprise-class disks. The study shows a clearwinfor enterprise-class drives wrt latent errors. The paper suggests a 2-weekscrub cycle and recognizes that many RAID arrays have such policies.Thereare indeed many studies which show latent sector errors are a biggerproblem

as the disk ages.
        An Analysis of Latent Sector Errors in Disk Drives
        www.cs.wisc.edu/adsl/Publications/latent-sigmetrics07.ps

See http://en.wikipedia.org/wiki/Ferromagnetic for information onferromagnetic materials.


For disks we worry about the superparamagnetic effect.
        http://en.wikipedia.org/wiki/Superparamagnetism

Quoting US Patent 6987630,
        ... the superparamagnetic effect is a thermal relaxation of information
        stored on the disk surface. Because the superparamagnetic effect may
        occur at room temperature, over time, information stored on the disk
        surface will begin to decay. Once the stored information decays beyond

a threshold level, it will be unable to be properly read by the readhead

        and the information will be lost.

        The superparamagnetic effect manifests itself by a loss in amplitude in
        the readback signal over time or an increase in the mean square error
        (MSE) of the read back signal over time. In other words, the readback
        signal quality metrics are means square error and amplitude as measured
        by the read channel integrated circuit. Decreases in the quality of the
        readback signal cause bit error rate (BER) increases. As is well known,
        the BER is the ultimate measure of drive performance in a disk drive.

This effect is based on the time since written. Hence, older data canhave

higher MSE and subsequent BER leading to a UER.

To be fair, newer disk technology is constantly improving. But what is
consistent with the physics is that increase in bit densities leads to
more space and rebalancing the BER. IMHO, this is why we see densities
increase, but UER does not increase (hint: marketing always wins these
sorts of battles).

FWIW, flash memories are not affected by superparamagnetic decay.

It would be most useful if zfs incorporated a slow-scan scrub whichvalidates data at a low rate of speed which does not hinder active I/O. Of course this is not a "green" energy efficient solution.

Oprea and Juels write, "Our key insight is that more aggressivescrubbingdoes not always increase disk reliability, as previously believed."They showhow read-induced LSEs would tend to encourage you to scrub lessfrequently.

They also discuss the advantage of random versus sequential scrubbing. I

would classify zfs scrubs as more random than sequential, for mostworkloads.Their model is even more sophisticated and considers scrubbing policybasedon the age of the disk and how many errors have been previouslydetected.

        A Clean-Slate Look at Disk Scrubbing
        http://www.rsa.com/rsalabs/staff/bios/aoprea/publications/scrubbing.pdf

Finally, there are two basic types of scrubs: read-only and rewrite.ZFS doesread-only. Other scrubbers can do rewrite. There is evidence thatrewrites

are better for attacking superparamagnetic decay issues.

So it is still not clear what the best scrubbing model or intervalshould befor the general case. I suggest scrubbing periodically, but notcontinuously :-)

Currently, scrub has the lowest priority for the vdev_queue. But Ithink the

vdev_queue could use more research.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Quickest way to find files with cksum errors without doing scrub

Reply via email to