On Mon, Jan 25, 2010 at 05:36:35PM -0500, Miles Nordin wrote:
> >>>>> "sb" == Simon Breden <sbre...@gmail.com> writes:
> 
>     sb> 1. In simple non-RAID single drive 'desktop' PC scenarios
>     sb> where you have one drive, if your drive is experiencing
>     sb> read/write errors, as this is the only drive you have, and
>     sb> therefore you have no alternative redundant source of data to
>     sb> help with required reconstruction/recovery, you REALLY NEED
>     sb> your drive to try as much as possible to try to recover
> 
> this sounds convincing to fetishists of an ordered world where
> egg-laying mammals do not exist, but it's utter rubbish.

There's a family of platypus in the creek just down the bike path from
my house.  They're returning thanks in large part to the removal of
rubbish sources upstream.

> As drives go bad they return errors frequently, and they don't succeed
> in recovering them.  

Typically, once a sector fails to read, this is true for that sector
(or range of sectors).  However, if the sector is overwritten and can
be remapped, drives often continue working flawlessly for a long time
thereafter. 

I have two of the original, infamous, glass-platter deathstars.  Both
were completely unusable, and would disappear into a bottomless pit of
endlessly unsuccessful resets and read attempts on just about any
attempt to get data off them.

However, a write scrub with random data and write cache turned off
allowed all the bad sectors to remap, and completely recovered them.
They saw many years of use - as scratch space, written often, but with
non-critical data.  The machine isn't used much anymore, but I still
expect that they work fine today if I turn it on.

I left the write cache off, because these and some other drives seemed
not to detect and remap errors on write otherwise.  They seem to only
verify writes in that case, and otherwise rely on the sector already
having been found bad and put on the pending list by prior reads,
whether from the host or background self-test.   The self-test are
often also dumb, in that they will stop at the first error.  I've seen
this from several vendors, but won't assert that it is universal or
even still common for current drives.

> Depending on how bad the drive is, you'll have to use a smaller or
> larger block size

Yes.  I hate to press the point, but this is another area where
CCTL/etc is useful - you can more quickly narrow down to the specific
problem sectors. 

> Anyway, these drives, once they've gone bad their behavior is very
> stupid and nothing like this imaginary world that's been pitched to

Again, it depends on the behaviour you care about: trying to recover
your only copy of crucial data, or getting back to a servicable state
by remapping on overwrite.   My experience with the latter is
positive, and zfs users should need no convincing that relying on the
former is crazy, regardless of the specific idiocy of the specific
drive in specific (non-)recovery circumstances. 

Any incidence of errors is a concern to which you should respond,
whether with "increased vigilance" or "immediate replacement" depends
on your own preferences and paranoia.  New drives will often hit a few
of these errors in early use, before the weak sectors are found and
abandoned, and then work flawlessly thereafter.  Burn-in tests and
scrubs are well worthwhile.

Once they've run out of remapped sectors, or have started consistently
producing errors, then they're cactus.  Do pay attention to the smart
error counts and predictors.  

Or they can just fail in other strange ways. I have one that just
works very very very slowly, for every single read request regardless
of size or location, even if they're all returned correctly in the
end. 

The best practices of regular scrubs and sufficiently redundant pools
and separate backups stand, in spite of and indeed because of such
idiocy. 

--
Dan.

Attachment: pgppRj1xVKNxA.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to