>>>>> "sb" == Simon Breden <sbre...@gmail.com> writes:
sb> 1. In simple non-RAID single drive 'desktop' PC scenarios sb> where you have one drive, if your drive is experiencing sb> read/write errors, as this is the only drive you have, and sb> therefore you have no alternative redundant source of data to sb> help with required reconstruction/recovery, you REALLY NEED sb> your drive to try as much as possible to try to recover this sounds convincing to fetishists of an ordered world where egg-laying mammals do not exist, but it's utter rubbish. As drives go bad they return errors frequently, and they don't succeed in recovering them. They do not encounter, like, one or two errors per day under general use most of which are recoverable in 7 < x < 60 seconds: this just does not happen except in your dreams. Good drives have zero UNC errors in the smartctl -a logs, and the conditional probability of soon-failure on a drive that's experienced just one UNC error is much higher than the regular probability of soon-failure. Once a drive for which you have no backup/mirror/whatever is returning errors, the remedy is not to wait longer. This does not work, basically ever. The remedy is to shut down the OS, copy the failing drive onto a good one with 'dd conv=noerror,sync', fsck, and read back your data (with a bunch of zeroes inserted for unreadable blocks). Depending on how bad the drive is, you'll have to use a smaller or larger block size: the reason is, most unreadable areas are larger than 1 sector, but the drive is so imbecillic if you read single sectors it will reinvoke its bogus retry timer for each and every sector within the same contiguous unreadable region: it has NO MEMORY for the fact that it already tried to read that area and failed. 60 seconds * <normal # of bad sectors> for a failing/pissed-off drive is generally somewhere between 3 days and forever, so you have to watch progress and start over with larger bs= if you are not on target to finish the dd within three days, because the drive will get worse and worse, so larger bs= (meaning, not bothering trying to read data that you would have been able to read) will get your data off the drive before it fails more completely and thus actually rescue *more*. Anyway, these drives, once they've gone bad their behavior is very stupid and nothing like this imaginary world that's been pitched to you by these bogan electrical engineers who apparently have no experience using their own product.
pgpP0xqm6hS5z.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss