Re: Tool for HD analyzing

Marco Peereboom Fri, 28 Sep 2007 19:03:44 -0700

Uhm sorry claudio but it is the other way around.  The only time you can
detect failed sectors is on reads.  When a read fails the disk goes into
a proprietary algorithm to try to recover as much of the sector as
possible.  Depending on the manufacturer they'll do something in excess
of 15000 reads of the same sector to try to recover the data.  They then
use heuristics to try to determine what the original data most likely
was.  If this process is successful enough the block gets re-assigned.
Most disks don't do verifies on writes (too slow) and rely on the
recovery algorithm to reassign these blocks; which are triggered by
subsequent reads.  RAID manufacturers implement algorithms that
continuous read all sectors of idle disks to ensure data integrity.
When they run into a failure they first let the disk try to recover and
if that fails they use parity to recover the block.  They also will
"puncture" the sector so that the disk will skip it going forward.

Some of the dmesg lines pasted in this message are likely LBA
relocations that take too long.  There is no set timeout in the spec and
therefore vendors try really hard and long to recover the data resulting
in OS timeouts.  A good example is calculating how long it takes to read
an LBA 15000 times on a 10000 RPM disk.  Add some fudge in there for the
head to find the exact spot and you'll see that it gets in excess of
seconds.  Repeat that a few times due to various retries in various
layers and you'll see where those lengthy timeouts come from.

On Fri, Sep 28, 2007 at 02:47:48PM +0200, Claudio Jeker wrote:
> On Fri, Sep 28, 2007 at 01:37:52PM +0200, Peter N. M. Hansteen wrote:
> > "Leonardo Marques" <[EMAIL PROTECTED]> writes:
> > 
> > > I've a HD which are returning a lot of errors. Someone know some good
> > > tool to analyze this disk and tell me if i've to replace it or if
> > > exist some way to repair it?
> > 
> > To my mind that kind of errors say "run, don't walk to the store for
> > replacement".  Modern disks remap bad parts away from active use, when
> > they've run out of remappable space, they start complaining like that.
> > 
> 
> Remapping is only possible when writing to blocks. The disk can not remap
> on reads. By forcing writes to such blocks you can remap them but the data
> on them is still lost. I use some partitions with almost only read access
> that had bad blocks and I could fix the problem with dd if=/dev/zero
> of=/dev/rsd0x (not important data so I did not replace the disk)
> 
> -- 
> :wq Claudio

Re: Tool for HD analyzing

Reply via email to