ddrescue: detect repeated reads of the same data

Matheus Afonso Martins Moreira Mon, 20 May 2024 09:47:00 -0700

Allow me to begin with words of gratitude. A few weeks ago
I accidentally dropped my laptop on the floor and the old 2 TB
hard disk finally began to fail. Thanks to GNU ddrescue I've
successfully recovered at least 70% of the data on that drive.
Thank you!


I'm using the --data-preview option to monitor the data being read.
During the entire process it seemed fine, I recognized numerous
text files as they were recovered by ddrescue.

Just today something weird started happening though.
At some point the data preview started showing the same
bytes over and over again. The input and output positions
kept advancing, as did the block offsets on the data preview,
but the data shown on the preview was always the same.
This happened while copying forwards and backwards.

I'm not entirely sure what's happening with the hard disk
or if it's even possible to recover more data out of it.
It's like the drive got stuck reading the same sectors
over and over again and for some reason it's not reporting
that fact as an I/O error to the operating system.

When I noticed this, I concluded that it was potentially invalid
or corrupt data. I interrupted the rescue, made note of the input
position and block offsets that were being read for later reference
and powered off the computer.

It's not the fault of ddrescue. The drive appears to be so badly
degraded that it can't even notice when it's reading garbage.
Looks like it can't even reliably report errors back up the stack. 

Still, I'd like to suggest a feature based on this experience:
detection of repeatedly equal non-null reads.

Unless the drive has been filled with some zero or non-zero pattern,
it just seems astronomically unlikely to me that reading two different
regions from the storage device could return the same data.
I assume that's a sign the drive is returning invalid data
even in the absence of I/O errors in the kernel log.

If ddrescue could detect this, then perhaps it could do something
smart about it. I'm not sure if there's anything that can be done,
folks much smarter than I would have to analyze the options.
Even simply halting the rescue process until the user intervened
would be reasonable though.

In my case, I ended up reading what's likely invalid data from the disk
and writing it to the ddrescue image. This went on for an undefined
but hopefully short amount of time. The rescued data count falsely
increased and the sectors were marked as rescued in the ddrescue's
map file even though the data probably couldn't be read correctly.
They won't be retried in the later passes and when trimming and
scraping the disk. Data was also written needlessly to the new
storage device which is undesirable for SSDs.

After the rescue, I planned to analyze the data loss by correlating
ext4 file system block locations with the data in ddrescue's map file.
Such analysis becomes complicated if not impossible if one cannot
assume that blocks rescued by ddrescue are valid data.
Detecting this condition would increase the confidence that blocks
marked by ddrescue as rescued were truly rescued.

Just a report of my extremely positive experience with GNU ddrescue
and a constructive suggestion on how to make it even better than it
already is. A gigantic thank you to everyone who ever worked on it.

  -- Matheus

ddrescue: detect repeated reads of the same data

Reply via email to