[Bug-ddrescue] ddrescue: some minor bugs, suggestions and feature requests.

Daniel Wed, 09 Mar 2011 01:14:38 -0800

I'm using ddrescue 1.14 compiled on a 64-bit Gentoo distro.

*Minor bug: `rescued:' value alternates between bytes and an appropriate
unit (KiB, MiB, etc.)


*I have invoked ddrescue in the following fashion:

ddrescue --i 0x1500AC0200 -s 0x5D9F0200 --max-retries=1
--binary-prefixes --verbose --direct /dev/sdd2 img log

What I'm noticing is that `rescued:' value alternates between and a
suffix appropriate for the size (GiB, MiB, KiB, etc.).  This appears to
happen when:

    * The value is less than one MiB
    * The outcome alternates between data recovered and failed sectors
      being recorded.

I glanced just a little bit at the code, and I personally recommend that
there should be a non-virtual Logbook::show_status(<some signature>)
that performs the generic display of the appropriate data.  Then, the
show_status(<some other signature>) override in each subclass should do
whatever specialized operations are necessary and call the generic
Logbook::show_status() function. (Alternately, you can call the generic
function show_status_generic() or some such).  This will eliminate
duplicate code that can, get out of sync during maintenance and bloat
both the source and compiled code and is the cleaner way to manage
subtle differences in behavior between sibling subclasses.  Altnerately,
dependent upon your tastes, it can be a good place the the non-virtual
interface idiom
(http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Non-Virtual_Interface). 
Either way, the idea is to eliminate duplicate functionality in
subclasses by creating a generic function in a base class and have the
subclasses only implement the parts of the functionality that is differs.

*Feature request: SIG_USR1 to flush buffers after current read operation
and subsequent write completes.*

This is a very minor issue, but when I'm working on a drive where the
log file isn't flushed often (obviously, not using --sync) and I just
want to check on the progress.  Alternately, maybe SIG_USR1 can cause it
to output a small report giving more complete details?

*Feature request: option to ask ddrescue to omit known damaged areas.*

This is admittedly only my second disk to rescue with ddrescue in the
last few years (and data recovery isn't my occupation).  The current one
I'm working on has areas that, when attempting to read, result in some
very loud noises that /can't/ be good!  I've been manually writing down
where these are and then doing my recovery piecemeal in chunks that I've
discovered are not as severely damaged.  My thought was perhaps being
able to specify a file containing a list of data ranges to omit.  Then,
once every other area has been initially recovered, trimmed, split a few
times, we can then visit the areas that we know will lead to more rapid
decompensation of the disk.

An advanced version of this would be to possibly have a mode where
ddrescue jumps around the disk looking for non-damaged portions, reading
until it hits an error and then leaping around a bit (in very large
steps, like gigabytes) looking for long continuous strips of undamaged
disk areas.

*Feature request/idea: --mode={normal,binary-tree} (a smarter recovery)*

What I've found to be particularly helpful in this recovery project has
been initially using large clusters (1-32 MiB) and then later returning
to the failed blocks and manually splitting them -- i.e., starting from
the middle of that "failed block" in reverse and then going forward from
the middle.  But again, here I was attempting to skip past damaged data
quickly, spending as little time with my heads there and get to the good
data while it still was good.  This made me consider the idea of having
an alternate mode with the following behavior.  At invocation, the
following is specified (or default values used).

    * The mode is set to binary-tree or some alternate name for this
      mode.  (default is "normal", which precludes the rest of this section)
    * An optional start location {beginning|middle|end} (default is
      "beginning")
    * An "error tolerate before jumping" value is set  (default is
      probably one).

So this is how it would behave in this mode:

    * ddrescue begins copying data at the specified location, going
      forward if starting at the beginning, backwards if starting at the
      end and the specified direction if starting from the middle (i.e.,
      forward, unless --reverse is specified).  For this example, we'll
      pretend we're starting from the beginning.
    * When the specified number of read errors are encountered, ddrescue
      will mark that block as bad (just as it currently does) but it
      will then jump to end and begin reading backwards.
    * When the specified number of read errors are encountered again, it
      will jump to the center point exactly between the two previously
      abandon locations.  It will then begin reading either forwards or
      backwards.
    * When the specified number of read errors are encountered again, it
      will jump back to the center point and begin reading in the
      opposite direction as before.

The rest of the behavior should be a continuation of fleshing out the
binary tree mapping all of these abandon locations, iteratively
splitting the sections of untried data into increasingly smaller sizes,
always beginning a new read cycle in the centerpoint (aligned to the
cluster-size of course) between previously abandon locations.  The
result /should/ be, in theory, more good data read initially, resulting
in a higher net yield of recovered to lost data for a hard drive that is
failing due to physical damage and contamination.

*Other thoughts*

I've noticed some patterns when manually jumping around, trying to avoid
bad spots.  For instance, in one section of the disk, it seemed that
there was about 23MiB of good data, followed by about 7MiB of bad data. 
I could sort-of visually see a disk with tracks 30MiB long at that point
with a bad "spot".  I wonder if some trend analysis could result in the
cleaner recovery of disks with less human interaction as well.


Thanks for the great program!
Daniel Santos

_______________________________________________
Bug-ddrescue mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-ddrescue

[Bug-ddrescue] ddrescue: some minor bugs, suggestions and feature requests.

Reply via email to