There is one exception to that. Unix in general, and I believe still Linux, makes a distinction between fast and slow IO devices, and when a 'fast' device, like a disk, gets hung up inside its device driver, there is no way to kill it.
On January 16, 2018 11:03:59 AM EST, Antonio Diaz Diaz <anto...@gnu.org> wrote: >Hi Linus, > >Linus Lüssing wrote: >> First of all, I need to complain vehemently: GNU ddrescue works >> too well :-). Now for the third time, it saved one of my neighbors >> data! How should people learn to make backups if such an >> awesome tool like ddrescue exists? :P > >How true! :-) > > >> Just kidding ;) - you guys are awesome, GNU ddrescue is one of the >> most valuable (and still too unknown) pieces of free software in my >> opinion. > >Thanks. > > >> During these hangs, the Ctrl-C would do nothing and even a SIGKILL >> would not kill ddrescue. The SATA-to-USB adapter would continue >> flashing its blue LED, seemingly still trying to read. > >We have already had bad experiences with USB adapters in this list. >(The >advice is to plug the drive directly to the motherboard). But in this >case there seems to be also a bug in the kernel driver regarding >SIGKILL. According to POSIX, SIGKILL cannot be handled or ignored. The >GNU C library manual even states that: >"In fact, if 'SIGKILL' fails to terminate a process, that by itself >constitutes an operating system bug which you should report." > > >> Question A): Would it be possible to reset the operation from >> software somehow? A timeout in ddrescue? Or does this sound like a >> hangup on an even lower level, the Linux kernel (I was using a >> 4.14.12 kernel on a 32bit ARM device, an Odroid U3) or maybe even >> the disk and/or SATA-USB adapter so that power cycling the disk / >> reconnecting the adapter is the only choice? > >The kernel driver for a device should know and implement whatever >timeout required for that type of device. The problem, I think, is that > >USB is not a device, but a communication bus, and maybe the driver just > >sits and waits forever. In any case, if SIGKILL fails to terminate >ddrescue, there is nothing that ddrescue can do. > > >> Another observation: During the trimming and scraping phases (so >> with the chunk size of 1 / 512B instead of 128x 512B chunks?) I >> did not experience those tedious hangs anymore. Could it be a >> firmware bug happening when requesting larger chunks? > >Maybe. Next time maybe you could try if --cluster-size=1 prevents the >hangs during the copying phase. > > >> Also, after pulling the USB cable, ddrescue unfroze and exited >> with an error, as expected. > >This seems consistent with "the driver just sits and waits forever" >(until the connection is interruped). > > >> Regarding the unplugging I also noticed: Pulling without a >> previous Ctrl+C seemed like a bad idea. This lead to ddrescue >> adding many Megabytes of false negatives to the mapfile. >> >> Question B): Would it be possible to prevent this? > >Yes, using --reopen-on-error, --max-error-rate or --max-bad-areas. >--reopen-on-error should return immediately reporting "Can't reopen >input file". (Maybe --reopen-on-error sould be enabled by default). > > >> For the Ctrl+C and then unplugging I noticed: Sometimes it exits >> with an "interrupted by user", sometimes with a "input file/device >> vanished". I couldn't figure out when one or the other might >> happen, the result was seemingly random. > >It depends on how fast the kernel removes the device name from /dev. >ddrescue stats the device name after each read error and, if it still >exists, moves to the next block (and then exits with "interrupted by >user"). > > >> Also it seemed, that only for the latter exit case a bad cluster >> was added to the mapfile? Which was the desirable result for me as >> this was indeed a cluster hanging forever. For the "interrupted by >> user" case it seemed that (usually?) no error was added to the >> mapfile. Does that make sense? > >If ddrescue is blocked in the read call when unplugging, it should >always mark the block as "bad" (non-trimmed, etc) in the mapfile. The >user interrupt is checked before making the read call. Maybe the USB >adapter is returning fake data, tricking ddrescue into marking the >block >as finished in the "interrupted by user" case? > > >Best regards, >Antonio. > >_______________________________________________ >Bug-ddrescue mailing list >Bug-ddrescue@gnu.org >https://lists.gnu.org/mailman/listinfo/bug-ddrescue -- Sent from my Android device with K-9 Mail. Please excuse my brevity. _______________________________________________ Bug-ddrescue mailing list Bug-ddrescue@gnu.org https://lists.gnu.org/mailman/listinfo/bug-ddrescue