Subject: dd(8)-written disk has ~800MB of NULs tl;dr: I dd(8)'d a partition to a HD in an external enclosure, then cmp(1)'d to verify the copy, and found 800MB of NULs in the target of the copy; I'm trying to figure out what went wrong and whether I can trust the enclosure and HD the data was written to.
--- What happened: I booted a host from a live USB [debian buster, kernel 4.19.67-2+deb10u1] and used dd(8) to copy one of the host's partitions to a 3.5" SATA disk in an external USB3.0 enclosure [vid 2109 pid 0711, quirks US_FL_NO_ATA_1X]. The enclosure is connected via two USB cables (the manufacturer's USB 3 B–A cable and a 3m USB A extension cable) and has its own 220V power supply. The partition in question is ~700GB. It wasn't mounted at the time. The argument to dd(8)'s of= option was a partition on the target disk, not a regular file. dd(8) processed data at 26MB/s. (IIRC, I didn't specify any bs= argument.) dd(8)'s exit code was zero. I turned off the host and the next day, in the same live environment, cmp(1)ed the source and target partitions. cmp(1) found a difference about 20% of the way in. A closer look revealed that 192774 4096-byte blocks (about 770MB) in the middle of the target partition contained only NULs. Other than those NULs, the target partition was identical to the source partition. I have now re-written those 800MB, which succeeded. Reading them back succeeded too and they compare equal to the source partition. SMART status of the source disk is clean. I can't get SMART status of the target disk easily (that's unsupported by the enclosure). --- I'm not sure what to make of that. It seems like dd(8) silently failed to write 800MB of data. The target partition is in an area of the target drive that was likely never used before. It's possible all-NULs is what those 771MB contained before the dd(8) run. Thus, two possibilities: either the sectors weren't written to at all, or they were written to with NULs rather than with the correct data. --- I'd like to understand what caused the silent write failure so I can ensure it won't happen again, and more importantly, so I can ensure disks I write will be readable when I need them. What should be my first suspect here? A hardware issue? What part of the setup should I look at first? What should I do to make sure the data will be readable? If I verify the data after writing it [e.g., by cmp(1) to a known-good copy, or by verifying PGP signatures], does that ascertain that the data will be readable /in the future/ assuming the drive is kept in storage in the meantime? Cheers, Daniel P.S. I have another, verified backup of that partition, as well as a non-block-level backup of it, so no need to worry about that partition. _______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il