On 7/3/20 1:41 PM, Chris Murphy wrote: > SSDs can fail in weird ways. Some spew garbage as they're failing, > some go read-only. I've seen both. I don't have stats on how common it > is for an SSD to go read-only as it fails, but once it happens you > cannot fsck it. It won't accept writes. If it won't mount, your only > chance to recover data is some kind of offline scrape tool. And Btrfs > does have a very very good scrape tool, in terms of its success rate - > UX is scary. But that can and will improve.
Ok, you and Josef have both recommended the btrfs restore ("scrape") tool as a next recovery step after fsck fails, and I figured we should check that out, to see if that alleviates the concerns about recoverability of user data in the face of corruption. I also realized that mkfs of an image isn't representative of an SSD system typical of Fedora laptops, so I added "-m single" to mkfs, because this will be the mkfs.btrfs default on SSDs (right?). Based on Josef's description of fsck's algorithm of throwing away any block with a bad CRC this seemed worth testing. I also turned fuzzing /down/ to hitting 2048 bytes out of the 1G image, or a bit less than 1% of the filesystem blocks, at random. This is 1/4 the fuzzing rate from the original test. So: -m single, fuzz 2048 bytes of 1G image, run btrfsck --repair, mount, mount w/ recovery, and then restore ("scrape") if all that fails, see what we get. I ran 50 loops, and got: 46 btrfsck failures 20 mount failures So it ran btrfs restore 20 times; of those, 11 runs lost all or substantially all of the files; 17 runs lost at least 1/3 of the files. -Eric _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org