On Mon, Mar 25, 2019 at 10:44:15PM +0000, berodual_xyz wrote: > Thank you very much Hugo, > > the underlying devices are based on HW raid6 and effectively "stitched" > together. Loosing any of those would mean loosing all data, so much is clear. > > My concern was not so much bitrod / silent data corruption but I would not > have expected disabled data checksumming to be a disadvantage at recovering > from the supposed corruption now.
OK, so it's not quite as bad a case as I painted. Turning off all of the btrfs data-protection features still isn't something you'd do to data you're friends with. However, it shouldn't directly affect the recoverability of the data (assuming you had RAID-1 metadata). The main problem is that you've had a transid error, which is pretty much universally fatal. There's a description of what that means in the FAQ here: https://btrfs.wiki.kernel.org/index.php/FAQ#What_does_.22parent_transid_verify_failed.22_mean.3F > Does anyone have any input on how to restore files based on inode no. from > the tree dump that I have? I'm not sure what you mean by "tree dump". Do you mean btrfs-debug-tree? Or btrfs-image? Or something else? In any case, none of those are likely to help all that much. The metadata is corrupted in a way that shouldn't ever happen, and where it's really hard to work out how to fix it, even with an actual human expert involved. (It's why there's no btrfs check fix for this situation -- you simply can't take the metadata broken in this way and make much sense out of it). Hugo. > "usebackuproot,ro" did not succeed either. > > Much appreciate the input! > > > Sent with ProtonMail Secure Email. > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > On Monday, March 25, 2019 11:38 PM, Hugo Mills <h...@carfax.org.uk> wrote: > > > On Mon, Mar 25, 2019 at 10:26:29PM +0000, berodual_xyz wrote: > > > > > Dear all, > > > on a large btrfs based filesystem (multi-device raid0 - all devices okay, > > > nodatacow,nodatasum...) > > > > Ouch. I think the only thing you could have done to make the FS > > more fragile is mounting with nobarrier(). Frankly, anything you're > > getting off it is a bonus. RAID-0 gives you no duplicate copy, > > nodatacow implies nodatasum, and nodatasum doesn't even give you the > > ability to detect data corruption, let alone fix it. > > With that configuration, I'd say pretty much by definition the > > contents of the FS are considered to be discardable. > > Restoring from backups is the recommended approach with transid > > failures. > > () Don't do that. > > > > > I experienced severe filesystem corruption, most likely due to a hard > > > reset with inflight data. > > > The system cannot mount (also not with "ro,nologreplay" / "nospace_cache" > > > etc.). > > > > Given how close the transids are, have you tried > > "ro,usebackuproot"? That's about your only other option at this > > point. But, if btrfs restore isn't working, then usebacuproot probably > > won't either. > > > > > Running "btrfs restore" I got a reasonable amount of data backed up, but > > > a large chunk is missing. > > > "btrfs check" gives the following error: > > > -- Hugo Mills | I gave up smoking, drinking and sex once. It was the hugo@... carfax.org.uk | scariest 20 minutes of my life. http://carfax.org.uk/ | PGP: E2AB1DE4 |
signature.asc
Description: Digital signature