On Sun, 13 Mar 2016 17:03:54 +0000 (UTC) Duncan <1i5t5.dun...@cox.net> wrote:
> With backups I'd try it, if only for the personal experience value and to > see what the result was. But that's certainly more intensive "surgery" > on the filesystem than --repair, and I'd only do it either for that > experience value or if I was seriously desperate to recover files, as I'd > not trust the filesystem's health after that intensive a surgery, and > would blow the filesystem away after I recovered what I needed, even if > it did appear to work successfully. "Blowing away" a 6TB filesystem just because some block randomly went "bad", without any explanation why, or guarantees that this won't happen again, is not the best outcome. Sure there might be no way to "guarantee" anything, but let's at least figure out a robust way to recover from this failure state. I'm running --init-extent-tree right now in a "what if" mode, using the copy-on-write feature of 'nbd-server' (this way the original block device is not modified, and all changes are saved in a separate file). It's been running for a good 8 hours now, with 100% CPU use of btrfsck and very little disk access. Unless I'm mistaken and something went majorly wrong, these messages (100 MB worth of them by now) seem to indicate it indeed proceeds in recreating the extent tree. adding new data backref on 3282190336 parent 4315246948352 owner 0 offset 0 found 1 Backref 3282190336 root 256 owner 1187677 offset 4096 num_refs 0 not found in extent tree Incorrect local backref count on 3282190336 root 256 owner 1187677 offset 4096 found 1 wanted 0 back 0x23496e40 Backref 3282190336 parent 4315038240768 owner 0 offset 0 num_refs 0 not found in extent tree Incorrect local backref count on 3282190336 parent 4315038240768 owner 0 offset 0 found 1 wanted 0 back 0x4b29f3a0 Backref 3282190336 parent 4315246948352 owner 0 offset 0 num_refs 0 not found in extent tree Incorrect local backref count on 3282190336 parent 4315246948352 owner 0 offset 0 found 1 wanted 0 back 0x4c330f60 backpointer mismatch on [3282190336 4096] ref mismatch on [3282194432 32768] extent item 0, found 1 adding new data backref on 3282194432 parent 4309109956608 owner 0 offset 0 found 1 Backref 3282194432 parent 4309109956608 owner 0 offset 0 num_refs 0 not found in extent tree Incorrect local backref count on 3282194432 parent 4309109956608 owner 0 offset 0 found 1 wanted 0 back 0x52903a20 backpointer mismatch on [3282194432 32768] ref mismatch on [3282227200 4096] extent item 0, found 1 As it finishes I'll check if files are present and not corrupted, then will have to run it once more, this time "for real". Unfortunately this also seems to be an O(n) operation (if I'm using the term correctly), as the rate at which new log messages appear has been slowing down considerably as it progresses. -- With respect, Roman
pgpuvtdhfBIeT.pgp
Description: OpenPGP digital signature