On Sun, 13 Mar 2016 17:03:54 +0000 (UTC)
Duncan <1i5t5.dun...@cox.net> wrote:

> With backups I'd try it, if only for the personal experience value and to 
> see what the result was.  But that's certainly more intensive "surgery" 
> on the filesystem than --repair, and I'd only do it either for that 
> experience value or if I was seriously desperate to recover files, as I'd 
> not trust the filesystem's health after that intensive a surgery, and 
> would blow the filesystem away after I recovered what I needed, even if 
> it did appear to work successfully.

"Blowing away" a 6TB filesystem just because some block randomly went "bad",
without any explanation why, or guarantees that this won't happen again, is not
the best outcome. Sure there might be no way to "guarantee" anything, but let's
at least figure out a robust way to recover from this failure state.

I'm running --init-extent-tree right now in a "what if" mode, using
the copy-on-write feature of 'nbd-server' (this way the original block device
is not modified, and all changes are saved in a separate file). It's been
running for a good 8 hours now, with 100% CPU use of btrfsck and very little
disk access. Unless I'm mistaken and something went majorly wrong, these
messages (100 MB worth of them by now) seem to indicate it indeed proceeds in
recreating the extent tree.

adding new data backref on 3282190336 parent 4315246948352 owner 0 offset 0 
found 1
Backref 3282190336 root 256 owner 1187677 offset 4096 num_refs 0 not found in 
extent tree
Incorrect local backref count on 3282190336 root 256 owner 1187677 offset 4096 
found 1 wanted 0 back 0x23496e40
Backref 3282190336 parent 4315038240768 owner 0 offset 0 num_refs 0 not found 
in extent tree
Incorrect local backref count on 3282190336 parent 4315038240768 owner 0 offset 
0 found 1 wanted 0 back 0x4b29f3a0
Backref 3282190336 parent 4315246948352 owner 0 offset 0 num_refs 0 not found 
in extent tree
Incorrect local backref count on 3282190336 parent 4315246948352 owner 0 offset 
0 found 1 wanted 0 back 0x4c330f60
backpointer mismatch on [3282190336 4096]
ref mismatch on [3282194432 32768] extent item 0, found 1
adding new data backref on 3282194432 parent 4309109956608 owner 0 offset 0 
found 1
Backref 3282194432 parent 4309109956608 owner 0 offset 0 num_refs 0 not found 
in extent tree
Incorrect local backref count on 3282194432 parent 4309109956608 owner 0 offset 
0 found 1 wanted 0 back 0x52903a20
backpointer mismatch on [3282194432 32768]
ref mismatch on [3282227200 4096] extent item 0, found 1

As it finishes I'll check if files are present and not corrupted, then will
have to run it once more, this time "for real". Unfortunately this also seems
to be an O(n) operation (if I'm using the term correctly), as the rate at which
new log messages appear has been slowing down considerably as it progresses.

-- 
With respect,
Roman

Attachment: pgpuvtdhfBIeT.pgp
Description: OpenPGP digital signature

Reply via email to