Re: Btrfs suddenly unmountable, open_ctree failed

Mike Hartman Wed, 25 Jun 2014 12:33:10 -0700

> I don't know all states of this file system, and copies you have. Right now 
> the earliest copy is obviously broken, and the latest copy is probably more 
> broken because at the least its csum tree has been blown away meaning there's 
> no checksums to confirm whether any data extracted/copied from the file 
> system is OK. You'd just have to open the file and look at it to see if it 
> behaves as it should, and even then depending on what kind of file it is, 
> corruption may not be obvious immediately. So... catch22.


I'm not sure I follow you. I have a clean dd image, and I made a fresh
copy of it every time I tried something new, so no copy should be any
more broken than whatever the current recovery operation did to it.
(There shouldn't be a distinction between earliest and latest if I
understand what you mean by "copy".)

I get all those tree errors when it tries to access the "active"
versions of the @ and @home volumes. But by the time it starts
restoring the snapshots I don't see those anymore. So my
interpretation of the output was that the trees rooted at @ and @home
are messed up, but the tree structures for the snapshots are ok. Am I
misunderstanding that output, or are we both equally in the dark about
what it's telling us?

Setting aside the tree integrity issue, does anyone know what
condition causes that "offset is <number>" output during a restore
operation? I've seen several wiki pages and posts about how to use the
restore feature, but nothing that details what's going on under the
hood or how to interpret the output. If it's not complaining about the
tree, and there are none of those "offset" errors, would the
assumption be that the entire snapshot was recovered successfully?

>But seems to me in any case sda6 needs to be reformatted, the only question is 
>if you reformat it Btrfs or something else (honestly - it doesn't matter 
>whether it was the chicken soup that made you sick, the association is 
>understandable).

Yep. I'll probably stick with Btrfs and just try to improve my backup
strategy. I think I need to start a) replicating my snapshots to
another file system and b) more aggressively snapshotting @home. I've
been avoiding that because I have so many big files that come and go
so often there the snapshots would take up a lot of space. But maybe a
single rolling snapshot hourly or daily would make this kind of issue
easier to deal with.

> I'm very interesting in how this turns out. Is it a Btrfs bug, is it a 
> hardware bug, is it some bad/unlucky collision of more than one problem? Is 
> it possible metadata DUP might have changed the outcome? If so are we better 
> off in the short term using metadata DUP on SSDs? Or is the hindsight takeway 
> "metadata DUP isn't worth it on SSD, better is more frequent backups than 
> what was done" and you just have to weigh whether that's practical for your 
> workflow.

I'll definitely report back here if #btrfs provides any insight. And
anyone else who comes across this and has suggestions, please, chime
in!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs suddenly unmountable, open_ctree failed

Reply via email to