On 2016-04-08 14:30, Chris Murphy wrote:
On Fri, Apr 8, 2016 at 12:18 PM, Austin S. Hemmelgarn
<ahferro...@gmail.com> wrote:
On 2016-04-08 14:05, Chris Murphy wrote:
On Fri, Apr 8, 2016 at 5:29 AM, Austin S. Hemmelgarn
<ahferro...@gmail.com> wrote:
I entirely agree. If the fix doesn't require any kind of decision to be
made other than whether to fix it or not, it should be trivially fixable
with the tools. TBH though, this particular issue with devices
disappearing
and reappearing could be fixed easier in the block layer (at least, there
are things that need to be fixed WRT it in the block layer).
Another feature needed for transient failures with large storage, is
some kind of partial scrub, along the lines of md partial resync when
there's a bitmap write intent log.
In this case, I would think the simplest way to do this would be to have
scrub check if generation matches and not further verify anything that does
(I think we might be able to prune anything below objects whose generation
matches, but I'm not 100% certain about how writes cascade up the trees). I
hadn't really thought about this before, but now that I do, it kind of
surprises me that we don't have something to do this.
And I need to better qualify this: this scrub (or balance) needs to be
initiated automatically, perhaps have some reasonable delay after the
block layer informs Btrfs the missing device as reappeared. Both the
requirement of a full scrub as well as it being a manual scrub, are
pretty big gotchas.
We would still ideally want some way to initiate it manually because:
1. It would make it easier to test.
2. We should have a way to do it on filesystems that have been
reassembled after a reboot, not just ones that got the device back in
the same boot (or it was missing on boot and then appeared).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html