On Mon, Oct 16, 2017 at 01:27:40PM -0400, Austin S. Hemmelgarn wrote:
> On 2017-10-16 12:57, Zoltan wrote:
> > On Mon, Oct 16, 2017 at 1:53 PM, Austin S. Hemmelgarn wrote:
> In an ideal situation, scrubbing should not be an 'only if needed' thing,
> even for a regular array that isn't dealing with USB issues. From a
> practical perspective, there's no way to know for certain if a scrub is
> needed short of reading every single file in the filesystem in it's
> entirety, at which point, you're just better off running a scrub (because if
> you _do_ need to scrub, you'll end up reading everything twice).

> [...]  There are three things to deal with here:
> 1. Latent data corruption caused either by bit rot, or by a half-write (that
> is, one copy got written successfully, then the other device disappeared
> _before_ the other copy got written).
> 2. Single chunks generated when the array is degraded.
> 3. Half-raid1 chunks generated by newer kernels when the array is degraded.

Note that any of the above other than bit rot affect only very recent data. 
If we keep record of the last known-good generation, all of that can be
enumerated, allowing us to make a selective scrub that checks only a small
part of the disk.  A linear read a 8TB disk takes 14 hours...

If we ever get auto-recovery, this is a fine candidate.

> Scrub will fix problem 1 because that's what it's designed to fix.  it will
> also fix problem 3, since that behaves just like problem 1 from a
> higher-level perspective.  It won't fix problem 2 though, as it doesn't look
> at chunk types (only if the data in the chunk doesn't have the correct
> number of valid copies).

Here not even tracking generations is required: a soft convert balance
touches only bad chunks.  Again, would work well for auto-recovery, as it's
a no-op if all is well.

> In contrast, the balance command you quoted won't fix issue 1 (because it
> doesn't validate checksums or check that data has the right number of
> copies), or issue 3 (because it's been told to only operate on non-raid1
> chunks), but it will fix issue 2.
> 
> In comparison to both of the above, a full balance without filters will fix
> all three issues, although it will do so less efficiently (in terms of both
> time and disk usage) than running a soft-conversion balance followed by a
> scrub.

"less efficiently" is an understatement.  Scrub gets a good part of
theoretical linear speed, while I just had a single metadata block take
14428 seconds to balance.

> In the case of normal usage, device disconnects are rare, so you should
> generally be more worried about latent data corruption.

Yeah, but certain setups (like anything USB) gets disconnect quite often. 
It would be nice to get them right.  MD thanks to write-intent bitmap can
recover almost instantly, btrfs could do it better -- the code to do so
isn't written yet.

> monitor the kernel log to watch for device disconnects, remount the
> filesystem when the device reconnects, and then run the balance command
> followed by a scrub.  With most hardware I've seen, USB disconnects tend to
> be relatively frequent unless you're using very high quality cabling and
> peripheral devices.  If, however, they happen less than once a day most of
> the time, just set up the log monitor to remount, and set the balance and
> scrub commands on the schedule I suggested above for normal usage.

A day-long recovery for an event that happens daily isn't a particularly
enticing prospect.

-- 
⢀⣴⠾⠻⢶⣦⠀ Meow!
⣾⠁⢰⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ I was born a dumb, ugly and work-loving kid, then I got swapped on
⠈⠳⣄⠀⠀⠀⠀ the maternity ward.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to