On 2017-10-18 07:59, Adam Borowski wrote:
On Wed, Oct 18, 2017 at 07:30:55AM -0400, Austin S. Hemmelgarn wrote:
On 2017-10-17 16:21, Adam Borowski wrote:
It's a single-device filesystem, thus disconnects are obviously fatal.  But,
they never caused even a single bit of damage (as scrub goes), thus proving
btrfs handles this kind of disconnects well.  Unlike times past, the kernel
doesn't get confused thus no reboot is needed, merely an unmount, "service
nbd-client restart", mount, restart the rebuild jobs.
That's expected behavior though.  _Single_ device BTRFS has nothing to get
out of sync most of the time, the only time there's any possibility of an
issue is when you die after writing the first copy of a block that's in a
dup profile chunk, but even that is not very likely to cause problems
(you'll just lose at most the last <commit-time> worth of data).

How come?  In a DUP profile, the writes are: chunk 1, chunk2, barrier,
superblock.  The two prior writes may be arbitrarily reordered -- both
between each other or even individual sectors inside the chunks, but unless
the disk lies about barriers, there's no way to have any corruption, thus
running scrub is not needed.
If the device dies after writing chunk 1 but before the barrier, you end up
needing scrub.  How much of a failure window is present is largely a
function of how fast the device is, but there is a failure window there.

CoW is there to ensure there is _no_ failure window.  The new content
doesn't matter until there are live pointers to it -- from the filesystem's
point of view we merely scribbled something on an unused part of the block
device.  Only after all pieces are in place (as ensured by the barrier), the
superblock is updated with a reference to the new metadata->data chain.
Even with CoW there _IS_ a failure window. At a bare minimum, when updating the root of the tree which has multiple copies, you have a failure window. This window could admittedly be significantly reduced for multi-device setups if we actually parallelized writes properly, but it would still be there.

Thus, no matter when a disconnect happens, after a crash you get either
uncorrupted old version or uncorrupted new version.

No scrub is ever needed for this reason on single device or on RAID1 that
didn't run degraded.
The whole conversation started regarding a RAID1 array that's functionally guaranteed to run degraded on a regular basis.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to