On Mon, Jan 28, 2019 at 03:23:28PM +0000, Supercilious Dude wrote:
> On Mon, 28 Jan 2019 at 01:18, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
> >
> > So for current upstream kernel, there should be no major problem despite
> > write hole.
> 
> 
> Can you please elaborate on the implications of the write-hole? Does
> it mean that the transaction currently in-flight might be lost but the
> filesystem is otherwise intact?

No, losing the in-flight transaction is normal operation of every modern
filesystem -- in fact, you _want_ the transaction to be lost instead of
partially torn.

The write hole means corruption of a random _old_ piece of data.

It can be fatal (ie, lead to data loss) if two errors happen together:
* the stripe is degraded
* there's unexpected crash/power loss

Every RAID implementation (not just btrfs) suffers from the write hole
unless some special, costly, precaution is being taken.  Those include
journaling, plug extents, varying-width stripes (ZFS: RAIDZ).  The two
former require effectively writing small writes twice, the latter degrades
small writes to RAID1 as disk capacity goes.

The write hole affects only writes that neighbour some old (ie, not from the
current transaction) data in the same stripe -- as long as everything in a
single stripe belongs to no more than one transaction, all is fine.  

> How does it interact with data and metadata being stored with a different
> profile (one with write hole and one without)?

If there's unrecoverable error due to write hole, you lose a single stripe
worth.  For data, this means a single piece of a file is beyond repair.  For
metadata, you lose a potentially large swatch of the filesystem -- and as
tree nodes close to the root get rewritten the most, a total filesystem loss
is pretty likely.  To make things worse, while data writes are mostly linear
(for small files, btrfs batches writes from the same transaction), metadata
is strewn all around, mixing pieces of different importance and different
age.  RAID5 (all implementations) is also very slow for random writes (such
as btrfs metadata), thus you really want RAID1 metadata both for safety and
performance.  Metadata being only around 1-2% of disk space, the only upside
of RAID5 (better use of capacity) doesn't really matter.

Ie: RAID1 is a clear winner for btrfs metadata; mixing profiles for data vs
metadata is safe.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Remember, the S in "IoT" stands for Security, while P stands
⢿⡄⠘⠷⠚⠋⠀ for Privacy.
⠈⠳⣄⠀⠀⠀⠀

Reply via email to