On 2018年07月14日 02:46, David Sterba wrote: > Hi, > > I have some goodies that go into the RAID56 problem, although not > implementing all the remaining features, it can be useful independently. > > This time my hackweek project > > https://hackweek.suse.com/17/projects/do-something-about-btrfs-and-raid56 > > aimed to implement the fix for the write hole problem but I spent more > time with analysis and design of the solution and don't have a working > prototype for that yet. > > This patchset brings a feature that will be used by the raid56 log, the > log has to be on the same redundancy level and thus we need a 3-copy > replication for raid6. As it was easy to extend to higher replication, > I've added a 4-copy replication, that would allow triple copy raid (that > does not have a standardized name).
So this special level will be used for RAID56 for now? Or it will also be possible for metadata usage just like current RAID1? If the latter, the metadata scrub problem will need to be considered more. For more copies RAID1, it's will have higher possibility one or two devices missing, and then being scrubbed. For metadata scrub, inlined csum can't ensure it's the latest one. So for such RAID1 scrub, we need to read out all copies and compare their generation to find out the correct copy. At least from the changeset, it doesn't look like it's addressed yet. And this also reminds me that current scrub is not as flex as balance, I really like we could filter block groups to scrub just like balance, and do scrub in a block group basis, other than devid basis. That's to say, for a block group scrub, we don't really care which device we're scrubbing, we just need to ensure all device in this block is storing correct data. Thanks, Qu > > The number of copies is fixed, so it's not N-copy for an arbitrary N. > This would complicate the implementation too much, though I'd be willing > to add a 5-copy replication for a small bribe. > > The new raid profiles and covered by an incompatibility bit, called > extended_raid, the (idealistic) plan is to stuff as many new > raid-related features as possible. The patch 4/4 mentions the 3- 4- copy > raid1, configurable stripe length, write hole log and triple parity. > If the plan turns out to be too ambitious, the ready and implemented > features will be split and merged. > > An interesting question is the naming of the extended profiles. I picked > something that can be easily understood but it's not a final proposal. > Years ago, Hugo proposed a naming scheme that described the > non-standard raid varieties of the btrfs flavor: > > https://marc.info/?l=linux-btrfs&m=136286324417767 > > Switching to this naming would be a good addition to the extended raid. > > Regarding the missing raid56 features, I'll continue working on them as > time permits in the following weeks/months, as I'm not aware of anybody > working on that actively enough so to speak. > > Anyway, git branches with the patches: > > kernel: git://github.com/kdave/btrfs-devel dev/extended-raid-ncopies > progs: git://github.com/kdave/btrfs-progs dev/extended-raid-ncopies > > David Sterba (4): > btrfs: refactor block group replication factor calculation to a helper > btrfs: add support for 3-copy replication (raid1c3) > btrfs: add support for 4-copy replication (raid1c4) > btrfs: add incompatibility bit for extended raid features > > fs/btrfs/ctree.h | 1 + > fs/btrfs/extent-tree.c | 45 +++++++----------- > fs/btrfs/relocation.c | 1 + > fs/btrfs/scrub.c | 4 +- > fs/btrfs/super.c | 17 +++---- > fs/btrfs/sysfs.c | 2 + > fs/btrfs/volumes.c | 84 ++++++++++++++++++++++++++++++--- > fs/btrfs/volumes.h | 6 +++ > include/uapi/linux/btrfs.h | 12 ++++- > include/uapi/linux/btrfs_tree.h | 6 +++ > 10 files changed, 134 insertions(+), 44 deletions(-) >
signature.asc
Description: OpenPGP digital signature