WARNING in clone_finish_inode_update while de-duplicating with bedup
Hello, I get the following WARNING while de-duplicating with bedup 0.10.1. I am running Debian with backports kernel: Linux lithium 4.17.0-0.bpo.1-amd64 #1 SMP Debian 4.17.8-1~bpo9+1 (2018-07-23) x86_64 GNU/Linux I ran: sudo bedup scan /media/btrfs/ sudo bedup dedupe /media/btrfs/ /media/btrfs has 4 drives in btrfs raid1 Aug 19 03:32:39 lithium kernel: BTRFS: Transaction aborted (error -28) Aug 19 03:32:39 lithium kernel: WARNING: CPU: 2 PID: 4204 at /build/linux-hvYKKE/linux-4.17.8/fs/btrfs/ioctl.c:3249 clone_finish_inode_update+0xf3/0x140 [btrfs] Aug 19 03:32:39 lithium kernel: Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs dm_mod xt_multiport iptable_filter iTCO_wdt iTCO_vendor_support ppdev evdev intel_powerclamp squashfs ir_rc6_decoder pcspkr serio_raw sg rc_rc6_mce lpc_ich shpchp fintek_cir parport_pc rc_core parport video button f71882fg lm78 hwmon_vid coretemp nfsd auth_rpcgss nfs_acl loop lockd grace sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto ecb crypto_simd cryptd glue_helper aes_x86_64 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq libcrc32c crc32c_generic sd_mod ahci libahci libata i2c_i801 psmouse scsi_mod uhci_hcd ehci_pci ehci_hcd e1000e usbcore usb_common thermal Aug 19 03:32:39 lithium kernel: CPU: 2 PID: 4204 Comm: bedup Not tainted 4.17.0-0.bpo.1-amd64 #1 Debian 4.17.8-1~bpo9+1 Aug 19 03:32:39 lithium kernel: Hardware name: /, BIOS 4.6.5 12/11/2012 Aug 19 03:32:39 lithium kernel: RIP: 0010:clone_finish_inode_update+0xf3/0x140 [btrfs] Aug 19 03:32:39 lithium kernel: RSP: 0018:a2e44931fc38 EFLAGS: 00010282 Aug 19 03:32:39 lithium kernel: RAX: RBX: ffe4 RCX: 0006 Aug 19 03:32:39 lithium kernel: RDX: 0007 RSI: 0086 RDI: 93571fd16730 Aug 19 03:32:39 lithium kernel: RBP: a2e44931fc68 R08: 0001 R09: 0492 Aug 19 03:32:39 lithium kernel: R10: 935687ff6ea8 R11: 0492 R12: 93561add38f0 Aug 19 03:32:39 lithium kernel: R13: 0438 R14: 93552159d288 R15: 935603ef4a10 Aug 19 03:32:39 lithium kernel: FS: 7fb3dffe6700() GS:93571fd0() knlGS: Aug 19 03:32:39 lithium kernel: CS: 0010 DS: ES: CR0: 80050033 Aug 19 03:32:39 lithium kernel: CR2: 7fb3d87d5024 CR3: 000170bcc000 CR4: 06e0 Aug 19 03:32:39 lithium kernel: Call Trace: Aug 19 03:32:39 lithium kernel: btrfs_clone+0x938/0x10e0 [btrfs] Aug 19 03:32:39 lithium kernel: btrfs_clone_files+0x16f/0x370 [btrfs] Aug 19 03:32:39 lithium kernel: vfs_clone_file_range+0x120/0x200 Aug 19 03:32:39 lithium kernel: ioctl_file_clone+0x9f/0x100 Aug 19 03:32:39 lithium kernel: ? __vma_rb_erase+0x11a/0x230 Aug 19 03:32:39 lithium kernel: do_vfs_ioctl+0x341/0x620 Aug 19 03:32:39 lithium kernel: ? do_munmap+0x34a/0x460 Aug 19 03:32:39 lithium kernel: ksys_ioctl+0x70/0x80 Aug 19 03:32:39 lithium kernel: __x64_sys_ioctl+0x16/0x20 Aug 19 03:32:39 lithium kernel: do_syscall_64+0x55/0x110 Aug 19 03:32:39 lithium kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Aug 19 03:32:39 lithium kernel: RIP: 0033:0x7fb3deda9dd7 Aug 19 03:32:39 lithium kernel: RSP: 002b:7fffd9e39e48 EFLAGS: 0246 ORIG_RAX: 0010 Aug 19 03:32:39 lithium kernel: RAX: ffda RBX: 40049409 RCX: 7fb3deda9dd7 Aug 19 03:32:39 lithium kernel: RDX: 0014 RSI: 40049409 RDI: 0016 Aug 19 03:32:39 lithium kernel: RBP: 557b716730e0 R08: R09: 7fffd9e39c20 Aug 19 03:32:39 lithium kernel: R10: 0100 R11: 0246 R12: 7fffd9e39e70 Aug 19 03:32:39 lithium kernel: R13: 557b7189f4c0 R14: 0016 R15: 0001 Aug 19 03:32:39 lithium kernel: Code: 89 c7 e9 67 ff ff ff 49 8b 44 24 50 f0 48 0f ba a8 30 17 00 00 02 72 15 83 fb fb 74 3b 89 de 48 c7 c7 78 1f 70 c0 e8 3d a5 5d f9 <0f> 0b 89 d9 4c 89 e7 ba b1 0c 00 00 48 c7 c6 50 62 6f c0 e8 cf Aug 19 03:32:39 lithium kernel: ---[ end trace d8e04102b2b7c95a ]--- Aug 19 03:32:39 lithium kernel: BTRFS: error (device sda2) in clone_finish_inode_update:3249: errno=-28 No space left Aug 19 03:32:39 lithium kernel: BTRFS info (device sda2): forced readonly Aug 19 03:32:39 lithium kernel: BTRFS error (device sda2): pending csums is 275275776 Bedup crashed, I guess because the warning forced the filesystem readonly Deduplicated: - '/media/btrfs/foo1/a/b/c/bar.mkv' - '/media/btrfs/foo2/a/b/c/d/e/bar.mkv' 03:27:43 Size group 64/17378 (1344397706) sampled 158 hashed 150 freed 355814871095 Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/bedup/tracking.py", line 609, in dedup_tracked1 dedup_fileset(ds, fileset, fd_names, fd_inodes, size) File "/usr/local/lib/python3.5/dist-packages/bedup/tracking.py", line 632, in dedup_fileset deduped = clone_data(dest=dfd, src=sfd,
[PATCH v3] Btrfs: Check metadata redundancy on balance
From: Sam Tygier <samtyg...@yahoo.co.uk> Date: Wed, 6 Jan 2016 08:46:12 + Subject: [PATCH] Btrfs: Check metadata redundancy on balance When converting a filesystem via balance check that metadata mode is at least as redundant as the data mode. For example give warning when: -dconvert=raid1 -mconvert=single Signed-off-by: Sam Tygier <samtyg...@yahoo.co.uk> --- v3: Use btrfs_warn() Mention profiles in message v2: Use btrfs_get_num_tolerated_disk_barrier_failures() --- fs/btrfs/volumes.c | 8 1 file changed, 8 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index a23399e..be91458 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3756,6 +3756,14 @@ int btrfs_balance(struct btrfs_balance_control *bctl, } } while (read_seqretry(_info->profiles_lock, seq)); + if (btrfs_get_num_tolerated_disk_barrier_failures(bctl->meta.target) < + btrfs_get_num_tolerated_disk_barrier_failures(bctl->data.target)) { + btrfs_warn(fs_info, + "Warning: metatdata profile %llu has lower redundancy " + "than data profile %llu\n", bctl->meta.target, + bctl->data.target); + } + if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) { fs_info->num_tolerated_disk_barrier_failures = min( btrfs_calc_num_tolerated_disk_barrier_failures(fs_info), -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Btrfs: Check metadata redundancy on balance
Resending as previous comments did not need any changes. Currently BTRFS allows you to make bad choices of data and metadata levels. For example -d raid1 -m raid0 means you can only use half your total disk space, but will loose everything if 1 disk fails. It should give a warning in these cases. This patch is a follow up to [PATCH v2] btrfs-progs: check metadata redundancy in order to cover the case of using balance to convert to such a set of raid levels. A simple example to hit this is to create a single device fs, which will default to single:dup, then to add a second device and attempt to convert to raid1 with the command btrfs balance start -dconvert=raid1 /mnt this will result in a filesystem with raid1:dup, which will not survive the loss of one drive. I personally don't see why the tools should allow this, but in the previous thread a warning was considered sufficient. Changes in v2 Use btrfs_get_num_tolerated_disk_barrier_failures() Signed-off-by: Sam Tygier <samtyg...@yahoo.co.uk> From: Sam Tygier <samtyg...@yahoo.co.uk> Date: Sat, 3 Oct 2015 16:43:48 +0100 Subject: [PATCH] Btrfs: Check metadata redundancy on balance When converting a filesystem via balance check that metadata mode is at least as redundant as the data mode. For example give warning when: -dconvert=raid1 -mconvert=single --- fs/btrfs/volumes.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 6fc73586..40247e9 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3584,6 +3584,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl, } } while (read_seqretry(_info->profiles_lock, seq)); + if (btrfs_get_num_tolerated_disk_barrier_failures(bctl->meta.target) < + btrfs_get_num_tolerated_disk_barrier_failures(bctl->data.target)) { + btrfs_info(fs_info, + "Warning: metatdata has lower redundancy than data\n"); + } + if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) { fs_info->num_tolerated_disk_barrier_failures = min( btrfs_calc_num_tolerated_disk_barrier_failures(fs_info), -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Btrfs: Check metadata redundancy on balance
Resending as previous comments did not need any changes. Currently BTRFS allows you to make bad choices of data and metadata levels. For example -d raid1 -m raid0 means you can only use half your total disk space, but will loose everything if 1 disk fails. It should give a warning in these cases. This patch is a follow up to [PATCH v2] btrfs-progs: check metadata redundancy in order to cover the case of using balance to convert to such a set of raid levels. A simple example to hit this is to create a single device fs, which will default to single:dup, then to add a second device and attempt to convert to raid1 with the command btrfs balance start -dconvert=raid1 /mnt this will result in a filesystem with raid1:dup, which will not survive the loss of one drive. I personally don't see why the tools should allow this, but in the previous thread a warning was considered sufficient. Changes in v2 Use btrfs_get_num_tolerated_disk_barrier_failures() Signed-off-by: Sam Tygier <samtyg...@yahoo.co.uk> From: Sam Tygier <samtyg...@yahoo.co.uk> Date: Sat, 3 Oct 2015 16:43:48 +0100 Subject: [PATCH] Btrfs: Check metadata redundancy on balance When converting a filesystem via balance check that metadata mode is at least as redundant as the data mode. For example give warning when: -dconvert=raid1 -mconvert=single --- fs/btrfs/volumes.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 6fc73586..40247e9 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3584,6 +3584,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl, } } while (read_seqretry(_info->profiles_lock, seq)); + if (btrfs_get_num_tolerated_disk_barrier_failures(bctl->meta.target) < + btrfs_get_num_tolerated_disk_barrier_failures(bctl->data.target)) { + btrfs_info(fs_info, + "Warning: metatdata has lower redundancy than data\n"); + } + if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) { fs_info->num_tolerated_disk_barrier_failures = min( btrfs_calc_num_tolerated_disk_barrier_failures(fs_info), -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Btrfs: Check metadata redundancy on balance
On 05/10/15 03:33, Anand Jain wrote: > > Sam, > > On 10/03/2015 11:50 PM, sam tygier wrote: >> Currently BTRFS allows you to make bad choices of data and >> metadata levels. For example -d raid1 -m raid0 means you can >> only use half your total disk space, but will loose everything >> if 1 disk fails. It should give a warning in these cases. > > Nice test case. however the way we calculate the impact of > lost device would be per chunk, as in the upcoming patch -set. > > PATCH 1/5] btrfs: Introduce a new function to check if all chunks a OK > for degraded mount > > The above patch-set should catch the bug here. Would you be able to > confirm if this patch is still needed Or apply your patch on top of > it ? > > Thanks, Anand > If I understand the per-chunk work correctly it is to handle the case where although there are not enough disks remaining to guarantee being able to mount degraded, the arrangement of existing chunks happens to allow it (e.g. all the single chunks happen to be on a surviving disk). So while the example case in "[PATCH 0/5] Btrfs: Per-chunk degradable check", can survive a 1 disk loss, the raid levels do not guarantee survivability of a 1 disk loss after more data is written. My patch is preventing combinations of raid levels that have poor guarantees when loosing disks, but waste disk space. For example data=raid1 metadata=single, which wastes space by writing the data twice, but would not guarantee survival of a 1 disk loss (even if the per-chuck patches allow some 1 disk losses to survive) and could loose everything if a bit flip happened in a critical metadata chunk. So I think my patch is useful with or without per-chunk work. Thanks, Sam -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Btrfs: Check metadata redundancy on balance
Currently BTRFS allows you to make bad choices of data and metadata levels. For example -d raid1 -m raid0 means you can only use half your total disk space, but will loose everything if 1 disk fails. It should give a warning in these cases. This patch is a follow up to [PATCH v2] btrfs-progs: check metadata redundancy in order to cover the case of using balance to convert to such a set of raid levels. A simple example to hit this is to create a single device fs, which will default to single:dup, then to add a second device and attempt to convert to raid1 with the command btrfs balance start -dconvert=raid1 /mnt this will result in a filesystem with raid1:dup, which will not survive the loss of one drive. I personally don't see why the tools should allow this, but in the previous thread a warning was considered sufficient. Changes in v2 Use btrfs_get_num_tolerated_disk_barrier_failures() Signed-off-by: Sam Tygier <samtyg...@yahoo.co.uk> From: Sam Tygier <samtyg...@yahoo.co.uk> Date: Sat, 3 Oct 2015 16:43:48 +0100 Subject: [PATCH] Btrfs: Check metadata redundancy on balance When converting a filesystem via balance check that metadata mode is at least as redundant as the data mode. For example give warning when: -dconvert=raid1 -mconvert=single --- fs/btrfs/volumes.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 6fc73586..40247e9 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3584,6 +3584,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl, } } while (read_seqretry(_info->profiles_lock, seq)); + if (btrfs_get_num_tolerated_disk_barrier_failures(bctl->meta.target) < + btrfs_get_num_tolerated_disk_barrier_failures(bctl->data.target)) { + btrfs_info(fs_info, + "Warning: metatdata has lower redundancy than data\n"); + } + if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) { fs_info->num_tolerated_disk_barrier_failures = min( btrfs_calc_num_tolerated_disk_barrier_failures(fs_info), -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: Check metadata redundancy on balance
On 16/09/15 11:15, Zhao Lei wrote: > Hi, sam tygier > >> -Original Message- >> From: linux-btrfs-ow...@vger.kernel.org >> [mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of sam tygier >> Sent: Wednesday, September 16, 2015 4:42 PM >> To: linux-btrfs@vger.kernel.org >> Subject: [PATCH] Btrfs: Check metadata redundancy on balance >> >> It was recommended that I resend after the merge window. No changes since >> last version. >> >> Currently BTRFS allows you to make bad choices of data and metadata levels. >> For example -d raid1 -m raid0 means you can only use half your total disk >> space, >> but will loose everything if 1 disk fails. It should give a warning in these >> cases. >> >> This patch is a follow up to >> [PATCH v2] btrfs-progs: check metadata redundancy in order to cover the case >> of using balance to convert to such a set of raid levels. >> > > Can we check and show warning of balance operation in btrfs-progs, > just like above patch? > I was not completely sure if this was better suited in btrfs-progs or the kernel. The existing checks against reducing redundancy are on the kernel side. Currently it looks like btrfs-progs does look at the current levels, only passes the arguments through to the kernel. If this was put on the btrfs-progs side, would it be an issue that some other tool might call the kernel directly, and bypass the check. >> A simple example to hit this is to create a single device fs, which will >> default to >> single:dup, then to add a second device and attempt to convert to raid1 with >> the command btrfs balance start -dconvert=raid1 /mnt this will result in a >> filesystem with raid1:dup, which will not survive the loss of one drive. I >> personally don't see why the tools should allow this, but in the previous >> thread >> a warning was considered sufficient. >> >> Signed-off-by: Sam Tygier <samtyg...@yahoo.co.uk> >> >> From: Sam Tygier <samtyg...@yahoo.co.uk> >> Date: Sat, 13 Jun 2015 18:13:06 +0100 >> Subject: [PATCH] Btrfs: Check metadata redundancy on balance >> >> When converting a filesystem via balance check that metadata mode is at least >> as redundant as the data mode. For example give warning >> when: >> -dconvert=raid1 -mconvert=single >> --- >> fs/btrfs/volumes.c | 24 >> 1 file changed, 24 insertions(+) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index fbe7c10..a0ce1f7 >> 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -3454,6 +3454,24 @@ static void __cancel_balance(struct btrfs_fs_info >> *fs_info) >> atomic_set(_info->mutually_exclusive_operation_running, 0); } >> >> +static int group_profile_max_safe_loss(u64 flag) { >> +switch (flag & BTRFS_BLOCK_GROUP_PROFILE_MASK) { >> +case 0: /* single */ >> +case BTRFS_BLOCK_GROUP_DUP: >> +case BTRFS_BLOCK_GROUP_RAID0: >> +return 0; >> +case BTRFS_BLOCK_GROUP_RAID1: >> +case BTRFS_BLOCK_GROUP_RAID5: >> +case BTRFS_BLOCK_GROUP_RAID10: >> +return 1; >> +case BTRFS_BLOCK_GROUP_RAID6: >> +return 2; >> +default: >> +return -1; >> +} >> +} >> + > > Maybe btrfs_get_num_tolerated_disk_barrier_failures() > fits above request, better to use existence function if possible. Sorry, I missed that. If it stays kernel side, I can switch to this. If it moves to btrfs-progs I'll use the existing group_profile_max_safe_loss() > Thanks > Zhaolei > > >> /* >>* Should be called with both balance and volume mutexes held >>*/ >> @@ -3572,6 +3590,12 @@ int btrfs_balance(struct btrfs_balance_control >> *bctl, >> } >> } while (read_seqretry(_info->profiles_lock, seq)); >> >> +if (group_profile_max_safe_loss(bctl->meta.target) < >> +group_profile_max_safe_loss(bctl->data.target)){ >> +btrfs_info(fs_info, >> +"Warning: metatdata has lower redundancy than data\n"); >> +} >> + >> if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) { >> int num_tolerated_disk_barrier_failures; >> u64 target = bctl->sys.target; >> -- >> 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: Check metadata redundancy on balance
It was recommended that I resend after the merge window. No changes since last version. Currently BTRFS allows you to make bad choices of data and metadata levels. For example -d raid1 -m raid0 means you can only use half your total disk space, but will loose everything if 1 disk fails. It should give a warning in these cases. This patch is a follow up to [PATCH v2] btrfs-progs: check metadata redundancy in order to cover the case of using balance to convert to such a set of raid levels. A simple example to hit this is to create a single device fs, which will default to single:dup, then to add a second device and attempt to convert to raid1 with the command btrfs balance start -dconvert=raid1 /mnt this will result in a filesystem with raid1:dup, which will not survive the loss of one drive. I personally don't see why the tools should allow this, but in the previous thread a warning was considered sufficient. Signed-off-by: Sam Tygier <samtyg...@yahoo.co.uk> From: Sam Tygier <samtyg...@yahoo.co.uk> Date: Sat, 13 Jun 2015 18:13:06 +0100 Subject: [PATCH] Btrfs: Check metadata redundancy on balance When converting a filesystem via balance check that metadata mode is at least as redundant as the data mode. For example give warning when: -dconvert=raid1 -mconvert=single --- fs/btrfs/volumes.c | 24 1 file changed, 24 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index fbe7c10..a0ce1f7 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3454,6 +3454,24 @@ static void __cancel_balance(struct btrfs_fs_info *fs_info) atomic_set(_info->mutually_exclusive_operation_running, 0); } +static int group_profile_max_safe_loss(u64 flag) +{ + switch (flag & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case 0: /* single */ + case BTRFS_BLOCK_GROUP_DUP: + case BTRFS_BLOCK_GROUP_RAID0: + return 0; + case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID5: + case BTRFS_BLOCK_GROUP_RAID10: + return 1; + case BTRFS_BLOCK_GROUP_RAID6: + return 2; + default: + return -1; + } +} + /* * Should be called with both balance and volume mutexes held */ @@ -3572,6 +3590,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl, } } while (read_seqretry(_info->profiles_lock, seq)); + if (group_profile_max_safe_loss(bctl->meta.target) < + group_profile_max_safe_loss(bctl->data.target)){ + btrfs_info(fs_info, + "Warning: metatdata has lower redundancy than data\n"); + } + if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) { int num_tolerated_disk_barrier_failures; u64 target = bctl->sys.target; -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: Check metadata redundancy on balance
Resending, as I received no comments on previous submission. Currently BTRFS allows you to make bad choices of data and metadata levels. For example -d raid1 -m raid0 means you can only use half your total disk space, but will loose everything if 1 disk fails. It should give a warning in these cases. This patch is a follow up to [PATCH v2] btrfs-progs: check metadata redundancy in order to cover the case of using balance to convert to such a set of raid levels. A simple example to hit this is to create a single device fs, which will default to single:dup, then to add a second device and attempt to convert to raid1 with the command btrfs balance start -dconvert=raid1 /mnt this will result in a filesystem with raid1:dup, which will not survive the loss of one drive. I personally don't see why the tools should allow this, but in the previous thread a warning was considered sufficient. Signed-off-by: Sam Tygier <samtyg...@yahoo.co.uk> From: Sam Tygier <samtyg...@yahoo.co.uk> Date: Sat, 13 Jun 2015 18:13:06 +0100 Subject: [PATCH] Btrfs: Check metadata redundancy on balance When converting a filesystem via balance check that metadata mode is at least as redundant as the data mode. For example give warning when: -dconvert=raid1 -mconvert=single --- fs/btrfs/volumes.c | 24 1 file changed, 24 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index fbe7c10..a0ce1f7 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3454,6 +3454,24 @@ static void __cancel_balance(struct btrfs_fs_info *fs_info) atomic_set(_info->mutually_exclusive_operation_running, 0); } +static int group_profile_max_safe_loss(u64 flag) +{ + switch (flag & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case 0: /* single */ + case BTRFS_BLOCK_GROUP_DUP: + case BTRFS_BLOCK_GROUP_RAID0: + return 0; + case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID5: + case BTRFS_BLOCK_GROUP_RAID10: + return 1; + case BTRFS_BLOCK_GROUP_RAID6: + return 2; + default: + return -1; + } +} + /* * Should be called with both balance and volume mutexes held */ @@ -3572,6 +3590,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl, } } while (read_seqretry(_info->profiles_lock, seq)); + if (group_profile_max_safe_loss(bctl->meta.target) < + group_profile_max_safe_loss(bctl->data.target)){ + btrfs_info(fs_info, + "Warning: metatdata has lower redundancy than data\n"); + } + if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) { int num_tolerated_disk_barrier_failures; u64 target = bctl->sys.target; -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Check metadata redundancy on balance
Currently BTRFS allows you to make bad choices of data and metadata levels. For example -d raid1 -m raid0 means you can only use half your total disk space, but will loose everything if 1 disk fails. It should give a warning in these cases. This patch is a follow up to [PATCH v2] btrfs-progs: check metadata redundancy in order to cover the case of using balance to convert to such a set of raid levels. A simple example to hit this is to create a single device fs, which will default to single:dup, then to add a second device and attempt to convert to raid1 with the command btrfs balance start -dconvert=raid1 /mnt this will result in a filesystem with raid1:dup, which will not survive the loss of one drive. I personally don't see why the tools should allow this, but in the previous thread a warning was considered sufficient. Signed-off-by: Sam Tygier samtyg...@yahoo.co.uk From: Sam Tygier samtyg...@yahoo.co.uk Date: Sat, 13 Jun 2015 18:13:06 +0100 Subject: [PATCH] Check metadata redundancy on balance When converting a filesystem via balance check that metadata mode is at least as redundant as the data mode. For example give warning when: -dconvert=raid1 -mconvert=single --- fs/btrfs/volumes.c | 24 1 file changed, 24 insertions(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 174f5e1..875d608 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3336,6 +3336,24 @@ static void __cancel_balance(struct btrfs_fs_info *fs_info) atomic_set(fs_info-mutually_exclusive_operation_running, 0); } +static int group_profile_max_safe_loss(u64 flag) +{ + switch (flag BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case 0: /* single */ + case BTRFS_BLOCK_GROUP_DUP: + case BTRFS_BLOCK_GROUP_RAID0: + return 0; + case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID5: + case BTRFS_BLOCK_GROUP_RAID10: + return 1; + case BTRFS_BLOCK_GROUP_RAID6: + return 2; + default: + return -1; + } +} + /* * Should be called with both balance and volume mutexes held */ @@ -3454,6 +3472,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl, } } while (read_seqretry(fs_info-profiles_lock, seq)); + if (group_profile_max_safe_loss(bctl-meta.target) + group_profile_max_safe_loss(bctl-data.target)) { + btrfs_info(fs_info, + Warning: metatdata has lower redundancy than data\n); + } + if (bctl-sys.flags BTRFS_BALANCE_ARGS_CONVERT) { int num_tolerated_disk_barrier_failures; u64 target = bctl-sys.target; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] btrfs-progs: check metadata redundancy
Currently BTRFS allows you to make bad choices of data and metadata levels. For example -d raid1 -m raid0 means you can only use half your total disk space, but will loose everything if 1 disk fails. It should give a warning in these cases. When making a filesystem check that metadata mode is at least as redundant as the data mode. For example give warning when: -d raid1 -m raid0 V1 - V2 Downgrade from error to warning as requested by David Sterba. Signed-off-by: Sam Tygier samtyg...@yahoo.co.uk From fdfcb5f733ff5ed48562366bda6f1a9c740b031a Mon Sep 17 00:00:00 2001 From: Sam Tygier samtyg...@yahoo.co.uk Date: Sat, 30 May 2015 15:37:37 +0100 Subject: [PATCH] When making a filesystem check that metadata mode is at least as redundant as the data mode. For example give warning when: -d raid1 -m raid0 --- mkfs.c | 6 ++ utils.c | 18 ++ utils.h | 1 + 3 files changed, 25 insertions(+) diff --git a/mkfs.c b/mkfs.c index 14e0fed..938840d 100644 --- a/mkfs.c +++ b/mkfs.c @@ -1367,6 +1367,12 @@ int main(int ac, char **av) exit(1); } + if (group_profile_max_safe_loss(metadata_profile) + group_profile_max_safe_loss(data_profile)){ + fprintf(stderr, + Warning: metatdata has lower redundancy than data\n); + } + /* if we are here that means all devs are good to btrfsify */ printf(%s\n, PACKAGE_STRING); printf(See %s for more information.\n\n, PACKAGE_URL); diff --git a/utils.c b/utils.c index 4b8a826..ba35b34 100644 --- a/utils.c +++ b/utils.c @@ -2354,6 +2354,24 @@ int test_num_disk_vs_raid(u64 metadata_profile, u64 data_profile, return 0; } +int group_profile_max_safe_loss(u64 flag) +{ + switch (flag BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case 0: /* single */ + case BTRFS_BLOCK_GROUP_DUP: + case BTRFS_BLOCK_GROUP_RAID0: + return 0; + case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID5: + case BTRFS_BLOCK_GROUP_RAID10: + return 1; + case BTRFS_BLOCK_GROUP_RAID6: + return 2; + default: + return -1; + } +} + /* Check if disk is suitable for btrfs * returns: * 1: something is wrong, estr provides the error diff --git a/utils.h b/utils.h index 5657c74..98fd812 100644 --- a/utils.h +++ b/utils.h @@ -144,6 +144,7 @@ int test_dev_for_mkfs(char *file, int force_overwrite, char *estr); int get_label_mounted(const char *mount_path, char *labelp); int test_num_disk_vs_raid(u64 metadata_profile, u64 data_profile, u64 dev_cnt, int mixed, char *estr); +int group_profile_max_safe_loss(u64 flag); int is_vol_small(char *file); int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf, int verify); -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] [RFC] RAID-level terminology change
On 10/03/13 15:43, Goffredo Baroncelli wrote: - DUP - dD (to allow more that 2 copy per disk) - RAID1 - nC or *C - RAID0 - mS or *S - RAID10 - nCmS or *CmS or nC*s - RAID with parity- mSpP or *SpP or mS*p (it is possible ?) - single - 1C or 1D or 1S or single where d,n,m,p are integers; '*' is the literal '*' and means how many possible. Using an asterisk '*' in something will be used as a command line argument risks having the shell expand it. Sticking to pure alphanumeric names would be better. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] [RFC] RAID-level terminology change
On 09/03/13 20:31, Hugo Mills wrote: Some time ago, and occasionally since, we've discussed altering the RAID-n terminology to change it to an nCmSpP format, where n is the number of copies, m is the number of (data) devices in a stripe per copy, and p is the number of parity devices in a stripe. The current kernel implementation uses as many devices as it can in the striped modes (RAID-0, -10, -5, -6), and in this implementation, that is written as mS (with a literal m). The mS and pP sections are omitted if the value is 1S or 0P. The magic look-up table for old-style / new-style is: single 1C (or omitted, in btrfs fi df output) RAID-0 1CmS RAID-1 2C DUP2CD RAID-102CmS RAID-5 1CmS1P RAID-6 1CmS2P Are these the only valid options? Are 'sensible' new levels (eg 3C, mirrored to 3 disk or 1CmS3P, like raid6 with but with 3 parity blocks) allowed? Are any arbitrary levels allowed (some other comments in the thread suggest no)? Will there be a recommended (or supported) set? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
warnings for silly/pointless raid levels
Hi, I recently got into a stick situation because i had a btrfs volume with data in raid1 but metadata as dup. when i removed one of the drives i could not mount as degraded. This was my error as i did not convert the metadata when i converted it from a single/dup volume. But i wonder if there should be a warning for raid combinations that are probably mistakes. raid1/dup, provides no protection against a failed disk, but uses as much space as if it did. so maybe the tools could have refused to allow it without a --force. sam -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problem replacing failing drive
On 25/10/12 22:37, Kyle Gates wrote: On 22/10/12 10:07, sam tygier wrote: hi, I have a 2 drive btrfs raid set up. It was created first with a single drive, and then adding a second and doing btrfs fi balance start -dconvert=raid1 /data the original drive is showing smart errors so i want to replace it. i dont easily have space in my desktop for an extra disk, so i decided to proceed by shutting down. taking out the old failing drive and putting in the new drive. this is similar to the description at https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Replacing_Failed_Devices (the other reason to try this is to simulate what would happen if a drive did completely fail). If i reconnect the failing drive then I can mount the filesystem with no errors, a quick glance suggests that the data is all there. Label: 'bdata' uuid: 1f07081c-316b-48be-af73-49e6f76535cc Total devices 2 FS bytes used 2.50TB devid 2 size 2.73TB used 2.73TB path /dev/sde1 -- this is the drive that i wish to remove devid 1 size 2.73TB used 2.73TB path /dev/sdd2 sudo btrfs filesystem df /mnt Data, RAID1: total=2.62TB, used=2.50TB System, DUP: total=40.00MB, used=396.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=112.00GB, used=3.84GB Metadata: total=8.00MB, used=0.00 is the failure to mount when i remove sde due to it being dup, rather than raid1? Yes, I would say so. Try a btrfs balance start -mconvert=raid1 /mnt so all metadata is on each drive. Thanks btrfs balance start -mconvert=raid1 /mnt did the trick. It gave btrfs: 9 enospc errors during balance errors the first few times i ran it, but got there in the end (smaller number of errors each time). the volume is pretty full, so i'll forgive it, (though is Metadata, RAID1: total=111.84GB, used=3.83GB a reasonable ratio?). i can now successfully remove the failed device and mount the filesystem in degraded mode. It seems like the system blocks get convert automatically. i have added an example for how to do this at https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Adding_New_Devices Thanks, Sam -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problem replacing failing drive
On 22/10/12 10:07, sam tygier wrote: hi, I have a 2 drive btrfs raid set up. It was created first with a single drive, and then adding a second and doing btrfs fi balance start -dconvert=raid1 /data the original drive is showing smart errors so i want to replace it. i dont easily have space in my desktop for an extra disk, so i decided to proceed by shutting down. taking out the old failing drive and putting in the new drive. this is similar to the description at https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Replacing_Failed_Devices (the other reason to try this is to simulate what would happen if a drive did completely fail). so after swapping the drives and rebooting, i try to mount as degraded. i instantly get a kernel panic, http://www.hep.man.ac.uk/u/sam/pub/IMG_5397_crop.png so far all this has been with 3.5 kernel. so i upgraded to 3.6.2 and tried to mount degraded again. first with just sudo mount /dev/sdd2 /mnt, then with sudo mount -o degraded /dev/sdd2 /mnt [ 582.535689] device label bdata devid 1 transid 25342 /dev/sdd2 [ 582.536196] btrfs: disk space caching is enabled [ 582.536602] btrfs: failed to read the system array on sdd2 [ 582.536860] btrfs: open_ctree failed [ 606.784176] device label bdata devid 1 transid 25342 /dev/sdd2 [ 606.784647] btrfs: allowing degraded mounts [ 606.784650] btrfs: disk space caching is enabled [ 606.785131] btrfs: failed to read chunk root on sdd2 [ 606.785331] btrfs warning page private not zero on page 392922368 [ 606.785408] btrfs: open_ctree failed [ 782.422959] device label bdata devid 1 transid 25342 /dev/sdd2 no panic is good progress, but something is still not right. my options would seem to be 1) reconnect old drive (probably in a USB caddy), see if it mounts as if nothing ever happened. or possibly try and recover it back to a working raid1. then try again with adding the new drive first, then removing the old one. 2) give up experimenting and create a new btrfs raid1, and restore from backup both leave me with a worry about what would happen if a disk in a raid 1 did die. (unless is was the panic that did some damage that borked the filesystem.) Some more details. If i reconnect the failing drive then I can mount the filesystem with no errors, a quick glance suggests that the data is all there. Label: 'bdata' uuid: 1f07081c-316b-48be-af73-49e6f76535cc Total devices 2 FS bytes used 2.50TB devid2 size 2.73TB used 2.73TB path /dev/sde1 -- this is the drive that i wish to remove devid1 size 2.73TB used 2.73TB path /dev/sdd2 sudo btrfs filesystem df /mnt Data, RAID1: total=2.62TB, used=2.50TB System, DUP: total=40.00MB, used=396.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=112.00GB, used=3.84GB Metadata: total=8.00MB, used=0.00 is the failure to mount when i remove sde due to it being dup, rather than raid1? is adding a second drive to a btrfs filesystem and running btrfs fi balance start -dconvert=raid1 /mnt not sufficient to create an array that can survive the loss of a disk? do i need -mconvert as well? is there an -sconvert for system? thanks Sam -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
problem replacing failing drive
hi, I have a 2 drive btrfs raid set up. It was created first with a single drive, and then adding a second and doing btrfs fi balance start -dconvert=raid1 /data the original drive is showing smart errors so i want to replace it. i dont easily have space in my desktop for an extra disk, so i decided to proceed by shutting down. taking out the old failing drive and putting in the new drive. this is similar to the description at https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices#Replacing_Failed_Devices (the other reason to try this is to simulate what would happen if a drive did completely fail). so after swapping the drives and rebooting, i try to mount as degraded. i instantly get a kernel panic, http://www.hep.man.ac.uk/u/sam/pub/IMG_5397_crop.png so far all this has been with 3.5 kernel. so i upgraded to 3.6.2 and tried to mount degraded again. first with just sudo mount /dev/sdd2 /mnt, then with sudo mount -o degraded /dev/sdd2 /mnt [ 582.535689] device label bdata devid 1 transid 25342 /dev/sdd2 [ 582.536196] btrfs: disk space caching is enabled [ 582.536602] btrfs: failed to read the system array on sdd2 [ 582.536860] btrfs: open_ctree failed [ 606.784176] device label bdata devid 1 transid 25342 /dev/sdd2 [ 606.784647] btrfs: allowing degraded mounts [ 606.784650] btrfs: disk space caching is enabled [ 606.785131] btrfs: failed to read chunk root on sdd2 [ 606.785331] btrfs warning page private not zero on page 392922368 [ 606.785408] btrfs: open_ctree failed [ 782.422959] device label bdata devid 1 transid 25342 /dev/sdd2 no panic is good progress, but something is still not right. my options would seem to be 1) reconnect old drive (probably in a USB caddy), see if it mounts as if nothing ever happened. or possibly try and recover it back to a working raid1. then try again with adding the new drive first, then removing the old one. 2) give up experimenting and create a new btrfs raid1, and restore from backup both leave me with a worry about what would happen if a disk in a raid 1 did die. (unless is was the panic that did some damage that borked the filesystem.) thanks. sam -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs across a mix of SSDs HDDs
On 01/05/12 20:35, Martin wrote: The idea is to gain the random access speed of the SSDs but have the HDDs as backup in case the SSDs fail due to wear... Have you looked at the bcache project http://bcache.evilpiepirate.org/ sam -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html