Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
On Mon, Jan 23, 2017 at 5:05 PM, Omar Sandovalwrote: > Thanks! Hmm, okay, so it's coming from btrfs_update_delayed_inode()... > That's probably us failing btrfs_lookup_inode(), but just to make sure, > could you apply the updated diff at the same link as before > (https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4)? If > that's the case, I'm even more confused about what xattrs have to do > with it. [ 35.015363] __btrfs_update_delayed_inode(): inode is missing [ 35.015372] btrfs_update_delayed_inode(ino=2) -> -2 osandov-9f223b_2-dmesg.log https://drive.google.com/open?id=0B_2Asp8DGjJ9UnNSRXpualprWHM -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs-progs: Introduce macro to calculate backup superblock offset
Introduce a new macro, BTRFS_SB_OFFSET() to calculate backup superblock offset, this is handy if one wants to initialize static array at declaration time. Suggested-by: David SterbaSigned-off-by: Qu Wenruo --- disk-io.h | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/disk-io.h b/disk-io.h index c4afea3f..08ee5cee 100644 --- a/disk-io.h +++ b/disk-io.h @@ -98,11 +98,17 @@ enum btrfs_read_sb_flags { SBREAD_PARTIAL = (1 << 1), }; +/* + * Use macro to define mirror super block position, + * so we can use it in static array initialization + */ +#define BTRFS_SB_MIRROR_OFFSET(mirror) ((u64)(SZ_16K) << \ + (BTRFS_SUPER_MIRROR_SHIFT * (mirror))) + static inline u64 btrfs_sb_offset(int mirror) { - u64 start = SZ_16K; if (mirror) - return start << (BTRFS_SUPER_MIRROR_SHIFT * mirror); + return BTRFS_SB_MIRROR_OFFSET(mirror); return BTRFS_SUPER_INFO_OFFSET; } -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs-progs: Introduce kernel sizes to cleanup large intermediate number
Large numbers like (1024 * 1024 * 1024) may cost reader/reviewer to waste one second to convert to 1G. Introduce kernel include/linux/sizes.h to replace any intermediate number larger than 4096 (not including 4096) to SZ_*. Signed-off-by: Qu Wenruo--- btrfs-map-logical.c | 2 +- cmds-fi-usage.c | 2 +- cmds-filesystem.c | 2 +- cmds-inspect.c | 6 +++--- cmds-scrub.c| 2 +- cmds-send.c | 2 +- ctree.c | 7 --- ctree.h | 6 -- disk-io.c | 2 +- disk-io.h | 5 +++-- extent-tree.c | 15 +++ free-space-cache.c | 2 +- kernel-lib/sizes.h | 47 +++ send.h | 2 +- utils.c | 8 utils.h | 9 + volumes.c | 20 ++-- volumes.h | 2 +- 18 files changed, 96 insertions(+), 45 deletions(-) create mode 100644 kernel-lib/sizes.h diff --git a/btrfs-map-logical.c b/btrfs-map-logical.c index e49a735e..bcbf2d90 100644 --- a/btrfs-map-logical.c +++ b/btrfs-map-logical.c @@ -30,7 +30,7 @@ #include "list.h" #include "utils.h" -#define BUFFER_SIZE (64 * 1024) +#define BUFFER_SIZE SZ_64K /* we write the mirror info to stdout unless they are dumping the data * to stdout diff --git a/cmds-fi-usage.c b/cmds-fi-usage.c index 8764fef6..5d8496fe 100644 --- a/cmds-fi-usage.c +++ b/cmds-fi-usage.c @@ -301,7 +301,7 @@ static void get_raid56_used(int fd, struct chunk_info *chunks, int chunkcount, } } -#defineMIN_UNALOCATED_THRESH (16 * 1024 * 1024) +#defineMIN_UNALOCATED_THRESH SZ_16M static int print_filesystem_usage_overall(int fd, struct chunk_info *chunkinfo, int chunkcount, struct device_info *devinfo, int devcount, char *path, unsigned unit_mode) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index c66709b3..f3949b3b 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@ -1044,7 +1044,7 @@ static int cmd_filesystem_defrag(int argc, char **argv) * but it does not defragment very well. The 32M will likely lead to * better results and is independent of the kernel default. */ - thresh = 32 * 1024 * 1024; + thresh = SZ_32M; defrag_global_errors = 0; defrag_global_verbose = 0; diff --git a/cmds-inspect.c b/cmds-inspect.c index 5e58a284..ac3da618 100644 --- a/cmds-inspect.c +++ b/cmds-inspect.c @@ -173,7 +173,7 @@ static int cmd_inspect_logical_resolve(int argc, char **argv) if (check_argc_exact(argc - optind, 2)) usage(cmd_inspect_logical_resolve_usage); - size = min(size, (u64)64 * 1024); + size = min(size, (u64)SZ_64K); inodes = malloc(size); if (!inodes) return 1; @@ -486,7 +486,7 @@ static void adjust_dev_min_size(struct list_head *extents, * chunk tree, so often this can lead to the need of allocating * a new system chunk too, which has a maximum size of 32Mb. */ - *min_size += 32 * 1024 * 1024; + *min_size += SZ_32M; } } @@ -500,7 +500,7 @@ static int print_min_dev_size(int fd, u64 devid) * possibility of deprecating/removing it has been discussed, so we * ignore it here. */ - u64 min_size = 1 * 1024 * 1024ull; + u64 min_size = SZ_1M; struct btrfs_ioctl_search_args args; struct btrfs_ioctl_search_key *sk = u64 last_pos = (u64)-1; diff --git a/cmds-scrub.c b/cmds-scrub.c index 2cf7f308..292a5dfd 100644 --- a/cmds-scrub.c +++ b/cmds-scrub.c @@ -467,7 +467,7 @@ static struct scrub_file_record **scrub_read_file(int fd, int report_errors) { int avail = 0; int old_avail = 0; - char l[16 * 1024]; + char l[SZ_16K]; int state = 0; int curr = -1; int i = 0; diff --git a/cmds-send.c b/cmds-send.c index cec11e6b..6c0a3dc3 100644 --- a/cmds-send.c +++ b/cmds-send.c @@ -44,7 +44,7 @@ #include "send.h" #include "send-utils.h" -#define SEND_BUFFER_SIZE (64 * 1024) +#define SEND_BUFFER_SIZE SZ_64K /* * Default is 1 for historical reasons, changing may break scripts that expect diff --git a/ctree.c b/ctree.c index d07ec7d9..e3d687fb 100644 --- a/ctree.c +++ b/ctree.c @@ -21,6 +21,7 @@ #include "print-tree.h" #include "repair.h" #include "internal.h" +#include "sizes.h" static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, int level); @@ -368,7 +369,7 @@ int btrfs_cow_block(struct btrfs_trans_handle *trans, return 0; } - search_start = buf->start & ~((u64)(1024 * 1024 * 1024) - 1); + search_start = buf->start & ~((u64)SZ_1G - 1); ret = __btrfs_cow_block(trans, root, buf, parent, parent_slot,
Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
At 01/24/2017 01:54 AM, David Sterba wrote: On Mon, Dec 19, 2016 at 02:56:41PM +0800, Qu Wenruo wrote: Since we have the whole facilities needed to rollback, switch to the new rollback. Sorry, the change from patch 4 to patch 5 seems too big to grasp for me, reviewing is really hard and I'm not sure I could even do that. My concern is namely about patch 5/6 that throws out a lot of code that does not obviously map to the new code. I can try again to see if there are points where the patch could be split, but at the moment the patchset is too scary to merge. So this implies the current implementation is not good enough for review. I'll try to extract more more set operation and make the core part more refined, with more ascii art comment for it. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
On Mon, Jan 23, 2017 at 04:48:54PM -0700, Chris Murphy wrote: > On Mon, Jan 23, 2017 at 3:04 PM, Omar Sandovalwrote: > > On Mon, Jan 23, 2017 at 02:55:21PM -0700, Chris Murphy wrote: > >> On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy > >> > I haven't found the commit for that patch, so maybe it's something > >> > with the combination of that patch and the previous commit. > >> > >> I think that's provably not the case based on the bisect log, because > >> I hit the problem with kernel that has only the commit, as well as the > >> commit plus the updated patch. So the patch neither causes nor fixes > >> the problem I'm experiencing. > >> > >> If it's useful the btrfs-image is here; mentioned in a previous > >> thread, this image mounts find, btrfs check --mode=original has no > >> complaints, but btrfs check --mode=lowmem has complaints. There's no > >> problem using the parent subvolume as rootfs. Only snapshots of that > >> subvolume result in the problem. > >> https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA > > > > What I meant to ask was if there were false positives/false negatives in > > booting from the subvolume that would lead you to doubt the results of > > the git bisect, but it sounds like it's 100% reproducible for you? > > OH I see. Yes. It happens within 60 seconds, during startup or shortly > after gnome-shell login. So it's really clear that it definitely > happens or does not happen. But it requires two things to be true: > kernel version 4.9 or higher *and* a normal rw snapshot is used as > rootfs. If either of those things is not true, then the problem > doesn't happen. > > I've also tried building a 4.9 kernel with CONFIG_BTRFS_DEBUG and > CONFIG_BTRFS_FS_CHECK_INTEGRITY with check_int, but the results are > the same - no additional debug info shown by dmesg. > > > > I'll take a look at the image. In the meantime, could you try booting > > with https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4 > > applied on top of 4.9 so we can hopefully narrow it down? It'd also be > > great to know if it always fails the same way or if it varies. > > Appears to always fail the same way. > > [chris@f25h ~]$ dmesg | grep -i btrfs > [2.705333] Btrfs loaded, crc32c=crc32c-intel > [2.705905] BTRFS: device label fedora devid 1 transid 113458 > /dev/nvme0n1p4 > [2.764563] BTRFS: device label fedora devid 2 transid 113458 > /dev/nvme0n1p6 > [3.990957] BTRFS info (device nvme0n1p6): disk space caching is enabled > [3.990988] BTRFS info (device nvme0n1p6): has skinny extents > [4.010618] BTRFS info (device nvme0n1p6): detected SSD devices, > enabling SSD mode > [4.551046] BTRFS info (device nvme0n1p6): disk space caching is enabled > [ 13.906182] btrfs_update_delayed_inode() -> -2 > [ 13.906261] WARNING: CPU: 0 PID: 488 at > fs/btrfs/delayed-inode.c:1179 __btrfs_run_delayed_items+0x1b7/0x660 > [btrfs] > [ 13.906266] BTRFS: Transaction aborted (error -2) > [ 13.906460] tpm_tis acpi_thermal_rel lis3lv02d tpm_tis_core > input_polldev acpi_pad wmi nfs_acl hp_wireless tpm lockd grace sunrpc > btrfs i915 xor raid6_pq i2c_algo_bit drm_kms_helper drm crc32c_intel > nvme serio_raw nvme_core i2c_hid video fjes > [ 13.906635] [] ? > __btrfs_release_delayed_node+0x70/0x1c0 [btrfs] > [ 13.906690] [] > __btrfs_run_delayed_items+0x1b7/0x660 [btrfs] > [ 13.906743] [] btrfs_run_delayed_items+0x13/0x20 [btrfs] > [ 13.906793] [] > btrfs_commit_transaction+0x23a/0xa20 [btrfs] > [ 13.906853] [] ? > btrfs_wait_ordered_range+0x7c/0x100 [btrfs] > [ 13.906910] [] btrfs_sync_file+0x2fb/0x3e0 [btrfs] > [ 13.906970] BTRFS: error (device nvme0n1p6) in > __btrfs_run_delayed_items:1179: errno=-2 No such entry > [ 13.906976] BTRFS info (device nvme0n1p6): forced readonly > [ 13.906982] BTRFS warning (device nvme0n1p6): Skipping commit of > aborted transaction. > [ 13.906989] BTRFS: error (device nvme0n1p6) in > cleanup_transaction:1850: errno=-2 No such entry > [ 13.907943] BTRFS info (device nvme0n1p6): delayed_refs has NO entry > > Complete dmesg tags/v4.9 + osandov/9f223b > https://drive.google.com/open?id=0B_2Asp8DGjJ9N1g5Wm9lVHpGWG8 Thanks! Hmm, okay, so it's coming from btrfs_update_delayed_inode()... That's probably us failing btrfs_lookup_inode(), but just to make sure, could you apply the updated diff at the same link as before (https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4)? If that's the case, I'm even more confused about what xattrs have to do with it. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
On Mon, Jan 23, 2017 at 3:04 PM, Omar Sandovalwrote: > On Mon, Jan 23, 2017 at 02:55:21PM -0700, Chris Murphy wrote: >> On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy >> > I haven't found the commit for that patch, so maybe it's something >> > with the combination of that patch and the previous commit. >> >> I think that's provably not the case based on the bisect log, because >> I hit the problem with kernel that has only the commit, as well as the >> commit plus the updated patch. So the patch neither causes nor fixes >> the problem I'm experiencing. >> >> If it's useful the btrfs-image is here; mentioned in a previous >> thread, this image mounts find, btrfs check --mode=original has no >> complaints, but btrfs check --mode=lowmem has complaints. There's no >> problem using the parent subvolume as rootfs. Only snapshots of that >> subvolume result in the problem. >> https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA > > What I meant to ask was if there were false positives/false negatives in > booting from the subvolume that would lead you to doubt the results of > the git bisect, but it sounds like it's 100% reproducible for you? OH I see. Yes. It happens within 60 seconds, during startup or shortly after gnome-shell login. So it's really clear that it definitely happens or does not happen. But it requires two things to be true: kernel version 4.9 or higher *and* a normal rw snapshot is used as rootfs. If either of those things is not true, then the problem doesn't happen. I've also tried building a 4.9 kernel with CONFIG_BTRFS_DEBUG and CONFIG_BTRFS_FS_CHECK_INTEGRITY with check_int, but the results are the same - no additional debug info shown by dmesg. > I'll take a look at the image. In the meantime, could you try booting > with https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4 > applied on top of 4.9 so we can hopefully narrow it down? It'd also be > great to know if it always fails the same way or if it varies. Appears to always fail the same way. [chris@f25h ~]$ dmesg | grep -i btrfs [2.705333] Btrfs loaded, crc32c=crc32c-intel [2.705905] BTRFS: device label fedora devid 1 transid 113458 /dev/nvme0n1p4 [2.764563] BTRFS: device label fedora devid 2 transid 113458 /dev/nvme0n1p6 [3.990957] BTRFS info (device nvme0n1p6): disk space caching is enabled [3.990988] BTRFS info (device nvme0n1p6): has skinny extents [4.010618] BTRFS info (device nvme0n1p6): detected SSD devices, enabling SSD mode [4.551046] BTRFS info (device nvme0n1p6): disk space caching is enabled [ 13.906182] btrfs_update_delayed_inode() -> -2 [ 13.906261] WARNING: CPU: 0 PID: 488 at fs/btrfs/delayed-inode.c:1179 __btrfs_run_delayed_items+0x1b7/0x660 [btrfs] [ 13.906266] BTRFS: Transaction aborted (error -2) [ 13.906460] tpm_tis acpi_thermal_rel lis3lv02d tpm_tis_core input_polldev acpi_pad wmi nfs_acl hp_wireless tpm lockd grace sunrpc btrfs i915 xor raid6_pq i2c_algo_bit drm_kms_helper drm crc32c_intel nvme serio_raw nvme_core i2c_hid video fjes [ 13.906635] [] ? __btrfs_release_delayed_node+0x70/0x1c0 [btrfs] [ 13.906690] [] __btrfs_run_delayed_items+0x1b7/0x660 [btrfs] [ 13.906743] [] btrfs_run_delayed_items+0x13/0x20 [btrfs] [ 13.906793] [] btrfs_commit_transaction+0x23a/0xa20 [btrfs] [ 13.906853] [] ? btrfs_wait_ordered_range+0x7c/0x100 [btrfs] [ 13.906910] [] btrfs_sync_file+0x2fb/0x3e0 [btrfs] [ 13.906970] BTRFS: error (device nvme0n1p6) in __btrfs_run_delayed_items:1179: errno=-2 No such entry [ 13.906976] BTRFS info (device nvme0n1p6): forced readonly [ 13.906982] BTRFS warning (device nvme0n1p6): Skipping commit of aborted transaction. [ 13.906989] BTRFS: error (device nvme0n1p6) in cleanup_transaction:1850: errno=-2 No such entry [ 13.907943] BTRFS info (device nvme0n1p6): delayed_refs has NO entry Complete dmesg tags/v4.9 + osandov/9f223b https://drive.google.com/open?id=0B_2Asp8DGjJ9N1g5Wm9lVHpGWG8 -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID56 status?
On Mon, 2017-01-23 at 18:18 -0500, Chris Mason wrote: > We've been focusing on the single-drive use cases internally. This > year > that's changing as we ramp up more users in different places. > Performance/stability work and raid5/6 are the top of my list right > now. +1 Would be nice to get some feedback on what happens behind the scenes... actually I think a regular btrfs development blog could be generally a nice thing :) Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: RAID56 status?
On Mon, Jan 23, 2017 at 06:53:21PM +0100, Christoph Anton Mitterer wrote: Just wondered... is there any larger known RAID56 deployment? I mean something with real-world production systems and ideally many different IO scenarios, failures, pulling disks randomly and perhaps even so many disks that it's also likely to hit something like silent data corruption (on the disk level)? Has CM already migrated all of Facebook's storage to btrfs RAID56?! ;-) Well at least facebook.com seems till online ;-P *kidding* I mean the good thing in having such a massive production-like environment - especially when it's not just one homogeneous usage pattern - is that it would help to build up quite some trust into the code (once the already known bugs are fixed). We've been focusing on the single-drive use cases internally. This year that's changing as we ramp up more users in different places. Performance/stability work and raid5/6 are the top of my list right now. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
On Mon, Jan 23, 2017 at 02:55:21PM -0700, Chris Murphy wrote: > On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy > > I haven't found the commit for that patch, so maybe it's something > > with the combination of that patch and the previous commit. > > I think that's provably not the case based on the bisect log, because > I hit the problem with kernel that has only the commit, as well as the > commit plus the updated patch. So the patch neither causes nor fixes > the problem I'm experiencing. > > If it's useful the btrfs-image is here; mentioned in a previous > thread, this image mounts find, btrfs check --mode=original has no > complaints, but btrfs check --mode=lowmem has complaints. There's no > problem using the parent subvolume as rootfs. Only snapshots of that > subvolume result in the problem. > https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA What I meant to ask was if there were false positives/false negatives in booting from the subvolume that would lead you to doubt the results of the git bisect, but it sounds like it's 100% reproducible for you? I'll take a look at the image. In the meantime, could you try booting with https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4 applied on top of 4.9 so we can hopefully narrow it down? It'd also be great to know if it always fails the same way or if it varies. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy > I haven't found the commit for that patch, so maybe it's something > with the combination of that patch and the previous commit. I think that's provably not the case based on the bisect log, because I hit the problem with kernel that has only the commit, as well as the commit plus the updated patch. So the patch neither causes nor fixes the problem I'm experiencing. If it's useful the btrfs-image is here; mentioned in a previous thread, this image mounts find, btrfs check --mode=original has no complaints, but btrfs check --mode=lowmem has complaints. There's no problem using the parent subvolume as rootfs. Only snapshots of that subvolume result in the problem. https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
On Mon, Jan 23, 2017 at 2:31 PM, Omar Sandovalwrote: > On Wed, Jan 18, 2017 at 02:27:13PM -0700, Chris Murphy wrote: >> On Wed, Jan 11, 2017 at 4:13 PM, Chris Murphy >> wrote: >> > Looks like there's some sort of xattr and Btrfs interaction happening >> > here; but as it only happens with some subvolumes/snapshots not all >> > (but 100% consistent) maybe the kernel version at the time the >> > snapshot was taken is a factor? >> >> The kernel version at the time the snapshot is taken is not a factor. >> I've taken a snapshot of a working subvolume, and booting the snapshot >> fails during startup with the fs forced readonly with kernel 4.9 and >> higher; the problem doesn't happen with kernel 4.8.17 and lower. >> >> As a further test I tried: >> >> >> git checkout tags/v4.9 >> git revert 6c6ef9f26e598fb977f60935e109cd5b266c941a >> >> But I get a failure during compile: >> >> scripts/Makefile.build:293: recipe for target 'fs/xattr.o' failed >> make[1]: *** [fs/xattr.o] Error 1 >> Makefile:988: recipe for target 'fs' failed >> make: *** [fs] Error 2 >> >> Anyway, the inability to boot snapshots means bootable rollbacks are >> broken. I think this is a serious regression, what's the next step in >> figuring out what's going on? > > Hm, so you're 100% sure that this exact commit caused the regression? > I've stared at it for a little while and am not seeing anything obvious. This is the git bisect log: https://bugzilla.kernel.org/attachment.cgi?id=251271 I don't know how to evaluate your question. If there's a test that helps answer the question, let me know. Searching for this commit brings up an lkml thread: https://lkml.org/lkml/2016/11/3/268 I haven't found the commit for that patch, so maybe it's something with the combination of that patch and the previous commit. But I see it's applied in at least 4.10-rc4 and this forced readonly event happens with 4.10-rc4. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
On Wed, Jan 18, 2017 at 02:27:13PM -0700, Chris Murphy wrote: > On Wed, Jan 11, 2017 at 4:13 PM, Chris Murphywrote: > > Looks like there's some sort of xattr and Btrfs interaction happening > > here; but as it only happens with some subvolumes/snapshots not all > > (but 100% consistent) maybe the kernel version at the time the > > snapshot was taken is a factor? > > The kernel version at the time the snapshot is taken is not a factor. > I've taken a snapshot of a working subvolume, and booting the snapshot > fails during startup with the fs forced readonly with kernel 4.9 and > higher; the problem doesn't happen with kernel 4.8.17 and lower. > > As a further test I tried: > > > git checkout tags/v4.9 > git revert 6c6ef9f26e598fb977f60935e109cd5b266c941a > > But I get a failure during compile: > > scripts/Makefile.build:293: recipe for target 'fs/xattr.o' failed > make[1]: *** [fs/xattr.o] Error 1 > Makefile:988: recipe for target 'fs' failed > make: *** [fs] Error 2 > > Anyway, the inability to boot snapshots means bootable rollbacks are > broken. I think this is a serious regression, what's the next step in > figuring out what's going on? Hm, so you're 100% sure that this exact commit caused the regression? I've stared at it for a little while and am not seeing anything obvious. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check lowmem vs original
OK so all of these pass original check, but have problems reported by lowmem. Separate notes about each inline. ~500MiB each, these three are data volumes, first two are raid1, third one is single. https://drive.google.com/open?id=0B_2Asp8DGjJ9Z3UzWnFKT3A0clU https://drive.google.com/open?id=0B_2Asp8DGjJ9V0ROdHNoMW1BVE0 https://drive.google.com/open?id=0B_2Asp8DGjJ9Zmd1LXl6MU5WeXc 19MiB, about 15 minutes old, rootfs, OS installation only https://drive.google.com/open?id=0B_2Asp8DGjJ9TF9LVkFlcDBzOG8 55MiB, about 1 month old, rootfs, not much activity https://drive.google.com/open?id=0B_2Asp8DGjJ9bkJFc01qcVJxNnM 324MiB, about 5 months old, used as rootfs, all read-write snapshots used as rootfs are forced readonly, a regression previously reported without any dev response https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA http://www.spinics.net/lists/linux-btrfs/msg61817.html https://bugzilla.kernel.org/show_bug.cgi?id=191761 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hard crash on 4.9.5
On 01/23/2017 09:27 PM, Hans van Kranenburg wrote: > [... press send without rereading ...] > > Anyway, it seems to point to something that's going wrong with changes > that are *not* on disk *yet*, and the crash is preventing ... ... whatever incorrect data this situation might result in from reaching disk, at least. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hard crash on 4.9.5
On 01/23/2017 09:03 PM, Matt McKinnon wrote: > Wondering what to do about this error which says 'reboot needed'. Has > happened a three times in the past week: > > Jan 23 14:16:17 my_machine kernel: [ 2568.595648] BTRFS error (device > sda1): err add delayed dir index item(index: 23810) into the deletion > tree of the delayed node(root id: 257, inode id: 2661433, errno: -17) > Jan 23 14:16:17 my_machine kernel: [ 2568.611010] [ cut here > ] > Jan 23 14:16:17 my_machine kernel: [ 2568.615628] kernel BUG at > fs/btrfs/delayed-inode.c:1557! > Jan 23 14:16:17 my_machine kernel: [ 2568.620942] invalid opcode: > [#1] SMP > [...] The purpose of the code involved is that if you create a directory or file and quickly remove it again, the filesystem doesn't need to do two disk writes, it can just erase it again from its memory before writing anything to disk. 8< more This is when the functionality was added: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=16cdcec736cd214350cdb591bf1091f8beedefa0 If you look for "err add delayed dir" in the source code of that commit message, you see where the error message is constructed errno: -17, just after it called __btrfs_add_delayed_insertion_item __btrfs_add_delayed_insertion_item calls __btrfs_add_delayed_item, and the only non-0 return in that function is: return -EEXIST, which is -17 I think this means you added a file or directory, and the kernel code tried to add adding the file twice to the list of additions, which it has no way to deal with except making the whole kernel crash. >8 A while ago someone reported this on IRC, running a 4.8.13 kernel. (that's when I looked up the above info). I can also find it in Oct 2016 in my IRC logs, but without any info on kernel version. Anyway, it seems to point to something that's going wrong with changes that are *not* on disk *yet*, and the crash is preventing . -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hard crash on 4.9.5
Wondering what to do about this error which says 'reboot needed'. Has happened a three times in the past week: Jan 23 14:16:17 my_machine kernel: [ 2568.595648] BTRFS error (device sda1): err add delayed dir index item(index: 23810) into the deletion tree of the delayed node(root id: 257, inode id: 2661433, errno: -17) Jan 23 14:16:17 my_machine kernel: [ 2568.611010] [ cut here ] Jan 23 14:16:17 my_machine kernel: [ 2568.615628] kernel BUG at fs/btrfs/delayed-inode.c:1557! Jan 23 14:16:17 my_machine kernel: [ 2568.620942] invalid opcode: [#1] SMP Jan 23 14:16:17 my_machine kernel: [ 2568.624960] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs ipt_REJECT nf_rej ect_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables ipmi_devintf nfsd au th_rpcgss nfs_acl nfs lockd grace sunrpc fscache intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_int el kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper crypt d dm_multipath joydev mei_me mei lpc_ich ioatdma wmi ipmi_si ipmi_msghandler btrfs shpchp mac_hid lp parport ses enclosure scsi_tran sport_sas raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c igb hid_generic i2c_algo_ bit raid1 dca usbhid ahci raid0 ptp megaraid_sas multipath Jan 23 14:16:17 my_machine kernel: [ 2568.697150] hid libahci pps_core linear dm_mirror dm_region_hash dm_log Jan 23 14:16:17 my_machine kernel: [ 2568.702689] CPU: 0 PID: 2440 Comm: nfsd Tainted: GW 4.9.5-custom #1 Jan 23 14:16:17 my_machine kernel: [ 2568.710166] Hardware name: Supermicro X9DRH-7TF/7F/iTF/iF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0b 04/28 /2014 Jan 23 14:16:17 my_machine kernel: [ 2568.719207] task: 95a42addab80 task.stack: b9da8533 Jan 23 14:16:17 my_machine kernel: [ 2568.725124] RIP: 0010:[] [] btrfs_delete_delayed_dir_inde x+0x286/0x290 [btrfs] Jan 23 14:16:17 my_machine kernel: [ 2568.735604] RSP: 0018:b9da85333be0 EFLAGS: 00010286 Jan 23 14:16:17 my_machine kernel: [ 2568.740917] RAX: RBX: 95a3b104b690 RCX: Jan 23 14:16:17 my_machine kernel: [ 2568.748048] RDX: 0001 RSI: 95a42fc0dcc8 RDI: 95a42fc0dcc8 Jan 23 14:16:17 my_machine kernel: [ 2568.755171] RBP: b9da85333c48 R08: 0491 R09: Jan 23 14:16:17 my_machine kernel: [ 2568.762297] R10: 0005 R11: 0006 R12: 95a3b104b6d8 Jan 23 14:16:17 my_machine kernel: [ 2568.769429] R13: 5d02 R14: 95a82953d800 R15: ffef Jan 23 14:16:17 my_machine kernel: [ 2568.776555] FS: () GS:95a42fc0() knlGS: Jan 23 14:16:17 my_machine kernel: [ 2568.784639] CS: 0010 DS: ES: CR0: 80050033 Jan 23 14:16:17 my_machine kernel: [ 2568.790377] CR2: 7f12ea376000 CR3: 0003e1e07000 CR4: 001406f0 Jan 23 14:16:17 my_machine kernel: [ 2568.797503] Stack: Jan 23 14:16:17 my_machine kernel: [ 2568.799524] 9b7fe5f2 95a3b104b560 0004 95a3f96b3e80 Jan 23 14:16:17 my_machine kernel: [ 2568.806983] 95a3f96b3e80 39ff95a814eeeb68 6000289c 5d02 Jan 23 14:16:17 my_machine kernel: [ 2568.814436] 95a3f7457c40 95a3bcb74138 95a814eeeb68 00289c39 Jan 23 14:16:17 my_machine kernel: [ 2568.821891] Call Trace: Jan 23 14:16:17 my_machine kernel: [ 2568.824343] [] ? mutex_lock+0x12/0x2f Jan 23 14:16:17 my_machine kernel: [ 2568.829671] [] __btrfs_unlink_inode+0x198/0x4c0 [btrfs] Jan 23 14:16:17 my_machine kernel: [ 2568.836555] [] btrfs_unlink_inode+0x1c/0x40 [btrfs] Jan 23 14:16:17 my_machine kernel: [ 2568.843086] [] btrfs_unlink+0x6b/0xb0 [btrfs] Jan 23 14:16:17 my_machine kernel: [ 2568.849091] [] vfs_unlink+0xda/0x190 Jan 23 14:16:17 my_machine kernel: [ 2568.854315] [] ? lookup_one_len+0xd3/0x130 Jan 23 14:16:17 my_machine kernel: [ 2568.860075] [] nfsd_unlink+0x16e/0x210 [nfsd] Jan 23 14:16:17 my_machine kernel: [ 2568.866084] [] nfsd3_proc_remove+0x7c/0x110 [nfsd] Jan 23 14:16:17 my_machine kernel: [ 2568.872529] [] nfsd_dispatch+0xb8/0x1f0 [nfsd] Jan 23 14:16:17 my_machine kernel: [ 2568.878641] [] svc_process_common+0x43f/0x700 [sunrpc] Jan 23 14:16:17 my_machine kernel: [ 2568.885432] [] svc_process+0xfc/0x1c0 [sunrpc] Jan 23 14:16:17 my_machine kernel: [ 2568.891528] [] nfsd+0xf0/0x160 [nfsd] Jan 23 14:16:17 my_machine kernel: [ 2568.896838] [] ? nfsd_destroy+0x60/0x60 [nfsd] Jan 23 14:16:17 my_machine kernel: [ 2568.902931] [] kthread+0xca/0xe0 Jan 23 14:16:17 my_machine kernel: [ 2568.907807] [] ? kthread_park+0x60/0x60 Jan 23 14:16:17 my_machine kernel: [ 2568.913296] [] ret_from_fork+0x25/0x30 Jan 23 14:16:17 my_machine kernel: [ 2568.918693] Code: ff ff 48 8b 43 10 49 8b
Re: [PATCH v3 5/6] btrfs-progs: convert: Switch to new rollback function
On Mon, Dec 19, 2016 at 02:56:41PM +0800, Qu Wenruo wrote: > Since we have the whole facilities needed to rollback, switch to the new > rollback. Sorry, the change from patch 4 to patch 5 seems too big to grasp for me, reviewing is really hard and I'm not sure I could even do that. My concern is namely about patch 5/6 that throws out a lot of code that does not obviously map to the new code. I can try again to see if there are points where the patch could be split, but at the moment the patchset is too scary to merge. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID56 status?
Just wondered... is there any larger known RAID56 deployment? I mean something with real-world production systems and ideally many different IO scenarios, failures, pulling disks randomly and perhaps even so many disks that it's also likely to hit something like silent data corruption (on the disk level)? Has CM already migrated all of Facebook's storage to btrfs RAID56?! ;-) Well at least facebook.com seems till online ;-P *kidding* I mean the good thing in having such a massive production-like environment - especially when it's not just one homogeneous usage pattern - is that it would help to build up quite some trust into the code (once the already known bugs are fixed). Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range
On Mon, Dec 19, 2016 at 02:56:38PM +0800, Qu Wenruo wrote: > +static u64 reserved_range_starts[3] = { 0, BTRFS_SB_MIRROR_OFFSET(1), > + BTRFS_SB_MIRROR_OFFSET(2) }; > +static u64 reserved_range_lens[3] = { 1024 * 1024, 64 * 1024, 64 * 1024 }; Also anywhere in the relevant code, the 3 should be better a named constant and not either 3 or the ARRAY_SIZE, or 2 in some backward going for loop I've seen. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/6] btrfs-progs: utils: Introduce basic set operations for range
On Mon, Dec 19, 2016 at 02:56:38PM +0800, Qu Wenruo wrote: > Introduce basic set operations: is_subset() and is_intersection(). > > This is quite useful to check if a range [start, start + len) subset or > intersection of another range. > So we don't need to use open code to do it, which I sometimes do it > wrong. > > Also use these new facilities in btrfs-convert, to check if a range is a > subset or intersects with btrfs convert reserved ranges. I see the range helpers used only inside convert so I don't think we need to export them into utils. Then you could introduce a helper structure with start and len members and use that instead of 2 arrays > --- a/disk-io.h > +++ b/disk-io.h > @@ -97,11 +97,16 @@ enum btrfs_read_sb_flags { > SBREAD_PARTIAL = (1 << 1), > }; > > +/* > + * Use macro to define mirror super block position > + * So we can use it in static array initializtion > + */ > +#define BTRFS_SB_MIRROR_OFFSET(mirror) ((u64)(16 * 1024) << \ > + (BTRFS_SUPER_MIRROR_SHIFT * (mirror))) This is unrelated change and should go separately. > static inline u64 btrfs_sb_offset(int mirror) > { > - u64 start = 16 * 1024; > if (mirror) > - return start << (BTRFS_SUPER_MIRROR_SHIFT * mirror); > + return BTRFS_SB_MIRROR_OFFSET(mirror); > return BTRFS_SUPER_INFO_OFFSET; > } > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: sanitize - Use correct source for memcpy
On Fri, Jan 20, 2017 at 01:03:33PM -0600, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues> > While performing a memcpy, we are copying from uninitialized dst > as opposed to src->data. Though using eb->len is correct, I used > src->len to make it more readable. > > Signed-off-by: Goldwyn Rodrigues Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Urgent Please,,
Good Day Dear, My name is Ms. Joyes Dadi, I am glad you are reading this letter and I hope we will start our communication and I know that this message will look strange, surprising and probably unbelievable to you, but it is the reality. I want to make a donation of money to you. I contact you by the will of God. I am a firm German woman specialized in mining gold and diamonds in Africa. But now, I'm very sick of a cancer. My husband died in an accident two years ago with our two children and now I have cancer of the esophagus that damaged almost all the cells in my system/agencies and I'll die soon according to my doctor. My most concern now is, we grew up in the orphanage and were married in orphanage. If I die this deposited fund will soon be left alone in the hand of the bank, and I do want to it that way. Please, if you can be reliable and sincere to accept my humble proposal; I have (10.5Millions Euro) in a fixed deposit account; I will order the Bank to transfer the money into your account in your country immediately, and then you will take the fund to your country and invest it to the orphanage homes Please, answer as quickly as possible. God bless you. Ms. Joyes Dadi Email: joyesdadi...@citromail.hu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID5: btrfs rescue chunk-recover segfaults.
On Mon, 23 Jan 2017 14:15:55 +0100 Simon Waidwrote: > I have a btrfs raid5 array that has become unmountable. That's the third time you send this today. Will you keep resending every few hours until you get a reply? That's not how mailing lists work. -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RAID5: btrfs rescue chunk-recover segfaults.
Dear all, I have a btrfs raid5 array that has become unmountable. When trying to mount dmesg containes the following: [ 5686.334384] BTRFS info (device sdb): disk space caching is enabled [ 5688.377244] BTRFS info (device sdb): bdev /dev/sdb errs: wr 2517, rd 77, flush 0, corrupt 0, gen 0 [ 5688.377254] BTRFS info (device sdb): bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 [ 5688.377261] BTRFS info (device sdb): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0 [ 5688.377268] BTRFS info (device sdb): bdev /dev/sde errs: wr 21, rd 8807, flush 0, corrupt 0, gen 0 [ 5688.744249] BTRFS error (device sdb): parent transid verify failed on 16227387371520 wanted 88711 found 88395 [ 5689.533817] BTRFS error (device sdb): parent transid verify failed on 16227388260352 wanted 88711 found 88395 [ 5689.609355] BTRFS error (device sdb): parent transid verify failed on 16227415158784 wanted 88711 found 88397 [ 5689.627715] BTRFS error (device sdb): parent transid verify failed on 16227415158784 wanted 88711 found 88397 [ 5689.627731] BTRFS error (device sdb): failed to read block groups: -5 [ 5689.675017] BTRFS error (device sdb): open_ctree failed I tried to recover from the problem using: btrfs rescue chunk-recover -v /dev/sdb The command runs for a few minutes. Then it segfaults. I used gdb to debug. This is the backtrace: Starting program: btrfs-progs/btrfs rescue chunk-recover -v /dev/sdb [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". All Devices: Device: id = 4, name = /dev/sde Device: id = 1, name = /dev/sdd1 Device: id = 2, name = /dev/sdc Device: id = 3, name = /dev/sdb [New Thread 0x76f6e700 (LWP 8155)] [New Thread 0x7676d700 (LWP 8156)] [New Thread 0x75f6c700 (LWP 8157)] [New Thread 0x7576b700 (LWP 8158)] Scanning: 24603734016 in dev0, 32581337088 in dev1, 37911248896 in dev2, 32217350144 in dev3 Thread 2 "btrfs" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x76f6e700 (LWP 8155)] btrfs_new_device_extent_record (leaf=leaf@entry=0x78c0, key=key@entry=0x76f6dc90, slot=slot@entry=12) at cmds-check.c:6656 6656rec->chunk_objecteid = (gdb) backtrace #0 btrfs_new_device_extent_record (leaf=leaf@entry=0x78c0, key=key@entry=0x76f6dc90, slot=slot@entry=12) at cmds-check.c:6656 #1 0x004370d2 in process_device_extent_item (slot=12, key=0x76f6dc90, leaf=0x78c0, devext_cache=0x7fffe410) at chunk-recover.c:332 #2 extract_metadata_record (rc=rc@entry=0x7fffe3c0, leaf=leaf@entry=0x78c0) at chunk-recover.c:727 #3 0x0043759b in scan_one_device (dev_scan_struct=0x6ae420) at chunk-recover.c:807 #4 0x7733f6ba in start_thread (arg=0x76f6e700) at pthread_create.c:333 #5 0x7707582d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Information about the system: uname -a: Linux 4.10.0-041000rc4-generic #201701152031 SMP Mon Jan 16 01:33:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux btrfs-progs --version: v4.9 (from git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git) sudo btrfs fi show Label: none uuid: a27cc0cf-1665-43ba-8c63-bf236d31fcd2 Total devices 4 FS bytes used 6.51TiB devid1 size 2.73TiB used 2.73TiB path /dev/sdd1 devid2 size 7.28TiB used 2.73TiB path /dev/sdc devid3 size 3.64TiB used 3.56TiB path /dev/sdb devid4 size 1.82TiB used 1.46TiB path /dev/sde btrfs fi df wont work as the filesystem is not mountable. Any help would be appreciated! Best regards, Simon PS: I'd also like to mention how the raid array became unmountable. The system I was running at that time was: Kernel: 4.8.0-34 generic #36~16.04.1 Ubuntu SMP btrfs-progs --version: v4.4 - I issued a replace command on disk 2. During the replace, disc 4 was disconnected. I noticed it and rebooted the system just a few second after the event. After the reboot, the replace continued and eventually finished. However, dmesg would showed errors like: parent transid verify failed on 16227387371520 wanted 88711 found 88395. - I issued a resize command on the new drive to free additional space: btrfs resize 2:max, which completed without errors. - I issued a balance without any filters in the hope it would correct the "parent transid verify failed" errors. The balance started normally. However, after about one hour, I saw that no I/O would happen and lots of errors appeared in dmesg. I tried to reboot but the command had no effect, so disconnected the PC from the power supply. I have attached the dmesg for the resize and balance operations. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: gdb log of crashed "btrfs-image -s"
On 01/18/2017 02:11 PM, Christoph Groth wrote: > Goldwyn Rodrigues wrote: >> Thanks Christoph for the backtrace. I am unable to reproduce it, but >> looking at your backtrace, I found a bug. Would you be able to give it >> a try and check if it fixes the problem? > > I applied your patch to v4.9, and compiled the static binaries. > Unfortunately, it still segfaults. (Perhaps your fix is correct, and > there's a second problem?) I attach a new backtrace. Do let me know if > I can help in another way. I looked hard, and could not find the reason of a failure here. The bakctrace of the new one is a little different than previous one, but I am not sure why it crashes. Until I have a reproduction scneario, I may not be able to fix this. How about a core? However, a core will have values which you are trying to mask with sanitize. -- Goldwyn signature.asc Description: OpenPGP digital signature
Re: btrfs recovery
Hello again by the way. the init-extent-tree is still running (now almost 7 days). is there any chance to find out how long it will take at the end? Sebastian Am 20.01.2017 um 02:08 schrieb Qu Wenruo: At 01/19/2017 06:06 PM, Sebastian Gottschall wrote: Hello I have a question. after a power outage my system was turning into a unrecoverable state using btrfs (kernel 4.9) since im running --init-extent-tree now for 3 days i'm asking how long this process normally takes and why it outputs millions of lines like --init-extent-tree will trash *ALL* current extent tree, and *REBUILD* them from fs-tree. This can takes a long time depending on the size of the fs, and how many shared extents there are (snapshots and reflinks all counts). Such a huge operation should only be used if you're sure only extent tree is corrupted, and other tree are all OK. Or you'll just totally screw your fs further, especially when interrupted. Backref 1562890240 root 262 owner 483059214 offset 0 num_refs 0 not found in extent tree Incorrect local backref count on 1562890240 root 262 owner 483059214 offset 0 found 1 wanted 0 back 0x23b0211d0 backpointer mismatch on [1562890240 4096] This is common, since --init-extent-tree trash all extent tree, so every tree-block/data extent will trigger such output adding new data backref on 1562890240 root 262 owner 483059214 offset 0 found 1 Repaired extent references for 1562890240 But as you see, it repaired the extent tree by adding back EXTENT_ITEM/METADATA_ITEM into extent tree, so far it works. If you see such output with all the same bytenr, then things goes really wrong and maybe a dead loop. Personally speaking, normal problem like failed to mount should not need --init-extent-tree. Especially, extent-tree corruption normally is not really related to mount failure, but sudden remount to RO and kernel wanring. Thanks, Qu please avoid typical answers like "potential dangerous operation" since all repair options are declared as potenial dangerous. Sebastian -- Mit freundlichen Grüssen / Regards Sebastian Gottschall / CTO NewMedia-NET GmbH - DD-WRT Firmensitz: Berliner Ring 101, 64625 Bensheim Registergericht: Amtsgericht Darmstadt, HRB 25473 Geschäftsführer: Peter Steinhäuser, Christian Scheele http://www.dd-wrt.com email: s.gottsch...@dd-wrt.com Tel.: +496251-582650 / Fax: +496251-5826565 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/9] btrfs-progs: lowmem check: Fix false alert in checking data extent pointing to prealloc extent
Btrfs lowmem check can report false csum error like: ERROR: root 5 EXTENT_DATA[257 0] datasum missing ERROR: root 5 EXTENT_DATA[257 4096] prealloc shouldn't have datasum This is because lowmem check code always compare the found csum size with the whole extent which data extents points to. Normally it's OK, but when prealloc extent is written, or reflink is done, data extent can points to part of a larger extent, making the csum check wrong. The fix changes the csum check part to the data extent size, other than the disk_bytenr/disk_num_bytes which points to a larger extent. Reported-by: Chris MurphySigned-off-by: Qu Wenruo --- cmds-check.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index f158daf9..fd176b76 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -4695,6 +4695,7 @@ static int check_file_extent(struct btrfs_root *root, struct btrfs_key *fkey, u64 disk_bytenr; u64 disk_num_bytes; u64 extent_num_bytes; + u64 extent_offset; u64 found; unsigned int extent_type; unsigned int is_hole; @@ -4731,17 +4732,28 @@ static int check_file_extent(struct btrfs_root *root, struct btrfs_key *fkey, disk_bytenr = btrfs_file_extent_disk_bytenr(node, fi); disk_num_bytes = btrfs_file_extent_disk_num_bytes(node, fi); extent_num_bytes = btrfs_file_extent_num_bytes(node, fi); + extent_offset = btrfs_file_extent_offset(node, fi); is_hole = (disk_bytenr == 0) && (disk_num_bytes == 0); - /* Check EXTENT_DATA datasum */ - ret = count_csum_range(root, disk_bytenr, disk_num_bytes, ); + /* +* Check EXTENT_DATA datasum +* +* We should only check the range we're referring to, as it's possible +* that part of prealloc extent has been written, and has csum: +* +* |<--- Original large preallocate extent A >| +* |<- Prealloc File Extent ->|<- Regular Extent ->| +* No csum Has csum +*/ + ret = count_csum_range(root, disk_bytenr + extent_offset, + extent_num_bytes, ); if (found > 0 && nodatasum) { err |= ODD_CSUM_ITEM; error("root %llu EXTENT_DATA[%llu %llu] nodatasum shouldn't have datasum", root->objectid, fkey->objectid, fkey->offset); } else if (extent_type == BTRFS_FILE_EXTENT_REG && !nodatasum && !is_hole && - (ret < 0 || found == 0 || found < disk_num_bytes)) { + (ret < 0 || found == 0 || found < extent_num_bytes)) { err |= CSUM_ITEM_MISSING; error("root %llu EXTENT_DATA[%llu %llu] datasum missing", root->objectid, fkey->objectid, fkey->offset); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 9/9] btrfs-progs: fsck: Fix lowmem mode override to allow it skip repair work
From: Lu FengqiCurrent common.local doesn't handle lowmem mode well. It passes "--mode=lowmem" alone with "--repair", making it unable to check lowmem mode. It's caused by the following bugs: 1) Wrong variable in test/common.local We should check TEST_ARGS_CHECK, not TEST_CHECK, which is not defined so we never return 1. 2) Wrong parameter passed to _cmd_spec() in test/common This prevents us from grepping the correct parameters. Fix it. Signed-off-by: Lu Fengqi --- tests/common | 8 tests/common.local | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/tests/common b/tests/common index 51c2e267..7ad436e3 100644 --- a/tests/common +++ b/tests/common @@ -106,7 +106,7 @@ run_check() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD: $@" > /dev/tty; fi @@ -128,7 +128,7 @@ run_check_stdout() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD(stdout): $@" > /dev/tty; fi @@ -152,7 +152,7 @@ run_mayfail() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD(mayfail): $@" > /dev/tty; fi @@ -188,7 +188,7 @@ run_mustfail() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD(mustfail): $@" > /dev/tty; fi diff --git a/tests/common.local b/tests/common.local index 9f567c27..4f56bb08 100644 --- a/tests/common.local +++ b/tests/common.local @@ -17,7 +17,7 @@ TEST_ARGS_CHECK=--mode=lowmem # break tests _skip_spec() { - if echo "$TEST_CHECK" | grep -q 'mode=lowmem' && + if echo "$TEST_ARGS_CHECK" | grep -q 'mode=lowmem' && echo "$@" | grep -q -- '--repair'; then return 0 fi -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/8] btrfs-progs: fsck: Fix lowmem mode override to allow it skip repair work
From: Lu FengqiCurrent common.local doesn't handle lowmem mode well. It passes "--mode=lowmem" alone with "--repair", making it unable to check lowmem mode. It's caused by the following bugs: 1) Wrong variable in test/common.local We should check TEST_ARGS_CHECK, not TEST_CHECK, which is not defined so we never return 1. 2) Wrong parameter passed to _cmd_spec() in test/common This prevents us from grepping the correct parameters. Fix it. Signed-off-by: Lu Fengqi --- tests/common | 8 tests/common.local | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/tests/common b/tests/common index 51c2e267..7ad436e3 100644 --- a/tests/common +++ b/tests/common @@ -106,7 +106,7 @@ run_check() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD: $@" > /dev/tty; fi @@ -128,7 +128,7 @@ run_check_stdout() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD(stdout): $@" > /dev/tty; fi @@ -152,7 +152,7 @@ run_mayfail() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD(mayfail): $@" > /dev/tty; fi @@ -188,7 +188,7 @@ run_mustfail() ins=$(_get_spec_ins "$@") spec=$(($ins-1)) cmd=$(eval echo "\${$spec}") - spec=$(_cmd_spec "$cmd") + spec=$(_cmd_spec "${@:$spec}") set -- "${@:1:$(($ins-1))}" $spec "${@: $ins}" echo "### $@" >> "$RESULTS" 2>&1 if [[ $TEST_LOG =~ tty ]]; then echo "CMD(mustfail): $@" > /dev/tty; fi diff --git a/tests/common.local b/tests/common.local index 9f567c27..4f56bb08 100644 --- a/tests/common.local +++ b/tests/common.local @@ -17,7 +17,7 @@ TEST_ARGS_CHECK=--mode=lowmem # break tests _skip_spec() { - if echo "$TEST_CHECK" | grep -q 'mode=lowmem' && + if echo "$TEST_ARGS_CHECK" | grep -q 'mode=lowmem' && echo "$@" | grep -q -- '--repair'; then return 0 fi -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/9] btrfs-progs: fsck-test: Add test image for lowmem mode block group false alert
Add a minimal image which can reproduce the block group used space false alert for lowmem mode fsck. Reported-by: Christoph Anton MittererSigned-off-by: Qu Wenruo --- .../block_group_item_false_alert.raw.xz | Bin 0 -> 47792 bytes tests/fsck-tests/020-extent-ref-cases/test.sh | 15 +++ 2 files changed, 11 insertions(+), 4 deletions(-) create mode 100644 tests/fsck-tests/020-extent-ref-cases/block_group_item_false_alert.raw.xz diff --git a/tests/fsck-tests/020-extent-ref-cases/block_group_item_false_alert.raw.xz b/tests/fsck-tests/020-extent-ref-cases/block_group_item_false_alert.raw.xz new file mode 100644 index ..559c3fa9e8491f3ce1f424d1baef29853e8fb889 GIT binary patch literal 47792 zcmeHwWmukDmL={If(LhZcL)S`cXxMpcPD`W!5xAV+%>qnyL)h$?&&{0(>=H9R^2;Q z-Ti*Qd7ksJzrEL4dz~#W9G>c$ARrKXb9JI%AVi?JARr*`#??pO-s~W@bwEJuUf$mF ze!PkCi=#Vo87MR+Qtsu2LO-1C;0I1K^l|foN>%#~MMlandk9peSu N#)cCI-~7;EQFhOF(7$wFo%;?(r9$c-B&8)gSv5^r*S24&1YlJ6HO@I5-X zMm=+|4QW$f92}?pM0A{MA>fqJm@%6us;Ff@N+Zs65^bwee?I@KoP9G|&?dIU)1?`f z4M(i5PN{Y3DY4oU7paaNp*_2dZ+Ar<+4GVa;iO f(R@$4vz3=WTw4%Jkc z!0_XpyKSs!ByoH#K|!dqiZ7C;7K1l`aP_3bf$)5#L+LuAGa|IOgTifxCf#-T z(LGUl?exf#!FLTA!OAMf+%4N%IaCqW+ls=Oz>+P64u$MzZO@a6fKyEAtKyH|ZzTJ1 z*|gv(!ggvf5O+#vp3dCF*euT_%Nqvn5fg>bjl!loHUf={<3BOM^fj`sHr(}n!w6Sx z<4%=oG1p|C<=pua55w8;Mmk}y5>!*%yJm=d@BFM~bu>Z;Bl^OemlRq0vRWT* z;f-!nqvd8#=lnh`q=6x=?6R^$o3BK>LJg(DBH#fXFhC{@k)T|K4-?>G0QV zxf(2#%(dgu^ iowEz z)L6@2g|d8oc7Fl-k)%ZPE<3kpw;LQS*YWK$lN_o*K77Tb$paUxyWq22XaOtZ18eSQ zMqILf0b$pH{Sp!{@No?e{5*~6gfJsqRO;@NV!iQb%Sfj<__U|Z=2 z4*U1Y;MNqn1oJ=IJV$_z3Lo1o9%;X6r?5EGr(Y>aT5{)bHdH?dp_c?*-yAO=J4$sO z4u{Xr5V_oaV@Jg)vR3?ET?4?@fb!e2q_l*kFBY~ zwb_=-uWI5eely{$8k7NHJJPmz5Lxks?v4KVD_-+hH3gQoM(ClbqI`eGt`;QB z!@yF-^OGWt^C-R+;m3znCBAzqV^|*lVQTC3dY+5kHhR5MWPT~rkMXETZ|{=s9o#kU zyq#mj*g7HQPgqD}tJ@T>`_h6?C(FxC;xFQkx3qTiL-e=V9jyAeiXkK+*BPQ%%|m>5 zAz11uSfU_pm4sK*lMy3E w zIw5h-Xv@8+{3Jt^I{2ZLo(QynUBs`Dw@U?rp}-Jq4nqjUeMF9Su_z_K&3J;JIKR3h zJnHIF?)oPi-@hgtNkxu`nmG|Rz64RdgxRQpIj9MESq-C|p2PY16{)6&!xYXr2DZw7 zfZ?sINZXU|!sX3GlmeYoA8^HwSh!NquhCp;j+(d^-Na5g!7|gGY9{S#coB|-V;9)d zs&+3Hy=|YI`%Jzow$F^;F=I(EcMG$SaPF!q)$Mij6bC+oKHNRb+G5FE?du@w07C+5 zgQ~_Vho9B{-J9t{){ht_Tl;TCBzh;5!y%I?sgb_a30D*@ci5d!sWPcA3zf6+$h~Ar z_qPe`OppatXLeooGGYjm2`Hr%`oH;0GN~(^X1%=wr2#o{QYw1+lTZ6yesd2$`j$c| zid^#Sz4r8#kzTbkF625Wpwr}U`bYn$qZN<^YO>yfz=}ZlFJbpc6RGCYOt%lV%mo*s zC4_j{paX6t?|9|*zL?oVd*UF_c3!Q9drg?vVEF2l@-$W_Y662zU=p$d>UDN~E6I zYrw-e=Unf94wBWIE(;6O`H~559gnEDaexalAs>$sDKU<_laNa%f6Vu>?w2u*m?CG>`vojw-wps>>5Yiz$-4Tt~dqC+|C* u(W|;)pDn5VK@h26sZbjj~ufR-RwPba=ALKKpZxs8GDUAZUV~jLq>q zL@Ml#ZV*$rg*Bf~)xY_r|tgr+QO)qHz=Fr4~zx0*(A3wm2%I^%Dp*iF(q zlh7tgvq>7fh}FEL(D}As-EPcsI346#EGxseSh@dg9{OdhGKZezepni26VK0#$ kZlSpmj2Hq!yrq798ZkQlFH*X_A*aB48vIwR6RhZvH88_D8wj?+Wqz z^~{wDAiDqW3+Xq2<=>(kfW!au1_uoGJ1GxfFu-8Hn~MEWR|J4z|5+9SgybI`lE|P< zARVs(pQwi++UeY2$9zDD5SH59g7}Mn3=kmm6EK Y6JU+=
[PATCH 1/9] btrfs-progs: lowmem check: Fix wrong block group check when search slot points to slot beyong leaf
Since btrfs_search_slot() can points to the slot which is beyond the leaves' capacity, in the following case, btrfs lowmem mode check will skip the block group and report false alert: leaf 29405184 items 37 free space 1273 generation 11 owner 2 ... item 36 key (77594624 EXTENT_ITEM 2097152) extent refs 1 gen 8 flags DATA extent data backref root 5 objectid 265 offset 0 count 1 leaf 29409280 items 43 free space 670 generation 11 owner 2 item 0 key (96468992 EXTENT_ITEM 2097152) extent refs 1 gen 8 flags DATA extent data backref root 5 objectid 274 offset 0 count 1 item 1 key (96468992 BLOCK_GROUP_ITEM 33554432) block group used 2265088 chunk_objectid 256 flags DATA When checking block group item, we will search key(96468992 0 0) to start from the first item in the block group. While search_slot() will point to leaf 29405184, slot 37 which is beyond leaf capacity. And when reading key from slot 37, uninitialized data can be read out and cause us to exit block group item check, leading to false alert. Fix it by checking path.slot[0] before reading out the key. Reported-by: Christoph Anton MittererSigned-off-by: Qu Wenruo --- cmds-check.c | 5 + 1 file changed, 5 insertions(+) diff --git a/cmds-check.c b/cmds-check.c index 1dba2985..c39392b7 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -10961,6 +10961,11 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info, /* Iterate extent tree to account used space */ while (1) { leaf = path.nodes[0]; + + /* Search slot can point to the last item beyond leaf nritems */ + if (path.slots[0] >= btrfs_header_nritems(leaf)) + goto next; + btrfs_item_key_to_cpu(leaf, _key, path.slots[0]); if (extent_key.objectid >= bg_key.objectid + bg_key.offset) break; -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/9] btrfs-progs: tests: Move fsck-tests/015 to fuzz tests
The test case fsck-tests/015-check-bad-memory-access can't be repair by btrfs check, and it's a fortunate bug makes original mode to forget the error code from extent tree, making original mode pass it. So fuzz-tests is more suitable for it. Signed-off-by: Qu Wenruo--- .../images}/bko-97171-btrfs-image.raw.txt | 0 .../images}/bko-97171-btrfs-image.raw.xz| Bin 2 files changed, 0 insertions(+), 0 deletions(-) rename tests/{fsck-tests/015-check-bad-memory-access => fuzz-tests/images}/bko-97171-btrfs-image.raw.txt (100%) rename tests/{fsck-tests/015-check-bad-memory-access => fuzz-tests/images}/bko-97171-btrfs-image.raw.xz (100%) diff --git a/tests/fsck-tests/015-check-bad-memory-access/bko-97171-btrfs-image.raw.txt b/tests/fuzz-tests/images/bko-97171-btrfs-image.raw.txt similarity index 100% rename from tests/fsck-tests/015-check-bad-memory-access/bko-97171-btrfs-image.raw.txt rename to tests/fuzz-tests/images/bko-97171-btrfs-image.raw.txt diff --git a/tests/fsck-tests/015-check-bad-memory-access/bko-97171-btrfs-image.raw.xz b/tests/fuzz-tests/images/bko-97171-btrfs-image.raw.xz similarity index 100% rename from tests/fsck-tests/015-check-bad-memory-access/bko-97171-btrfs-image.raw.xz rename to tests/fuzz-tests/images/bko-97171-btrfs-image.raw.xz -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/9] btrfs-progs: fsck-tests: Make 013 compatible with lowmem mode
fsck-tests/013-extent-tree-rebuild uses "--init-extent-tree", which implies "--repair". But the test script doesn't specify "--repair" for lowmem mode test to detect it. Add it so lowmem mode test can be happy with it. Signed-off-by: Qu Wenruo--- tests/fsck-tests/013-extent-tree-rebuild/test.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/fsck-tests/013-extent-tree-rebuild/test.sh b/tests/fsck-tests/013-extent-tree-rebuild/test.sh index 37bdcd9c..08c1e50e 100755 --- a/tests/fsck-tests/013-extent-tree-rebuild/test.sh +++ b/tests/fsck-tests/013-extent-tree-rebuild/test.sh @@ -36,7 +36,7 @@ test_extent_tree_rebuild() $SUDO_HELPER $TOP/btrfs check $TEST_DEV >& /dev/null && \ _fail "btrfs check should detect failure" - run_check $SUDO_HELPER $TOP/btrfs check --init-extent-tree $TEST_DEV + run_check $SUDO_HELPER $TOP/btrfs check --repair --init-extent-tree $TEST_DEV run_check $SUDO_HELPER $TOP/btrfs check $TEST_DEV } -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/9] btrfs-progs: fsck-tests: Add new test case for partly written prealloc extent
This is a bug found in lowmem mode, which reports false alert for partly written prealloc extent. Reported-by: Chris MurphySigned-off-by: Qu Wenruo --- tests/fsck-tests/020-extent-ref-cases/test.sh | 15 +++ 1 file changed, 15 insertions(+) diff --git a/tests/fsck-tests/020-extent-ref-cases/test.sh b/tests/fsck-tests/020-extent-ref-cases/test.sh index 5dc5e55d..91340671 100755 --- a/tests/fsck-tests/020-extent-ref-cases/test.sh +++ b/tests/fsck-tests/020-extent-ref-cases/test.sh @@ -18,6 +18,7 @@ source $TOP/tests/common check_prereq btrfs +check_global_prereq xfs_io for img in *.img *.raw.xz do @@ -28,3 +29,17 @@ do run_check $TOP/btrfs check "$image" rm -f "$image" done + +# Extra test case for partly written prealloc extents. +test_prealloc_written() +{ + run_check $SUDO_HELPER $TOP/mkfs.btrfs -f $TEST_DEV + + run_check_mount_test_dev + xfs_io -f -c "falloc 0 128k" -c "syncfs" $TEST_MNT/tmpfile + xfs_io -c "pwrite 0 64k" $TEST_MNT/tmpfile + + run_check_umount_test_dev + + run_check $TOP/btrfs check $TEST_DEV +} -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/9] Lowmem mode fsck fixes with fsck-tests framework update
Patches can be fetch from github: https://github.com/adam900710/btrfs-progs/tree/lowmem_fixes Although there are near 10 patches, but they are all small. No need to be scared. :) Thanks for reports from Chris Murphy and Christoph Anton Mitterer, several new bugs are exposed for lowmem mode fsck. And one original mode bug, not fixed in this patchset. The original mode bug is caused by fsck/006, which repairs doesn't fix backrefs of a data extent, which lowmem mode detects it correctly. 1) Block group used space false alert If a BLOCK_GROUP_ITEM or its first EXTENT/METADATA_ITEM is located at the first slot of a leaf, search_slot() used by lowmem mode can point to previous leaf, with path->slots[0] beyond valid leaf slots. This makes us to read out uninitialized data, and can abort block group used space check loop, causing a false alert. Fix it with a test case image inside fsck-tests/020/extent-ref-cases Reported by Christoph. 2) Partly written prealloc extent false alert If a prealloc extent gets partily written, lowmem mode will report prealloc extent shouldn't have csum. Lowmem mode passed wrong variable to csum checking code, causing it to check the whole range of the prealloc extent, making the bug happens. Fix it with a test case inside fsck-tests/020/extent-ref-cases. Reported by Chirs Murphy And Christoph. 3) Extent item size false alert. Under certain case, btrfs lowmem mode check reports data backref lost. It's because newly introduced extent item size check aborts normal check routine. It can happen if a data/metadata extent item has no inline ref. Fix it, test case already submitted before and merged, but due to fsck-tests framework bugs, it never get called for lowmem mode. 4) fsck-tests Lowmem mode override fixes Allow lowmem mode override to get called for all tests, and allow them all to pass lowmem mode except fsck-tests/006, which is a original repair mode bug. Lu Fengqi (1): btrfs-progs: fsck: Fix lowmem mode override to allow it skip repair work Qu Wenruo (8): btrfs-progs: lowmem check: Fix wrong block group check when search slot points to slot beyong leaf btrfs-progs: fsck-test: Add test image for lowmem mode block group false alert btrfs-progs: fsck: Output verbose error when fsck found a bug btrfs-progs: lowmem check: Fix false alert in checking data extent pointing to prealloc extent btrfs-progs: lowmem check: Fix extent item size false alert btrfs-progs: tests: Move fsck-tests/015 to fuzz tests btrfs-progs: fsck-tests: Make 013 compatible with lowmem mode btrfs-progs: fsck-tests: Add new test case for partly written prealloc extent cmds-check.c | 80 - tests/common | 8 +-- tests/common.local | 2 +- tests/fsck-tests/013-extent-tree-rebuild/test.sh | 2 +- .../block_group_item_false_alert.raw.xz| Bin 0 -> 47792 bytes tests/fsck-tests/020-extent-ref-cases/test.sh | 30 ++-- .../images}/bko-97171-btrfs-image.raw.txt | 0 .../images}/bko-97171-btrfs-image.raw.xz | Bin 8 files changed, 95 insertions(+), 27 deletions(-) create mode 100644 tests/fsck-tests/020-extent-ref-cases/block_group_item_false_alert.raw.xz rename tests/{fsck-tests/015-check-bad-memory-access => fuzz-tests/images}/bko-97171-btrfs-image.raw.txt (100%) rename tests/{fsck-tests/015-check-bad-memory-access => fuzz-tests/images}/bko-97171-btrfs-image.raw.xz (100%) -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/9] btrfs-progs: lowmem check: Fix extent item size false alert
If one extent item has no inline ref, btrfs lowmem mode check can give false alert without outputting any error message. The problem is lowmem mode always assume that extent item has inline refs, and when it encounters such case it flags the extent item has wrong size, but doesn't output the error message. Although we already have such image submitted, at the commit time due to another bug in cmds-check return value, it doesn't detect it until that bug is fixed. Signed-off-by: Qu Wenruo--- cmds-check.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index fd176b76..802d179f 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -10730,13 +10730,20 @@ static int check_extent_item(struct btrfs_fs_info *fs_info, } end = (unsigned long)ei + item_size; - if (ptr >= end) { +next: + /* Reached extent item end normally */ + if (ptr == end) + goto out; + + /* Beyond extent item end, wrong item size */ + if (ptr > end) { err |= ITEM_SIZE_MISMATCH; + error("extent item at bytenr %llu slot %d has wrong size", + eb->start, slot); goto out; } /* Now check every backref in this extent item */ -next: iref = (struct btrfs_extent_inline_ref *)ptr; type = btrfs_extent_inline_ref_type(eb, iref); offset = btrfs_extent_inline_ref_offset(eb, iref); @@ -10773,8 +10780,7 @@ next: } ptr += btrfs_extent_inline_ref_size(type); - if (ptr < end) - goto next; + goto next; out: return err; -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/9] btrfs-progs: fsck: Output verbose error when fsck found a bug
Although we output error like "errors found in extent allocation tree or chunk allocation", but we lacks such output for other trees, but leaving the final "found error is %d" to catch the last return value(and sometime it's cleared) This patch adds extra error message for top level error path, and modify the last "found error is %d" to "error(s) found" or "no error found". Cc: Christoph Anton MittererSigned-off-by: Qu Wenruo --- cmds-check.c | 43 +-- 1 file changed, 33 insertions(+), 10 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index c39392b7..f158daf9 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -12913,8 +12913,10 @@ int cmd_check(int argc, char **argv) ret = repair_root_items(info); err |= !!ret; - if (ret < 0) + if (ret < 0) { + error("failed to repair root items: %s", strerror(-ret)); goto close_out; + } if (repair) { fprintf(stderr, "Fixed %d roots.\n", ret); ret = 0; @@ -12937,8 +12939,13 @@ int cmd_check(int argc, char **argv) } ret = check_space_cache(root); err |= !!ret; - if (ret) + if (ret) { + if (btrfs_fs_compat_ro(info, FREE_SPACE_TREE)) + error("errors found in free space tree"); + else + error("errors found in free space cache"); goto out; + } /* * We used to have to have these hole extents in between our real @@ -12954,22 +12961,28 @@ int cmd_check(int argc, char **argv) else ret = check_fs_roots(root, _cache); err |= !!ret; - if (ret) + if (ret) { + error("errors found in fs roots"); goto out; + } fprintf(stderr, "checking csums\n"); ret = check_csums(root); err |= !!ret; - if (ret) + if (ret) { + error("errors found in csum tree"); goto out; + } fprintf(stderr, "checking root refs\n"); /* For low memory mode, check_fs_roots_v2 handles root refs */ if (check_mode != CHECK_MODE_LOWMEM) { ret = check_root_refs(root, _cache); err |= !!ret; - if (ret) + if (ret) { + error("errors found in root refs"); goto out; + } } while (repair && !list_empty(>fs_info->recow_ebs)) { @@ -12980,8 +12993,10 @@ int cmd_check(int argc, char **argv) list_del_init(>recow); ret = recow_extent_buffer(root, eb); err |= !!ret; - if (ret) + if (ret) { + error("fails to fix transid errors"); break; + } } while (!list_empty(_items)) { @@ -13000,13 +13015,17 @@ int cmd_check(int argc, char **argv) fprintf(stderr, "checking quota groups\n"); ret = qgroup_verify_all(info); err |= !!ret; - if (ret) + if (ret) { + error("failed to check quota groups"); goto out; + } report_qgroups(0); ret = repair_qgroups(info, _repaired); err |= !!ret; - if (err) + if (err) { + error("failed to repair quota groups"); goto out; + } ret = 0; } @@ -13027,8 +13046,12 @@ out: "backup data and re-format the FS. *\n\n"); err |= 1; } - printf("found %llu bytes used err is %d\n", - (unsigned long long)bytes_used, ret); + printf("found %llu bytes used, ", + (unsigned long long)bytes_used); + if (err) + printf("error(s) found\n"); + else + printf("no error found\n"); printf("total csum bytes: %llu\n",(unsigned long long)total_csum_bytes); printf("total tree bytes: %llu\n", (unsigned long long)total_btree_bytes); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID56 status?
On Mon, Jan 23, 2017 at 7:57 AM, Brendan Hidewrote: > > raid0 stripes data in 64k chunks (I think this size is tunable) across all > devices, which is generally far faster in terms of throughput in both > writing and reading data. I remember seeing some proposals for configurable stripe size in the form of patches (which changed a lot over time) but I don't think the idea reached a consensus (let alone if a final patch materialized and got merged). I think it would be a nice feature though. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID56 status?
Hey, all Long-time lurker/commenter here. Production-ready RAID5/6 and N-way mirroring are the two features I've been anticipating most, so I've commented regularly when this sort of thing pops up. :) I'm only addressing some of the RAID-types queries as Qu already has a handle on the rest. Small-yet-important hint: If you don't have a backup of it, it isn't important. On 01/23/2017 02:25 AM, Jan Vales wrote: [ snip ] Correct me, if im wrong... * It seems, raid1(btrfs) is actually raid10, as there are no more than 2 copies of data, regardless of the count of devices. The original "definition" of raid1 is two mirrored devices. The *nix industry standard implementation (mdadm) extends this to any number of mirrored devices. Thus confusion here is understandable. ** Is there a way to duplicate data n-times? This is a planned feature, especially in lieu of feature-parity with mdadm, though the priority isn't particularly high right now. This has been referred to as "N-way mirroring". The last time I recall discussion over this, it was hoped to get work started on it after raid5/6 was stable. ** If there are only 3 devices and the wrong device dies... is it dead? Qu has the right answers. Generally if you're using anything other than dup, raid0, or single, one disk failure is "okay". More than one failure is closer to "undefined". Except with RAID6, where you need to have more than two disk failures before you have lost data. * Whats the diffrence of raid1(btrfs) and raid10(btrfs)? Some nice illustrations from Qu there. :) ** After reading like 5 diffrent wiki pages, I understood, that there are diffrences ... but not what they are and how they affect me :/ * Whats the diffrence of raid0(btrfs) and "normal" multi-device operation which seems like a traditional raid0 to me? raid0 stripes data in 64k chunks (I think this size is tunable) across all devices, which is generally far faster in terms of throughput in both writing and reading data. By '"normal" multi-device' I will assume this means "single" with multiple devices. New writes with "single" will use a 1GB chunk on one device until the chunk is full, at which point it allocates a new chunk, which will usually be put on the disk with the most available free space. There is no particular optimisation in place comparable to raid0 here. Maybe rename/alias raid-levels that do not match traditional raid-levels, so one cannot expect some behavior that is not there. The extreme example is imho raid1(btrfs) vs raid1. I would expect that if i have 5 btrfs-raid1-devices, 4 may die and btrfs should be able to fully recover, which, if i understand correctly, by far does not hold. If you named that raid-level say "george" ... I would need to consult the docs and I obviously would not expect any behavior. :) We've discussed this a couple of times. Hugo came up with a notation since dubbed "csp" notation: c->Copies, s->Stripes, and p->Parities. Examples of this would be: raid1: 2c 3-way mirroring across 3 (or more*) devices: 3c raid0 (2-or-more-devices): 2s raid0 (3-or-more): 3s raid5 (5-or-more): 4s1p raid16 (12-or-more): 2c4s2p * note the "or more": Mdadm *cannot* mirror less mirrors or stripes than devices, whereas there is no particular reason why btrfs won't be able to do this. A minor problem with csp notation is that it implies a complete implementation of *any* combination of these, whereas the idea was simply to create a way to refer to the "raid" levels in a consistent way. I hope this brings some clarity. :) regards, Jan Vales -- I only read plaintext emails. -- __ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html