[PATCH] btrfs-progs: Check the csum tree node before go through the csum tree
[BUG] Some fsfuzzed btrfs image will cause btrfsck segfault. [REPRODUCER] Run btrfsck on a csum tree block corrupted image. [REASON] check_csums() function call btrfs_search_slot() on csum_tree but doesn't check whether the csum_tree contains a valid extent_buffer, which causes the segfault. [FIX] Check the csum_root-node before any search. Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com --- cmds-check.c | 4 1 file changed, 4 insertions(+) diff --git a/cmds-check.c b/cmds-check.c index 9471709..537376a 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -3905,6 +3905,10 @@ static int check_csums(struct btrfs_root *root) unsigned long leaf_offset; root = root-fs_info-csum_root; + if (!extent_buffer_uptodate(root-node)) { + fprintf(stderr, No valid csum tree found\n); + return -ENOENT; + } key.objectid = BTRFS_EXTENT_CSUM_OBJECTID; key.type = BTRFS_EXTENT_CSUM_KEY; -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs-progs: make BUG*() be more verbose
On Thu, Sep 25, 2014 at 04:07:10PM -0400, Josef Bacik wrote: Currently these macros just tie to assert(), which gives us line number and such but no backtrace so no actual context. This patch adds support for spitting out a backtrace so we can see how we got to the given assert. Thanks, https://patchwork.kernel.org/patch/4757041/ There are differences and I find Josef's implementation better in some respects and Naota's in others. Below is what I'm going to apply with sign-offs from both of you. Notable change is the use of backtrace_symbols_fd, it does not malloc/free the memory, though the output is slighly different from backtrace_symbols, still readable. --- a/kerncompat.h +++ b/kerncompat.h @@ -55,21 +55,17 @@ #define ULONG_MAX (~0UL) #endif +#define MAX_BACKTRACE 16 static inline void print_trace(void) { - void *array[10]; + void *array[MAX_BACKTRACE]; size_t size; - char **strings; - size_t i; - - size = backtrace(array, 10); - strings = backtrace_symbols(array, size); - for (i = 0; i size; i++) - fprintf(stderr, \t%s\n, strings[i]); - free(strings); + + size = backtrace(array, MAX_BACKTRACE); + backtrace_symbols_fd(array, size, 2); } -static inline void btr_assert(const char *assertion, const char *filename, +static inline void assert_trace(const char *assertion, const char *filename, const char *func, unsigned line, int val) { if (val) @@ -84,7 +80,7 @@ static inline void btr_assert(const char *assertion, const char *filename, exit(1); } -#define BUG() btr_assert(NULL, __FILE__, __func__, __LINE__, 0) +#define BUG() assert_trace(NULL, __FILE__, __func__, __LINE__, 0) #ifdef __CHECKER__ #define __force__attribute__((force)) @@ -268,10 +264,10 @@ static inline long IS_ERR(const void *ptr) #define kstrdup(x, y) strdup(x) #define kfree(x) free(x) -#define BUG_ON(c) btr_assert(#c, __FILE__, __func__, __LINE__, !(c)) +#define BUG_ON(c) assert_trace(#c, __FILE__, __func__, __LINE__, !(c)) #define WARN_ON(c) BUG_ON(c) -#defineASSERT(c) btr_assert(#c, __FILE__, __func__, __LINE__, (c)) +#defineASSERT(c) assert_trace(#c, __FILE__, __func__, __LINE__, (c)) #define container_of(ptr, type, member) ({ \ const typeof( ((type *)0)-member ) *__mptr = (ptr);\ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: Don't return any fs_info that contain NULL tree_root or fs_root.
On Tue, Sep 30, 2014 at 10:39:22AM +0800, Qu Wenruo wrote: --- a/disk-io.c +++ b/disk-io.c @@ -1134,7 +1134,8 @@ static struct btrfs_fs_info *__open_ctree_fd(int fp, const char *path, return fs_info; out_failed: - if (flags OPEN_CTREE_PARTIAL) + if (flags OPEN_CTREE_PARTIAL + fs_info-tree_root fs_info-fs_root) return fs_info; I see a conflict with a pending patch https://patchwork.kernel.org/patch/4254631/ that removes the check completely but fixes the crash in another way. I like Wang's patch because it keeps the logic about partial open inside btrfs_setup_all_roots(). Please test if it fixes the crash with the corrupted image you have. Thanks. out_chunk: btrfs_release_all_roots(fs_info); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] btrfs: introduce procfs interface for the device list
On Mon, Sep 29, 2014 at 1:09 AM, Anand Jain anand.j...@oracle.com wrote: From: Anand Jain anand.j...@oracle.com (added RFC prefix to the patch header) (as of now just an experimental interface) This patch introduces profs interface /proc/fs/btrfs/devlist, which as of now exports all the members of kernel fs_devices. The current /sys/fs/btrfs interface works when the fs is mounted, and is on the file directory hierarchy and also has the sysfs limitation max output of U64 per file. Here btrfs procfs uses seq_file to export all the members of fs_devices. Also shows the contents when device is not mounted, but have registered with btrfs kernel (useful as an alternative to buggy ready ioctl) An attempt is made to follow the some standard file format output such as ini. So that a simple warper python script will provide end user useful interfaces. Further planning to add few more members to the interface such as group profile info. The long term idea is to make procfs interface a onestop btrfs application interface for the device and fs info from the kernel, where a simple python script can make use of it. Hi Anand, We're going to have a really hard time getting a new proc interface merged in, and after we've recently fixed up all (most?) of our sysfs races, I'd rather not have to do it all over again with /proc. I know the lack of a seq interface is a difficult compromise to make in sysfs, but at this point I think we're stuck with it. Which specific part do you hope to improve by dumping more information out in a single file? -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-progs: initial reference count of extent buffer is correct?
Hi, Liu Thank you for your explanation, and I'm sorry for this long silence. Liu Bo bo.li@oracle.com writes: You may think of it twice, commit 53ee1bccf99cd5b474fe1aa857b7dd176e3a1407 is to fix a bug of assigning a free block to two different extent buffers, ie. two different extent buffers' share the same eb-start, so it's not just bumping a reference cnt. Right now we want to be consistent with the kernel side, decreasing eb-refs=0 means it'd be freed, so droping free_some_buffer() can be a good choice. Now I understand the reason why @refs = 1 initially. I'll post a patch to drop free_some_buffer() after this. And for caching extent buffer, we've increased eb-refs by 1 to keep it in the cache rbtree. thanks, -liubo Regards, Naohiro -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: do not reclaim extent buffer
We should kill free_some_buffers() to stop reclaiming extent buffers or we will hit a problem described below. As of commit 53ee1bccf99cd5b474fe1aa857b7dd176e3a1407, we are not counting a reference for tree-lru anymore. However free_some_buffers() is still left and is reclaiming extent buffers whose @refs == 1. This cause extent buffers to be reclaimed unintentionally. Thus the following steps could happen: 1. A buffer at address A is reclaimed by free_some_buffers() (address A is also free()ed) 2. Some code call alloc_extent_buffer() 3. Address A is assigned to newly allocated buffer 4. You see a buffer pointed by A suddenly changed its content This problem is also pointed out here and it has a reproducer: https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg36703.html This commit drop free_some_buffers() and related variables, and also it modify extent_io_tree_cleanup() to catch non-free'ed buffers properly. Signed-off-by: Naohiro Aota na...@elisp.net --- extent_io.c | 37 +++-- 1 file changed, 3 insertions(+), 34 deletions(-) diff --git a/extent_io.c b/extent_io.c index 1df377d..425af8a 100644 --- a/extent_io.c +++ b/extent_io.c @@ -30,9 +30,6 @@ #include ctree.h #include volumes.h -static u64 cache_soft_max = 1024 * 1024 * 256; -static u64 cache_hard_max = 1 * 1024 * 1024 * 1024; - void extent_io_tree_init(struct extent_io_tree *tree) { cache_tree_init(tree-state); @@ -77,12 +74,9 @@ void extent_io_tree_cleanup(struct extent_io_tree *tree) while(!list_empty(tree-lru)) { eb = list_entry(tree-lru.next, struct extent_buffer, lru); - if (eb-refs != 1) { - fprintf(stderr, extent buffer leak: - start %llu len %u\n, - (unsigned long long)eb-start, eb-len); - eb-refs = 1; - } + fprintf(stderr, extent buffer leak: + start %llu len %u\n, + (unsigned long long)eb-start, eb-len); free_extent_buffer(eb); } @@ -541,30 +535,6 @@ out: return ret; } -static int free_some_buffers(struct extent_io_tree *tree) -{ - u32 nrscan = 0; - struct extent_buffer *eb; - struct list_head *node, *next; - - if (tree-cache_size cache_soft_max) - return 0; - - list_for_each_safe(node, next, tree-lru) { - eb = list_entry(node, struct extent_buffer, lru); - if (eb-refs == 1 !(eb-flags EXTENT_DIRTY)) { - free_extent_buffer(eb); - if (tree-cache_size cache_hard_max) - break; - } else { - list_move_tail(eb-lru, tree-lru); - } - if (nrscan++ 64 tree-cache_size cache_hard_max) - break; - } - return 0; -} - static struct extent_buffer *__alloc_extent_buffer(struct extent_io_tree *tree, u64 bytenr, u32 blocksize) { @@ -589,7 +559,6 @@ static struct extent_buffer *__alloc_extent_buffer(struct extent_io_tree *tree, eb-cache_node.size = blocksize; INIT_LIST_HEAD(eb-recow); - free_some_buffers(tree); ret = insert_cache_extent(tree-cache, eb-cache_node); if (ret) { free(eb); -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: Don't return any fs_info that contain NULL tree_root or fs_root.
btrfs_setup_all_roots(). Please test if it fixes the crash with the corrupted image you have. Thanks. Perhaps with a test in xfstests. - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
On Mon, Sep 29, 2014 at 03:03:48PM -0400, Josef Bacik wrote: On 09/26/2014 12:14 AM, Eryu Guan wrote: Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. Signed-off-by: Eryu Guan eg...@redhat.com --- [snip] +args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d $SCRATCH_MNT/stressdir` +echo Run fsstress $args $seqres.full +$FSSTRESS_PROG $args /dev/null 21 +fsstress_pid=$! + +echo -n Start balance worker: $seqres.full +_btrfs_stress_balance $SCRATCH_MNT /dev/null 21 +balance_pid=$! +echo $balance_pid $seqres.full + +echo -n Start subvolume worker: $seqres.full +_btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt /dev/null 21 +subvol_pid=$! +echo $subvol_pid $seqres.full + +echo Wait for fsstress to exit and kill all background workers $seqres.full +wait $fsstress_pid + +kill $balance_pid $subvol_pid +wait +# wait for the balance operation to finish +while ps aux | grep balance start | grep -qv grep; do +sleep 1 +done This bit isn't needed, killing the balance pid also won't do anything, just waiting is enough, it'll only exit once the balance is finished. If you really want to stop the balance you can use I think it's still needed, or the balance process would block the later umount and test fails like --- tests/btrfs/059.out 2014-09-26 11:35:43.932850826 +0800 +++ /root/xfstests/results//btrfs/059.out.bad 2014-09-30 23:29:47.538310375 +0800 @@ -1,2 +1,9 @@ QA output created by 059 Silence is golden +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +_check_btrfs_filesystem: filesystem on /dev/mapper/rhel_hp--dl388eg8--01-testlv2 is inconsistent (see /root/xfstests/results//btrfs/059.full) Adding debug output of lsof $SCRATCH_MNT and ps aux | grep btrfs before umount shows it's the balance process is still running and blocks the umount. lsof /mnt/testarea/scratch btrfs 13491 root3r DIR 0,40 42 256 /mnt/testarea/scratch ps aux | grep btrfs root 13491 7.0 0.0 18660 1312 pts/1D+ 23:29 0:00 /usr/sbin/btrfs balance start /mnt/testarea/scratch Killing the balance pid is to stop any further balance operation, wait is waiting for the child process (_btrfs_stress_balance), not the child's child process (balance process). So only wait is not enough, we should wait for the balance to finish explicitly. This is also true for the rest of the test cases in this series. This is kind of ugly, but I cannot figure out a better solution right now.. Thanks, Eryu -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
On 09/30/2014 11:48 AM, Eryu Guan wrote: On Mon, Sep 29, 2014 at 03:03:48PM -0400, Josef Bacik wrote: On 09/26/2014 12:14 AM, Eryu Guan wrote: Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. Signed-off-by: Eryu Guan eg...@redhat.com --- [snip] + args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d $SCRATCH_MNT/stressdir` + echo Run fsstress $args $seqres.full + $FSSTRESS_PROG $args /dev/null 21 + fsstress_pid=$! + + echo -n Start balance worker: $seqres.full + _btrfs_stress_balance $SCRATCH_MNT /dev/null 21 + balance_pid=$! + echo $balance_pid $seqres.full + + echo -n Start subvolume worker: $seqres.full + _btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt /dev/null 21 + subvol_pid=$! + echo $subvol_pid $seqres.full + + echo Wait for fsstress to exit and kill all background workers $seqres.full + wait $fsstress_pid + + kill $balance_pid $subvol_pid + wait + # wait for the balance operation to finish + while ps aux | grep balance start | grep -qv grep; do + sleep 1 + done This bit isn't needed, killing the balance pid also won't do anything, just waiting is enough, it'll only exit once the balance is finished. If you really want to stop the balance you can use I think it's still needed, or the balance process would block the later umount and test fails like --- tests/btrfs/059.out 2014-09-26 11:35:43.932850826 +0800 +++ /root/xfstests/results//btrfs/059.out.bad 2014-09-30 23:29:47.538310375 +0800 @@ -1,2 +1,9 @@ QA output created by 059 Silence is golden +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +_check_btrfs_filesystem: filesystem on /dev/mapper/rhel_hp--dl388eg8--01-testlv2 is inconsistent (see /root/xfstests/results//btrfs/059.full) Adding debug output of lsof $SCRATCH_MNT and ps aux | grep btrfs before umount shows it's the balance process is still running and blocks the umount. lsof /mnt/testarea/scratch btrfs 13491 root3r DIR 0,40 42 256 /mnt/testarea/scratch ps aux | grep btrfs root 13491 7.0 0.0 18660 1312 pts/1D+ 23:29 0:00 /usr/sbin/btrfs balance start /mnt/testarea/scratch Killing the balance pid is to stop any further balance operation, wait is waiting for the child process (_btrfs_stress_balance), not the child's child process (balance process). So only wait is not enough, we should wait for the balance to finish explicitly. This is also true for the rest of the test cases in this series. This is kind of ugly, but I cannot figure out a better solution right now.. Oh duh sorry, I missed that it was a loop of btrfs balance start, not just btrfs balance start . Ok in that case instead use btrfs balance status and loop on that until it is done and then exit, that way it is clear what we are waiting on, but really I'm not married to the idea either so if it ends up being too much trouble just skip it. You can add Reviewed-by: Josef Bacik jba...@fb.com Whenever you re-roll with your whitespace changes and whatever other cleanups you add. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
On Tue, Sep 30, 2014 at 11:54:30AM -0400, Josef Bacik wrote: On 09/30/2014 11:48 AM, Eryu Guan wrote: On Mon, Sep 29, 2014 at 03:03:48PM -0400, Josef Bacik wrote: On 09/26/2014 12:14 AM, Eryu Guan wrote: Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. Signed-off-by: Eryu Guan eg...@redhat.com --- [snip] + args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d $SCRATCH_MNT/stressdir` + echo Run fsstress $args $seqres.full + $FSSTRESS_PROG $args /dev/null 21 + fsstress_pid=$! + + echo -n Start balance worker: $seqres.full + _btrfs_stress_balance $SCRATCH_MNT /dev/null 21 + balance_pid=$! + echo $balance_pid $seqres.full + + echo -n Start subvolume worker: $seqres.full + _btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt /dev/null 21 + subvol_pid=$! + echo $subvol_pid $seqres.full + + echo Wait for fsstress to exit and kill all background workers $seqres.full + wait $fsstress_pid + + kill $balance_pid $subvol_pid + wait + # wait for the balance operation to finish + while ps aux | grep balance start | grep -qv grep; do + sleep 1 + done This bit isn't needed, killing the balance pid also won't do anything, just waiting is enough, it'll only exit once the balance is finished. If you really want to stop the balance you can use I think it's still needed, or the balance process would block the later umount and test fails like --- tests/btrfs/059.out 2014-09-26 11:35:43.932850826 +0800 +++ /root/xfstests/results//btrfs/059.out.bad 2014-09-30 23:29:47.538310375 +0800 @@ -1,2 +1,9 @@ QA output created by 059 Silence is golden +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +_check_btrfs_filesystem: filesystem on /dev/mapper/rhel_hp--dl388eg8--01-testlv2 is inconsistent (see /root/xfstests/results//btrfs/059.full) Adding debug output of lsof $SCRATCH_MNT and ps aux | grep btrfs before umount shows it's the balance process is still running and blocks the umount. lsof /mnt/testarea/scratch btrfs 13491 root3r DIR 0,40 42 256 /mnt/testarea/scratch ps aux | grep btrfs root 13491 7.0 0.0 18660 1312 pts/1D+ 23:29 0:00 /usr/sbin/btrfs balance start /mnt/testarea/scratch Killing the balance pid is to stop any further balance operation, wait is waiting for the child process (_btrfs_stress_balance), not the child's child process (balance process). So only wait is not enough, we should wait for the balance to finish explicitly. This is also true for the rest of the test cases in this series. This is kind of ugly, but I cannot figure out a better solution right now.. Oh duh sorry, I missed that it was a loop of btrfs balance start, not just btrfs balance start . Ok in that case instead use btrfs balance status and loop on that until it is done and then exit, that way it is clear what we are waiting on, but really I'm not married to the idea either so if it ends up being too much trouble just skip it. You can add Reviewed-by: Josef Bacik jba...@fb.com Whenever you re-roll with your whitespace changes and whatever other cleanups you add. Thanks, Thanks for your review! Just want to confirm with you, do you mean I can add your reviewed-by for the whole patchset or just the first one? Thanks, Eryu -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
On 09/30/2014 12:08 PM, Eryu Guan wrote: On Tue, Sep 30, 2014 at 11:54:30AM -0400, Josef Bacik wrote: On 09/30/2014 11:48 AM, Eryu Guan wrote: On Mon, Sep 29, 2014 at 03:03:48PM -0400, Josef Bacik wrote: On 09/26/2014 12:14 AM, Eryu Guan wrote: Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. Signed-off-by: Eryu Guan eg...@redhat.com --- [snip] + args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d $SCRATCH_MNT/stressdir` + echo Run fsstress $args $seqres.full + $FSSTRESS_PROG $args /dev/null 21 + fsstress_pid=$! + + echo -n Start balance worker: $seqres.full + _btrfs_stress_balance $SCRATCH_MNT /dev/null 21 + balance_pid=$! + echo $balance_pid $seqres.full + + echo -n Start subvolume worker: $seqres.full + _btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt /dev/null 21 + subvol_pid=$! + echo $subvol_pid $seqres.full + + echo Wait for fsstress to exit and kill all background workers $seqres.full + wait $fsstress_pid + + kill $balance_pid $subvol_pid + wait + # wait for the balance operation to finish + while ps aux | grep balance start | grep -qv grep; do + sleep 1 + done This bit isn't needed, killing the balance pid also won't do anything, just waiting is enough, it'll only exit once the balance is finished. If you really want to stop the balance you can use I think it's still needed, or the balance process would block the later umount and test fails like --- tests/btrfs/059.out 2014-09-26 11:35:43.932850826 +0800 +++ /root/xfstests/results//btrfs/059.out.bad 2014-09-30 23:29:47.538310375 +0800 @@ -1,2 +1,9 @@ QA output created by 059 Silence is golden +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +umount: /mnt/testarea/scratch: target is busy. +(In some cases useful info about processes that use + the device is found by lsof(8) or fuser(1)) +_check_btrfs_filesystem: filesystem on /dev/mapper/rhel_hp--dl388eg8--01-testlv2 is inconsistent (see /root/xfstests/results//btrfs/059.full) Adding debug output of lsof $SCRATCH_MNT and ps aux | grep btrfs before umount shows it's the balance process is still running and blocks the umount. lsof /mnt/testarea/scratch btrfs 13491 root3r DIR 0,40 42 256 /mnt/testarea/scratch ps aux | grep btrfs root 13491 7.0 0.0 18660 1312 pts/1D+ 23:29 0:00 /usr/sbin/btrfs balance start /mnt/testarea/scratch Killing the balance pid is to stop any further balance operation, wait is waiting for the child process (_btrfs_stress_balance), not the child's child process (balance process). So only wait is not enough, we should wait for the balance to finish explicitly. This is also true for the rest of the test cases in this series. This is kind of ugly, but I cannot figure out a better solution right now.. Oh duh sorry, I missed that it was a loop of btrfs balance start, not just btrfs balance start . Ok in that case instead use btrfs balance status and loop on that until it is done and then exit, that way it is clear what we are waiting on, but really I'm not married to the idea either so if it ends up being too much trouble just skip it. You can add Reviewed-by: Josef Bacik jba...@fb.com Whenever you re-roll with your whitespace changes and whatever other cleanups you add. Thanks, Thanks for your review! Just want to confirm with you, do you mean I can add your reviewed-by for the whole patchset or just the first one? Oh sorry, the whole patchset. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: add more superblock checks
Populate btrfs_check_super_valid() with checks that try to verify consistency of superblock by additional conditions that may arise from corrupted devices or bitflips. Some of tests are only hints and issue warnings instead of failing the mount, basically when the checks are derived from the data found in the superblock. Tested on a broken image provided by Qu. Reported-by: Qu Wenruo quwen...@cn.fujitsu.com Signed-off-by: David Sterba dste...@suse.cz --- fs/btrfs/disk-io.c | 67 -- 1 file changed, 65 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index a1d36e62179c..bfb00cb1c84c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3814,10 +3814,73 @@ int btrfs_read_buffer(struct extent_buffer *buf, u64 parent_transid) static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info, int read_only) { + struct btrfs_super_block *sb = fs_info-super_copy; + int ret = 0; + + if (sb-root_level BTRFS_MAX_LEVEL) { + printk(KERN_ERR BTRFS: tree_root level too big: %d %d\n, + sb-root_level, BTRFS_MAX_LEVEL); + ret = -EINVAL; + } + if (sb-chunk_root_level BTRFS_MAX_LEVEL) { + printk(KERN_ERR BTRFS: chunk_root level too big: %d %d\n, + sb-chunk_root_level, BTRFS_MAX_LEVEL); + ret = -EINVAL; + } + if (sb-log_root_level BTRFS_MAX_LEVEL) { + printk(KERN_ERR BTRFS: log_root level too big: %d %d\n, + sb-log_root_level, BTRFS_MAX_LEVEL); + ret = -EINVAL; + } + /* -* Placeholder for checks +* The common minimum, we don't know if we can trust the nodesize/sectorsize +* items yet, they'll be verified later. Issue just a warning. */ - return 0; + if (!IS_ALIGNED(sb-root, 4096)) + printk(KERN_WARNING BTRFS: tree_root block unaligned: %llu\n, + sb-root); + if (!IS_ALIGNED(sb-chunk_root, 4096)) + printk(KERN_WARNING BTRFS: tree_root block unaligned: %llu\n, + sb-chunk_root); + if (!IS_ALIGNED(sb-log_root, 4096)) + printk(KERN_WARNING BTRFS: tree_root block unaligned: %llu\n, + sb-log_root); + + if (memcmp(fs_info-fsid, sb-dev_item.fsid, BTRFS_UUID_SIZE) != 0) { + printk(KERN_ERR BTRFS: dev_item UUID does not match fsid: %pU != %pU\n, + fs_info-fsid, sb-dev_item.fsid); + ret = -EINVAL; + } + + /* +* Hint to catch really bogus numbers, bitflips or so, more exact checks are +* done later +*/ + if (sb-num_devices (1UL 31)) + printk(KERN_WARNING BTRFS: suspicious number of devices: %llu\n, + sb-num_devices); + + if (sb-bytenr != BTRFS_SUPER_INFO_OFFSET) { + printk(KERN_ERR BTRFS: super offset mismatch %llu != %u\n, + sb-bytenr, BTRFS_SUPER_INFO_OFFSET); + ret = -EINVAL; + } + + /* +* The generation is a global counter, we'll trust it more than the others +* but it's still possible that it's the one that's wrong. +*/ + if (sb-generation sb-chunk_root_generation) + printk(KERN_WARNING + BTRFS: suspicious: generation chunk_root_generation: %llu %llu\n, + sb-generation, sb-chunk_root_generation); + if (sb-generation sb-cache_generation sb-cache_generation != (u64)-1) + printk(KERN_WARNING + BTRFS: suspicious: generation cache_generation: %llu %llu\n, + sb-generation, sb-cache_generation); + + return ret; } static void btrfs_error_commit_super(struct btrfs_root *root) -- 1.8.4.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 3.16.3 btrfs receive dies with rename o4736059-335172-0 - merlin/RC/OSD failed. Directory not empty machine
Marc MERLIN marc at merlins.org writes: Hi Filipe and others, After I moved directories around, and since then my hourly btrfs backup is failing. ... ERROR: rename o4736059-335172-0 - merlin/RC/OSD failed. Directory not empty i have the exact same problem, also 3.16-3. i did something along these lines: mv Music/2/VideoGamesLive Music/2/VolumeOne mkdir Music/2/VideoGamesLive mv Music/2/VolumeOne Music/2/VideoGamesLive and got: ERROR: rename o3854555-52412-0 - Music/2/VideoGamesLive failed. Directory not empty -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs-progs: check all slots in leaves
There's an off by one error in btrfs_check_leaf, we should be going to nritems - 1, not nritems - 2, we were missing problems with items in the very last slot. Thanks, Signed-off-by: Josef Bacik jba...@fb.com --- ctree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ctree.c b/ctree.c index 9e5b30f..c085941 100644 --- a/ctree.c +++ b/ctree.c @@ -494,7 +494,7 @@ btrfs_check_leaf(struct btrfs_root *root, struct btrfs_disk_key *parent_key, (unsigned long long)btrfs_header_bytenr(buf)); goto fail; } - for (i = 0; nritems 1 i nritems - 2; i++) { + for (i = 0; nritems 1 i nritems - 1; i++) { btrfs_item_key(buf, key, i); btrfs_item_key_to_cpu(buf, cpukey, i + 1); if (btrfs_comp_keys(key, cpukey) = 0) { -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs-progs: add shift_items to btrfs-corrupt-block
A user had a corrupted fs where his items where shifted oddly. This adds the functionality I needed to btrfs-corrupt-block in order to reproduce this corruption in order to make fsck fix this sort of problem. Thanks, Signed-off-by: Josef Bacik jba...@fb.com --- btrfs-corrupt-block.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/btrfs-corrupt-block.c b/btrfs-corrupt-block.c index 474d48f..22390b5 100644 --- a/btrfs-corrupt-block.c +++ b/btrfs-corrupt-block.c @@ -303,6 +303,7 @@ enum btrfs_file_extent_field { enum btrfs_metadata_block_field { BTRFS_METADATA_BLOCK_GENERATION, + BTRFS_METADATA_BLOCK_SHIFT_ITEMS, BTRFS_METADATA_BLOCK_BAD, }; @@ -332,6 +333,8 @@ convert_metadata_block_field(char *field) { if (!strncmp(field, generation, FIELD_BUF_LEN)) return BTRFS_METADATA_BLOCK_GENERATION; + if (!strncmp(field, shift_items, FIELD_BUF_LEN)) + return BTRFS_METADATA_BLOCK_SHIFT_ITEMS; return BTRFS_METADATA_BLOCK_BAD; } @@ -538,6 +541,26 @@ out: return ret; } +static void shift_items(struct btrfs_root *root, struct extent_buffer *eb) +{ + int nritems = btrfs_header_nritems(eb); + int shift_space = btrfs_leaf_free_space(root, eb) / 2; + int slot = nritems / 2; + int i = 0; + unsigned int data_end = btrfs_item_offset_nr(eb, nritems - 1); + + /* Shift the item data up to and including slot back by shift space */ + memmove_extent_buffer(eb, data_end - shift_space, data_end, + btrfs_item_offset_nr(eb, slot - 1) - data_end); + + /* Now update the item pointers. */ + for (i = nritems - 1; i = slot; i--) { + u32 offset = btrfs_item_offset_nr(eb, i); + offset -= shift_space; + btrfs_set_item_offset(eb, btrfs_item_nr(i), offset); + } +} + static int corrupt_metadata_block(struct btrfs_root *root, u64 block, char *field) { @@ -608,6 +631,9 @@ static int corrupt_metadata_block(struct btrfs_root *root, u64 block, bogus = generate_u64(orig); btrfs_set_header_generation(eb, bogus); break; + case BTRFS_METADATA_BLOCK_SHIFT_ITEMS: + shift_items(root, path-nodes[level]); + break; default: ret = -EINVAL; break; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs-progs: add the ability to fix shifted item offsets
A user had a corrupted fs where the items had been shifted improperly. This patch adds the ability to fix this sort of problem within fsck. We will simply shift the item over to the proper offset and update the offsets to make sure they are correct. I tested this with a hand crafted fs that was broken in the same way as the user, and I've included the file as a new test. Thanks, Signed-off-by: Josef Bacik jba...@fb.com --- cmds-check.c | 113 + tests/fsck-tests/003-shift-offsets.img | Bin 0 - 4096 bytes 2 files changed, 100 insertions(+), 13 deletions(-) create mode 100644 tests/fsck-tests/003-shift-offsets.img diff --git a/cmds-check.c b/cmds-check.c index 4c01330..03b0fbd 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -2413,15 +2413,9 @@ static int swap_values(struct btrfs_root *root, struct btrfs_path *path, return 0; } -/* - * Attempt to fix basic block failures. Currently we only handle bad key - * orders, we will cycle through the keys and swap them if necessary. - */ -static int try_to_fix_bad_block(struct btrfs_trans_handle *trans, - struct btrfs_root *root, - struct extent_buffer *buf, - struct btrfs_disk_key *parent_key, - enum btrfs_tree_block_status status) +static int fix_key_order(struct btrfs_trans_handle *trans, +struct btrfs_root *root, +struct extent_buffer *buf) { struct btrfs_path *path; struct btrfs_key k1, k2; @@ -2429,9 +2423,6 @@ static int try_to_fix_bad_block(struct btrfs_trans_handle *trans, int level; int ret; - if (status != BTRFS_TREE_BLOCK_BAD_KEY_ORDER) - return -EIO; - k1.objectid = btrfs_header_owner(buf); k1.type = BTRFS_ROOT_ITEM_KEY; k1.offset = (u64)-1; @@ -2482,6 +2473,103 @@ static int try_to_fix_bad_block(struct btrfs_trans_handle *trans, return ret; } +static int fix_item_offset(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + struct extent_buffer *buf) +{ + struct btrfs_path *path; + struct btrfs_key k1; + int i; + int level; + int ret; + + k1.objectid = btrfs_header_owner(buf); + k1.type = BTRFS_ROOT_ITEM_KEY; + k1.offset = (u64)-1; + + root = btrfs_read_fs_root(root-fs_info, k1); + if (IS_ERR(root)) + return -EIO; + + record_root_in_trans(trans, root); + + path = btrfs_alloc_path(); + if (!path) + return -EIO; + + level = btrfs_header_level(buf); + path-lowest_level = level; + path-skip_check_block = 1; + if (level) + btrfs_node_key_to_cpu(buf, k1, 0); + else + btrfs_item_key_to_cpu(buf, k1, 0); + + ret = btrfs_search_slot(trans, root, k1, path, 0, 1); + if (ret) { + btrfs_free_path(path); + return -EIO; + } + + buf = path-nodes[level]; + for (i = 0; i btrfs_header_nritems(buf); i++) { + unsigned int shift = 0, offset; + + if (i == 0 btrfs_item_end_nr(buf, i) != + BTRFS_LEAF_DATA_SIZE(root)) { + if (btrfs_item_end_nr(buf, i) + BTRFS_LEAF_DATA_SIZE(root)) { + fprintf(stderr, item is off the end of the + leaf, can't fix\n); + ret = -EIO; + break; + } + shift = BTRFS_LEAF_DATA_SIZE(root) - + btrfs_item_end_nr(buf, i); + } else if (i 0 btrfs_item_end_nr(buf, i) != + btrfs_item_offset_nr(buf, i - 1)) { + if (btrfs_item_end_nr(buf, i) + btrfs_item_offset_nr(buf, i - 1)) { + fprintf(stderr, items overlap, can't fix\n); + ret = -EIO; + break; + } + shift = btrfs_item_offset_nr(buf, i - 1) - + btrfs_item_end_nr(buf, i); + } + if (!shift) + continue; + + printf(Shifting item nr %d by %u bytes in block %llu\n, + i + 1, shift, (unsigned long long)buf-start); + offset = btrfs_item_offset_nr(buf, i); + memmove_extent_buffer(buf, offset + shift, offset, + btrfs_item_size_nr(buf, i)); + btrfs_set_item_offset(buf, btrfs_item_nr(i), + offset + shift); + btrfs_mark_buffer_dirty(buf); + } + +
[PATCH v2 3/3] btrfs: fix sparse lock context warnings
Fix several sparse warnings that can easily be addressed with context annotations. These annotations also provide some sort of documentation for the internal helper functions. Signed-off-by: Omar Sandoval osan...@osandov.com Reviewed-by: David Sterba dste...@suse.cz --- fs/btrfs/extent-tree.c | 1 + fs/btrfs/extent_io.c| 2 +- fs/btrfs/free-space-cache.c | 1 + fs/btrfs/locking.c | 1 + 4 files changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 3efe1c3..281f30f 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -6387,6 +6387,7 @@ static struct btrfs_block_group_cache * btrfs_lock_cluster(struct btrfs_block_group_cache *block_group, struct btrfs_free_cluster *cluster, int delalloc) +__acquires(cluster-refill_lock) { struct btrfs_block_group_cache *used_bg; bool locked = false; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index af0359d..cc6b1fc 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4775,8 +4775,8 @@ static inline void btrfs_release_extent_buffer_rcu(struct rcu_head *head) __free_extent_buffer(eb); } -/* Expects to have eb-eb_lock already held */ static int release_extent_buffer(struct extent_buffer *eb) +__releases(eb-refs_lock) { WARN_ON(atomic_read(eb-refs) == 0); if (atomic_dec_and_test(eb-refs)) { diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 2b0a627..41c6cd5 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -1838,6 +1838,7 @@ static struct btrfs_free_space_op free_space_op = { static int insert_into_bitmap(struct btrfs_free_space_ctl *ctl, struct btrfs_free_space *info) +__must_hold(ctl-tree_lock) { struct btrfs_free_space *bitmap_info; struct btrfs_block_group_cache *block_group = NULL; diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c index 5665d21..47bdc2c 100644 --- a/fs/btrfs/locking.c +++ b/fs/btrfs/locking.c @@ -222,6 +222,7 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb) * blocking readers or writers */ void btrfs_tree_lock(struct extent_buffer *eb) +__acquires(eb-lock) { again: wait_event(eb-read_lock_wq, atomic_read(eb-blocking_readers) == 0); -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/3] btrfs: replace open-coded kernel_write
write_buf used by btrfs send has what is more or less a reimplementation of kernel_write. This also gets rid of a sparse address space warning. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/btrfs/send.c | 21 ++--- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 6528aa6..f6b2ec3 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -508,32 +508,23 @@ static struct btrfs_path *alloc_path_for_send(void) static int write_buf(struct file *filp, const void *buf, u32 len, loff_t *off) { int ret; - mm_segment_t old_fs; u32 pos = 0; - old_fs = get_fs(); - set_fs(KERNEL_DS); - while (pos len) { - ret = vfs_write(filp, (char *)buf + pos, len - pos, off); + ret = kernel_write(filp, buf + pos, len - pos, *off); /* TODO handle that correctly */ /*if (ret == -ERESTARTSYS) { continue; }*/ if (ret 0) - goto out; - if (ret == 0) { - ret = -EIO; - goto out; - } + return ret; + if (ret == 0) + return -EIO; pos += ret; + *off += ret; } - ret = 0; - -out: - set_fs(old_fs); - return ret; + return 0; } static int tlv_put(struct send_ctx *sctx, u16 attr, const void *data, int len) -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/3] btrfs: fix several sparse warnings
This patch series fixes some warnings reported by sparse when building the btrfs module. It fixes two classes of warnings: address space warnings and lock context warnings. This didn't uncover any logical errors, but it reduces the noise when checking the source. There are a few less straightforward sparse warnings still remaining. Version 2 replaces an open-coded implementation of kernel_write in write_buf, doesn't touch unrelated style issues, and add's David's Reviewed-by. This patch series applies to 3.17-rc7. Omar Sandoval (3): btrfs: replace open-coded kernel_write btrfs: fix sparse address space warnings btrfs: fix sparse lock context warnings fs/btrfs/extent-tree.c | 1 + fs/btrfs/extent_io.c| 2 +- fs/btrfs/free-space-cache.c | 1 + fs/btrfs/ioctl.c| 6 +++--- fs/btrfs/locking.c | 1 + fs/btrfs/send.c | 21 ++--- 6 files changed, 13 insertions(+), 19 deletions(-) -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/3] btrfs: replace open-coded kernel_write
On Tue, Sep 30, 2014 at 03:01:40PM -0700, Omar Sandoval wrote: write_buf used by btrfs send has what is more or less a reimplementation of kernel_write. This also gets rid of a sparse address space warning. Seems reasonable to me: Reviewed-by: Zach Brown z...@zabbo.net - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html