Re: [PATCH] fstests: common: Make _test_mount to include MOUNT_OPTIONS to allow consistent _test_cycle_mount
At 05/24/2017 01:08 PM, Eryu Guan wrote: On Wed, May 24, 2017 at 12:28:34PM +0800, Qu Wenruo wrote: At 05/24/2017 12:24 PM, Eryu Guan wrote: On Wed, May 24, 2017 at 08:22:25AM +0800, Qu Wenruo wrote: At 05/23/2017 07:13 PM, Eryu Guan wrote: On Tue, May 23, 2017 at 04:02:05PM +0800, Qu Wenruo wrote: [BUG] If using MOUNT_OPTIONS="-o nodatasum" and btrfs to run genierc/142 generic/143 and generic/154, it will cause false alert like: cp: failed to clone '/mnt/test/test-154/file2' from '/mnt/test/test-154/file1': Invalid argument MOUNT_OPTIONS is for scratch mount, and TEST_FS_MOUNT_OPTS is for test dev mount, so I think setting TEST_FS_MOUNT_OPTS to "-o nodatasum" should fix your problem. Nope, the problem is the inconsistent of TEST_MNT setup. It does fix the failure for me, did I miss anything? # MOUNT_OPTIONS="-o nodatasum" TEST_FS_MOUNT_OPTS="-o nodatasum" ./check generic/142 generic/143 generic/154 FSTYP -- btrfs PLATFORM -- Linux/x86_64 dhcp-66-86-11 4.12.0-rc1 MKFS_OPTIONS -- /dev/sda6 MOUNT_OPTIONS -- -o nodatasum -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/testarea/scratch generic/142 2s ... 1s generic/143 18s generic/154 1s Ran: generic/142 generic/143 generic/154 Passed all 3 tests But if you only export MOUNT_OPTIONS, it will fail, due to the different mount options between test_cycle_mount(). That's correct. Sorry, I didn't make it clear in my first reply. I meant that you should set both TEST_FS_MOUNT_OPTS and MOUNT_OPTIONS to "-onodatasum", for both test dev and scratch dev. That's just a workaround, not a root fix. Not to mention quite a lot test cases lose its coverage. As after _test_cycle_mount(), they are just testing default mount option. To make it clear: If test mount follows TEST_FS_MOUNT_OPTS, then both the first mount and test_cycle_mount should follow TEST_FS_MOUNT_OPTS. _test_mount does follow TEST_FS_MOUNT_OPTS, not MOUNT_OPTIONS, no matter which mount it is. While test mount setup by check script follows MOUNT_OPTIONS. Just check the following very basic test script: -- #! /bin/bash # FS QA Test 010 # # what am I here for? # seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" here=`pwd` tmp=/tmp/$$ status=1# failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { cd / rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter # remove previous $seqres.full before test rm -f $seqres.full # real QA test starts here # Modify as appropriate. _supported_fs generic _supported_os Linux _require_test mount > $tmp.mount1 _test_cycle_mount mount > $tmp.mount2 diff $tmp.mount1 $tmp.mount2 echo "Silence is golden" # success, all done status=0 exit -- Thanks, Qu Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fstests: common: Make _test_mount to include MOUNT_OPTIONS to allow consistent _test_cycle_mount
On Wed, May 24, 2017 at 12:28:34PM +0800, Qu Wenruo wrote: > > > At 05/24/2017 12:24 PM, Eryu Guan wrote: > > On Wed, May 24, 2017 at 08:22:25AM +0800, Qu Wenruo wrote: > > > > > > > > > At 05/23/2017 07:13 PM, Eryu Guan wrote: > > > > On Tue, May 23, 2017 at 04:02:05PM +0800, Qu Wenruo wrote: > > > > > [BUG] > > > > > If using MOUNT_OPTIONS="-o nodatasum" and btrfs to run genierc/142 > > > > > generic/143 and generic/154, it will cause false alert like: > > > > > cp: failed to clone '/mnt/test/test-154/file2' from > > > > > '/mnt/test/test-154/file1': Invalid argument > > > > > > > > MOUNT_OPTIONS is for scratch mount, and TEST_FS_MOUNT_OPTS is for test > > > > dev mount, so I think setting TEST_FS_MOUNT_OPTS to "-o nodatasum" > > > > should fix your problem. > > > > > > Nope, the problem is the inconsistent of TEST_MNT setup. > > > > It does fix the failure for me, did I miss anything? > > > > # MOUNT_OPTIONS="-o nodatasum" TEST_FS_MOUNT_OPTS="-o nodatasum" ./check > > generic/142 generic/143 generic/154 > > FSTYP -- btrfs > > PLATFORM -- Linux/x86_64 dhcp-66-86-11 4.12.0-rc1 > > MKFS_OPTIONS -- /dev/sda6 > > MOUNT_OPTIONS -- -o nodatasum -o context=system_u:object_r:root_t:s0 > > /dev/sda6 /mnt/testarea/scratch > > > > generic/142 2s ... 1s > > generic/143 18s > > generic/154 1s > > Ran: generic/142 generic/143 generic/154 > > Passed all 3 tests > > > > But if you only export MOUNT_OPTIONS, it will fail, due to the different > mount options between test_cycle_mount(). That's correct. Sorry, I didn't make it clear in my first reply. I meant that you should set both TEST_FS_MOUNT_OPTS and MOUNT_OPTIONS to "-onodatasum", for both test dev and scratch dev. > > To make it clear: > If test mount follows TEST_FS_MOUNT_OPTS, then both the first mount and > test_cycle_mount should follow TEST_FS_MOUNT_OPTS. _test_mount does follow TEST_FS_MOUNT_OPTS, not MOUNT_OPTIONS, no matter which mount it is. Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fstests: common: Make _test_mount to include MOUNT_OPTIONS to allow consistent _test_cycle_mount
At 05/24/2017 12:24 PM, Eryu Guan wrote: On Wed, May 24, 2017 at 08:22:25AM +0800, Qu Wenruo wrote: At 05/23/2017 07:13 PM, Eryu Guan wrote: On Tue, May 23, 2017 at 04:02:05PM +0800, Qu Wenruo wrote: [BUG] If using MOUNT_OPTIONS="-o nodatasum" and btrfs to run genierc/142 generic/143 and generic/154, it will cause false alert like: cp: failed to clone '/mnt/test/test-154/file2' from '/mnt/test/test-154/file1': Invalid argument MOUNT_OPTIONS is for scratch mount, and TEST_FS_MOUNT_OPTS is for test dev mount, so I think setting TEST_FS_MOUNT_OPTS to "-o nodatasum" should fix your problem. Nope, the problem is the inconsistent of TEST_MNT setup. It does fix the failure for me, did I miss anything? # MOUNT_OPTIONS="-o nodatasum" TEST_FS_MOUNT_OPTS="-o nodatasum" ./check generic/142 generic/143 generic/154 FSTYP -- btrfs PLATFORM -- Linux/x86_64 dhcp-66-86-11 4.12.0-rc1 MKFS_OPTIONS -- /dev/sda6 MOUNT_OPTIONS -- -o nodatasum -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/testarea/scratch generic/142 2s ... 1s generic/143 18s generic/154 1s Ran: generic/142 generic/143 generic/154 Passed all 3 tests But if you only export MOUNT_OPTIONS, it will fail, due to the different mount options between test_cycle_mount(). To make it clear: If test mount follows TEST_FS_MOUNT_OPTS, then both the first mount and test_cycle_mount should follow TEST_FS_MOUNT_OPTS. If it follows MOUNT_OPTIONS, then both. Not one follows MOUNT_OPTIONS and the other follows TEST_FS_MOUNT_OPTS. THanks, Qu Thanks, Eryu As I described, the TEST_MNT is setup *with* MOUNT_OPTIONS for the 1st time, maybe by "check" script. But _test_cycle_mount() uses TEST_FS_MOUNT_OPTS otherthan MOUNT_OPTIONS, and leads to the problem. If using TEST_FS_MOUNT_OPTS, then 1st setup should also use TEST_FS_MOUNT_OPTS. Thanks, Qu Yeah, MOUNT_OPTIONS and TEST_FS_MOUNT_OPTS (and almost all other global variables) are not documented properly.. Thanks, Eryu [REASON] It is caused by _test_cycle_mount function, which unmount test device, but when trying to re-mount it again using _test_mount(), we don't pass $MOUNT_OPTIONS. So this makes mount options differs between _test_cycle_mount(). And btrfs doesn't allow different csum flags between reflink source and destination inodes, so it returns -EINVAL for reflink operation. [FIX] Fix it by passing $MOUNT_OPTIONS to _test_mount(), so that _test_cycle_mount() won't cause different mount options. So btrfs with "-o nodatasum" mount option can pass generic/14[23] and generic/154 without false alert. Signed-off-by: Qu Wenruo--- common/rc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/common/rc b/common/rc index ba215961..a591907c 100644 --- a/common/rc +++ b/common/rc @@ -522,7 +522,8 @@ _test_mount() return $? fi _test_options mount -_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS $SELINUX_MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR +_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS $SELINUX_MOUNT_OPTIONS \ + $MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR } _test_unmount() -- 2.13.0 -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fstests: common: Make _test_mount to include MOUNT_OPTIONS to allow consistent _test_cycle_mount
On Wed, May 24, 2017 at 08:22:25AM +0800, Qu Wenruo wrote: > > > At 05/23/2017 07:13 PM, Eryu Guan wrote: > > On Tue, May 23, 2017 at 04:02:05PM +0800, Qu Wenruo wrote: > > > [BUG] > > > If using MOUNT_OPTIONS="-o nodatasum" and btrfs to run genierc/142 > > > generic/143 and generic/154, it will cause false alert like: > > > cp: failed to clone '/mnt/test/test-154/file2' from > > > '/mnt/test/test-154/file1': Invalid argument > > > > MOUNT_OPTIONS is for scratch mount, and TEST_FS_MOUNT_OPTS is for test > > dev mount, so I think setting TEST_FS_MOUNT_OPTS to "-o nodatasum" > > should fix your problem. > > Nope, the problem is the inconsistent of TEST_MNT setup. It does fix the failure for me, did I miss anything? # MOUNT_OPTIONS="-o nodatasum" TEST_FS_MOUNT_OPTS="-o nodatasum" ./check generic/142 generic/143 generic/154 FSTYP -- btrfs PLATFORM -- Linux/x86_64 dhcp-66-86-11 4.12.0-rc1 MKFS_OPTIONS -- /dev/sda6 MOUNT_OPTIONS -- -o nodatasum -o context=system_u:object_r:root_t:s0 /dev/sda6 /mnt/testarea/scratch generic/142 2s ... 1s generic/143 18s generic/154 1s Ran: generic/142 generic/143 generic/154 Passed all 3 tests Thanks, Eryu > > As I described, the TEST_MNT is setup *with* MOUNT_OPTIONS for the 1st time, > maybe by "check" script. > > But _test_cycle_mount() uses TEST_FS_MOUNT_OPTS otherthan MOUNT_OPTIONS, and > leads to the problem. > > If using TEST_FS_MOUNT_OPTS, then 1st setup should also use > TEST_FS_MOUNT_OPTS. > > Thanks, > Qu > > > > > Yeah, MOUNT_OPTIONS and TEST_FS_MOUNT_OPTS (and almost all other global > > variables) are not documented properly.. > > > > Thanks, > > Eryu > > > > > > > > [REASON] > > > It is caused by _test_cycle_mount function, which unmount test device, > > > but when trying to re-mount it again using _test_mount(), we don't pass > > > $MOUNT_OPTIONS. > > > > > > So this makes mount options differs between _test_cycle_mount(). > > > > > > And btrfs doesn't allow different csum flags between reflink source and > > > destination inodes, so it returns -EINVAL for reflink operation. > > > > > > [FIX] > > > Fix it by passing $MOUNT_OPTIONS to _test_mount(), so that > > > _test_cycle_mount() won't cause different mount options. > > > So btrfs with "-o nodatasum" mount option can pass generic/14[23] > > > and generic/154 without false alert. > > > > > > Signed-off-by: Qu Wenruo> > > --- > > > common/rc | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/common/rc b/common/rc > > > index ba215961..a591907c 100644 > > > --- a/common/rc > > > +++ b/common/rc > > > @@ -522,7 +522,8 @@ _test_mount() > > > return $? > > > fi > > > _test_options mount > > > -_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS > > > $SELINUX_MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR > > > +_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS > > > $SELINUX_MOUNT_OPTIONS \ > > > + $MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR > > > } > > > _test_unmount() > > > -- > > > 2.13.0 > > > > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe fstests" in > > > the body of a message to majord...@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS converted from EXT4 becomes read-only after reboot
24.05.2017 00:49, Marc MERLIN пишет: > On Tue, May 23, 2017 at 03:38:01PM -0600, Chris Murphy wrote: >>> I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it >>> never worked properly any of those times, sadly. >> >> Since the 4.6 total rewrite? There are also recent bug fixes related >> to convert in the changelog, it should be working now and if there are >> problems Qu is probably interested in getting them fixed. > > It was a 4.9 kernel from debian. btrfs-convert is actually user space, so the question is what btrfsprogs version you used. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/19] Btrfs-progs offline scrub
> > Only recovery needs to be implemented now. > > Thanks, > Qu > Once recovery is implemented, I'll try again. Just one suggestion: Optionally, It is possible to print filename for these detected blocks. For ex, if corruption happened on a old/unwanted archived log file (/var/log/nginx_access_20160101.log) I don't need to panic, as I can live without that file. Or possibly delete that file, so that scrub doesn't report errors. -- Cheers, Lakshmipathi.G -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] qgroup: Retry after commit on getting EDQUOT
At 03/28/2017 02:13 AM, Goldwyn Rodrigues wrote: On 03/27/2017 12:36 PM, David Sterba wrote: On Mon, Mar 27, 2017 at 12:29:57PM -0500, Goldwyn Rodrigues wrote: From: Goldwyn RodriguesWe are facing the same problem with EDQUOT which was experienced with ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but here is a quick fix, which may be too big a hammer. Quotas are reserved during the start of an operation, incrementing qg->reserved. However, it is written to disk in a commit_transaction which could take as long as commit_interval. In the meantime there could be deletions which are not accounted for because deletions are accounted for only while committed (free_refroot). So, when we get a EDQUOT flush the data to disk and try again. This fixes fstests btrfs/139. This patch is causing hang for inode_cache mount option. Which can be easily triggered by btrfs/042 with inode_cache. The callback trace will be: Call Trace: __schedule+0x374/0xaf0 schedule+0x3d/0x90 wait_for_commit+0x4a/0x80 [btrfs] ? wake_atomic_t_function+0x60/0x60 btrfs_commit_transaction+0xe0/0xa10 [btrfs] ? start_transaction+0xad/0x510 [btrfs] qgroup_reserve+0x1f0/0x350 [btrfs] btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs] ? _raw_spin_unlock+0x27/0x40 btrfs_check_data_free_space+0x6d/0xb0 [btrfs] btrfs_delalloc_reserve_space+0x25/0x70 [btrfs] btrfs_save_ino_cache+0x402/0x650 [btrfs] commit_fs_roots+0xb7/0x170 [btrfs] btrfs_commit_transaction+0x425/0xa10 [btrfs] qgroup_reserve+0x1f0/0x350 [btrfs] btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs] ? _raw_spin_unlock+0x27/0x40 btrfs_check_data_free_space+0x6d/0xb0 [btrfs] btrfs_delalloc_reserve_space+0x25/0x70 [btrfs] btrfs_direct_IO+0x1c5/0x3b0 [btrfs] generic_file_direct_write+0xab/0x150 btrfs_file_write_iter+0x243/0x530 [btrfs] __vfs_write+0xc9/0x120 vfs_write+0xcb/0x1f0 SyS_pwrite64+0x79/0x90 entry_SYSCALL_64_fastpath+0x18/0xad We're calling btrfs_commit_transaction() inside btrfs_commit_transaction(). Which will definitely hang the system. Any idea to fix it? Thanks, Qu Signed-off-by: Goldwyn Rodrigues --- Changes since v1: - Changed start_delalloc_roots() to start_delalloc_inode() to target the root in question only to reduce the amount of flush to be done. - Added wait_ordered_extents(). Changes since v2: - Revised patch header - removed comment on combining conditions - removed test case, to be done in fstests Changes sinve v3: - testcase reinstated - return value checks Changes since v4: - removed testcase since btrfs/139 got incorporated in fstests The point was to keep the test inside the changelog as well. Yes, that was before we had btrfs/139 in fstest. I have put it in the patch header. However, if you really want the test script back, I can put it there. Let me know. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: add compression trace points
This patch adds compression and decompression trace points for the purpose of debugging. Signed-off-by: Anand Jain--- Note: I have used same trace function for both compress and decompress as I wanted to maintain compress and decompress debug data aligned. fs/btrfs/compression.c | 10 ++ include/trace/events/btrfs.h | 36 2 files changed, 46 insertions(+) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index ee934e612f15..9b0562cd1b7f 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -938,6 +938,10 @@ int btrfs_compress_pages(int type, struct address_space *mapping, start, pages, out_pages, total_in, total_out); + + trace_btrfs_encoder(1, 0, mapping->host, type, *total_in, + *total_out, start, ret); + free_workspace(type, workspace); return ret; } @@ -968,6 +972,9 @@ static int btrfs_decompress_bio(struct compressed_bio *cb) cb->compressed_pages, cb->start, cb->orig_bio, cb->compressed_len); + trace_btrfs_encoder(0, 1, cb->inode, type, + cb->compressed_len, cb->len, cb->start, ret); + free_workspace(type, workspace); return ret; } @@ -989,6 +996,9 @@ int btrfs_decompress(int type, unsigned char *data_in, struct page *dest_page, dest_page, start_byte, srclen, destlen); + trace_btrfs_encoder(0, 0, dest_page->mapping->host, + type, srclen, destlen, start_byte, ret); + free_workspace(type, workspace); return ret; } diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index e37973526153..1ebffcd005a1 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -1658,6 +1658,42 @@ TRACE_EVENT(qgroup_meta_reserve, show_root_type(__entry->refroot), __entry->diff) ); +TRACE_EVENT(btrfs_encoder, + + TP_PROTO(int encode, int bio, struct inode *inode, int type, + unsigned long bfr, unsigned long aft, + unsigned long start, int ret), + + TP_ARGS(encode, bio, inode, type, bfr, aft, start, ret), + + TP_STRUCT__entry_btrfs( + __field(int,encode) + __field(int,bio) + __field(unsigned long, i_ino) + __field(int,type) + __field(unsigned long, bfr) + __field(unsigned long, aft) + __field(unsigned long, start) + __field(int,ret) + ), + + TP_fast_assign_btrfs(btrfs_sb(inode->i_sb), + __entry->encode = encode; + __entry->bio= bio; + __entry->i_ino = inode->i_ino; + __entry->type = type; + __entry->bfr= bfr; + __entry->aft= aft; + __entry->start = start; + __entry->ret= ret; + ), + + TP_printk_btrfs("%s %s ino:%lu tfm:%d bfr:%lu aft:%lu start:%lu ret:%d", + __entry->encode ? "encode":"decode", + __entry->bio ? "bio":"pge", __entry->i_ino, __entry->type, + __entry->bfr, __entry->aft, __entry->start, __entry->ret) + +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */ -- 2.10.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: btrfs_decompress_bio() could accept compressed_bio instead
Instead of sending each argument of struct compressed_bio, send the compressed_bio itself. Also by having struct compressed_bio in btrfs_decompress_bio() it would help tracing. Signed-off-by: Anand Jain--- This patch is preparatory for the up coming patch btrfs: add compression trace points fs/btrfs/compression.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 10e6b282d09d..ee934e612f15 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -81,9 +81,7 @@ struct compressed_bio { u32 sums; }; -static int btrfs_decompress_bio(int type, struct page **pages_in, - u64 disk_start, struct bio *orig_bio, - size_t srclen); +static int btrfs_decompress_bio(struct compressed_bio *cb); static inline int compressed_bio_size(struct btrfs_fs_info *fs_info, unsigned long disk_size) @@ -173,11 +171,8 @@ static void end_compressed_bio_read(struct bio *bio) /* ok, we're the last bio for this extent, lets start * the decompression. */ - ret = btrfs_decompress_bio(cb->compress_type, - cb->compressed_pages, - cb->start, - cb->orig_bio, - cb->compressed_len); + ret = btrfs_decompress_bio(cb); + csum_failed: if (ret) cb->errors = 1; @@ -961,18 +956,18 @@ int btrfs_compress_pages(int type, struct address_space *mapping, * be contiguous. They all correspond to the range of bytes covered by * the compressed extent. */ -static int btrfs_decompress_bio(int type, struct page **pages_in, - u64 disk_start, struct bio *orig_bio, - size_t srclen) +static int btrfs_decompress_bio(struct compressed_bio *cb) { struct list_head *workspace; int ret; + int type = cb->compress_type; workspace = find_workspace(type); - ret = btrfs_compress_op[type-1]->decompress_bio(workspace, pages_in, -disk_start, orig_bio, -srclen); + ret = btrfs_compress_op[type-1]->decompress_bio(workspace, + cb->compressed_pages, cb->start, cb->orig_bio, + cb->compressed_len); + free_workspace(type, workspace); return ret; } -- 2.10.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: tree-log.c: Wrong printk information about namelen
In verify_dir_item, it wants to printk name_len of dir_item but printk data_len acutally. Fix it by calling btrfs_dir_name_len instead of btrfs_dir_data_len. Signed-off-by: Su Yue--- fs/btrfs/dir-item.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/dir-item.c b/fs/btrfs/dir-item.c index 60a750678a82..c24d615e3d7f 100644 --- a/fs/btrfs/dir-item.c +++ b/fs/btrfs/dir-item.c @@ -468,7 +468,7 @@ int verify_dir_item(struct btrfs_fs_info *fs_info, if (btrfs_dir_name_len(leaf, dir_item) > namelen) { btrfs_crit(fs_info, "invalid dir item name len: %u", - (unsigned)btrfs_dir_data_len(leaf, dir_item)); + (unsigned)btrfs_dir_name_len(leaf, dir_item)); return 1; } -- 2.13.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fstests: common: Make _test_mount to include MOUNT_OPTIONS to allow consistent _test_cycle_mount
At 05/23/2017 07:13 PM, Eryu Guan wrote: On Tue, May 23, 2017 at 04:02:05PM +0800, Qu Wenruo wrote: [BUG] If using MOUNT_OPTIONS="-o nodatasum" and btrfs to run genierc/142 generic/143 and generic/154, it will cause false alert like: cp: failed to clone '/mnt/test/test-154/file2' from '/mnt/test/test-154/file1': Invalid argument MOUNT_OPTIONS is for scratch mount, and TEST_FS_MOUNT_OPTS is for test dev mount, so I think setting TEST_FS_MOUNT_OPTS to "-o nodatasum" should fix your problem. Nope, the problem is the inconsistent of TEST_MNT setup. As I described, the TEST_MNT is setup *with* MOUNT_OPTIONS for the 1st time, maybe by "check" script. But _test_cycle_mount() uses TEST_FS_MOUNT_OPTS otherthan MOUNT_OPTIONS, and leads to the problem. If using TEST_FS_MOUNT_OPTS, then 1st setup should also use TEST_FS_MOUNT_OPTS. Thanks, Qu Yeah, MOUNT_OPTIONS and TEST_FS_MOUNT_OPTS (and almost all other global variables) are not documented properly.. Thanks, Eryu [REASON] It is caused by _test_cycle_mount function, which unmount test device, but when trying to re-mount it again using _test_mount(), we don't pass $MOUNT_OPTIONS. So this makes mount options differs between _test_cycle_mount(). And btrfs doesn't allow different csum flags between reflink source and destination inodes, so it returns -EINVAL for reflink operation. [FIX] Fix it by passing $MOUNT_OPTIONS to _test_mount(), so that _test_cycle_mount() won't cause different mount options. So btrfs with "-o nodatasum" mount option can pass generic/14[23] and generic/154 without false alert. Signed-off-by: Qu Wenruo--- common/rc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/common/rc b/common/rc index ba215961..a591907c 100644 --- a/common/rc +++ b/common/rc @@ -522,7 +522,8 @@ _test_mount() return $? fi _test_options mount -_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS $SELINUX_MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR +_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS $SELINUX_MOUNT_OPTIONS \ + $MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR } _test_unmount() -- 2.13.0 -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS converted from EXT4 becomes read-only after reboot
On Tue, May 23, 2017 at 02:53:21PM -0700, Marc MERLIN wrote: > On Tue, May 23, 2017 at 03:51:43PM -0600, Chris Murphy wrote: > > On Tue, May 23, 2017 at 3:49 PM, Marc MERLINwrote: > > > On Tue, May 23, 2017 at 03:38:01PM -0600, Chris Murphy wrote: > > >> > I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it > > >> > never worked properly any of those times, sadly. > > >> > > >> Since the 4.6 total rewrite? There are also recent bug fixes related > > >> to convert in the changelog, it should be working now and if there are > > >> problems Qu is probably interested in getting them fixed. > > > > > > It was a 4.9 kernel from debian. > > > > Convert is done by user space code, so btrfs-progs version is relevant. > > Sigh, I think you found the issue. That system only had btrfs-tools 4.4 :( On the plus side, it's good news for btrfs, that means the next time I may need/try this again, it should actually work :) (so thanks for fixing it) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS converted from EXT4 becomes read-only after reboot
On Tue, May 23, 2017 at 03:51:43PM -0600, Chris Murphy wrote: > On Tue, May 23, 2017 at 3:49 PM, Marc MERLINwrote: > > On Tue, May 23, 2017 at 03:38:01PM -0600, Chris Murphy wrote: > >> > I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it > >> > never worked properly any of those times, sadly. > >> > >> Since the 4.6 total rewrite? There are also recent bug fixes related > >> to convert in the changelog, it should be working now and if there are > >> problems Qu is probably interested in getting them fixed. > > > > It was a 4.9 kernel from debian. > > Convert is done by user space code, so btrfs-progs version is relevant. Sigh, I think you found the issue. That system only had btrfs-tools 4.4 :( Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS converted from EXT4 becomes read-only after reboot
On Tue, May 23, 2017 at 02:49:43PM -0700, Marc MERLIN wrote: > On Tue, May 23, 2017 at 03:38:01PM -0600, Chris Murphy wrote: > > > I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it > > > never worked properly any of those times, sadly. > > > > Since the 4.6 total rewrite? There are also recent bug fixes related > > to convert in the changelog, it should be working now and if there are > > problems Qu is probably interested in getting them fixed. > > It was a 4.9 kernel from debian. It's the userspace tools that make the difference here (and what Chris was referring to). Conversion has nothing to do with the kernel. Hugo. > The conversion looked like it worked, I rebooted ok, and then it got > corrupted quickly after I deleted the subvolumes that had the old ext4 data. > I've since wiped that disk and done a fresh btrfs install on it, because I > had to get some work done :) > > Marc -- Hugo Mills | Essex: a branch of philothophy. hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: BTRFS converted from EXT4 becomes read-only after reboot
On Tue, May 23, 2017 at 3:49 PM, Marc MERLINwrote: > On Tue, May 23, 2017 at 03:38:01PM -0600, Chris Murphy wrote: >> > I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it >> > never worked properly any of those times, sadly. >> >> Since the 4.6 total rewrite? There are also recent bug fixes related >> to convert in the changelog, it should be working now and if there are >> problems Qu is probably interested in getting them fixed. > > It was a 4.9 kernel from debian. Convert is done by user space code, so btrfs-progs version is relevant. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS converted from EXT4 becomes read-only after reboot
On Tue, May 23, 2017 at 03:38:01PM -0600, Chris Murphy wrote: > > I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it > > never worked properly any of those times, sadly. > > Since the 4.6 total rewrite? There are also recent bug fixes related > to convert in the changelog, it should be working now and if there are > problems Qu is probably interested in getting them fixed. It was a 4.9 kernel from debian. The conversion looked like it worked, I rebooted ok, and then it got corrupted quickly after I deleted the subvolumes that had the old ext4 data. I've since wiped that disk and done a fresh btrfs install on it, because I had to get some work done :) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS converted from EXT4 becomes read-only after reboot
On Tue, May 23, 2017 at 11:00 AM, Marc MERLINwrote: > On Thu, May 04, 2017 at 03:55:28AM +, Duncan wrote: >> > But that alone may not fix it, I think you need a newer kernel... >> >> Well, while the 4.4 LTS kernel series /is/ getting a bit long in the >> tooth by now, it's still the second newest LTS series available, 4.9 >> being the newest. >> >> And on-list we've long recommended staying within the latest two series >> in either the LTS or current kernel series, which means the 4.4 series >> should still be reasonably supported. > > For what it's worth, I also had an ext4 filesystem created by some > debian testing install, tried to convert it, it worked, I rebooted, then > it worked once and got corrupted and in the end I had to destroy it and > convert by hand. > > I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it > never worked properly any of those times, sadly. Since the 4.6 total rewrite? There are also recent bug fixes related to convert in the changelog, it should be working now and if there are problems Qu is probably interested in getting them fixed. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.11.1: cannot btrfs check --repair a filesystem, causes heavy memory stalls
Am Tue, 23 May 2017 07:21:33 -0400 schrieb "Austin S. Hemmelgarn": > On 2017-05-22 22:07, Chris Murphy wrote: > > On Mon, May 22, 2017 at 5:57 PM, Marc MERLIN > > wrote: > >> On Mon, May 22, 2017 at 05:26:25PM -0600, Chris Murphy wrote: > [...] > [...] > [...] > >> > >> Oh, swap will work, you're sure? > >> I already have an SSD, if that's good enough, I can give it a > >> shot. > > > > Yeah although I have no idea how much swap is needed for it to > > succeed. I'm not sure what the relationship is to fs metadata chunk > > size to btrfs check RAM requirement is; but if it wants all of the > > metadata in RAM, then whatever btrfs fi us shows you for metadata > > may be a guide (?) for how much memory it's going to want. > I think the in-memory storage is a bit more space efficient than the > on-disk storage, but I'm not certain, and I'm pretty sure it takes up > more space when it's actually repairing things. If I'm doing the > math correctly, you _may_ need up to 50% _more_ than the total > metadata size for the FS in virtual memory space. > > > > Another possibility is zswap, which still requires a backing device, > > but it might be able to limit how much swap to disk is needed if the > > data to swap out is highly compressible. *shrug* > > > zswap won't help in that respect, but it might make swapping stuff > back in faster. It just keeps a compressed copy in memory in > parallel to writing the full copy out to disk, then uses that > compressed copy to swap in instead of going to disk if the copy is > still in memory (but it will discard the compressed copies if memory > gets really low). In essence, it reduces the impact of swapping when > memory pressure is moderate (the situation for most desktops for > example), but becomes almost useless when you have very high memory > pressure (which is what describes this usage). Is this really how zswap works? I always thought it acts as a compressed write-back cache in front of the swap devices. Pages first go to zswap compressed, and later write-back kicks in and migrates those compressed pages to real swap, but still compressed. This is done by zswap putting two (or up to three in modern kernels) compressed pages into one page. It has the downside of uncompressing all "buddy pages" when only one is needed back in. But it stays compressed. This also tells me zswap will either achieve around 1:2 or 1:3 effective compression ratio or none. So it cannot be compared to how streaming compression works. OTOH, if the page is reloaded from cache before write-back kicks in, it will never be written to swap but just uncompressed and discarded from the cache. Under high memory pressure it doesn't really work that well due to high CPU overhead if pages constantly swap out, compress, write, read, uncompress, swap in... This usually results in very low CPU usage for processes but high IO and disk wait and high kernel CPU usage. But it defers memory pressure conditions to a little later in exchange for more a little more IO usage and more CPU usage. If you have a lot of inactive memory around, it can make a difference. But it is counter productive if almost all your memory is active and pressure is high. So, in this scenario, it probably still doesn't help. -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.11.1: cannot btrfs check --repair a filesystem, causes heavy memory stalls
On Mon, May 22, 2017 at 09:19:34AM +, Duncan wrote: > btrfs check is userspace, not kernelspace. The btrfs-transacti threads That was my understanding, yes, but since I got it to starve my system, including in kernel OOM issues I pasted in my last message and just referenced in https://bugzilla.kernel.org/show_bug.cgi?id=195863 I think it's not much as black and white as running a userland process that takes too much RAM and get killed if it does. > are indeed kernelspace, but the problem would appear to be either IO or > memory starvation triggered by the userspace check hogging all available > resources, not leaving enough for normal system, including kernel, > processes. Looks like it, but also memory. > * Keeping the number of snapshots as low as possible is strongly > recommended by pretty much everyone here, definitely under 300 per > subvolume and if possible, to double-digits per subvolume. I agree that fewer snapshots is better, but between recovery snapshots and btrfs snapshots for some amount of subvolumes, things add up :) gargamel:/mnt/btrfs_pool1# btrfs subvolume list . | wc -l 93 gargamel:/mnt/btrfs_pool2# btrfs subvolume list . | wc -l 103 > * I personally recommend disabling qgroups, unless you're actively > working with the devs on improving them. In addition to the scaling > issues, quotas simply aren't reliable enough on btrfs yet to rely on them > if the use-case requires them (in which case using a mature filesystem > where they're proven to work is recommended), and if it doesn't, there's > simply too many remaining issues for the qgroups option to be worth it. I had consider using them at some point for each size of each subvolume but good to know they're still not ready quite yet. > * I personally recommend keeping overall filesystem size to something one > can reasonably manage. Most people's use-cases aren't going to allow for > an fsck taking days and tens of GiB, but /will/ allow for multi-TB > filesystems to be split out into multiple independent filesystems of > perhaps a TB or two each, tops, if that's the alternative to multiple-day > fscks taking tens of GiB. (Some use-cases are of course exceptions.) fsck ran in 6H with bcache, but the lowmem one could take a lot longer. Running over ndb to another host with more RAM could indeed take days given the loss of bcache and adding the latency/bandwidth of a networkg. > * The low-memory-mode btrfs check is being developed, tho unfortunately > it doesn't yet do repairs. (Another reason is that it's an alternate > implementation that provides a very useful second opinion and the ability > to cross-check one implementation against the other in hard problem > cases.) True. > >> Sadly, I tried a scrub on the same device, and it stalled after 6TB. > >> The scrub process went zombie and the scrub never succeeded, nor could > >> it be stopped. > > Quite apart from the "... after 6TB" bit setting off my own "it's too big > to reasonably manage" alarm, the filesystem obviously is bugged, and > scrub as well, since it shouldn't just go zombie regardless of the > problem -- it should fail much more gracefully. :) In this case it's mostly big files, so it's fine metadata wise but takes a while to scrub (<24H though). The problem I had is that I copied all of dshelf2 onto dshelf1 while I blew ds2, and rebuilt it. That extra metadata (many smaller files) tipped the metadata size of ds1 over the edge. Once I blew that backup, things became ok again. > Meanwhile, FWIW, unlike check, scrub /is/ kernelspace. Correct, just like balance. > As explained, check is userspace, but as you found, it can still > interfere with kernelspace, including unrelated btrfs-transaction > threads. When the system's out of memory, it's out of memory. userspace should not take the entire system down without the OOM killer even firing. Also, is the logs I just sent, it showed that none of my swap space had been used. Why would that be? > Tho there is ongoing work into better predicting memory allocation needs > for btrfs kernel threads and reserving memory space accordingly, so this > sort of thing doesn't happen any more. That would be good. > Agreed. Lowmem mode looks like about your only option, beyond simply > blowing it away, at this point. Too bad it doesn't do repair yet, but it's not an option since it won't fix the small corruption issue I had. Thankfully deleting enough metadata allowed it to run within my RAM and check --repair fixed it now. > with a bit of luck it should at least give you and the devs some idea > what's wrong, information that can in turn be used to fix both scrub and > normal check mode, as well as low-mem repair mode, once it's available. In this case, not useful information for the devs. It's a bad SAS card that corrupted my data, not a bug in the kernel code. > Of course your "days" comment is triggering my "it's too big to maintain" > reflex again, but obviously
Re: BTRFS converted from EXT4 becomes read-only after reboot
On Thu, May 04, 2017 at 03:55:28AM +, Duncan wrote: > > But that alone may not fix it, I think you need a newer kernel... > > Well, while the 4.4 LTS kernel series /is/ getting a bit long in the > tooth by now, it's still the second newest LTS series available, 4.9 > being the newest. > > And on-list we've long recommended staying within the latest two series > in either the LTS or current kernel series, which means the 4.4 series > should still be reasonably supported. For what it's worth, I also had an ext4 filesystem created by some debian testing install, tried to convert it, it worked, I rebooted, then it worked once and got corrupted and in the end I had to destroy it and convert by hand. I've tried an ext4 to btrfs conversion 3 times in the last 3 years, it never worked properly any of those times, sadly. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
On Tue, May 02, 2017 at 05:01:02AM +, Duncan wrote: > Marc MERLIN posted on Mon, 01 May 2017 20:23:46 -0700 as excerpted: > > > Also, how is --mode=lowmem being useful? > > FWIW, I just watched your talk that's linked from the wiki, and wondered > what you were doing these days as I hadn't seen any posts from you here > for awhile. First, sorry for the late reply. Because you didn't Cc me in the answer, it went to a different folder where I only saw your replies now. Off topic, but basically I'm not dead or anything, I have btrfs working well enough to not mess with it further because I have many other hobbies :) that is unless I put a new SAS card in my server, hit some corruption bugs, and now I'm back spending days fixing the system. > Well, that you're asking that question confirms you've not been following > the list too closely... Of course that's understandable as people have > other stuff to do, but just sayin'. That's exactly right. I'm subscribed to way too many lists on way too many topics to be up to date with all, sadly :( > Of course on-list I'm somewhat known for my arguments propounding the > notion that any filesystem that's too big to be practically maintained > (including time necessary to restore from backups, should that be > necessary for whatever reason) is... too big... and should ideally be > broken along logical and functional boundaries into a number of > individual smaller filesystems until such point as each one is found to > be practically maintainable within a reasonably practical time frame. > Don't put all the eggs in one basket, and when the bottom of one of those > baskets inevitably falls out, most of your eggs will be safe in other > baskets. =:^) That's a valid point, and in my case, I can back it up/restore, it just takes a bit of time, but most of the time is manually babysitting all those subvolumes that I need to recreate by hand with btrfs send/restore relationships, which all get lost during backup/restore. This is the most painful part. What's too big? I've only ever used a filesystem that fits on on a raid of 4 data drives. That value has increased over time, but I don't have a a crazy array of 20+ drives as a single filesystem, or anything. Since drives have gotten bigger, but not that much faster, I use bcache to make things more acceptable in speed. > *BUT*, and here's the "go further" part, keep in mind that subvolume-read- > only is a property, gettable and settable by btrfs property. > > So you should be able to unset the read-only property of a subvolume or > snapshot, move it, then if desired, set it again. > > Of course I wouldn't expect send -p to work with such a snapshot, but > send -c /might/ still work, I'm not actually sure but I'd consider it > worth trying. (I'd try -p as well, but expect it to fail...) That's an interesting point, thanks for making it. In that case, I did have to destroy and recreate the filesystem since btrfs check --repair was unable to fix it, but knowing how to reparent read only subvolumes may be handy in the future, thanks. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.11.1: cannot btrfs check --repair a filesystem, causes heavy memory stalls
On Tue, May 23, 2017 at 07:21:33AM -0400, Austin S. Hemmelgarn wrote: > > Yeah although I have no idea how much swap is needed for it to > > succeed. I'm not sure what the relationship is to fs metadata chunk > > size to btrfs check RAM requirement is; but if it wants all of the > > metadata in RAM, then whatever btrfs fi us shows you for metadata may > > be a guide (?) for how much memory it's going to want. > > I think the in-memory storage is a bit more space efficient than the on-disk > storage, but I'm not certain, and I'm pretty sure it takes up more space > when it's actually repairing things. If I'm doing the math correctly, you > _may_ need up to 50% _more_ than the total metadata size for the FS in > virtual memory space. So I was able to rescue/fix my system by removing a bunch of temporary data on it, which in turn freed up enough metadata for things to btrfs check to work again. The things to check were minor, so they were fixed quickly. I seem to have been the last person who last edited https://btrfs.wiki.kernel.org/index.php/Btrfsck and it's therefore way out of date :) I propose the following 1) One dev needs to confirm that as long as you have enough swap, btrfs check should. Give some guideline of metadatasize to swap size. Then again I think swap doesn't help, see below 2) I still think there is an issue with either the OOM killer, or btrfs check actually chewing up kernel RAM. I've never seen any linux system die in the spectacular ways mine died with that btrfs check, if it were only taking userspace RAM. I've filed a bug, because it looks bad: https://bugzilla.kernel.org/show_bug.cgi?id=195863 Can someone read those better than me? Is it userspace RAM that is missing? You said that swap would help, but in the dump below, I see: Free swap = 15366388kB so my swap was unused and the system crashed due to OOM anyway. btrfs-transacti: page allocation stalls for 23508ms, order:0, mode:0x1400840(GFP_NOFS|__GFP_NOFAIL), nodemask=(null) btrfs-transacti cpuset=/ mems_allowed=0 Mem-Info: active_anon:5274313 inactive_anon:378373 isolated_anon:3590 active_file:3711 inactive_file:3809 isolated_file:0 unevictable:1467 dirty:5068 writeback:49189 unstable:0 slab_reclaimable:8721 slab_unreclaimable:67310 mapped:556943 shmem:801313 pagetables:15777 bounce:0 free:89741 free_pcp:6 free_cma:0 Node 0 active_anon:21097252kB inactive_anon:1513492kB active_file:14844kB inactive_file:15236kB unevictable:5868kB isolated(anon):14360kB isolated(file):0kB mapped:2227772kB dirty:20272kB writeback:196756kB shmem:3205252kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:215184 all_unreclaimable? no Node 0 DMA free:15880kB min:168kB low:208kB high:248kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15972kB managed:15888kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 3201 23768 23768 23768 Node 0 DMA32 free:116720kB min:35424kB low:44280kB high:53136kB active_anon:3161376kB inactive_anon:8kB active_file:320kB inactive_file:332kB unevictable:0kB writepending:612kB present:3362068kB managed:3296500kB mlocked:0kB slab_reclaimable:460kB slab_unreclaimable:668kB kernel_stack:16kB pagetables:7292kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 0 20567 20567 20567 Node 0 Normal free:226664kB min:226544kB low:283180kB high:339816kB active_anon:17935552kB inactive_anon:1513564kB active_file:14524kB inactive_file:14904kB unevictable:5868kB writepending:216372kB present:21485568kB managed:21080208kB mlocked:5868kB slab_reclaimable:34412kB slab_unreclaimable:268520kB kernel_stack:12480kB pagetables:55816kB bounce:0kB free_pcp:148kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 0 0 0 0 Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15880kB Node 0 DMA32: 768*4kB (UME) 740*8kB (UME) 685*16kB (UME) 446*32kB (UME) 427*64kB (UME) 233*128kB (UME) 79*256kB (UME) 10*512kB (UME) 0*1024kB 0*2048kB 0*4096kB = 116720kB Node 0 Normal: 25803*4kB (UME) 11297*8kB (UME) 947*16kB (UME) 260*32kB (ME) 72*64kB (UM) 15*128kB (UM) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 223844kB Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB 858720 total pagecache pages 49221 pages in swap cache Swap cache stats: add 62319, delete 13131, find 75/76 Free swap = 15366388kB Total swap = 15616764kB 6215902 pages RAM 0 pages HighMem/MovableOnly 117753 pages reserved 4096 pages cma reserved I'm also happy to modify the wiki to 1) mention that there is a lowmem mode which in turn isn't really useful for much yet since it won't repair even a trivial thing (seen patches go around, but not in upstream yet) 2) warn that for now check --repair of a big filesystem will crash
Re: [PATCH v3 00/19] Btrfs-progs offline scrub
Okay, I did multiple (upto 11) corruption on the same file. Seems like it says 'CORRUPTED' when we corrupt two continuous data stripes and reports 'RECOVERABLE' whenever possible.It looks fine. You can find the logs and test-scripts on below link. thanks. https://github.com/Lakshmipathi/btrfs_offline_scrub -- Cheers, Lakshmipathi.G -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4.11.1: cannot btrfs check --repair a filesystem, causes heavy memory stalls
On 2017-05-22 22:07, Chris Murphy wrote: On Mon, May 22, 2017 at 5:57 PM, Marc MERLINwrote: On Mon, May 22, 2017 at 05:26:25PM -0600, Chris Murphy wrote: On Mon, May 22, 2017 at 10:31 AM, Marc MERLIN wrote: I already have 24GB of RAM in that machine, adding more for the real fsck repair to run, is going to be difficult and ndb would take days I guess (then again I don't have a machine with 32 or 48 or 64GB of RAM anyway). If you can acquire an SSD, you can give the system a bunch of swap, and at least then hopefully the check repair can complete. Yes it'll be slower than with real RAM but it's not nearly as bad as you might think it'd be, based on HDD based swap. Oh, swap will work, you're sure? I already have an SSD, if that's good enough, I can give it a shot. Yeah although I have no idea how much swap is needed for it to succeed. I'm not sure what the relationship is to fs metadata chunk size to btrfs check RAM requirement is; but if it wants all of the metadata in RAM, then whatever btrfs fi us shows you for metadata may be a guide (?) for how much memory it's going to want. I think the in-memory storage is a bit more space efficient than the on-disk storage, but I'm not certain, and I'm pretty sure it takes up more space when it's actually repairing things. If I'm doing the math correctly, you _may_ need up to 50% _more_ than the total metadata size for the FS in virtual memory space. Another possibility is zswap, which still requires a backing device, but it might be able to limit how much swap to disk is needed if the data to swap out is highly compressible. *shrug* zswap won't help in that respect, but it might make swapping stuff back in faster. It just keeps a compressed copy in memory in parallel to writing the full copy out to disk, then uses that compressed copy to swap in instead of going to disk if the copy is still in memory (but it will discard the compressed copies if memory gets really low). In essence, it reduces the impact of swapping when memory pressure is moderate (the situation for most desktops for example), but becomes almost useless when you have very high memory pressure (which is what describes this usage). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fstests: common: Make _test_mount to include MOUNT_OPTIONS to allow consistent _test_cycle_mount
On Tue, May 23, 2017 at 04:02:05PM +0800, Qu Wenruo wrote: > [BUG] > If using MOUNT_OPTIONS="-o nodatasum" and btrfs to run genierc/142 > generic/143 and generic/154, it will cause false alert like: > cp: failed to clone '/mnt/test/test-154/file2' from > '/mnt/test/test-154/file1': Invalid argument MOUNT_OPTIONS is for scratch mount, and TEST_FS_MOUNT_OPTS is for test dev mount, so I think setting TEST_FS_MOUNT_OPTS to "-o nodatasum" should fix your problem. Yeah, MOUNT_OPTIONS and TEST_FS_MOUNT_OPTS (and almost all other global variables) are not documented properly.. Thanks, Eryu > > [REASON] > It is caused by _test_cycle_mount function, which unmount test device, > but when trying to re-mount it again using _test_mount(), we don't pass > $MOUNT_OPTIONS. > > So this makes mount options differs between _test_cycle_mount(). > > And btrfs doesn't allow different csum flags between reflink source and > destination inodes, so it returns -EINVAL for reflink operation. > > [FIX] > Fix it by passing $MOUNT_OPTIONS to _test_mount(), so that > _test_cycle_mount() won't cause different mount options. > So btrfs with "-o nodatasum" mount option can pass generic/14[23] > and generic/154 without false alert. > > Signed-off-by: Qu Wenruo> --- > common/rc | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/common/rc b/common/rc > index ba215961..a591907c 100644 > --- a/common/rc > +++ b/common/rc > @@ -522,7 +522,8 @@ _test_mount() > return $? > fi > _test_options mount > -_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS > $SELINUX_MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR > +_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS > $SELINUX_MOUNT_OPTIONS \ > + $MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR > } > > _test_unmount() > -- > 2.13.0 > > > > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: skip commit transaction if we don't have enough pinned bytes
On 19.05.2017 20:39, Liu Bo wrote: > We commit transaction in order to reclaim space from pinned bytes because > it could process delayed refs, and in may_commit_transaction(), we check > first if pinned bytes are enough for the required space, we then check if > that plus bytes reserved for delayed insert are enough for the required > space. > > This changes the code to the above logic. > > Signed-off-by: Liu BoPlease add: Fixes: b150a4f10d87 ("Btrfs: use a percpu to keep track of possibly pinned bytes") > --- > fs/btrfs/extent-tree.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index e390451c72e6..bded1ddd1bb6 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -4837,7 +4837,7 @@ static int may_commit_transaction(struct btrfs_fs_info > *fs_info, > > spin_lock(_rsv->lock); > if (percpu_counter_compare(_info->total_bytes_pinned, > -bytes - delayed_rsv->size) >= 0) { > +bytes - delayed_rsv->size) < 0) { > spin_unlock(_rsv->lock); > return -ENOSPC; > } > With the minor nit above: Reviewed-by: Nikolay Borisov Tested-by: Nikolay Borisov -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] fstests: common: Make _test_mount to include MOUNT_OPTIONS to allow consistent _test_cycle_mount
[BUG] If using MOUNT_OPTIONS="-o nodatasum" and btrfs to run genierc/142 generic/143 and generic/154, it will cause false alert like: cp: failed to clone '/mnt/test/test-154/file2' from '/mnt/test/test-154/file1': Invalid argument [REASON] It is caused by _test_cycle_mount function, which unmount test device, but when trying to re-mount it again using _test_mount(), we don't pass $MOUNT_OPTIONS. So this makes mount options differs between _test_cycle_mount(). And btrfs doesn't allow different csum flags between reflink source and destination inodes, so it returns -EINVAL for reflink operation. [FIX] Fix it by passing $MOUNT_OPTIONS to _test_mount(), so that _test_cycle_mount() won't cause different mount options. So btrfs with "-o nodatasum" mount option can pass generic/14[23] and generic/154 without false alert. Signed-off-by: Qu Wenruo--- common/rc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/common/rc b/common/rc index ba215961..a591907c 100644 --- a/common/rc +++ b/common/rc @@ -522,7 +522,8 @@ _test_mount() return $? fi _test_options mount -_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS $SELINUX_MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR +_mount -t $FSTYP $TEST_OPTIONS $TEST_FS_MOUNT_OPTS $SELINUX_MOUNT_OPTIONS \ + $MOUNT_OPTIONS $* $TEST_DEV $TEST_DIR } _test_unmount() -- 2.13.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: QGroups Semantics
Hit "Send" a little too early: More complete workaround would be delayed cleanup. What about (re-)mount time? (Should also handle qgroups remaining ... after subvolumes deleted on previous kernels.) -- With Best Regards, Marat Khalili On 23/05/17 08:38, Marat Khalili wrote: Just some user's point of view: I propose the following changes: 1) We always cleanup level-0 qgroups by default, with no opt-out. I see absolutely no reason to keep these around. It WILL break scripts that try to do this cleanup themselves. OTOH it will simplify writing new ones. Since qgroups are assigned sequential numbers, it must be possible to partially work it around by not returning error on repeated delete. But you cannot completely emulate qgroup presence without actually keeping it, so some scripts will still break. More complete workaround would be delayed cleanup. What about (re-)mount time? (Should also handle qgroups remaining ) We do not allow the creation of level-0 qgroups for (sub)volumes that do not exist. Probably I'm mistaken, but I see no reasons for doing it even now, since I don't think it's possible to reliably assign existing 0-level qgroup to a new subvolume. So this change should break nothing. Why do we allow deleting a level 0 qgroup for a currently existing subvolume? 4) Add a flag to the qgroup_delete_v2 ioctl, NO_SUBVOL_CHECK. If the flag is present, it will allow you to delete qgroups which reference active subvolumes. Some people doing cleanup in the reverse order? Other than this, I don't understand why this feature is needed, so IMO it's unlikely to be needed in a new API. Of course, this is all just one datapoint for you. -- With Best Regards, Marat Khalili -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] btrfs: Add code to prevent qgroup creation for a non-existent subvol
Hi Sargun, [auto build test WARNING on linus/master] [also build test WARNING on v4.12-rc2 next-20170522] [cannot apply to btrfs/next] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Sargun-Dhillon/BtrFS-QGroups-uapi-improvements/20170523-111746 reproduce: # apt-get install sparse make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) include/linux/compiler.h:264:8: sparse: attribute 'no_sanitize_address': unknown attribute >> fs/btrfs/tests/qgroup-tests.c:232:34: sparse: not enough arguments for >> function btrfs_create_qgroup fs/btrfs/tests/qgroup-tests.c:334:34: sparse: not enough arguments for function btrfs_create_qgroup fs/btrfs/tests/qgroup-tests.c: In function 'test_no_shared_qgroup': fs/btrfs/tests/qgroup-tests.c:232:8: error: too few arguments to function 'btrfs_create_qgroup' ret = btrfs_create_qgroup(NULL, fs_info, BTRFS_FS_TREE_OBJECTID); ^~~ In file included from fs/btrfs/tests/qgroup-tests.c:24:0: fs/btrfs/tests/../qgroup.h:127:5: note: declared here int btrfs_create_qgroup(struct btrfs_trans_handle *trans, ^~~ fs/btrfs/tests/qgroup-tests.c: In function 'test_multiple_refs': fs/btrfs/tests/qgroup-tests.c:334:8: error: too few arguments to function 'btrfs_create_qgroup' ret = btrfs_create_qgroup(NULL, fs_info, BTRFS_FIRST_FREE_OBJECTID); ^~~ In file included from fs/btrfs/tests/qgroup-tests.c:24:0: fs/btrfs/tests/../qgroup.h:127:5: note: declared here int btrfs_create_qgroup(struct btrfs_trans_handle *trans, ^~~ vim +232 fs/btrfs/tests/qgroup-tests.c faa2dbf0 Josef Bacik2014-05-07 216 btrfs_free_path(path); faa2dbf0 Josef Bacik2014-05-07 217 return ret; faa2dbf0 Josef Bacik2014-05-07 218 } faa2dbf0 Josef Bacik2014-05-07 219 b9ef22de Feifei Xu 2016-06-01 220 static int test_no_shared_qgroup(struct btrfs_root *root, b9ef22de Feifei Xu 2016-06-01 221 u32 sectorsize, u32 nodesize) faa2dbf0 Josef Bacik2014-05-07 222 { faa2dbf0 Josef Bacik2014-05-07 223 struct btrfs_trans_handle trans; faa2dbf0 Josef Bacik2014-05-07 224 struct btrfs_fs_info *fs_info = root->fs_info; 442244c9 Qu Wenruo 2015-04-16 225 struct ulist *old_roots = NULL; 442244c9 Qu Wenruo 2015-04-16 226 struct ulist *new_roots = NULL; faa2dbf0 Josef Bacik2014-05-07 227 int ret; faa2dbf0 Josef Bacik2014-05-07 228 7c55ee0c Omar Sandoval 2015-09-29 229 btrfs_init_dummy_trans(); faa2dbf0 Josef Bacik2014-05-07 230 faa2dbf0 Josef Bacik2014-05-07 231 test_msg("Qgroup basic add\n"); ef9f2db3 Feifei Xu 2016-06-01 @232 ret = btrfs_create_qgroup(NULL, fs_info, BTRFS_FS_TREE_OBJECTID); faa2dbf0 Josef Bacik2014-05-07 233 if (ret) { faa2dbf0 Josef Bacik2014-05-07 234 test_msg("Couldn't create a qgroup %d\n", ret); faa2dbf0 Josef Bacik2014-05-07 235 return ret; faa2dbf0 Josef Bacik2014-05-07 236 } faa2dbf0 Josef Bacik2014-05-07 237 442244c9 Qu Wenruo 2015-04-16 238 /* 01327610 Nicholas D Steeves 2016-05-19 239 * Since the test trans doesn't have the complicated delayed refs, 442244c9 Qu Wenruo 2015-04-16 240 * we can only call btrfs_qgroup_account_extent() directly to test :: The code at line 232 was first introduced by commit :: ef9f2db365c31433e52b0c5863793273bb632666 Btrfs: self-tests: Use macros instead of constants and add missing newline :: TO: Feifei Xu <xufei...@linux.vnet.ibm.com> :: CC: David Sterba <dste...@suse.com> --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] Btrfs: compression must free at least PAGE_SIZE
Hi Timofey, [auto build test ERROR on v4.9-rc8] [also build test ERROR on next-20170522] [cannot apply to btrfs/next] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Timofey-Titovets/Btrfs-lzo-c-pr_debug-deflate-lzo/20170523-110651 config: x86_64-kexec (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): fs/btrfs/lzo.c: In function 'lzo_compress_pages': >> fs/btrfs/lzo.c:233:27: error: expected expression before '>' token if (tot_out + PAGE_SIZE => tot_in) { ^ vim +233 fs/btrfs/lzo.c 227 in_page = find_get_page(mapping, start >> PAGE_SHIFT); 228 data_in = kmap(in_page); 229 in_len = min(bytes_left, PAGE_SIZE); 230 } 231 232 /* Compression must save at least one PAGE_SIZE */ > 233 if (tot_out + PAGE_SIZE => tot_in) { 234 ret = -E2BIG; 235 goto out; 236 } --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip