Re: [BUG] cannot mount subvolume with selinux context
Original Message Subject: Re: [BUG] cannot mount subvolume with selinux context From: Eryu Guan guane...@gmail.com To: Zach Brown z...@zabbo.net Date: 2014年08月20日 11:57 On Tue, Aug 19, 2014 at 10:28:54AM -0700, Zach Brown wrote: On Tue, Aug 19, 2014 at 11:32:16AM +0800, Eryu Guan wrote: Hi, Description of the problem: mount btrfs with selinux context, then create a subvolume, the new subvolume cannot be mounted, even with the same context. mkfs -t btrfs /dev/sda5 mount -o context=system_u:object_r:nfs_t:s0 /dev/sda5 /mnt/btrfs btrfs subvolume create /mnt/btrfs/subvol mount -o subvol=subvol,context=system_u:object_r:nfs_t:s0 /dev/sda5 /mnt/test Submit a xfstest? Sure, will do. Thanks, Eryu The security_sb_copy_data() takes out selinux context data to secdata, then mount_subvol() calls mount_fs() (via vfs_kern_mount()) again without selinux context, so mount_subvol() fails, which fails the whole mount. Not sure what's the proper fix. Zach suggestted that the fix will probably be to rework the vfs functions a bit as he said in rh bugzilla[1]. Yeah, I have no idea what'd be preferred here: - rework the vfs _kern_ mount api to offer one that doesn't mess with selinux mount options - add a flag to have the second _kern_ mount ignore selinux (but not MS_KERNMOUNT?) - binary data and fs selinux handling? (like nfs) In fact, we can just make btrfs deal with subvol= mount option in a new method. Current, btrfs handle subvol= by call vfs_kern_mount again and use vfs level mount_subtree() to do the path search thing. But on the other hand, btrfs does not call vfs_kern_mount() when handling default subvolume or subvolid= mount, so, I think we can do all the path search inside btrfs instead of reuse vfs level functions, and convert subvol= mount option to subvolid=, which should be selinux friendly now. (And in this method mount_subvol() should be called just before get_default_root()). If I am wrong, please tell me. BTW, it seems that if mainline kernel accept the patchset which convert subvolid= to subvol=, it will make the bug more seriously. :-( Thank goddness, the successor patch uses get_path() Thanks, Qu - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: init uninitialized output buf for btrfs-restore
Hi Gui, Am Donnerstag, 21. August 2014, 11:35:36 schrieb Gui Hecheng: A memory problem reported by valgrind as follows: === Syscall param pwrite64(buf) points to uninitialised byte(s) When running: # valgrind --leak-check=yes btrfs restore /dev/sda9 /mnt/backup Because the output buf size is alloced with malloc, but the length of output data is shorter than the sizeof(buf), so valgrind report uninitialised byte(s). We could use calloc to repalce malloc and clear this WARNING away. yes, the warning vanished. But the reads from free'd memory make me more worring... Marc signature.asc Description: This is a digitally signed message part.
Re: [PATCH 0/3] btrfs-progs: remove full /dev scanning
A long time back there was an attempt to remove it but this avoided it. Pls ref to the link in this discussion. https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg27272.html Thanks, Anand On 08/21/2014 06:21 AM, Eric Sandeen wrote: btrfs fileystem show and btrfs device scan today both have the -d option to scan everything under /dev. But we also have a mechanism to scan everything in /proc/partitions, which should always be sufficient. If anyone knows why we'd find something deep under /dev but not in /proc/partitions, speak now or forever hold your peace... Tested this by running through a matrix of -d, -m, or args for show/scan, for a 2-device fs, with and without a symlinked device, with and without a symlinked mountpoint. All output was identical. Thanks, -Eric -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
On Thu, Aug 21, 2014 at 10:04:30AM +0800, Qu Wenruo wrote: Original Message Subject: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously From: Eryu Guan eg...@redhat.com To: fste...@vger.kernel.org Date: 2014年08月21日 01:33 Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. Signed-off-by: Eryu Guan eg...@redhat.com --- tests/btrfs/057 | 147 tests/btrfs/057.out | 2 + tests/btrfs/group | 1 + 3 files changed, 150 insertions(+) create mode 100755 tests/btrfs/057 create mode 100644 tests/btrfs/057.out diff --git a/tests/btrfs/057 b/tests/btrfs/057 new file mode 100755 index 000..2f507a7 --- /dev/null +++ b/tests/btrfs/057 @@ -0,0 +1,147 @@ +#! /bin/bash +# FSQA Test No. btrfs/057 +# +# Run btrfs balance and subvolume create/mount/umount/delete simultaneously, +# with fsstress running in background. +# +#--- +# Copyright (C) 2014 Red Hat Inc. All rights reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo QA output created by $seq + +here=`pwd` +tmp=/tmp/$$ +status=1 +trap _cleanup; exit \$status 0 1 2 3 15 + +_cleanup() +{ +cd / +rm -fr $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch +_require_scratch_dev_pool 4 + +rm -f $seqres.full + +# test case array +tcs=( +-m single -d single +-m dup -d single +-m raid0 -d raid0 +-m raid1 -d raid0 +-m raid1 -d raid1 +-m raid10 -d raid10 +-m raid5 -d raid5 +-m raid6 -d raid6 +) I wonder should we add the mkfs options there. Since xfstests already use environment MKFS_OPTIONS to do mkfs, if really need to test all mkfs options, IMO it is better to change MKFS_OPTIONS on each test round. Hmmm - I you didn't read the code, because: +run_test() +{ +local mkfs_opts=$1 +local saved_mkfs_opts=$MKFS_OPTIONS +local subvol_mnt=$tmp.mnt + +echo Test $mkfs_opts $seqres.full + +MKFS_OPTIONS=$MKFS_OPTIONS $mkfs_opts +# dup only works on single device it's doing exactly what you suggest. And it's wrong. This: _scratch_mkfs $mkfs_opts is all that is needed. This wheel does not need reinventing. ;) Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Significance of high number of mails on this list?
Shriramana Sharma posted on Thu, 21 Aug 2014 08:52:52 +0530 as excerpted: Hello. People on this list have been kind enough to reply to my technical questions. However, seeing the high number of mails on this list, esp with the title PATCH, I have a question about the development itself: Is this just an indication of a vibrant user/devel community [*] and healthy development of many new nice features to eventually come out in stable form later, or are we still at the fixing rough edges stage? IOW what is the proportion of commits adding new features to those stabilising/fixing features? [* Since there is no separate btrfs-users vs brtfs-dev I'm not able to gauge this difference either. i.e. if there were a dedicated -dev list I might not be alarmed by a high number of mails indicating fast development.] Mostly I have read like BTRFS is mostly stable but there might be a few corner cases as yet unknown since this is a totally new generation of FSs. But still given the volume of mails here I wanted to ask... I'm sorry I realize I'm being a bit vague but I'm not sure how to exactly express what I'm feeling about BTRFS right now... Good question, but certainly one where opinions are likely to differ. Of course there a more or less official answer on the wiki ( btrfs.wiki.kernel.org ), but given the previous research the level of the questions you've been asking demonstrate, I expect you've seen that and well, it's simply not at the level of detail you find yourself needing at this point, so here you are, asking for more detail and, I guess, personal opinion. I'm going to frame my own answer as several general observations of fact which I don't believe there's likely to be much dispute over, then explain them, then follow that with my own opinion. Observations of fact enumerated: 1) Definitions of stable differ. 2) Mailing lists distort reality. 3) Previous btrfs warnings have been removed. 4) Two major recent stability-affecting bugs. 5) Some bits of btrfs more stable than others. Observations of fact explained: 1) Definitions of stable differ. There's the stable that people like me sometimes simply omit the b from and call stale, and there's dogfoodable stable. In distro terms, sta(b)le is RHEL 5 or Debian old-stable. In filesystem terms, it's ext3, as ext4 is only now beginning to look stable. Dogfoodable stable refers to the point at which developers and early testers find software stable enough to actually use in their ordinary daily routine. I can't see anyone arguing that btrfs meets the sta(b)le standard or that it's any closer than a few years out. OTOH, I guess most regulars here have found btrfs dogfoodable stable for some time. But stable for most people means something in between, and just where btrfs is in that in between, is where the debate is. 2) Filesystem mailing lists distort reality. It's exceeding difficult to draw accurate stability conclusions (well, beyond the sta(b)le level) from a filesystem's mailing list. If the filesystem's under active development, there /will/ be numbers of active bugs reported, some of which will look pretty bad from the outside or simple sysadmin's perspective, and lots of patches floating around in various stages of development as well. An uninitiated outsider's reaction will almost certainly be and THIS is what they call STABLE? But that's a fairly obvious first-order conclusion. The immediate natural reaction is to discount it, but it's just as easy to over- discount and see it as more stable than it is. Accurately calibrating the amount of discount without other information is basically impossible, so other information must be used as a primary gauge, with the mailing list possibly used as a reality check on /that/. OTOH, I'd expect a fully mature and stable (post-mature? sta(b)le?) filesystem, without much /need/ for new development, only maintenance of current state, to have a much quieter list. Obviously as a filesystem falls into obsolescence and disuse, it'll have a quieter list as well. 3) Previous btrfs warnings have been removed. In the last few kernel cycles the previous more or less official btrfs will eat your babies level instability warning has been removed, from the kernel option description, to the wording used on the wiki, to the warning in the manpages and printed by mkfs.btrfs, if there's any warning at all left, it's far less strident than it was. Some here believe the complete removal of such warnings has been premature, altho arguably it was time to tone them down. 4) Two severe stability-affecting bugs have recently been traced and have patches working thru the pipeline as we speak. One of these bugs has been there since very near the beginning, but happened to only be triggered rarely, the reason it wasn't caught until now. The patch is to be applied to stable going back as far as stable series (with btrfs) go and should already be in
Re: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
Original Message Subject: Re: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously From: Dave Chinner da...@fromorbit.com To: Qu Wenruo quwen...@cn.fujitsu.com Date: 2014年08月21日 17:01 On Thu, Aug 21, 2014 at 10:04:30AM +0800, Qu Wenruo wrote: Original Message Subject: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously From: Eryu Guan eg...@redhat.com To: fste...@vger.kernel.org Date: 2014年08月21日 01:33 Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. Signed-off-by: Eryu Guan eg...@redhat.com --- tests/btrfs/057 | 147 tests/btrfs/057.out | 2 + tests/btrfs/group | 1 + 3 files changed, 150 insertions(+) create mode 100755 tests/btrfs/057 create mode 100644 tests/btrfs/057.out diff --git a/tests/btrfs/057 b/tests/btrfs/057 new file mode 100755 index 000..2f507a7 --- /dev/null +++ b/tests/btrfs/057 @@ -0,0 +1,147 @@ +#! /bin/bash +# FSQA Test No. btrfs/057 +# +# Run btrfs balance and subvolume create/mount/umount/delete simultaneously, +# with fsstress running in background. +# +#--- +# Copyright (C) 2014 Red Hat Inc. All rights reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo QA output created by $seq + +here=`pwd` +tmp=/tmp/$$ +status=1 +trap _cleanup; exit \$status 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -fr $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch +_require_scratch_dev_pool 4 + +rm -f $seqres.full + +# test case array +tcs=( + -m single -d single + -m dup -d single + -m raid0 -d raid0 + -m raid1 -d raid0 + -m raid1 -d raid1 + -m raid10 -d raid10 + -m raid5 -d raid5 + -m raid6 -d raid6 +) I wonder should we add the mkfs options there. Since xfstests already use environment MKFS_OPTIONS to do mkfs, if really need to test all mkfs options, IMO it is better to change MKFS_OPTIONS on each test round. Hmmm - I you didn't read the code, because: +run_test() +{ + local mkfs_opts=$1 + local saved_mkfs_opts=$MKFS_OPTIONS + local subvol_mnt=$tmp.mnt + + echo Test $mkfs_opts $seqres.full + + MKFS_OPTIONS=$MKFS_OPTIONS $mkfs_opts + # dup only works on single device it's doing exactly what you suggest. I am afraid that you misunderstand what I mean... I just mean these mount option should be done by setting environment before runing check or set in local.conf. Although Eryu Guan has already explaines this and it is still needed. Thanks, Qu And it's wrong. This: _scratch_mkfs $mkfs_opts is all that is needed. This wheel does not need reinventing. ;) Cheers, Dave. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
On Thu, Aug 21, 2014 at 05:15:01PM +0800, Qu Wenruo wrote: Original Message Subject: Re: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously From: Dave Chinner da...@fromorbit.com To: Qu Wenruo quwen...@cn.fujitsu.com Date: 2014年08月21日 17:01 On Thu, Aug 21, 2014 at 10:04:30AM +0800, Qu Wenruo wrote: Original Message Subject: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously From: Eryu Guan eg...@redhat.com To: fste...@vger.kernel.org Date: 2014年08月21日 01:33 Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. +# test case array +tcs=( + -m single -d single + -m dup -d single + -m raid0 -d raid0 + -m raid1 -d raid0 + -m raid1 -d raid1 + -m raid10 -d raid10 + -m raid5 -d raid5 + -m raid6 -d raid6 +) I wonder should we add the mkfs options there. Since xfstests already use environment MKFS_OPTIONS to do mkfs, if really need to test all mkfs options, IMO it is better to change MKFS_OPTIONS on each test round. Hmmm - I you didn't read the code, because: +run_test() +{ + local mkfs_opts=$1 + local saved_mkfs_opts=$MKFS_OPTIONS + local subvol_mnt=$tmp.mnt + + echo Test $mkfs_opts $seqres.full + + MKFS_OPTIONS=$MKFS_OPTIONS $mkfs_opts + # dup only works on single device it's doing exactly what you suggest. I am afraid that you misunderstand what I mean... I just mean these mount option should be done by setting environment before runing check or set in local.conf. You can override or append to MKFS_OPTIONS and MOUNT_OPTIONS in tests if required - lots of tests do exactly that (e.g. any quota test your care to name). That modification, however, is only valid for the specific test being run because the modification is to the environment of the test process, not the environment of check process that is running the tests i.e. Running custom mkfs or mount options like this is perfectly acceptable and I'm just commenting that the implementation of those custom options could be a lot better. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/15] xfstests: new btrfs stress test cases
On Wed, Aug 20, 2014 at 11:24:37AM -0700, Zach Brown wrote: On Thu, Aug 21, 2014 at 01:33:48AM +0800, Eryu Guan wrote: This patchset add new stress test cases for btrfs by running two different btrfs operations simultaneously under fsstress to ensure btrfs doesn't hang or oops in such situations. btrfs scrub and btrfs check will be run after each test. Cool. The test matrix is the combination of 6 btrfs operations: balance create/mount/umount/delete subvolume replace device scrub defrag remount with different compress algorithms Short descriptions: 057: balance-subvolume 058: balance-scrub 059: balance-defrag 060: balance-remount 061: balance-replace 062: subvolume-replace 063: subvolume-scrub 064: subvolume-defrag 065: subvolume-remount 066: replace-scrub 067: replace-defrag 068: replace-remount 069: scrub-defrag 070: scrub-remount 071: defrag-remount But I'm not sure it should be built this way. At the very least each operation's implementation should be in a shared function somewhere instead of being duplicated in each test. But I don't think there should be a seperate test for each combination. With a bit of fiddly bash you can automate generating unique combinations of operations that are defined as functions in one test. btrfs_op_balance() { echo hi } btrfs_op_scrub() { echo hi } btrfs_op_defrag() { echo hi } ops=($(declare -F | awk '/-f btrfs_op_/ {print $3}')) nr=${#ops[@]} for i in $(seq 0 $((nr - 2))); do for j in $(seq $((i + 1)) $((nr - 1))); do echo ${ops[i]} ${ops[j]} done done Yes, it could be done like that, but historically that has proven to be a bad idea. Multiplexing tens of tests within a single test is just makes it hard to determine what failed. It might fail one combination in 3.16, a different combo in 3.17 and yet another in 3.18. But from a reporting point of view, all we see is that a single test failed, rather than being able to see that there were three separate problems and that btrfs_op_scrub() was the common factor in all three failures. It's trivial to write this as a bunch of helper functions and then boiler-plate the actual tests themselves. There will be little difference in terms of run time, but we get much more fine-grained control of execution and reporting Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: init uninitialized output buf for btrfs-restore
On Thu, 2014-08-21 at 10:14 +0200, Marc Dietrich wrote: Hi Gui, Am Donnerstag, 21. August 2014, 11:35:36 schrieb Gui Hecheng: A memory problem reported by valgrind as follows: === Syscall param pwrite64(buf) points to uninitialised byte(s) When running: # valgrind --leak-check=yes btrfs restore /dev/sda9 /mnt/backup Because the output buf size is alloced with malloc, but the length of output data is shorter than the sizeof(buf), so valgrind report uninitialised byte(s). We could use calloc to repalce malloc and clear this WARNING away. yes, the warning vanished. But the reads from free'd memory make me more worring... Ah, yeah, I am looking into it, hope that I can do some help :) Marc -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs restore memory corruption (bug: 82701)
On Mon, 2014-08-18 at 11:25 +0200, Marc Dietrich wrote: Hi, I did a checkout of the latest btrfs progs to repair my damaged filesystem. Running btrfs restore gives me several failed to inflate: -6 and crashes with some memory corruption. I ran it again with valgrind and got: valgrind --log-file=x2 -v --leak-check=yes btrfs restore /dev/sda9 /mnt/backup ==8528== Memcheck, a memory error detector ==8528== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==8528== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==8528== Command: btrfs restore /dev/sda9 /mnt/backup ==8528== Parent PID: 8453 ==8528== ==8528== Syscall param pwrite64(buf) points to uninitialised byte(s) ==8528==at 0x59BE3C3: __pwrite_nocancel (in /lib64/libpthread-2.18.so) ==8528==by 0x41F22F: search_dir (cmds-restore.c:392) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== Address 0x66956a0 is 7,056 bytes inside a block of size 8,192 alloc'd ==8528==at 0x4C277AB: malloc (in /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so) ==8528==by 0x41EEAD: search_dir (cmds-restore.c:316) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ---[snip]- ==8528== Invalid read of size 1 ==8528==at 0x4C2BF15: memcpy@@GLIBC_2.14 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8528==by 0x43818F: read_extent_buffer (string3.h:51) ==8528==by 0x41EC66: search_dir (cmds-restore.c:233) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== Address 0x684c186 is 1,110 bytes inside a block of size 4,224 free'd ==8528==at 0x4C28ADC: free (in /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so) ==8528==by 0x437895: free_extent_buffer (extent_io.c:618) ==8528==by 0x41E053: next_leaf (cmds-restore.c:202) ==8528==by 0x41E50F: search_dir (cmds-restore.c:731) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== ==8528== Invalid read of size 8 ==8528==at 0x4C2BF40: memcpy@@GLIBC_2.14 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8528==by 0x43818F: read_extent_buffer (string3.h:51) ==8528==by 0x41EC66: search_dir (cmds-restore.c:233) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== Address 0x684c178 is 1,096 bytes inside a block of size 4,224 free'd ==8528==at 0x4C28ADC: free (in /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so) ==8528==by 0x437895: free_extent_buffer (extent_io.c:618) ==8528==by 0x41E053: next_leaf (cmds-restore.c:202) ==8528==by 0x41E50F: search_dir (cmds-restore.c:731) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== ==8528== Invalid read of size 8 ==8528==at 0x4C2BF52: memcpy@@GLIBC_2.14 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8528==by 0x43818F: read_extent_buffer (string3.h:51) ==8528==by 0x41EC66:
Re: Significance of high number of mails on this list?
Hello! Am Donnerstag, 21. August 2014, 08:52:52 schrieb Shriramana Sharma: Hello. People on this list have been kind enough to reply to my technical questions. However, seeing the high number of mails on this list, esp with the title PATCH, I have a question about the development itself: Is this just an indication of a vibrant user/devel community [*] and healthy development of many new nice features to eventually come out in stable form later, or are we still at the fixing rough edges stage? IOW what is the proportion of commits adding new features to those stabilising/fixing features? Oh, well, sometimes I can guess from the patch descriptions whether this is more of a fix or more of a feature. And on what I see in the last week or do its mostly stabilization and fixing work. Also the last pull request was more about fixes. Which is good, since fixes help to stabilize BTRFS further. [* Since there is no separate btrfs-users vs brtfs-dev I'm not able to gauge this difference either. i.e. if there were a dedicated -dev list I might not be alarmed by a high number of mails indicating fast development.] Why would that make a difference? Mostly I have read like BTRFS is mostly stable but there might be a few corner cases as yet unknown since this is a totally new generation of FSs. But still given the volume of mails here I wanted to ask... I'm sorry I realize I'm being a bit vague but I'm not sure how to exactly express what I'm feeling about BTRFS right now... Well, I do not really get what you are after. What is your *intention*? Do you want to try out BTRFS? Do you want to use BTRFS in production use? Do you want to use BTRFS on a desktop, laptop, server? What BTRFS features do you want to use? Just a plain volume or RAID 1 and so on… On any account if you plan to use BTRFS I strongly recommend to be subscribed to this mailing list and be willing to deal with issues. There are hangs reported happening on 3.15 and 3.16 in space full situations. I long thought 3.14 would be safe, but there have been problem reports as well. That said, none of my BTRFS filesystems corrupted itself so far. I have a slight glitch on /home BTRFS RAID 1 I was not yet able to repair, but scrubbing tells me my data is good. And at keeping stored data fine and healthy all of my BTRFS installations have been good at. According to scrubbing I never ever lost a single byte on *any* BTRFS drive. Except a BTRFS RAID 0 over 16 or 18 SAS disks which went completely bust quite some time ago, but first this could have been a hardware issue and second maybe btrfs-zero-log I wasn´t aware of could have helped. I close with a summary on where I use BTRFS right now: - my main laptop is except for /boot completely BTRFS, partly RAID 1 on two SSDs, partly single drived. It is BTRFS since I got it. So more than 3 years. RAID 1 since four months or so. I used snapshots manually there, but as the lockups seem to happen more easily as BTRFS fills up more quickly, I didn´t. I think I will use snapshots again now I am testing some patches to fix those lockups. - my old music laptop was BTRFS except for /boot for years. - I just moved my new music laptop /home to BTRFS as well, / and /boot are Ext4. I did you by restoring from backup instead of using btrfs-convert - I have two large external eSATA HDDs which are BTRFS as well. One since more than a year. I use snapshots manuelly there to hold old backups. Still doing backup by rsync so far. - I recently moved part of my server VM onto BTRFS with several subvolumes. /home and /srv are on it already. /home with maildir stuff and /srv with owncloud data storage. I will be moving /var soon as well, I think. I use snapshots manually there. Many of my BTRFS file system is not all of them use lzo compression. All use space_cache. Most of them use big metadata and skinny extents and ext ref for more hardlinks as well. So while BTRFS seems to keep already stored data safe for me very reliably, I do have issues with reliability during runtime on *some* BTRFS setups, mostly those where the trees of BTRFS easily allocate all of the volume, so somewhat heavily used *and* somewhat space constrained setups, where automatic tree rebalancing would be helpful. These require manual maintenance for now from time to time. I know some are using btrfs send and receive already as well. I didn´t yet got to set it up. Also see the nice slides by Marc Merlin. http://www.phoronix.com/scan.php?page=news_itempx=MTc2Njk http://events.linuxfoundation.org/sites/events/files/slides/Btrfs.pdf Thanks, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously
On Thu, Aug 21, 2014 at 07:01:05PM +1000, Dave Chinner wrote: On Thu, Aug 21, 2014 at 10:04:30AM +0800, Qu Wenruo wrote: Original Message Subject: [PATCH 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously From: Eryu Guan eg...@redhat.com To: fste...@vger.kernel.org Date: 2014年08月21日 01:33 Run btrfs balance and subvolume create/mount/umount/delete simultaneously, with fsstress running in background. Signed-off-by: Eryu Guan eg...@redhat.com [snip] +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch +_require_scratch_dev_pool 4 + +rm -f $seqres.full + +# test case array +tcs=( + -m single -d single + -m dup -d single + -m raid0 -d raid0 + -m raid1 -d raid0 + -m raid1 -d raid1 + -m raid10 -d raid10 + -m raid5 -d raid5 + -m raid6 -d raid6 +) I wonder should we add the mkfs options there. Since xfstests already use environment MKFS_OPTIONS to do mkfs, if really need to test all mkfs options, IMO it is better to change MKFS_OPTIONS on each test round. Hmmm - I you didn't read the code, because: +run_test() +{ + local mkfs_opts=$1 + local saved_mkfs_opts=$MKFS_OPTIONS + local subvol_mnt=$tmp.mnt + + echo Test $mkfs_opts $seqres.full + + MKFS_OPTIONS=$MKFS_OPTIONS $mkfs_opts + # dup only works on single device it's doing exactly what you suggest. And it's wrong. This: _scratch_mkfs $mkfs_opts is all that is needed. This wheel does not need reinventing. ;) I just noticed _scratch_pool_mkfs could do the same, thanks for the reminder! Thanks, Eryu Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC] btrfs-progs: Show backtrace on BUGs
btrfs check is still under heavy development and so there are some BUGs beging hit. btrfs check can be run on limited environment which lacks gdb to debug the abort in detail. If we could see backtrace, it will be easier to find a root cause of the BUG. Following is my btrfs check output with and without the patch. without patch: ref mismatch on [1437411180544 16384] extent item 3, found 2 btrfs: extent_io.c:612: free_extent_buffer: Assertion `!(eb-flags 1)' failed. enabling repair mode Checking filesystem on /dev/sdb2 UUID: a53121ee-679f-4241-bb44-ceb5a1a7beb7 with patch: ref mismatch on [1437411180544 16384] extent item 3, found 2 btrfs check(free_extent_buffer+0xa3)[0x808ff89] btrfs check[0x8090222] btrfs check(alloc_extent_buffer+0xa7)[0x80903d0] btrfs check(btrfs_find_create_tree_block+0x2e)[0x807edef] btrfs check(btrfs_alloc_free_block+0x299)[0x808a3c4] btrfs check(__btrfs_cow_block+0x1d4)[0x8078896] btrfs check(btrfs_cow_block+0x150)[0x807934d] btrfs check(btrfs_search_slot+0x136)[0x807c041] btrfs check[0x8065a6b] btrfs check[0x806bc0f] btrfs check(cmd_check+0xc8f)[0x806d03a] btrfs check(main+0x167)[0x804f5ec] /lib/libc.so.6(__libc_start_main+0xe6)[0xf754c4f6] btrfs check[0x804f061] btrfsck: extent_io.c:612: free_extent_buffer: Assertion `!(eb-flags 1)' failed. enabling repair mode Checking filesystem on /dev/sdb2 UUID: a53121ee-679f-4241-bb44-ceb5a1a7beb7 Now it's much clear that there's something wrong around alloc_extent_buffer. Signed-off-by: Naohiro Aota na...@elisp.net --- Makefile | 2 +- kerncompat.h | 29 ++--- 2 files changed, 27 insertions(+), 4 deletions(-) diff --git a/Makefile b/Makefile index 76565e8..9db5441 100644 --- a/Makefile +++ b/Makefile @@ -5,7 +5,7 @@ CC = gcc LN = ln AR = ar AM_CFLAGS = -Wall -D_FILE_OFFSET_BITS=64 -DBTRFS_FLAT_INCLUDES -fno-strict-aliasing -fPIC -CFLAGS = -g -O1 -fno-strict-aliasing +CFLAGS = -g -O1 -fno-strict-aliasing -rdynamic objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \ root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \ extent-cache.o extent_io.o volumes.o utils.o repair.o \ diff --git a/kerncompat.h b/kerncompat.h index f370cd8..94fadc8 100644 --- a/kerncompat.h +++ b/kerncompat.h @@ -27,6 +27,7 @@ #include byteswap.h #include assert.h #include stddef.h +#include execinfo.h #include linux/types.h #ifndef READ @@ -50,7 +51,18 @@ #define ULONG_MAX (~0UL) #endif -#define BUG() assert(0) +#define MAX_BACKTRACE 128 +#define show_trace() \ + do {\ + void *trace[MAX_BACKTRACE]; \ + int n = backtrace(trace, MAX_BACKTRACE);\ + backtrace_symbols_fd(trace, n, 2); \ + } while(0) +#define BUG() \ + do {\ + show_trace(); \ + assert(0); \ + } while(0) #ifdef __CHECKER__ #define __force__attribute__((force)) #define __bitwise__ __attribute__((bitwise)) @@ -233,8 +245,19 @@ static inline long IS_ERR(const void *ptr) #define kstrdup(x, y) strdup(x) #define kfree(x) free(x) -#define BUG_ON(c) assert(!(c)) -#define WARN_ON(c) assert(!(c)) +#define BUG_ON(c) \ + do {\ + if((c)) { \ + show_trace(); \ + assert(!(c)); \ + } \ + } while(0) +#define WARN_ON(c) do {\ + if((c)) { \ + show_trace(); \ + assert(!(c)); \ + } \ + } while(0) #define container_of(ptr, type, member) ({ \ -- 2.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: Do not free dirty extent buffer
free_some_buffer() should not free dirty extent buffers. They should be left for later commit. Signed-off-by: Naohiro Aota na...@elisp.net --- extent_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/extent_io.c b/extent_io.c index a127e54..8a668be 100644 --- a/extent_io.c +++ b/extent_io.c @@ -552,7 +552,7 @@ static int free_some_buffers(struct extent_io_tree *tree) list_for_each_safe(node, next, tree-lru) { eb = list_entry(node, struct extent_buffer, lru); - if (eb-refs == 1) { + if (eb-refs == 1 !(eb-flags EXTENT_DIRTY)) { free_extent_buffer(eb); if (tree-cache_size cache_hard_max) break; -- 2.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs restore memory corruption (bug: 82701)
Am Donnerstag, 21. August 2014, 17:52:16 schrieb Gui Hecheng: On Mon, 2014-08-18 at 11:25 +0200, Marc Dietrich wrote: Hi, I did a checkout of the latest btrfs progs to repair my damaged filesystem. Running btrfs restore gives me several failed to inflate: -6 and crashes with some memory corruption. I ran it again with valgrind and got: valgrind --log-file=x2 -v --leak-check=yes btrfs restore /dev/sda9 /mnt/backup ==8528== Memcheck, a memory error detector ==8528== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==8528== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==8528== Command: btrfs restore /dev/sda9 /mnt/backup ==8528== Parent PID: 8453 ==8528== ==8528== Syscall param pwrite64(buf) points to uninitialised byte(s) ==8528==at 0x59BE3C3: __pwrite_nocancel (in /lib64/libpthread-2.18.so) ==8528==by 0x41F22F: search_dir (cmds-restore.c:392) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== Address 0x66956a0 is 7,056 bytes inside a block of size 8,192 alloc'd ==8528==at 0x4C277AB: malloc (in /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so) ==8528==by 0x41EEAD: search_dir (cmds-restore.c:316) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ---[snip]- ==8528== Invalid read of size 1 ==8528==at 0x4C2BF15: memcpy@@GLIBC_2.14 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8528==by 0x43818F: read_extent_buffer (string3.h:51) ==8528==by 0x41EC66: search_dir (cmds-restore.c:233) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== Address 0x684c186 is 1,110 bytes inside a block of size 4,224 free'd ==8528==at 0x4C28ADC: free (in /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so) ==8528==by 0x437895: free_extent_buffer (extent_io.c:618) ==8528==by 0x41E053: next_leaf (cmds-restore.c:202) ==8528==by 0x41E50F: search_dir (cmds-restore.c:731) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== ==8528== Invalid read of size 8 ==8528==at 0x4C2BF40: memcpy@@GLIBC_2.14 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==8528==by 0x43818F: read_extent_buffer (string3.h:51) ==8528==by 0x41EC66: search_dir (cmds-restore.c:233) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== Address 0x684c178 is 1,096 bytes inside a block of size 4,224 free'd ==8528==at 0x4C28ADC: free (in /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so) ==8528==by 0x437895: free_extent_buffer (extent_io.c:618) ==8528==by 0x41E053: next_leaf (cmds-restore.c:202) ==8528==by 0x41E50F: search_dir (cmds-restore.c:731) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x41F8D0: search_dir (cmds-restore.c:895) ==8528==by 0x4204B8: cmd_restore (cmds-restore.c:1284) ==8528==by 0x4043FE: main (btrfs.c:286) ==8528== ==8528== Invalid read of size 8 ==8528==at 0x4C2BF52: memcpy@@GLIBC_2.14
Re: [PATCH 0/3] btrfs-progs: remove full /dev scanning
On 8/21/14, 3:44 AM, Anand Jain wrote: A long time back there was an attempt to remove it but this avoided it. Pls ref to the link in this discussion. https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg27272.html Hm, I guess I don't understand this. How is udev related to whether or not /proc/partitions is sufficient vs. recursive /dev? To be clear, my patchset keeps the -d / --all-devices option. It simply discovers all devices via /proc/partitions, not via a full /dev tree walk. Thanks, -Eric Thanks, Anand On 08/21/2014 06:21 AM, Eric Sandeen wrote: btrfs fileystem show and btrfs device scan today both have the -d option to scan everything under /dev. But we also have a mechanism to scan everything in /proc/partitions, which should always be sufficient. If anyone knows why we'd find something deep under /dev but not in /proc/partitions, speak now or forever hold your peace... Tested this by running through a matrix of -d, -m, or args for show/scan, for a 2-device fs, with and without a symlinked device, with and without a symlinked mountpoint. All output was identical. Thanks, -Eric -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/15] xfstests: new btrfs stress test cases
It's trivial to write this as a bunch of helper functions and then boiler-plate the actual tests themselves. There will be little difference in terms of run time, but we get much more fine-grained control of execution and reporting Sure, that's reasonable, given the xfstests infrastructure. - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] btrfs-progs: remove full /dev scanning
On 08/21/2014 12:21 AM, Eric Sandeen wrote: btrfs fileystem show and btrfs device scan today both have the -d option to scan everything under /dev. But we also have a mechanism to scan everything in /proc/partitions, which should always be sufficient. If anyone knows why we'd find something deep under /dev but not in /proc/partitions, speak now or forever hold your peace... In the past I worked on this. From my commit messages [1] [...] the devices scanned are extracted from /proc/partitions. This should avoid to scan devices not suitable for a btrfs filesystem like cdrom and floppy or to scan not existant devices. [...] Of course I (we) don't care cdrom and/or floppy, but this raises the question: are there other block devices which aren't showed in proc/partitions ? I am thinking to some less common hardware like NON USB sd disk (I saw this kind of hardware, but now I don't have it in my hands). GB [1] 0dbd99fb3e117cd5f87eda492b6b4fab1b5bea23 Tested this by running through a matrix of -d, -m, or args for show/scan, for a 2-device fs, with and without a symlinked device, with and without a symlinked mountpoint. All output was identical. Thanks, -Eric -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: init uninitialized output buf for btrfs-restore
On 8/20/14, 10:35 PM, Gui Hecheng wrote: A memory problem reported by valgrind as follows: === Syscall param pwrite64(buf) points to uninitialised byte(s) When running: # valgrind --leak-check=yes btrfs restore /dev/sda9 /mnt/backup Because the output buf size is alloced with malloc, but the length of output data is shorter than the sizeof(buf), so valgrind report uninitialised byte(s). We could use calloc to repalce malloc and clear this WARNING away. It clears the valgrind error away, but does it hide a real bug? The code does this: ram_size = btrfs_file_extent_ram_bytes(leaf, fi); outbuf = malloc(ram_size); if (!outbuf) { fprintf(stderr, No memory\n); return -ENOMEM; } ret = decompress(buf, outbuf, len, ram_size, compress); if (ret) { free(outbuf); return ret; } done = pwrite(fd, outbuf, ram_size, pos); Now, I don't know the details of the decompression routines, but it sure *looks* to me like we have found out that ram size is the size of the decompressed data, and so we allocate that much. If valgrind detects that when we write ram_size bytes, some of them are uninitialized, doesn't that mean that something has gone wrong in decompression? using calloc shuts up the warning, sure, but ... Marc, are you using zlib or lzo? If zlib, maybe this in decompress_zlib is a problem: (void)inflateEnd(strm); return 0; } inflateEnd returns Z_OK if success, Z_STREAM_ERROR if the stream state was inconsistent. Josef, any idea why return value is cast away there? Thanks, -Eric Reported-by: Marc Dietrich marvi...@gmx.de Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com --- cmds-restore.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cmds-restore.c b/cmds-restore.c index cbda6bb..bb72311 100644 --- a/cmds-restore.c +++ b/cmds-restore.c @@ -251,7 +251,7 @@ static int copy_one_inline(int fd, struct btrfs_path *path, u64 pos) } ram_size = btrfs_file_extent_ram_bytes(leaf, fi); - outbuf = malloc(ram_size); + outbuf = calloc(1, ram_size); if (!outbuf) { fprintf(stderr, No memory\n); return -ENOMEM; @@ -320,7 +320,7 @@ static int copy_one_extent(struct btrfs_root *root, int fd, } if (compress != BTRFS_COMPRESS_NONE) { - outbuf = malloc(ram_size); + outbuf = calloc(1, ram_size); if (!outbuf) { fprintf(stderr, No memory\n); free(inbuf); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: init uninitialized output buf for btrfs-restore
On 8/21/14, 1:42 PM, Eric Sandeen wrote: On 8/20/14, 10:35 PM, Gui Hecheng wrote: A memory problem reported by valgrind as follows: === Syscall param pwrite64(buf) points to uninitialised byte(s) When running: # valgrind --leak-check=yes btrfs restore /dev/sda9 /mnt/backup Because the output buf size is alloced with malloc, but the length of output data is shorter than the sizeof(buf), so valgrind report uninitialised byte(s). We could use calloc to repalce malloc and clear this WARNING away. It clears the valgrind error away, but does it hide a real bug? Maybe the relevant question for Marc is - did you get decompression errors during restore? if so then I guess it all makes sense, and the proposed patch seems sane after all, sorry. -Eric -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Questions on using BtrFS for fileserver
On 19/08/14 12:21 PM, M G Berberich wrote: we are thinking about using BtrFS on standard hardware for a fileserver with about 50T (100T raw) of storage (25×4TByte). ... · Are there any reports/papers/web-pages about BtrFS-systems this size in use? Praises, complains, performance-reviews, whatever… For what it is worth, I am running two btrfs filesystems: 1. Primary: 25 TiB, hardware RAID-6, LVM, PCIex8 (11x3TB) 2. Backup : 25 TiB, software RAID-5, LUKS, USB 3.0 (8x4TB) I am not using btrfs RAID (-d single -m dup), rather hardware or software MD. Neither are partitioned (as they are not bootable). I do hourly / daily / weekly / monthly / yearly snapshots on subvolumes in the primary fs, and pruning excess snapshots (example: I only keep 24 hourly snapshots). Currently using stock Fedora 20, though I try to keep the btrfs utility up-to-date by building from GIT when an updated RPM is not available. Overall impressions of btrfs: * Very resilient. It has suffered many hardware-related panics and no data-loss or filesystem corruption has been detected. I maintain a backup, which includes hashes of everything, and also 5% par2 recovery for some critical data. The data is fairly static though, with the vast majority of operations being reads. * Much higher CPU load than ext4. This exposes a known reset issue with the old 3Ware 9650SE-ML16 RAID controller. Switching to the NOOP IO scheduler helped reduce the load considerably, but it still can get quite high [even without LUKS]. CPU motherboard replacement hardware is on-hand, and an upgrade is imminent (currently using an old Core2 Duo @ 3 GHz, 4 GiB DDR2). * Slow to mount, but not an unreasonable amount. ~~ Andrew E. Mileski -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: fix unaligned loads in receive
A user reported corruption after receiving subvolumes. Turning up the logging during the receive showed that the commands and string attributes were being received correctly but the u64 attrbutes were sometimes corrupted by having variable number of low order bytes introduced. It turned out they were on a platform that corrupts unaligned userspace loads. Loading the u64s from the unaligned pointers into the received command stream with get_unaligned() fixed the problem. Reported-By: Klaus Holler k...@gmx.at Tested-By: Klaus Holler k...@gmx.at Signed-off-by: Zach Brown z...@zabbo.net --- send-stream.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/send-stream.c b/send-stream.c index 88e18e2..4f8dd83 100644 --- a/send-stream.c +++ b/send-stream.c @@ -204,7 +204,7 @@ out: int __len; \ TLV_GET(s, attr, (void**)__tmp, __len); \ TLV_CHECK_LEN(sizeof(*__tmp), __len); \ - *v = le##bits##_to_cpu(*__tmp); \ + *v = get_unaligned_le##bits(__tmp); \ } while (0) #define TLV_GET_U8(s, attr, v) TLV_GET_INT(s, attr, 8, v) -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs receive problem on ARM kirkwood NAS with kernel 3.16.0 and btrfs-progs 3.14.2
On Thu, Aug 21, 2014 at 09:03:16PM +0200, Klaus Holler wrote: Hello Hugo and Zach! a big thanks to both of you! Both Hugo's userspace workaround and Zach's patch work fine for me - the /boot snapshot can be restored completely as expected :-) Cool, glad to hear it. I sent a proper patch to the list and added your reported-by and tested-by, hope that's OK. - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
mkfs.btrfs vs fstrim on an SD Card (not SSD)
Short version: When I mkfs.btrfs either an SD Card or an SSD, I get a response back to the effect the whole device specified is trimmed. However, when I use fstrim on an SD Card, I get an error that trim isn't supported. So I'm wondering if anyone knows the difference between how fstrim is trimming, and how mkfs.btrfs is trimming. Long version: It does seem like the mkfs.btrfs one has worked because using dd to read an LBA with known information returns zeros after mkfs. And then I found the SD Card association has their own formatting tool for Windows and OS X, with the warning Using generic formatting utilities may result in less than optimal performance for your memory cards. https://www.sdcard.org/downloads/formatter_4/ So I downloaded the Physical Layer Simplified Specification Version 4.10 spec, and on page 38 it describes an erase command. 4.3.5 Erase It is desirable to erase many write blocks simultaneously in order to enhance the data throughput. Identification of these write blocks is accomplished with the ERASE_WR_BLK_START (CMD32), ERASE_WR_BLK_END (CMD33) commands. The host should adhere to the following command sequence: ERASE_WR_BLK_START, ERASE_WR_BLK_END and ERASE (CMD38). So I'm going to guess that mkfs.btrfs is leveraging something that ends up using these SD Card specific commands on SD Cards, but mkfs.btfs itself isn't aware of this distinction. Whereas fstrim is maybe using something else? Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs restore
On Thu, Aug 21, 2014 at 05:52:01AM +, Mihail Zaporozhets wrote: # btrfs-zero-log /dev/sda1 warning devid 5 not found already Check tree block failed, want=16845270495232, have=0 read block failed check_tree_block Couldn't read tree root You may be hitting the same problem I was a week back. See the thread that says btrfs-zero-log fails, can't mount FS Download the source for btrfs-progs, and apply this patch from Chris: diff --git a/disk-io.c b/disk-io.c index 8db0335..d9a8e19 100644 --- a/disk-io.c +++ b/disk-io.c @@ -911,13 +911,13 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr, return -EIO; } fs_info-csum_root-track_dirty = 1; - +#if 0 ret = find_and_setup_log_root(root, fs_info, sb); if (ret) { printk(Couldn't setup log root tree\n); return -EIO; } - +#endif fs_info-generation = generation; fs_info-last_trans_committed = generation; if (extent_buffer_uptodate(fs_info-extent_root-node) Or if you're desparate and want a binary, I'll Email you one directly (not that you should run a binary you got from someone via Email as root, so it's only if you're desperate) Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs restore
I just created https://btrfs.wiki.kernel.org/index.php/Btrfs-zero-log and added the info about this failure of btrfs-zero-log as well as the patch from Chris. Whenever it's in a new version of btrfs-zero-log, I or someone else can update that wiki page to tell people to just update to a newer version to get around this Couldn't setup log root tree problem. However, re-reading your error message you got a different error, so the patch isn't likely to work for you read block failed check_tree_block is a warhing Your actual error is if (!extent_buffer_uptodate(root-node)) { fprintf(stderr, Couldn't read tree root\n); return -EIO; } This looks more serious, and I'm not sure if btrfs-zero-log can help with that. I'll let someone else answer. Marc On Thu, Aug 21, 2014 at 06:52:16PM -0700, Marc MERLIN wrote: On Thu, Aug 21, 2014 at 05:52:01AM +, Mihail Zaporozhets wrote: # btrfs-zero-log /dev/sda1 warning devid 5 not found already Check tree block failed, want=16845270495232, have=0 read block failed check_tree_block Couldn't read tree root You may be hitting the same problem I was a week back. See the thread that says btrfs-zero-log fails, can't mount FS Download the source for btrfs-progs, and apply this patch from Chris: diff --git a/disk-io.c b/disk-io.c index 8db0335..d9a8e19 100644 --- a/disk-io.c +++ b/disk-io.c @@ -911,13 +911,13 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr, return -EIO; } fs_info-csum_root-track_dirty = 1; - +#if 0 ret = find_and_setup_log_root(root, fs_info, sb); if (ret) { printk(Couldn't setup log root tree\n); return -EIO; } - +#endif fs_info-generation = generation; fs_info-last_trans_committed = generation; if (extent_buffer_uptodate(fs_info-extent_root-node) Or if you're desparate and want a binary, I'll Email you one directly (not that you should run a binary you got from someone via Email as root, so it's only if you're desperate) Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mkfs.btrfs vs fstrim on an SD Card (not SSD)
I may have answered my own question using strace. And for whatever reason this time fstrim worked. fstrim ioctl(3, FITRIM, 0x7fffbf6b87e0) … write(1, /mnt/: 13.9 MiB (14598144 bytes)…, 41/mnt/: 13.9 MiB (14598144 bytes) trimmed Clearly this is only erasing what the file system is aware of having been recently deleted, even though it's a bit off as I'd just deleted a 6.1MB file. mkfs.btrs: ioctl(3, BLKGETSIZE64, 14729330176) = 0 ioctl(3, BLKDISCARD, {0, 7fff920ee4a0}) = 0 write(2, Performing full device TRIM (13, 43Performing full device TRIM (13.72GiB) ... ) = 43 ioctl(3, BLKDISCARD, {0, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {4000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {8000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {c000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {1, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {14000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {18000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {1c000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {2, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {24000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {28000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {2c000, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {3, 7fff920ee4a0}) = 0 ioctl(3, BLKDISCARD, {34000, 7fff920ee4a0}) = 0 And this blew away everything on that partition. Since the SD Card spec references a completely different command than the ATA spec (TRIM), I don't think either one of these are TRIM, even if functionally equivalent. Instead the SD Card ERASE_* commands are probably being used, but I can't confirm this because writes to /dev/mmcblk0 aren't showing up with: echo scsi:scsi_dispatch_cmd_start /sys/kernel/debug/tracing/set_event echo 1 /sys/kernel/debug/tracing/tracing_on cat /sys/kernel/debug/tracing/trace_pipe Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Significance of high number of mails on this list?
Hello people. Thank you for your detailed replies, esp Duncan. In essence, I plan on using BTRFS for my production data -- mainly programs/documents I write in connection with my academic research. I'm not a professional sysadmin and I'm not running a business server. I'm just managing my own data, and as I have mentioned, my chief reason for looking at BTRFS is the ease of snapshots and backups using send/receive. It is clear now that snapshots are by and large stable but send/receive is not. But, IIUC, even if send/receive fails I still have the older data which is not overwritten due to COW and atomic operations, and I can always retry send/receive again. Is this correct? If yes, then I guess I can take the plunge but ensure I have daily backups (which BTRFS itself should help me do easily). -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: corrupt-block: fix a delete and use bug corrupting extent tree.
When corrupting extent tree, corrupt-block will iterate each child node/leaf of a node. However, when a node's child is leaf, btrfs_corrupt_extent_leaf() may delete some item in the leaf, which may cause the children number of the parent node decrease. Before this patch, corrupt-block will read out the nritems only *ONCE* and iterate the 'nritems' times. When btrfs_corrupt_extent_leaf() deletes enough item, causing the nritems of btrfs_header decreased, the last few iteration will access non-existed node, which will cause the delete and use bug like the following: --- \# ./btrfs-corrupt-block -E /dev/vdc deleting extent record: key 40714240 168 16384 Couldn't map the block 3459802452797161472 btrfs-corrupt-block: volumes.c:1137: btrfs_num_copies: Assertion `!(!ce)' failed. Aborted --- This patch will update the nritmes in each iteration to avoid the bug. Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com --- btrfs-corrupt-block.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/btrfs-corrupt-block.c b/btrfs-corrupt-block.c index 6ecbe47..486ea19 100644 --- a/btrfs-corrupt-block.c +++ b/btrfs-corrupt-block.c @@ -264,12 +264,10 @@ static void btrfs_corrupt_extent_tree(struct btrfs_trans_handle *trans, struct extent_buffer *eb) { int i; - u32 nr; if (!eb) return; - nr = btrfs_header_nritems(eb); if (btrfs_is_leaf(eb)) { btrfs_corrupt_extent_leaf(trans, root, eb); return; @@ -280,7 +278,7 @@ static void btrfs_corrupt_extent_tree(struct btrfs_trans_handle *trans, return; } - for (i = 0; i nr; i++) { + for (i = 0; i btrfs_header_nritems(eb); i++) { struct extent_buffer *next; next = read_tree_block(root, btrfs_node_blockptr(eb, i), -- 2.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Significance of high number of mails on this list?
On Fri, Aug 22, 2014 at 09:10:55AM +0530, Shriramana Sharma wrote: Hello people. Thank you for your detailed replies, esp Duncan. In essence, I plan on using BTRFS for my production data -- mainly programs/documents I write in connection with my academic research. I'm not a professional sysadmin and I'm not running a business server. I'm just managing my own data, and as I have mentioned, my chief reason for looking at BTRFS is the ease of snapshots and backups using send/receive. It is clear now that snapshots are by and large stable but send/receive is not. But, IIUC, even if send/receive fails I still I wouldn't quite agree with that, btrfs send/receive has been working fairly well for me on multiple systems for multiple backups per day. My laptop oftens fails to complete a btrfs send to my server remotely over the internet, and it recovers on its own a the next cron run and sends the a newer bigger diff next time and it just works. Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html