Re: [PATCH 1/3] xfstests/btrfs: add qgroup rescan stress test

2014-02-17 Thread Wang Shilong

On 02/18/2014 02:46 PM, Dave Chinner wrote:

On Thu, Feb 13, 2014 at 11:18:57AM +0800, Wang Shilong wrote:

Test flow is to run fsstress after triggering quota rescan.
the ruler is simple, we just remove all files and directories,
sync filesystem and see if qgroup's ref and excl are nodesize.

Signed-off-by: Wang Shilong 
---
  tests/btrfs/038 | 75 +
  tests/btrfs/038.out |  3 +++
  tests/btrfs/group   |  1 +
  3 files changed, 79 insertions(+)
  create mode 100644 tests/btrfs/038
  create mode 100644 tests/btrfs/038.out

diff --git a/tests/btrfs/038 b/tests/btrfs/038
new file mode 100644
index 000..f6bd872
--- /dev/null
+++ b/tests/btrfs/038
@@ -0,0 +1,75 @@
+#! /bin/bash
+# FSQA Test No. btrfs/038
+#
+# Quota rescan stress test, we run fsstress and quota rescan concurrently
+#
+#---
+# Copyright (C) 2014 Fujitsu.  All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+
+_cleanup()
+{
+   rm -f $tmp.*
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_need_to_be_root
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+
+rm -f $seqres.full
+
+run_check _scratch_mkfs "-b 2g --nodesize 4096"
+run_check _scratch_mount
+
+# -w ensures that the only ops are ones which cause write I/O
+run_check $FSSTRESS_PROG -d $SCRATCH_MNT -w -p 5 -n 1000 \
+   $FSSTRESS_AVOID >&/dev/null
+
+run_check $BTRFS_UTIL_PROG subvolume snapshot $SCRATCH_MNT \
+   $SCRATCH_MNT/snap1 >>$seqres.full 2>&1
+
+run_check $FSSTRESS_PROG -d $SCRATCH_MNT/snap1 -w -p 5 -n 1000 \
+   $FSSTRESS_AVOID >&/dev/null
+
+run_check $BTRFS_UTIL_PROG quota enable $SCRATCH_MNT
+run_check $BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT

"run_check considered harmful."

http://oss.sgi.com/archives/xfs/2014-02/msg00482.html

Once I've committed Filipe's run_btrfs_util_prog, can you update
this series to remove all the unnecessary run_check calls and
repost? Thanks!

No problem. ^_^

Thanks,
Wang


Cheers,

Dave.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] xfstests/btrfs: add qgroup rescan stress test

2014-02-17 Thread Dave Chinner
On Thu, Feb 13, 2014 at 11:18:57AM +0800, Wang Shilong wrote:
> Test flow is to run fsstress after triggering quota rescan.
> the ruler is simple, we just remove all files and directories,
> sync filesystem and see if qgroup's ref and excl are nodesize.
> 
> Signed-off-by: Wang Shilong 
> ---
>  tests/btrfs/038 | 75 
> +
>  tests/btrfs/038.out |  3 +++
>  tests/btrfs/group   |  1 +
>  3 files changed, 79 insertions(+)
>  create mode 100644 tests/btrfs/038
>  create mode 100644 tests/btrfs/038.out
> 
> diff --git a/tests/btrfs/038 b/tests/btrfs/038
> new file mode 100644
> index 000..f6bd872
> --- /dev/null
> +++ b/tests/btrfs/038
> @@ -0,0 +1,75 @@
> +#! /bin/bash
> +# FSQA Test No. btrfs/038
> +#
> +# Quota rescan stress test, we run fsstress and quota rescan concurrently
> +#
> +#---
> +# Copyright (C) 2014 Fujitsu.  All rights reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#
> +#---
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1
> +
> +_cleanup()
> +{
> +   rm -f $tmp.*
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# real QA test starts here
> +_need_to_be_root
> +_supported_fs btrfs
> +_supported_os Linux
> +_require_scratch
> +
> +rm -f $seqres.full
> +
> +run_check _scratch_mkfs "-b 2g --nodesize 4096"
> +run_check _scratch_mount
> +
> +# -w ensures that the only ops are ones which cause write I/O
> +run_check $FSSTRESS_PROG -d $SCRATCH_MNT -w -p 5 -n 1000 \
> + $FSSTRESS_AVOID >&/dev/null
> +
> +run_check $BTRFS_UTIL_PROG subvolume snapshot $SCRATCH_MNT \
> +   $SCRATCH_MNT/snap1 >>$seqres.full 2>&1
> +
> +run_check $FSSTRESS_PROG -d $SCRATCH_MNT/snap1 -w -p 5 -n 1000 \
> +   $FSSTRESS_AVOID >&/dev/null
> +
> +run_check $BTRFS_UTIL_PROG quota enable $SCRATCH_MNT
> +run_check $BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT

"run_check considered harmful."

http://oss.sgi.com/archives/xfs/2014-02/msg00482.html

Once I've committed Filipe's run_btrfs_util_prog, can you update
this series to remove all the unnecessary run_check calls and
repost? Thanks!

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: know mount location with in FS

2014-02-17 Thread Anand Jain



For what reason?

Remember that a single block device can be mounted in multiple places

> (or bind-mounted, etc), so there is not even necessarily a single
> answer to that question.


-Eric


 Yes indeed. (the attempt is should we be able to maintain all
 the mount points as a list saved/updated under per fs_devices. ?)

 some of the exported symbols at fs/namei.c looks closely
 related to the purpose here, but it didn't help unless
 I missed something.

 any comment is helpful..

 The reason:
First of all btrfs-progs has used "scan-all-disks" very
liberally which isn't a scalable design (imagine a data
center with 1000's of LUN).
Even a simple check_mounted() does scan-all-disks (when
total_disk >1), that isn't necessary if the kernel could
let it know.
Scan for btrfs has expensive steps of reading each super-block,
and the effect is, in general most of the btrfs-progs commands
are very very slow when things like scrub is running.
check_mounted() fails when seeding is used (since
/proc/self/mounts would show disk with lowest devid and in
most common scenario it will be a seed disk. (which has
different FSID from the actual disk in question). and
Further most severe problem is some btrfs-progs threads has been
scan-all-disks more than once during the thread's life time.
So a total revamp of this design has become an immediate need.

What I am planning is
   - btrfs-progs to init btrfs-disk-list once per required thread
 (mostly use BTRFS_IOC_GET_DEVS, which would dump anything
 and everything about the btrfs devices)
   - the btrfs-disk-list is obtained from kernel first, and will
 fill with the remaining disks which kernel isn't aware of.
   - If the step one also provides the mount point(s) from the
 kernel that would complete the loop with what end user
 would want to know.


Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] xfstests: test for atime-related mount options

2014-02-17 Thread Dave Chinner
On Mon, Feb 17, 2014 at 02:40:45PM -0600, Eric Sandeen wrote:
> On 2/17/14, 2:25 PM, Koen De Wit wrote:
> > Tests the noatime, relatime, strictatime and nodiratime mount options.
> > 
> > There is an extra check for Btrfs to ensure that the access time is
> > never updated on read-only subvolumes. (Regression test for bug fixed
> > with commit 93fd63c2f001ca6797c6b15b696a484b165b4800)
> 
> I think this looks ok.  Only little nit is that _require_relatime
> now implicitly requires scratch, but *shrug* it was Dave's idea.  ;)

There's a _require_scratch call in there, so it's all good. ;)
Thanks Koen!

> Thanks!
> 
> Reviewed-by: Eric Sandeen 

And thanks for the reivew, Eric ;)

Cheers,

Dave.

-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs "possible irq lock inversion dependency detected"

2014-02-17 Thread Chris Murphy

On Feb 17, 2014, at 1:09 PM, Tommi Rantala  wrote:

> Hello,
> 
> Saw this while fuzzing the kernel with Trinity.
> 
> Tommi
> 
> 
> [  396.136048] =
> [  396.136048] [ INFO: possible irq lock inversion dependency detected ]
> [  396.136048] 3.14.0-rc3 #1 Not tainted
> [  396.136048] -
> [  396.136048] kswapd0/1482 just changed the state of lock:
> [  396.136048]  (&delayed_node->mutex){+.+.-.}, at: [] 
> __btrfs_release_delayed_node+0x4b/0x1e0
> [  396.136048] but this lock took another, RECLAIM_FS-unsafe lock in the past:
> [  396.136048]  (&found->groups_sem){+.}

Looks like this is the same thing previously report on Btrfs list with 
3.14.0-rc1 here:
https://bugzilla.redhat.com/show_bug.cgi?id=1062439

Which points to this:
https://bugzilla.redhat.com/show_bug.cgi?id=1062833#c24

Which points to this patch:
http://marc.info/?l=linux-netdev&m=139233546723342&q=raw


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][BTRFS-PROGS][v4] Enhance btrfs fi df

2014-02-17 Thread Goffredo Baroncelli
On 02/17/2014 07:41 PM, David Sterba wrote:
> series as-is into the -next part of the integration
> branch so we have something to test, let the review and comments phase
> continue.
Hi David,

many thanks

BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] xfstests: test for atime-related mount options

2014-02-17 Thread Eric Sandeen
On 2/17/14, 2:25 PM, Koen De Wit wrote:
> Tests the noatime, relatime, strictatime and nodiratime mount options.
> 
> There is an extra check for Btrfs to ensure that the access time is
> never updated on read-only subvolumes. (Regression test for bug fixed
> with commit 93fd63c2f001ca6797c6b15b696a484b165b4800)

I think this looks ok.  Only little nit is that _require_relatime
now implicitly requires scratch, but *shrug* it was Dave's idea.  ;)

Thanks!

Reviewed-by: Eric Sandeen 

> Signed-off-by: Koen De Wit 
> ---
> 
> v1->v2:
> - Fix typo in _cleanup()
> - Explicitly passing relatime mount option
> - Adding _require_relatime method to common/rc
> - Adding extra test for read-only mounts 
> 
> diff --git a/common/rc b/common/rc
> index e91568b..e55d09e 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2159,6 +2159,14 @@ _verify_reflink()
> || echo "$1 and $2 are not reflinks: different extents"
>  }
>  
> +_require_relatime()
> +{
> +_scratch_mkfs > /dev/null 2>&1
> +_mount -t $FSTYP -o relatime $SCRATCH_DEV $SCRATCH_MNT || \
> +_notrun "relatime not supported by the current kernel"
> + _scratch_unmount
> +}
> +
>  _create_loop_device()
>  {
>   file=$1
> diff --git a/tests/generic/323 b/tests/generic/323
> new file mode 100644
> index 000..54f2739
> --- /dev/null
> +++ b/tests/generic/323
> @@ -0,0 +1,199 @@
> +# Tests the noatime, relatime, strictatime and nodiratime mount options.
> +# There is an extra check for Btrfs to ensure that the access time is
> +# never updated on read-only subvolumes. (Regression test for bug fixed
> +# with commit 93fd63c2f001ca6797c6b15b696a484b165b4800)
> +#
> +#---
> +# Copyright (c) 2014, Oracle and/or its affiliates.  All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#---
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +cd /
> +rm -rf $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# real QA test starts here
> +
> +_supported_fs generic
> +_supported_os Linux
> +_require_scratch
> +_require_relatime
> +
> +rm -f $seqres.full
> +
> +_stat() {
> +stat --printf="%x;%y;%z" $1
> +}
> +
> +_compare_stat_times() {
> +updated=$1  # 3 chars indicating if access, modify and
> +# change times should be updated (Y) or not (N)
> +IFS=';' read -a first_stat <<< "$2"   # Convert first stat to array
> +IFS=';' read -a second_stat <<< "$3"  # Convert second stat to array
> +test_step=$4# Will be printed to output stream in case of an
> +# error, to make debugging easier
> +types=( access modify change )
> +
> +for i in 0 1 2; do
> +if [ "${first_stat[$i]}" == "${second_stat[$i]}" ]; then
> +if [ "${updated:$i:1}" == "Y" ]; then
> +echo -n "ERROR: ${types[$i]} time has not been updated "
> +echo $test_step
> +fi
> +else
> +if [ "${updated:$i:1}" == "N" ]; then
> +echo -n "ERROR: ${types[$i]} time has changed "
> +echo $test_step
> +fi
> +fi
> +done
> +}
> +
> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
> +_scratch_mount "-o relatime"
> +
> +if [ "$FSTYP" = "btrfs" ]; then
> +TPATH=$SCRATCH_MNT/sub1
> +$BTRFS_UTIL_PROG subvolume create $TPATH > $seqres.full
> +else
> +TPATH=$SCRATCH_MNT
> +fi
> +
> +mkdir $TPATH/dir1
> +echo "aaa" > $TPATH/dir1/file1
> +file1_stat_before_first_access=`_stat $TPATH/dir1/file1`
> +
> +# Accessing file1 the first time
> +cat $TPATH/dir1/file1 > /dev/null
> +file1_stat_after_first_access=`_stat $TPATH/dir1/file1`
> +_compare_stat_times YNN "$file1_stat_before_first_access" \
> +"$file1_stat_after_first_access" "after accessing file1 first time"
> +
> +# Accessing file1 a second time
> +cat $TPATH/dir1/file1 > /dev/null
> +file1_stat_after_second_access=`_stat $TPATH/dir1/file1`
> +_compare_stat_times NNN "$file1_stat_after_fir

Re: [PATCH] xfstests: test for atime-related mount options

2014-02-17 Thread Koen De Wit

Thanks for the review, Eric!
Comments inline.

On 02/13/2014 05:42 PM, Eric Sandeen wrote:

On 2/13/14, 9:23 AM, Koen De Wit wrote:

Tests the noatime, relatime, strictatime and nodiratime mount options.

There is an extra check for Btrfs to ensure that the access time is
never updated on read-only subvolumes. (Regression test for bug fixed
with commit 93fd63c2f001ca6797c6b15b696a484b165b4800)

Signed-off-by: Koen De Wit 
---
  tests/generic/323 |  186 +
  tests/generic/323.out |2 +
  tests/generic/group   |1 +
  3 files changed, 189 insertions(+), 0 deletions(-)
  create mode 100644 tests/generic/323
  create mode 100644 tests/generic/323.out

diff --git a/tests/generic/323 b/tests/generic/323
new file mode 100644
index 000..423b141
--- /dev/null
+++ b/tests/generic/323
@@ -0,0 +1,186 @@
+# Tests the noatime, relatime, strictatime and nodiratime mount options.
+# There is an extra check for Btrfs to ensure that the access time is
+# never updated on read-only subvolumes. (Regression test for bug fixed
+# with commit 93fd63c2f001ca6797c6b15b696a484b165b4800)
+#
+#---
+# Copyright (c) 2014, Oracle and/or its affiliates.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+cd /
+rm -f $*
+}

is "$*" really what you meant?  Normally this is $tmp.*

$* is positional parameters for the script, and I don't think it takes any.


That's a typo indeed. Fixed in v2.


+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+
+rm -f $seqres.full
+
+_stat() {
+stat --printf="%x;%y;%z" $1
+}
+
+_compare_stat_times() {
+updated=$1  # 3 chars indicating if access, modify and
+# change times should be updated (Y) or not (N)
+IFS=';' read -a first_stat <<< "$2"   # Convert first stat to array
+IFS=';' read -a second_stat <<< "$3"  # Convert second stat to array
+test_step=$4# Will be printed to output stream in case of an
+# error, to make debugging easier
+types=( access modify change )
+
+for i in 0 1 2; do
+if [ "${first_stat[$i]}" == "${second_stat[$i]}" ]; then
+if [ "${updated:$i:1}" == "Y" ]; then
+echo -n "ERROR: ${types[$i]} time has not been updated "
+echo $test_step
+fi
+else
+if [ "${updated:$i:1}" == "N" ]; then
+echo -n "ERROR: ${types[$i]} time has changed "
+echo $test_step
+fi
+fi
+done
+}
+
+_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
+_scratch_mount
+
+cat /proc/mounts | grep "$SCRATCH_MNT" | grep relatime >> $seqres.full
+[ $? -ne 0 ] && echo "The relatime mount option should be the default."

Ok, I guess "relatime" in /proc/mounts is from core vfs code and should be 
there for the foreseeable future, so seems ok.

But - relatime was added in v2.6.20, and made default in 2.6.30.  So testing older 
kernels may not go as expected; it'd probably be best to catch situations where 
relatime isn't available (< 2.6.20) or not default (< 2.6.30), by explicitly 
mounting with relatime, and skipping relatime/strictatime tests if that fails?


From the mailing list discussions in the last days, I think we can conclude 
it's best to specify the relatime mount option explicitly and include a new 
_require_relatime method as proposed by Dave Chinner. I implemented it this way 
in v2.


The rest of the test is awfully dense, but nice long understandable variable 
names, so that's good.  ;)

I wonder if in the spirit of testing a btrfs RO snapshot, you could also add a 
readonly mount test, to be sure that an RO mount doesn't update atime.  Of 
course it shouldn't, but it might be worth adding for basic sanity?


Good idea! I added a read-only mount test in v2.

Thanks,
Koen.



Thanks,
-Eric


+
+if [ "$FSTYP" = "btrfs" ]; then
+TPATH=$SCRATCH_MNT/sub1

[PATCH v2] xfstests: test for atime-related mount options

2014-02-17 Thread Koen De Wit
Tests the noatime, relatime, strictatime and nodiratime mount options.

There is an extra check for Btrfs to ensure that the access time is
never updated on read-only subvolumes. (Regression test for bug fixed
with commit 93fd63c2f001ca6797c6b15b696a484b165b4800)

Signed-off-by: Koen De Wit 
---

v1->v2:
- Fix typo in _cleanup()
- Explicitly passing relatime mount option
- Adding _require_relatime method to common/rc
- Adding extra test for read-only mounts 

diff --git a/common/rc b/common/rc
index e91568b..e55d09e 100644
--- a/common/rc
+++ b/common/rc
@@ -2159,6 +2159,14 @@ _verify_reflink()
|| echo "$1 and $2 are not reflinks: different extents"
 }
 
+_require_relatime()
+{
+_scratch_mkfs > /dev/null 2>&1
+_mount -t $FSTYP -o relatime $SCRATCH_DEV $SCRATCH_MNT || \
+_notrun "relatime not supported by the current kernel"
+   _scratch_unmount
+}
+
 _create_loop_device()
 {
file=$1
diff --git a/tests/generic/323 b/tests/generic/323
new file mode 100644
index 000..54f2739
--- /dev/null
+++ b/tests/generic/323
@@ -0,0 +1,199 @@
+# Tests the noatime, relatime, strictatime and nodiratime mount options.
+# There is an extra check for Btrfs to ensure that the access time is
+# never updated on read-only subvolumes. (Regression test for bug fixed
+# with commit 93fd63c2f001ca6797c6b15b696a484b165b4800)
+#
+#---
+# Copyright (c) 2014, Oracle and/or its affiliates.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+cd /
+rm -rf $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_relatime
+
+rm -f $seqres.full
+
+_stat() {
+stat --printf="%x;%y;%z" $1
+}
+
+_compare_stat_times() {
+updated=$1  # 3 chars indicating if access, modify and
+# change times should be updated (Y) or not (N)
+IFS=';' read -a first_stat <<< "$2"   # Convert first stat to array
+IFS=';' read -a second_stat <<< "$3"  # Convert second stat to array
+test_step=$4# Will be printed to output stream in case of an
+# error, to make debugging easier
+types=( access modify change )
+
+for i in 0 1 2; do
+if [ "${first_stat[$i]}" == "${second_stat[$i]}" ]; then
+if [ "${updated:$i:1}" == "Y" ]; then
+echo -n "ERROR: ${types[$i]} time has not been updated "
+echo $test_step
+fi
+else
+if [ "${updated:$i:1}" == "N" ]; then
+echo -n "ERROR: ${types[$i]} time has changed "
+echo $test_step
+fi
+fi
+done
+}
+
+_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
+_scratch_mount "-o relatime"
+
+if [ "$FSTYP" = "btrfs" ]; then
+TPATH=$SCRATCH_MNT/sub1
+$BTRFS_UTIL_PROG subvolume create $TPATH > $seqres.full
+else
+TPATH=$SCRATCH_MNT
+fi
+
+mkdir $TPATH/dir1
+echo "aaa" > $TPATH/dir1/file1
+file1_stat_before_first_access=`_stat $TPATH/dir1/file1`
+
+# Accessing file1 the first time
+cat $TPATH/dir1/file1 > /dev/null
+file1_stat_after_first_access=`_stat $TPATH/dir1/file1`
+_compare_stat_times YNN "$file1_stat_before_first_access" \
+"$file1_stat_after_first_access" "after accessing file1 first time"
+
+# Accessing file1 a second time
+cat $TPATH/dir1/file1 > /dev/null
+file1_stat_after_second_access=`_stat $TPATH/dir1/file1`
+_compare_stat_times NNN "$file1_stat_after_first_access" \
+"$file1_stat_after_second_access" "after accessing file1 second time"
+
+# Remounting with nodiratime option
+_scratch_unmount
+_scratch_mount "-o nodiratime"
+file1_stat_after_remount=`_stat $TPATH/dir1/file1`
+_compare_stat_times NNN "$file1_stat_after_second_access" \
+"$file1_stat_after_remount" "for file1 after remount"
+
+# Creating dir2 and file2, checking directory stats
+mkdir $TPATH/dir2
+dir2_stat_before_file_creation=`_stat $TPATH/dir2`
+echo "bbb" > 

btrfs "possible irq lock inversion dependency detected"

2014-02-17 Thread Tommi Rantala
Hello,

Saw this while fuzzing the kernel with Trinity.

Tommi


[  396.136048] =
[  396.136048] [ INFO: possible irq lock inversion dependency detected ]
[  396.136048] 3.14.0-rc3 #1 Not tainted
[  396.136048] -
[  396.136048] kswapd0/1482 just changed the state of lock:
[  396.136048]  (&delayed_node->mutex){+.+.-.}, at: [] 
__btrfs_release_delayed_node+0x4b/0x1e0
[  396.136048] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[  396.136048]  (&found->groups_sem){+.}

and interrupts could create inverse lock ordering between them.

[  396.136048]
[  396.136048] other info that might help us debug this:
[  396.136048]  Possible interrupt unsafe locking scenario:
[  396.136048]
[  396.136048]CPU0CPU1
[  396.136048]
[  396.136048]   lock(&found->groups_sem);
[  396.136048]local_irq_disable();
[  396.136048]lock(&delayed_node->mutex);
[  396.136048]lock(&found->groups_sem);
[  396.136048]   
[  396.136048] lock(&delayed_node->mutex);
[  396.136048]
[  396.136048]  *** DEADLOCK ***
[  396.136048]
[  396.136048] 2 locks held by kswapd0/1482:
[  396.136048]  #0:  (shrinker_rwsem){..}, at: [] 
shrink_slab+0x3a/0x170
[  396.136048]  #1:  (&type->s_umount_key#25){.+}, at: [] 
grab_super_passive+0x4f/0x80
[  396.136048]
[  396.136048] the shortest dependencies between 2nd lock and 1st lock:
[  396.136048]  -> (&found->groups_sem){+.} ops: 38935 {
[  396.136048] HARDIRQ-ON-W at:
[  396.136048]   [] 
__lock_acquire+0x88e/0x1d90
[  396.136048]   [] 
lock_acquire+0x182/0x210
[  396.136048]   [] down_write+0x5c/0xc0
[  396.136048]   [] 
__link_block_group+0x3d/0xf0
[  396.136048]   [] 
btrfs_read_block_groups+0x392/0x690
[  396.136048]   [] 
open_ctree+0x1ad7/0x2140
[  396.136048]   [] 
btrfs_mount+0x44e/0x8e0
[  396.136048]   [] mount_fs+0x7a/0x1a0
[  396.136048]   [] 
vfs_kern_mount+0x71/0x150
[  396.136048]   [] 
btrfs_mount+0x831/0x8e0
[  396.136048]   [] mount_fs+0x7a/0x1a0
[  396.136048]   [] 
vfs_kern_mount+0x71/0x150
[  396.136048]   [] do_mount+0x954/0xb90
[  396.136048]   [] SyS_mount+0x94/0xe0
[  396.136048]   [] 
do_mount_root+0x1a/0x93
[  396.136048]   [] 
mount_block_root+0xe5/0x203
[  396.136048]   [] mount_root+0xe1/0xea
[  396.136048]   [] 
prepare_namespace+0x13c/0x174
[  396.136048]   [] 
kernel_init_freeable+0x242/0x251
[  396.136048]   [] kernel_init+0x9/0xf0
[  396.136048]   [] 
ret_from_fork+0x7c/0xb0
[  396.136048] HARDIRQ-ON-R at:
[  396.136048]   [] 
__lock_acquire+0x847/0x1d90
[  396.136048]   [] 
lock_acquire+0x182/0x210
[  396.136048]   [] down_read+0x4c/0xa0
[  396.136048]   [] 
btrfs_calc_num_tolerated_disk_barrier_failures+0x24a/0x310
[  396.136048]   [] 
open_ctree+0x1b0f/0x2140
[  396.136048]   [] 
btrfs_mount+0x44e/0x8e0
[  396.136048]   [] mount_fs+0x7a/0x1a0
[  396.136048]   [] 
vfs_kern_mount+0x71/0x150
[  396.136048]   [] 
btrfs_mount+0x831/0x8e0
[  396.136048]   [] mount_fs+0x7a/0x1a0
[  396.136048]   [] 
vfs_kern_mount+0x71/0x150
[  396.136048]   [] do_mount+0x954/0xb90
[  396.136048]   [] SyS_mount+0x94/0xe0
[  396.136048]   [] 
do_mount_root+0x1a/0x93
[  396.136048]   [] 
mount_block_root+0xe5/0x203
[  396.136048]   [] mount_root+0xe1/0xea
[  396.136048]   [] 
prepare_namespace+0x13c/0x174
[  396.136048]   [] 
kernel_init_freeable+0x242/0x251
[  396.136048]   [] kernel_init+0x9/0xf0
[  396.136048]   [] 
ret_from_fork+0x7c/0xb0
[  396.136048] SOFTIRQ-ON-W at:
[  396.136048]   [] 
__lock_acquire+0x8c3/0x1d90
[  396.136048]   [] 
lock_acquire+0x182/0x210
[  396.136048]   [] down_write+0x5c/0xc0
[  396.136048]   [] 
__link_block_group+0x3d/0xf0
[  396.136048]   [] 
btrfs_read_block_groups+0x392/0x690
[  396.136048]   [] 
open_ctree+0x1ad7/0x2140
[  396.136048]   [] 
btrfs_mount+0x44e/0x8e0
[  396.136048]   [

Re: [PATCH][BTRFS-PROGS][v4] Enhance btrfs fi df

2014-02-17 Thread David Sterba
On Thu, Feb 13, 2014 at 08:18:10PM +0100, Goffredo Baroncelli wrote:
> This is the 4th attempt of my patches related to show how the data
> are stored in a btrfs filesystem. I rebased all the patches on the v3.13
> btrfs-progs. 

FYI, I've added this series as-is into the -next part of the integration
branch so we have something to test, let the review and comments phase
continue.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/8] Add command btrfs filesystem disk-usage

2014-02-17 Thread Goffredo Baroncelli
On 02/15/2014 11:23 PM, Chris Murphy wrote:
> 
> On Feb 14, 2014, at 11:34 AM, Hugo Mills  wrote:
> 
>> On Fri, Feb 14, 2014 at 07:27:57PM +0100, Goffredo Baroncelli wrote:
>>> On 02/14/2014 07:11 PM, Roman Mamedov wrote:
 On Fri, 14 Feb 2014 18:57:03 +0100
 Goffredo Baroncelli  wrote:

> On 02/13/2014 10:00 PM, Roman Mamedov wrote:
>> On Thu, 13 Feb 2014 20:49:08 +0100
>> Goffredo Baroncelli  wrote:
>>
>>> Thanks for the comments, however I don't like du not usage; but you are 
>>> right 
>>> when you don't like "disk-usage". What about "btrfs filesystem 
>>> chunk-usage" ?
>>
>> Personally I don't see the point of being super-pedantic here, i.e. 
>> "look this
>> is not just filesystem usage, this is filesystem CHUNK usage"... 
>> Consistency
>> of having a matching "dev usage" and "fi usage" would have been nicer.
>
>
> What about "btrfs filesystem chunk-usage" ? 

 Uhm? Had to reread this several times, but it looks like you're repeating
 exactly the same question that I was already answering in the quoted part.

 To clarify even more, personally I'd like if there would have been "btrfs 
 dev
 usage" and "btrfs fi usage". Do not see the need to specifically make the 
 2nd
 one "chunk-usage" instead of simply "usage".
>>>
>>> I don't like "usage" because it to me seems to be too much generic.
>>> Because both "btrfs filesystem disk-usage" and "btrfs device disk-usage"
>>> report about chunk (and/or block group) infos, I am investigating 
>>> about 
>>> - btrfs filesystem chunk-usage
>>> - btrfs device chunk-usage
>>
>>   Most people aren't going to know (or care) what a chunk is. I'm
>> much happier with Roman's suggestion of btrfs {fi,dev} usage.
> 
> Or btrfs filesystem examine, or btrfs filesystem detail, which are
> semi-consistent with mdadm for obtaining similar data.
> 

I have to agree with Chris: looking at the output of "btrfs fi disk-usage"

$ sudo ./btrfs filesystem disk-usage -t /mnt/btrfs1/
 Data   DataMetadata Metadata System System 
 Single RAID6   Single   RAID5Single RAID5   Unallocated

/dev/vdb 8.00MB  1.00GB   8.00MB   1.00GB 4.00MB  4.00MB 97.98GB
/dev/vdc  -  1.00GB-   1.00GB  -  4.00MB 98.00GB
/dev/vdd  -  1.00GB-   1.00GB  -  4.00MB 98.00GB
/dev/vde  -  1.00GB-   1.00GB  -  4.00MB 98.00GB
 == ===   == === ===
Total8.00MB  2.00GB   8.00MB   3.00GB 4.00MB 12.00MB391.97GB
Used   0.00 11.25MB 0.00  36.00KB   0.00  4.00KB   

it is hard to tell that this can be named "filesystem usage". I think that 
"details" or "examine" is a better name.

Regarding "btrfs device usage", it seems to me more coherent. But as 
reported before consistency also matters, so now I am inclined to use
"detail" (or examine) also for "btrfs device"

> 
> Chris Murphy
> 
Regards
Goffredo

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: know mount location with in FS

2014-02-17 Thread Eric Sandeen
On 2/16/14, 9:02 AM, Anand Jain wrote:
> 
> Hello,
> 
> I wonder if there is any known way to get the mount point directory name with 
> in the btrfs-kernel ?

For what reason?

Remember that a single block device can be mounted in multiple places (or 
bind-mounted, etc), so there is not even necessarily a single answer to that 
question.

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about df and btrfs fi df

2014-02-17 Thread David Sterba
On Mon, Feb 10, 2014 at 01:41:23PM -0500, Josef Bacik wrote:
> 
> 
> On 02/10/2014 01:36 PM, cwillu wrote:
> >IMO, used should definitely include metadata, especially given that we
> >inline small files.
> >
> >I can convince myself both that this implies that we should roll it
> >into b_avail, and that we should go the other way and only report the
> >actual used number for metadata as well, so I might just plead
> >insanity here.
> >
> 
> I could be convinced to do this.  So we have
> 
> total: (total disk bytes) / (raid multiplier)
> used: (total used in data block groups) +
>   (total used in metadata block groups)
> avail: total - (total used in data block groups +
>   total metadata block groups)

The size of global block reserve should be IMO subtracted from 'avail',
this reports the space as free, but is in fact not.

The "used" amount of the global reserve might be included into
filesystem 'used', but I've observed the global reserve used for short
periods of time under some heavy stress, I'm convinced it needs to be
accounted in the df report.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.14-rc1] BUG: soft lockup - CPU#1 stuck for 22s with 255 GiB BTRFS with only 6 GiB free

2014-02-17 Thread Martin Steigerwald
Am Montag, 17. Februar 2014, 08:06:50 schrieb Chris Mason:
> On 02/17/2014 05:35 AM, Martin Steigerwald wrote:
> > Am Dienstag, 11. Februar 2014, 15:50:12 schrieb Dave:
> >> On Tue, Feb 11, 2014 at 10:36 AM, Martin Steigerwald
> >> 
> >>  wrote:
> >>> Today I started getting those on 3.14-rc. One core as displayed as 100%
> >>> system CPU. I rebooted cause the system didn´t respond consistently to
> >>> user input anymore.
> >> 
> >> Does 3.14-rc1 have Joseph's delayed refs throttling code?  I had two
> >> separate machines that exhibited similar symptoms.  Chris's for-linus
> >> branch has a fix for this which solved my problems:
> >> https://urldefense.proofpoint.com/v1/url?u=https://git.kernel.org/cgit/li
> >> nux/kernel/git/mason/linux-btrfs.git/commit/?h%3Dfor-linus%26id%3D27a377d
> >> b745ed4d11b3b9b340756857cb8dde07f&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2
> >> FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=EO6xV8TuAFpGsdd9TfvXbfHYgIFA8%2BKZo1Kn
> >> Q%2BZ6yzU%3D%0A&s=c9c8769a12e5537247d6ef176681cf19e2bf80fef186ead748b8a2c
> >> d2bac6a85> 
> > I also got this now under 3.14-rc3 with almost 16 GiB left on heavy KMail
> > /
> > Akonadi activity. 3.14-rc3 includes above commit.
> > 
> > As I now also got it with more free space and never saw this with upto
> > 3.13 I think this is a regression.
> 
> Do we eventually recover or is it stuck like this forever?

Well I got the lock up again and again and watched it for some minutes until I 
lost patience and did a hard reboot, so I don´t know actually. Each lockup was 
about 22 or 23 seconds.

I will try to trigger that workload again. Since I freed 2-3 more GB, it may 
not trigger, but when it does, how long do you suggest me to wait for it to 
recover?

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.14-rc1] BUG: soft lockup - CPU#1 stuck for 22s with 255 GiB BTRFS with only 6 GiB free

2014-02-17 Thread Chris Mason



On 02/17/2014 05:35 AM, Martin Steigerwald wrote:

Am Dienstag, 11. Februar 2014, 15:50:12 schrieb Dave:

On Tue, Feb 11, 2014 at 10:36 AM, Martin Steigerwald

 wrote:

Today I started getting those on 3.14-rc. One core as displayed as 100%
system CPU. I rebooted cause the system didn´t respond consistently to
user input anymore.


Does 3.14-rc1 have Joseph's delayed refs throttling code?  I had two
separate machines that exhibited similar symptoms.  Chris's for-linus
branch has a fix for this which solved my problems:
https://urldefense.proofpoint.com/v1/url?u=https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?h%3Dfor-linus%26id%3D27a377db745ed4d11b3b9b340756857cb8dde07f&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0A&m=EO6xV8TuAFpGsdd9TfvXbfHYgIFA8%2BKZo1KnQ%2BZ6yzU%3D%0A&s=c9c8769a12e5537247d6ef176681cf19e2bf80fef186ead748b8a2cd2bac6a85


I also got this now under 3.14-rc3 with almost 16 GiB left on heavy KMail /
Akonadi activity. 3.14-rc3 includes above commit.

As I now also got it with more free space and never saw this with upto
3.13 I think this is a regression.


Do we eventually recover or is it stuck like this forever?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.14-rc1] BUG: soft lockup - CPU#1 stuck for 22s with 255 GiB BTRFS with only 6 GiB free

2014-02-17 Thread Martin Steigerwald
Am Dienstag, 11. Februar 2014, 15:50:12 schrieb Dave:
> On Tue, Feb 11, 2014 at 10:36 AM, Martin Steigerwald
> 
>  wrote:
> > Today I started getting those on 3.14-rc. One core as displayed as 100%
> > system CPU. I rebooted cause the system didn´t respond consistently to
> > user input anymore.
> 
> Does 3.14-rc1 have Joseph's delayed refs throttling code?  I had two
> separate machines that exhibited similar symptoms.  Chris's for-linus
> branch has a fix for this which solved my problems:
> https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/?h=for-linus&id=27a377db745ed4d11b3b9b340756857cb8dde07f

I also got this now under 3.14-rc3 with almost 16 GiB left on heavy KMail /
Akonadi activity. 3.14-rc3 includes above commit.

As I now also got it with more free space and never saw this with upto
3.13 I think this is a regression.

Another trace:

Feb 17 11:22:38 merkaba kernel: [ 1816.159830] BUG: soft lockup - CPU#1 stuck 
for 22s! [btrfs-transacti:1618]
Feb 17 11:22:38 merkaba kernel: [ 1816.159834] Modules linked in: ufs qnx4 
hfsplus hfs minix ntfs vfat msdos fat jfs xfs libcrc32c rfcomm
 bnep bluetooth 6lowpan_iphc ip6table_filter ip6_tables iptable_filter 
ip_tables ebtable_nat ebtables x_tables cpufreq_userspace cpufreq_
stats cpufreq_powersave cpufreq_conservative pci_stub vboxpci(O) vboxnetadp(O) 
vboxnetflt(O) binfmt_misc vboxdrv(O) uinput ext4 crc16 mbc
ache jbd2 sbs sbshc hdaps(O) tp_smapi(O) thinkpad_ec(O) loop firewire_sbp2 fuse 
ecryptfs dm_crypt joydev snd_hda_codec_hdmi snd_hda_codec
_conexant snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep 
snd_pcm_oss snd_mixer_oss snd_pcm intel_rapl x86_pkg_temp_thermal i
ntel_powerclamp thinkpad_acpi nvram coretemp kvm_intel snd_seq_midi kvm 
snd_seq_midi_event iwldvm mac80211 snd_rawmidi microcode snd_seq 
pcspkr psmouse iwlwifi serio_raw snd_seq_device snd_timer lpc_ich cfg80211 
mfd_core i2c_i801 snd soundcore rfkill battery ac tpm_tis tpm 
evdev processor btrfs xor raid6_pq md_mod dm_mirror dm_region_hash dm_log 
dm_mod sg sr_mod cdrom sd_mod crc_t10dif hid_generic usbhid hid
 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel firewire_ohci 
aesni_intel sdhci_pci aes_x86_64 lrw gf128mul sdhci glue_he
lper ablk_helper firewire_core mmc_core crc_itu_t sata_sil24 ahci ehci_pci 
libahci libata cryptd ehci_hcd scsi_mod usbcore e1000e usb_com
mon ptp pps_core thermal
Feb 17 11:22:38 merkaba kernel: [ 1816.159946] CPU: 1 PID: 1618 Comm: 
btrfs-transacti Tainted: G   O 3.14.0-rc3-tp520 #46
Feb 17 11:22:38 merkaba kernel: [ 1816.159948] Hardware name: LENOVO 
42433WG/42433WG, BIOS 8AET63WW (1.43 ) 05/08/2013
Feb 17 11:22:38 merkaba kernel: [ 1816.159951] task: 88020ed19850 ti: 
88020ed14000 task.ti: 88020ed14000
Feb 17 11:22:38 merkaba kernel: [ 1816.159953] RIP: 0010:[]  
[] do_raw_spin_lock+0x15/0x21
Feb 17 11:22:38 merkaba kernel: [ 1816.159961] RSP: 0018:88020ed15d30  
EFLAGS: 0297
Feb 17 11:22:38 merkaba kernel: [ 1816.159963] RAX: 3130 RBX: 
8111239e RCX: 
Feb 17 11:22:38 merkaba kernel: [ 1816.159965] RDX: 0031 RSI: 
8050 RDI: 880036dfc640
Feb 17 11:22:38 merkaba kernel: [ 1816.159967] RBP: 88020ed15d30 R08: 
0001af40 R09: 
Feb 17 11:22:38 merkaba kernel: [ 1816.159969] R10: 0009 R11: 
0008 R12: ea00058e2640
Feb 17 11:22:38 merkaba kernel: [ 1816.159971] R13: 88021e29af80 R14: 
0193ecd5 R15: 88021135d900
Feb 17 11:22:38 merkaba kernel: [ 1816.159974] FS:  () 
GS:88021e28() knlGS:
Feb 17 11:22:38 merkaba kernel: [ 1816.159976] CS:  0010 DS:  ES:  CR0: 
80050033
Feb 17 11:22:38 merkaba kernel: [ 1816.159978] CR2: 7fe3963e1000 CR3: 
01a0b000 CR4: 000407e0
Feb 17 11:22:38 merkaba kernel: [ 1816.159979] Stack:
Feb 17 11:22:38 merkaba kernel: [ 1816.159981]  88020ed15d48 
81441c6d 880036dfc640 88020ed15da0
Feb 17 11:22:38 merkaba kernel: [ 1816.159985]  a02d8c6c 
88020ed15d90 a028d46a 
Feb 17 11:22:38 merkaba kernel: [ 1816.159989]  003fe0a04000 
8800cf824c00 003fe0a04000 880037232000
Feb 17 11:22:38 merkaba kernel: [ 1816.159993] Call Trace:
Feb 17 11:22:38 merkaba kernel: [ 1816.16]  [] 
_raw_spin_lock+0x1a/0x1d
Feb 17 11:22:38 merkaba kernel: [ 1816.160031]  [] 
__btrfs_add_free_space+0x47/0x2bd [btrfs]
Feb 17 11:22:38 merkaba kernel: [ 1816.160048]  [] ? 
block_group_cache_tree_search+0xb7/0xc5 [btrfs]
Feb 17 11:22:38 merkaba kernel: [ 1816.160063]  [] 
unpin_extent_range.isra.54+0xa2/0x194 [btrfs]
Feb 17 11:22:38 merkaba kernel: [ 1816.160080]  [] 
btrfs_finish_extent_commit+0xa9/0xb9 [btrfs]
Feb 17 11:22:38 merkaba kernel: [ 1816.160097]  [] 
btrfs_commit_transaction+0x6cc/0x83c [btrfs]
Feb 17 11:22:38 merkaba kernel: [ 1816.160114]  [] 
transaction_kthread+0xf3/0x1a6 [btrfs]
Feb 17 11:2

[PATCH v2 4/4] btrfs-progs: fix fsck leaks on error returns

2014-02-17 Thread Gui Hecheng
Add close_ctree()s before the "returns" on errors after open_ctree()
Also merge the err returns into the "goto + single return" pattern.

Signed-off-by: Gui Hecheng 
---
changelog:
v1->v2: merge err returns into "goto + single return" pattern
---
 cmds-check.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index eef7c6c..c053126 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -6444,18 +6444,21 @@ int cmd_check(int argc, char **argv)
 
if((ret = check_mounted(argv[optind])) < 0) {
fprintf(stderr, "Could not check mount status: %s\n", 
strerror(-ret));
-   return ret;
+   goto err_out;
} else if(ret) {
fprintf(stderr, "%s is currently mounted. Aborting.\n", 
argv[optind]);
-   return -EBUSY;
+   ret = -EBUSY;
+   goto err_out;
}
 
info = open_ctree_fs_info(argv[optind], bytenr, 0, ctree_flags);
if (!info) {
fprintf(stderr, "Couldn't open file system\n");
-   return -EIO;
+   ret = -EIO;
+   goto err_out;
}
 
+   root = info->fs_root;
uuid_unparse(info->super_copy->fsid, uuidbuf);
printf("Checking filesystem on %s\nUUID: %s\n", argv[optind], uuidbuf);
 
@@ -6463,19 +6466,20 @@ int cmd_check(int argc, char **argv)
!extent_buffer_uptodate(info->dev_root->node) ||
!extent_buffer_uptodate(info->chunk_root->node)) {
fprintf(stderr, "Critical roots corrupted, unable to fsck the 
FS\n");
-   return -EIO;
+   ret = -EIO;
+   goto close_out;
}
 
-   root = info->fs_root;
if (init_extent_tree) {
printf("Creating a new extent tree\n");
ret = reinit_extent_tree(info);
if (ret)
-   return ret;
+   goto close_out;
}
if (!extent_buffer_uptodate(info->extent_root->node)) {
fprintf(stderr, "Critical roots corrupted, unable to fsck the 
FS\n");
-   return -EIO;
+   ret = -EIO;
+   goto close_out;
}
 
fprintf(stderr, "checking extents\n");
@@ -6486,13 +6490,15 @@ int cmd_check(int argc, char **argv)
trans = btrfs_start_transaction(info->csum_root, 1);
if (IS_ERR(trans)) {
fprintf(stderr, "Error starting transaction\n");
-   return PTR_ERR(trans);
+   ret = PTR_ERR(trans);
+   goto close_out;
}
 
ret = btrfs_fsck_reinit_root(trans, info->csum_root, 0);
if (ret) {
fprintf(stderr, "crc root initialization failed\n");
-   return -EIO;
+   ret = -EIO;
+   goto close_out;
}
 
ret = btrfs_commit_transaction(trans, info->csum_root);
@@ -6562,9 +6568,6 @@ int cmd_check(int argc, char **argv)
ret = 1;
}
 out:
-   free_root_recs_tree(&root_cache);
-   close_ctree(root);
-
if (found_old_backref) { /*
 * there was a disk format change when mixed
 * backref was in testing tree. The old format
@@ -6591,5 +6594,10 @@ out:
(unsigned long long)data_bytes_allocated,
(unsigned long long)data_bytes_referenced);
printf("%s\n", BTRFS_BUILD_VERSION);
+
+   free_root_recs_tree(&root_cache);
+close_out:
+   close_ctree(root);
+err_out:
return ret;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck does not fix

2014-02-17 Thread Goswin von Brederlow
On Mon, Feb 17, 2014 at 03:20:58AM +, Duncan wrote:
> Chris Murphy posted on Sun, 16 Feb 2014 12:54:44 -0700 as excerpted:
> > Also, 10 hours to balance two disks at 2.3TB seems like a long time. I'm
> > not sure if that's expected.
> 
> FWIW, I think you may not realize how big 2.3 TiB is, and/or how slow 
> spinning rust can be when dealing with TiBs of potentially fragmented 
> data...
> 
> 2.3TiB * 1024GiB/TiB * 1024 MiB/GiB / 10 hours / 60 min/hr / 60 sec/min =
> 
> 66.99... real close to 67 MiB/sec
> 
> Since it's multiple TiB we're talking and only two devices, that's almost 
> certainly spinning rust, not SSD, and on spinning rust, 67 MiB/sec really 
> isn't /that/ bad, especially if the filesystem wasn't new and had been 
> reasonably used, thus likely had some fragmentation to deal with.

Don't forget that that is 67MiB/s reading data and 67MiB/s writing
data giving a total of 134MiB/s. 

Still, on a good system each disk should have about that speed so it's
about 50% of theoretical maximum. Which is quite good given that the
disks will need to seek between every read and write. In comparison
moving data with LVM gets only about half that speed and that doesn't
even have the overhead of a filesystem to deal with.

MfG
Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ENOSPC with 270GiB free

2014-02-17 Thread Goswin von Brederlow
On Mon, Feb 17, 2014 at 08:42:23AM +0100, Dan van der Ster wrote:
> Did you already try this?? [1]:
> 
>btrfs fi balance start -dusage=5 /mnt/nas3
> 
> Cheers, dan
> 
> [1] 
> https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_get_.22No_space_left_on_device.22_errors.2C_but_df_says_I.27ve_got_lots_of_space
> 
> On Sun, Feb 16, 2014 at 2:58 PM, Goswin von Brederlow  
> wrote:
> > Hi,
> >
> > I'm getting a ENOSPC error from btrfs despite there still being plenty
> > of space left:
> >
> > % df -m /mnt/nas3
> > Filesystem 1M-blocks Used Available Use% Mounted on
> > /dev/mapper/nas3-a  19077220 18805132270773  99% /mnt/nas3
> >
> > % btrfs fi show
> > Label: none  uuid: 4b18f84e-2499-41ca-81ff-fe1783c11491
> > Total devices 1 FS bytes used 17.91TiB
> > devid1 size 18.19TiB used 17.94TiB path /dev/mapper/nas3-a
> >
> > Btrfs v3.12
> >
> > % btrfs fi df
> > Data, single: total=17.89TiB, used=17.88TiB
> > System, DUP: total=32.00MiB, used=1.92MiB
> > Metadata, DUP: total=25.50GiB, used=24.89GiB
> >
> > As you can see there are still 270GiB free and plenty of block groups
> > free on the device too.
> >
> > So why isn't btrfs allocating a new block group to store more data?
> >
> > MfG
> > Goswin

I did and that isn't the problem. Balancing only frees up partially
used block groups so they can be reused. But the problem is that the
remaining free block groups are not getting used.

MfG
Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html