[RFC PATCH] fstests: Check if a fs can survive random (emulated) power loss

2018-02-25 Thread Qu Wenruo
This test case is originally designed to expose unexpected corruption
for btrfs, where there are several reports about btrfs serious metadata
corruption after power loss.

The test case itself will trigger heavy fsstress for the fs, and use
dm-flakey to emulate power loss by dropping all later writes.

For btrfs, it should be completely fine, as long as superblock write
(FUA write) finishes atomically, since with metadata CoW, superblock
either points to old trees or new tress, the fs should be as atomic as
superblock.

For journal based filesystems, each metadata update should be journaled,
so metadata operation is as atomic as journal updates.

It does show that XFS is doing the best work among the tested
filesystems (Btrfs, XFS, ext4), no kernel nor xfs_repair problem at all.

For btrfs, although btrfs check doesn't report any problem, kernel
reports some data checksum error, which is a little unexpected as data
is CoWed by default, which should be as atomic as superblock.
(Unfortunately, still not the exact problem I'm chasing for)

For EXT4, kernel is fine, but later e2fsck reports problem, which may
indicates there is still something to be improved.

Signed-off-by: Qu Wenruo 
---
 tests/generic/479 | 109 ++
 tests/generic/479.out |   2 +
 tests/generic/group   |   1 +
 3 files changed, 112 insertions(+)
 create mode 100755 tests/generic/479
 create mode 100644 tests/generic/479.out

diff --git a/tests/generic/479 b/tests/generic/479
new file mode 100755
index ..ab530231
--- /dev/null
+++ b/tests/generic/479
@@ -0,0 +1,109 @@
+#! /bin/bash
+# FS QA Test 479
+#
+# Test if a filesystem can survive emulated powerloss.
+#
+# No matter what the solution a filesystem uses (journal or CoW),
+# it should survive unexpected powerloss, without major metadata
+# corruption.
+#
+#---
+# Copyright (c) 2018 SuSE.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   ps -e | grep fsstress > /dev/null 2>&1
+   while [ $? -eq 0 ]; do
+   $KILLALL_PROG -KILL fsstress > /dev/null 2>&1
+   wait > /dev/null 2>&1
+   ps -e | grep fsstress > /dev/null 2>&1
+   done
+   _unmount_flakey &> /dev/null
+   _cleanup_flakey
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# remove previous $seqres.full before test
+rm -f $seqres.full
+
+# real QA test starts here
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+_require_command "$KILLALL_PROG" "killall"
+
+runtime=$(($TIME_FACTOR * 15))
+loops=$(($LOAD_FACTOR * 4))
+
+for i in $(seq -w $loops); do
+   echo "=== Loop $i: $(date) ===" >> $seqres.full
+
+   _scratch_mkfs >/dev/null 2>&1
+   _init_flakey
+   _mount_flakey
+
+   ($FSSTRESS_PROG $FSSTRESS_AVOID -w -d $SCRATCH_MNT -n 100 \
+   -p 100 >> $seqres.full &) > /dev/null 2>&1
+
+   sleep $runtime
+
+   # Here we only want to drop all write, don't need to umount the fs
+   _load_flakey_table $FLAKEY_DROP_WRITES
+
+   ps -e | grep fsstress > /dev/null 2>&1
+   while [ $? -eq 0 ]; do
+   $KILLALL_PROG -KILL fsstress > /dev/null 2>&1
+   wait > /dev/null 2>&1
+   ps -e | grep fsstress > /dev/null 2>&1
+   done
+
+   _unmount_flakey
+   _cleanup_flakey
+
+   # Mount the fs to do proper log replay for journal based fs
+   # so later check won't report annoying dirty log and only
+   # report real problem.
+   _scratch_mount
+   _scratch_unmount
+
+   _check_scratch_fs
+done
+
+echo "Silence is golden"
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/479.out b/tests/generic/479.out
new file mode 100644
index ..290f18b3
--- /dev/null
+++ b/tests/generic/479.out
@@ -0,0 +1,2 @@
+QA output created by 479
+Silence is golden
diff --git 

Re: Help with leaf parent key incorrect

2018-02-25 Thread Anand Jain



On 02/25/2018 06:16 PM, Paul Jones wrote:

Hi all,

I was running dedupe on my filesystem and something went wrong overnight, by 
the time I noticed the fs was readonly.


 Thanks for the report. I have few questions..
  Kind of raid profile used here?
  Dedupe tool that was used?
  Was the fs full before dedupe?
  Were there any IO errors?

Thanks, Anand


When trying to check it this is what I get:
vm-server ~ # btrfs check /dev/mapper/a-backup--a
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
Ignoring transid failure
leaf parent key incorrect 2371034071040
ERROR: cannot open file system

Is there a way to fix this? I'm using kernel 4.15.5

This is the last part of dmesg

[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.107963] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.05] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.473598] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.001927] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.60] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +2.676048] verify_parent_transid: 10362 callbacks suppressed
[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.078432] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.43] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.058638] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.139174] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[Feb25 20:48] BTRFS info (device dm-6): using free space tree
[  +0.02] BTRFS error (device dm-6): Remounting read-write after error is 
not allowed
[Feb25 20:49] BTRFS error (device dm-6): cleaner transaction attach returned -30
[  +0.238718] BTRFS warning (device dm-6): page private not zero on page 
1596642967552
[  +0.03] BTRFS warning (device dm-6): page private not zero on page 
1596642971648
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642975744
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642979840
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643672064
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643676160
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643680256
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643684352
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643704832
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643708928
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643713024
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643717120
[  +0.28] BTRFS warning (device dm-6): page private not zero on page 
2363051098112
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051102208
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051106304
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051110400
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2368056344576
[  +0.00] BTRFS 

[PATCH v3] btrfs: verify max_inline mount parameter

2018-02-25 Thread Anand Jain
We aren't verifying the parameter passed to the max_inline mount option.
So we won't fail the mount if a junk value is specified, for example,
-o max_inline=abc. This patch checks if input is valid.

Signed-off-by: Anand Jain 
---
v2->v3: Handle parameter with unit, such as 4K. Use memparse() 2nd arg.
v1->v2: use match_int ret value if error
use %u instead of %d for parser

 fs/btrfs/super.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 77e0537e1db5..76b58da8d56d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -605,7 +605,14 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char 
*options,
case Opt_max_inline:
num = match_strdup([0]);
if (num) {
-   info->max_inline = memparse(num, NULL);
+   char *retptr;
+
+   info->max_inline = memparse(num, );
+   if (*retptr != '\0') {
+   ret = -EINVAL;
+   kfree(num);
+   goto out;
+   }
kfree(num);
 
if (info->max_inline) {
-- 
2.15.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help with leaf parent key incorrect

2018-02-25 Thread Paul Jones
Hi all,

I was running dedupe on my filesystem and something went wrong overnight, by 
the time I noticed the fs was readonly.
When trying to check it this is what I get:
vm-server ~ # btrfs check /dev/mapper/a-backup--a
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
Ignoring transid failure
leaf parent key incorrect 2371034071040
ERROR: cannot open file system

Is there a way to fix this? I'm using kernel 4.15.5

This is the last part of dmesg

[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.107963] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.05] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.473598] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.001927] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.60] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +2.676048] verify_parent_transid: 10362 callbacks suppressed
[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.078432] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.43] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.058638] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.139174] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[Feb25 20:48] BTRFS info (device dm-6): using free space tree
[  +0.02] BTRFS error (device dm-6): Remounting read-write after error is 
not allowed
[Feb25 20:49] BTRFS error (device dm-6): cleaner transaction attach returned -30
[  +0.238718] BTRFS warning (device dm-6): page private not zero on page 
1596642967552
[  +0.03] BTRFS warning (device dm-6): page private not zero on page 
1596642971648
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642975744
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642979840
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643672064
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643676160
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643680256
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643684352
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643704832
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643708928
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643713024
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643717120
[  +0.28] BTRFS warning (device dm-6): page private not zero on page 
2363051098112
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051102208
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051106304
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051110400
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2368056344576
[  +0.00] BTRFS warning (device dm-6): page private not zero on page 
2368056348672
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2368056352768
[  +0.01] BTRFS warning (device dm-6): page private not zero on page