RE: [GIT PULL] Fix for btrfs/070 checksum error

2015-07-29 Thread Zhao Lei
Hi, Chris

> -Original Message-
> From: linux-btrfs-ow...@vger.kernel.org
> [mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Qu Wenruo
> Sent: Tuesday, July 28, 2015 3:11 PM
> To: Chris Mason; btrfs
> Subject: Re: [GIT PULL] Fix for btrfs/070 checksum error
> 
> Chris Mason wrote on 2015/07/23 21:57 -0400:
> > On Fri, Jul 24, 2015 at 08:29:05AM +0800, Qu Wenruo wrote:
> >
> > [ deadlock with the 070 patches ]
> >
> >> Thanks Chris
> >>
> >> We will investigate it with highest priority.
> >>
> >> Thanks,
> >> Qu
> >>
> >
> > Thanks!  I'm doing a few more runs to make sure the lockup is new with
> > these patches.
> >
> > -chris
> >
> Hi Chris,
> 
> I'm very sorry that we are unable to fix the lockup in a short time, so it 
> may not
> fit in the v4.2 merge window.
> 
> Please ignore this patchset for now.
> 

Sorry for taking quite a long time for investigate because it is
randomly happened.

We got reason of process blocking:
1: In some case, this patch caused __btrfs_cow_block()->btrfs_reloc_cow_block()
  failed from btrfs_balance operation.(need more investigation)

2: __btrfs_cow_block()'s error handle code hadn't unlock/free
  new_allocated tree block before return error.

3: do_relocation(), which is caller of __btrfs_cow_block(), have error handle
  code, but also can't work in this case, because new_allocated eb is not
  returned.

4: subsequent code in do_relocation() try to lock above eb again,
  and caused dead lock.

In short:
do_relocation()
-> __btrfs_cow_block() failed without unlock eb *1
...
-> btrfs_search_slot() try to lock above eb again
...
*1: this fail is caused by scrub

Because eb locking code is not normal lock, we can't get information
from lockldep in this case.

Things to do:
1: Fix this patch to avoid making __btrfs_cow_block() fails.
2: Fix __btrfs_cow_block() to do enough cleanup in error handle code.
3: Some enhance for eb locking, to report some information to helps
  similar error.

Thanks
Zhaolei

> Thanks,
> Qu
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the 
> body
> of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Fix for btrfs/070 checksum error

2015-07-29 Thread Chris Mason
On Wed, Jul 29, 2015 at 04:21:33PM +0800, Zhao Lei wrote:
> Hi, Chris
> 
> > -Original Message-
> > From: linux-btrfs-ow...@vger.kernel.org
> > [mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Qu Wenruo
> > Sent: Tuesday, July 28, 2015 3:11 PM
> > To: Chris Mason; btrfs
> > Subject: Re: [GIT PULL] Fix for btrfs/070 checksum error
> > 
> > Chris Mason wrote on 2015/07/23 21:57 -0400:
> > > On Fri, Jul 24, 2015 at 08:29:05AM +0800, Qu Wenruo wrote:
> > >
> > > [ deadlock with the 070 patches ]
> > >
> > >> Thanks Chris
> > >>
> > >> We will investigate it with highest priority.
> > >>
> > >> Thanks,
> > >> Qu
> > >>
> > >
> > > Thanks!  I'm doing a few more runs to make sure the lockup is new with
> > > these patches.
> > >
> > > -chris
> > >
> > Hi Chris,
> > 
> > I'm very sorry that we are unable to fix the lockup in a short time, so it 
> > may not
> > fit in the v4.2 merge window.
> > 
> > Please ignore this patchset for now.
> > 
> 
> Sorry for taking quite a long time for investigate because it is
> randomly happened.
> 
> We got reason of process blocking:
> 1: In some case, this patch caused 
> __btrfs_cow_block()->btrfs_reloc_cow_block()
>   failed from btrfs_balance operation.(need more investigation)
> 
> 2: __btrfs_cow_block()'s error handle code hadn't unlock/free
>   new_allocated tree block before return error.
> 
> 3: do_relocation(), which is caller of __btrfs_cow_block(), have error handle
>   code, but also can't work in this case, because new_allocated eb is not
>   returned.
> 
> 4: subsequent code in do_relocation() try to lock above eb again,
>   and caused dead lock.

Excellent, thanks for tracking this down.  I agree investigating #1 is
the top priority, since it's possible the patches are just making it
happen more often.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: teach backref walking about backrefs with underflowed offset values

2015-07-29 Thread fdmanana
From: Filipe Manana 

When cloning/deduplicating file extents (through the clone and extent_same
ioctls) we can get data back references with offset values that are a
result of an unsigned integer arithmetic underflow, that is, values that
are much larger then they could be otherwise.

This is not a problem when decrementing or dropping the back references
(happens when we overwrite the extents or punch a hole for example, through
__btrfs_drop_extents()), since we compute the same too large offset value,
but it is a problem for the backref walking code, used by an incremental
send and the ioctls that are used by the btrfs tool "inspect-internal"
commands, as it makes it miss the corresponding file extent items because
the search key is set for an extent item that starts at an offset matching
the exceptionally large offset value of the data back reference. For an
incremental send this causes the send ioctl to fail with -EIO.

So teach the backref walking code to deal with these cases by setting the
search key's offset to 0 if the backref's offset value is larger than
LLONG_MAX (the largest possible file offset). This makes sure the backref
walking code finds the corresponding file extent items at the expense of
scanning more items and leafs in the btree.

Fixing the clone/dedup ioctls to not produce such underflowed results would
require major changes breaking backward compatibility, updating user space
tools, etc.

Simple reproducer case for fstests:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"

  tmp=/tmp/$$
  status=1  # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
  rm -fr $send_files_dir
  rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner
  _need_to_be_root

  send_files_dir=$TEST_DIR/btrfs-test-$seq

  rm -f $seqres.full
  rm -fr $send_files_dir
  mkdir $send_files_dir

  _scratch_mkfs >>$seqres.full 2>&1
  _scratch_mount

  # Create our test file with a single extent of 64K starting at file
  # offset 128K.
  $XFS_IO_PROG -f -c "pwrite -S 0xaa 128K 64K" $SCRATCH_MNT/foo \
  | _filter_xfs_io

  _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
  $SCRATCH_MNT/mysnap1

  # Now clone parts of the original extent into lower offsets of the file.
  #
  # The first clone operation adds a file extent item to file offset 0
  # that points to our initial extent with a data offset of 16K. The
  # corresponding data back reference in the extent tree has an offset of
  # 18446744073709535232, which is the result of file_offset - data_offset
  # = 0 - 16K.
  #
  # The second clone operation adds a file extent item to file offset 16K
  # that points to our initial extent with a data offset of 48K. The
  # corresponding data back reference in the extent tree has an offset of
  # 18446744073709518848, which is the result of file_offset - data_offset
  # = 16K - 48K.
  #
  # Those large back reference offsets (result of unsigned arithmetic
  # underflow) confused the back reference walking code (used by an
  # incremental send and the multiple inspect-internal ioctls) and made it
  # miss the back references, which for the case of an incremental send it
  # made it fail with -EIO and print a message like the following to
  # dmesg:
  #
  # "BTRFS error (device sdc): did not find backref in send_root. \
  #  inode=257, offset=0, disk_byte=12845056 found extent=12845056"
  #
  $CLONER_PROG -s $(((128 + 16) * 1024)) -d 0 -l $((16 * 1024)) \
  $SCRATCH_MNT/foo $SCRATCH_MNT/foo
  $CLONER_PROG -s $(((128 + 48) * 1024)) -d $((16 * 1024)) \
  -l $((16 * 1024)) $SCRATCH_MNT/foo $SCRATCH_MNT/foo

  _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
  $SCRATCH_MNT/mysnap2

  _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap
  _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \
  -f $send_files_dir/2.snap

  echo "File digest in the original filesystem:"
  md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch

  # Now recreate the filesystem by receiving both send streams and verify
  # we get the same file contents that the original filesystem had.
  _scratch_unmount
  _scratch_mkfs >>$seqres.full 2>&1
  _scratch_mount

  _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap
  _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/2.snap

  echo "File digest in the new filesystem:"
  md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch

  status=0
  exit

The test's expected golden output is:

  wrote 65536/65536 bytes at offset 131072
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  File digest in the original filesystem:
  6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2/foo
  File digest in the new filesystem:
  6c6079335cff141b8a31233ead04cbff  SCRATCH_MNT/mysnap2

Re: Strange data backref offset?

2015-07-29 Thread Filipe David Manana
On Fri, Jul 17, 2015 at 3:38 AM, Qu Wenruo  wrote:
> Hi all,
>
> While I'm developing a new btrfs inband dedup mechanism, I found btrfsck and
> kernel doing strange behavior for clone.
>
> [Reproducer]
> # mount /dev/sdc -t btrfs /mnt/test
> # dd if=/dev/zero of=/mnt/test/file1 bs=4K count=4
> # sync
> # ~/xfstests/src/cloner -s 4096 -l 4096 /mnt/test/file1 /mnt/test/file2
> # sync
>
> Then btrfs-debug-tree gives quite strange result on the data backref:
> --
> 
> item 4 key (12845056 EXTENT_ITEM 16384) itemoff 16047 itemsize 111
> extent refs 3 gen 6 flags DATA
> extent data backref root 5 objectid 257 offset 0 count 1
> extent data backref root 5 objectid 258 offset
> 18446744073709547520 count 1
>
> 
> item 8 key (257 EXTENT_DATA 0) itemoff 15743 itemsize 53
> extent data disk byte 12845056 nr 16384
> extent data offset 0 nr 16384 ram 16384
> extent compression 0
> item 9 key (257 EXTENT_DATA 16384) itemoff 15690 itemsize 53
> extent data disk byte 12845056 nr 16384
> extent data offset 4096 nr 4096 ram 16384
> extent compression 0
> --
>
> The offset is file extent's key.offset - file exntent's offset,
> Which is 0 - 4096, causing the overflow result.
>
> Kernel and fsck all uses that behavior, so fsck can pass the strange thing.
>
> But shouldn't the offset in data backref matches with the key.offset of the
> file extent?
>
> And I'm quite sure the change of behavior can hugely break the fsck and
> kernel, but I'm wondering is this a known BUG or feature, and will it be
> handled?

Obviously a bug.

I was recently investigating incremental send failures after
cloning/deduping extents and that lead me to this as well.
It's a bug but it's not too bad as it effects only backref walking,
which can have a simple workaround (I just sent a patch for it). For
the purposes of incrementing/decrementing the data backref's count we
do the same calculation everywhere, always leading to the same large
and unexpected value, so we don't get bogus backrefs added/left
around.

>
> Thanks,
> Qu
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fstests: test for btrfs incremental send after file extent cloning

2015-07-29 Thread fdmanana
From: Filipe Manana 

Test that an incremental send works after a file gets one of its extents
cloned/deduplicated into lower file offsets.

This is a regression test for the problem fixed by the linux kernel patch
titled:

  "Btrfs: teach backref walking about backrefs with underflowed
   offset values"

Signed-off-by: Filipe Manana 
---
 tests/btrfs/097 | 113 
 tests/btrfs/097.out |   7 
 tests/btrfs/group   |   1 +
 3 files changed, 121 insertions(+)
 create mode 100755 tests/btrfs/097
 create mode 100644 tests/btrfs/097.out

diff --git a/tests/btrfs/097 b/tests/btrfs/097
new file mode 100755
index 000..d9138ea
--- /dev/null
+++ b/tests/btrfs/097
@@ -0,0 +1,113 @@
+#! /bin/bash
+# FS QA Test No. btrfs/097
+#
+# Test that an incremental send works after a file gets one of its extents
+# cloned/deduplicated into lower file offsets.
+#
+#---
+# Copyright (C) 2015 SUSE Linux Products GmbH. All Rights Reserved.
+# Author: Filipe Manana 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   rm -fr $send_files_dir
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_cloner
+_need_to_be_root
+
+send_files_dir=$TEST_DIR/btrfs-test-$seq
+
+rm -f $seqres.full
+rm -fr $send_files_dir
+mkdir $send_files_dir
+
+_scratch_mkfs >>$seqres.full 2>&1
+_scratch_mount
+
+# Create our test file with a single extent of 64K starting at file offset 
128K.
+$XFS_IO_PROG -f -c "pwrite -S 0xaa 128K 64K" $SCRATCH_MNT/foo | _filter_xfs_io
+
+_run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap1
+
+# Now clone parts of the original extent into lower offsets of the file.
+#
+# The first clone operation adds a file extent item to file offset 0 that 
points
+# to our initial extent with a data offset of 16K. The corresponding data back
+# reference in the extent tree has an offset of 18446744073709535232, which is
+# the result of file_offset - data_offset = 0 - 16K.
+#
+# The second clone operation adds a file extent item to file offset 16K that
+# points to our initial extent with a data offset of 48K. The corresponding 
data
+# back reference in the extent tree has an offset of 18446744073709518848, 
which
+# is the result of file_offset - data_offset = 16K - 48K.
+#
+# Those large back reference offsets (result of unsigned arithmetic underflow)
+# confused the back reference walking code (used by an incremental send and
+# the multiple inspect-internal ioctls) and made it miss the back references,
+# which for the case of an incremental send it made it fail with -EIO and print
+# a message like the following to dmesg:
+#
+# "BTRFS error (device sdc): did not find backref in send_root. inode=257, \
+#  offset=0, disk_byte=12845056 found extent=12845056"
+#
+$CLONER_PROG -s $(((128 + 16) * 1024)) -d 0 -l $((16 * 1024)) \
+   $SCRATCH_MNT/foo $SCRATCH_MNT/foo
+$CLONER_PROG -s $(((128 + 48) * 1024)) -d $((16 * 1024)) -l $((16 * 1024)) \
+   $SCRATCH_MNT/foo $SCRATCH_MNT/foo
+
+_run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap2
+
+_run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap
+_run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \
+   -f $send_files_dir/2.snap
+
+echo "File digest in the original filesystem:"
+md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch
+
+# Now recreate the filesystem by receiving both send streams and verify we get
+# the same file contents that the original filesystem had.
+_scratch_unmount
+_scratch_mkfs >>$seqres.full 2>&1
+_scratch_mount
+
+_run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap
+_run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/2.snap
+
+echo "File digest in the new filesystem:"
+md5sum $SCRATCH_MNT/mysnap2/foo | _filter_scratch
+
+status=0
+exit
diff --git a/tests/btrfs/097.out b/tests/btrfs/097.out
new file mode 100644
index 0

fs got readonly after "btrfs_run_delayed_refs:2783: errno=-5 IO failure"

2015-07-29 Thread Anatol Pomozov
Hi

At my home machine I use btrfs from the latest Linux kernel (Linux Arch).

A few days ago I started rebalance but unfortunately the machine got
rebooted. It looks like rebalance operation is not interrupt-tolerant
and now my filesystem got corrupted.

I see a lot of checksum errors, but as I use RAID most of these error
got fixed, I started scrub operation to find/fix all the problems but
the scrub operation got cancelled at the very beginning. I see
following error in kernel logs, it says "(device sdb):
run_one_delayed_ref returned -5" and after that "(device sdb): forced
readonly". What does it suppose to mean? I expect that scrub either
fix filesystem inconsistency problems. Or tell me what file are not
recoverable so I can delete/restore the data from backup. But now I
have a readonly filsystem and scrub refuses to recover it.

I see the same issue with current HEAD (v4.2-rc3). I enabled btrfs
debugging to get more info what is going on here

[  609.802479] BTRFS: read error corrected: ino 1 off 25324783960064
(dev /dev/sdc sector 2530931648)
[  609.814791] BTRFS (device sdb): parent transid verify failed on
25324789198848 wanted 443932 found 413701
[  609.835181] BTRFS: read error corrected: ino 1 off 25324789198848
(dev /dev/sdc sector 2530941880)
[  609.846056] BTRFS (device sdb): parent transid verify failed on
25324789448704 wanted 443932 found 413701
[  609.858280] BTRFS: read error corrected: ino 1 off 25324789448704
(dev /dev/sdc sector 2530942368)
[  609.859835] BTRFS (device sdb): parent transid verify failed on
25324789387264 wanted 443932 found 413701
[  609.870867] BTRFS: read error corrected: ino 1 off 25324789387264
(dev /dev/sdc sector 2530942248)
[  609.872679] BTRFS (device sdb): parent transid verify failed on
25324822609920 wanted 443938 found 441825
[  609.909616] BTRFS: read error corrected: ino 1 off 25324822609920
(dev /dev/sdc sector 2531007136)
[  609.967041] BTRFS (device sdb): parent transid verify failed on
25324678742016 wanted 443932 found 441820
[  609.970855] BTRFS: read error corrected: ino 1 off 25324678742016
(dev /dev/sdc sector 2530726144)
[  610.008460] BTRFS (device sdb): parent transid verify failed on
25324908392448 wanted 443938 found 415080
[  610.041669] BTRFS: read error corrected: ino 1 off 25324908392448
(dev /dev/sdc sector 2531174680)
[  610.116968] BTRFS (device sdb): parent transid verify failed on
25325058904064 wanted 443941 found 441828
[  610.123595] BTRFS: read error corrected: ino 1 off 25325058904064
(dev /dev/sdc sector 4024575336)
[  610.128482] BTRFS: read error corrected: ino 1 off 25324674007040
(dev /dev/sdc sector 2530716896)
[  640.028885] verify_parent_transid: 19 callbacks suppressed
[  640.030377] BTRFS (device sdb): parent transid verify failed on
25324845932544 wanted 443938 found 441825
[  640.062917] repair_io_failure: 18 callbacks suppressed
[  640.064486] BTRFS: read error corrected: ino 1 off 25324845932544
(dev /dev/sdc sector 2531052688)
[  640.119903] BTRFS (device sdb): parent transid verify failed on
25324845969408 wanted 443938 found 441827
[  640.125322] BTRFS: read error corrected: ino 1 off 25324845969408
(dev /dev/sdc sector 2531052760)
[  640.142157] BTRFS (device sdb): parent transid verify failed on
25325043716096 wanted 443940 found 441827
[  640.174974] BTRFS: read error corrected: ino 1 off 25325043716096
(dev /dev/sdc sector 4024545672)
[  640.185464] BTRFS (device sdb): parent transid verify failed on
25325503774720 wanted 443950 found 441837
[  640.238762] BTRFS: read error corrected: ino 1 off 25325503774720
(dev /dev/sdc sector 4025444224)
[  641.718129] BTRFS (device sdb): parent transid verify failed on
25325006667776 wanted 443940 found 441827
[  641.721734] BTRFS: read error corrected: ino 1 off 25325006667776
(dev /dev/sdc sector 4024473312)
[  641.723841] BTRFS (device sdb): parent transid verify failed on
25325006692352 wanted 443940 found 441827
[  641.725775] BTRFS: read error corrected: ino 1 off 25325006692352
(dev /dev/sdc sector 4024473360)
[  641.742454] BTRFS (device sdb): parent transid verify failed on
25325006716928 wanted 443940 found 441827
[  641.744649] BTRFS: read error corrected: ino 1 off 25325006716928
(dev /dev/sdc sector 4024473408)
[  641.778807] BTRFS (device sdb): parent transid verify failed on
25324804997120 wanted 443937 found 413700
[  641.819483] BTRFS: read error corrected: ino 1 off 25324804997120
(dev /dev/sdc sector 2530972736)
[  641.821201] BTRFS (device sdb): parent transid verify failed on
25324782997504 wanted 443937 found 441823
[  641.834794] BTRFS: read error corrected: ino 1 off 25324782997504
(dev /dev/sdc sector 2530929768)
[  641.836415] BTRFS (device sdb): parent transid verify failed on
25324805001216 wanted 443937 found 413700
[  641.838488] BTRFS: read error corrected: ino 1 off 25324805001216
(dev /dev/sdc sector 2530972744)
[  644.531005] BTRFS error (device sdb): run_one_delayed_ref returned -5
[  644.534562] [ cut here ]---

[PATCH 1/1] btrfs-progs: compilation errors when using musl libc

2015-07-29 Thread Brendan Heading
- limits.h must be included to pick up PATH_MAX.
- remove double declaration of BTRFS_DISABLE_BACKTRACE

kerncompat.h assumed that if __GLIBC__ was not defined,
it could safely define BTRFS_DISABLE_BACKTRACE, however this can be
defined by the configure script. Added a check to ensure it is not
defined first.

Signed-off-by: Brendan Heading 
---
 cmds-inspect.c | 1 +
 cmds-receive.c | 1 +
 cmds-scrub.c   | 1 +
 cmds-send.c| 1 +
 kerncompat.h   | 2 ++
 5 files changed, 6 insertions(+)

diff --git a/cmds-inspect.c b/cmds-inspect.c
index 71451fe..9712581 100644
--- a/cmds-inspect.c
+++ b/cmds-inspect.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "kerncompat.h"
 #include "ioctl.h"
diff --git a/cmds-receive.c b/cmds-receive.c
index 071bea9..d4b3103 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/cmds-scrub.c b/cmds-scrub.c
index b7aa809..5a85dc4 100644
--- a/cmds-scrub.c
+++ b/cmds-scrub.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ctree.h"
 #include "ioctl.h"
diff --git a/cmds-send.c b/cmds-send.c
index 20bba18..a0b7f95 100644
--- a/cmds-send.c
+++ b/cmds-send.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ctree.h"
 #include "ioctl.h"
diff --git a/kerncompat.h b/kerncompat.h
index 5d92856..7c627ba 100644
--- a/kerncompat.h
+++ b/kerncompat.h
@@ -33,7 +33,9 @@
 #include 
 
 #ifndef __GLIBC__
+#ifndef BTRFS_DISABLE_BACKTRACE
 #define BTRFS_DISABLE_BACKTRACE
+#endif
 #define __always_inline __inline __attribute__ ((__always_inline__))
 #endif
 
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fs got readonly after "btrfs_run_delayed_refs:2783: errno=-5 IO failure"

2015-07-29 Thread Duncan
Anatol Pomozov posted on Wed, 29 Jul 2015 09:26:00 -0700 as excerpted:

> At my home machine I use btrfs from the latest Linux kernel (Linux
> Arch).

Similar here, but on gentoo.  And to be clear, just a list regular and 
btrfs user as yourself, not a dev.  As such, this reply isn't intended to 
directly help you fix the issue at hand, but it does address a possible 
misconception I saw, below, and provide some more general information 
that could be helpful.

> A few days ago I started rebalance but unfortunately the machine got
> rebooted. It looks like rebalance operation is not interrupt-tolerant
> and now my filesystem got corrupted.

In _theory_ btrfs operations are atomic and thus even unplug-the-running-
machine tolerant, let alone reboot tolerant.  However, in _both_ theory 
and practice, btrfs is still not fully stable and mature yet, and bugs 
negatively affect the operation of the theory above...

In theory rebalance simply moves big chunks of data/metadata around, and 
if interrupted, all addresses will either point to the new location for 
for previously balanced chunks, or the old location, for those not yet 
balanced and for the one that was being processed at the time of the 
reboot.

And a balance definitely can and normally does pick up where it left off 
after a reboot.

But...

> I see a lot of checksum errors, but as I use RAID most of these error
> got fixed, I started scrub operation to find/fix all the problems but
> the scrub operation got cancelled at the very beginning. I see following
> error in kernel logs, it says "(device sdb): run_one_delayed_ref
> returned -5" and after that "(device sdb): forced readonly". What does
> it suppose to mean? I expect that scrub either fix filesystem
> inconsistency problems. Or tell me what file are not recoverable so I
> can delete/restore the data from backup. But now I have a readonly
> filsystem and scrub refuses to recover it.

Scrub detects, and fixes in the dup/raid1/5/6/10 case where there's 
either a redundant copy or parity information from which it can rebuild, 
one kind of error, the checksum errors you mentioned.  It does _not_, 
however, and this is the possible misconception I mentioned above, fix 
other types of filesystem inconsistency problems, unless they're a direct 
result of the checksum validated data integrity errors it does detect and 
fix if possible.  For other errors, the kernel itself catches and fixes 
many problems on-mount, with others recoverable with the recovery mount 
option, and still others fixable using btrfs check, tho AFAIK, the 
recommendation remains not to use btrfs check in --repair mode (without --
repair it'll only report any problems it finds, not attempt to fix them) 
unless you have to, because with problems it doesn't understand it might 
make the problem worse instead of better.

Of course with btrfs' immaturity, the rule about having backups if you 
care about the data, and if you don't have backups, by definition you 
don't care about the data, applies double, but you already mentioned the 
possibility of restoring from backups, so you have that one covered. =:^)

As for the read-only, the kernel btrfs code forces a filesystem read-only 
when it detects a filesystem inconsistency that could result in further 
damage were it to continue to write to the filesystem.  Since at that 
point it's read-only, you can't damage it further by rebooting, and it's 
possible btrfs' self-healing properties will fix the problem on reboot.  
However, since it's also possible the damage is bad enough it might not 
mount at all on reboot, you might wish to take advantage of the current 
read-only state to freshen your backups while you can still access the 
filesystem.

(If you do get caught with an unmountable filesystem and stale backups, 
btrfs restore can be used to restore still readable files from the 
unmounted filesystem.  And because restore doesn't actually change the 
filesystem it's restoring from but writes restored files elsewhere, if it 
comes to that, restore is recommended before more risky interventions, 
like btrfs check in --repair mode.  I've done that a couple times when my 
backups were stale, and was quite happy with the results.  Of course that 
does mean you need space on a mounted filesystem to restore to...)

As for the problem at hand itself, I'll let those with more expertise 
address that.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fs got readonly after "btrfs_run_delayed_refs:2783: errno=-5 IO failure"

2015-07-29 Thread Anand Jain


Hi,


I see a lot of checksum errors, but as I use RAID most of these error
got fixed, I started scrub operation to find/fix all the problems but
the scrub operation got cancelled at the very beginning. I see
following error in kernel logs, it says "(device sdb):
run_one_delayed_ref returned -5" and after that "(device sdb): forced
readonly".


  are you using mount -o degrade option ? if not could you please try ?

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RESEND] btrfs: fix search key advancing condition

2015-07-29 Thread Naohiro Aota
Hello, list.

Could any one take a look at on this? I believe this is a issue slowing
down ioctl(BTRFS_IOC_TREE_SEARCH) if the target key is missing.

On Tue, Jun 30, 2015 at 11:25 AM, Naohiro Aota  wrote:
> The search key advancing condition used in copy_to_sk() is loose. It can
> advance the key even if it reaches sk->max_*: e.g. when the max key = (512,
> 1024, -1) and the current key = (512, 1025, 10), it increments the
> offset by 1, continues hopeless search from (512, 1025, 11). This issue
> make ioctl() to take unexpectedly long time scanning all the leaf a blocks
> one by one.
>
> This commit fix the problem using standard way of key comparison:
> btrfs_comp_cpu_keys()
>
> Signed-off-by: Naohiro Aota 
> ---
>  fs/btrfs/ioctl.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 1c22c65..07dc01d 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1932,6 +1932,7 @@ static noinline int copy_to_sk(struct btrfs_root *root,
> u64 found_transid;
> struct extent_buffer *leaf;
> struct btrfs_ioctl_search_header sh;
> +   struct btrfs_key test;
> unsigned long item_off;
> unsigned long item_len;
> int nritems;
> @@ -2015,12 +2016,17 @@ static noinline int copy_to_sk(struct btrfs_root 
> *root,
> }
>  advance_key:
> ret = 0;
> -   if (key->offset < (u64)-1 && key->offset < sk->max_offset)
> +   test.objectid = sk->max_objectid;
> +   test.type = sk->max_type;
> +   test.offset = sk->max_offset;
> +   if (btrfs_comp_cpu_keys(key, &test) >= 0)
> +   ret = 1;
> +   else if (key->offset < (u64)-1)
> key->offset++;
> -   else if (key->type < (u8)-1 && key->type < sk->max_type) {
> +   else if (key->type < (u8)-1) {
> key->offset = 0;
> key->type++;
> -   } else if (key->objectid < (u64)-1 && key->objectid < 
> sk->max_objectid) {
> +   } else if (key->objectid < (u64)-1) {
> key->offset = 0;
> key->type = 0;
> key->objectid++;
> --
> 2.4.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html