date:20150814

RE: trim not working and irreparable errors from btrfsck

2015-08-14 Thread Paul Jones

 -Original Message-
 From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs-
 ow...@vger.kernel.org] On Behalf Of Marc Joliet
 Sent: Friday, 14 August 2015 6:06 PM
 To: linux-btrfs@vger.kernel.org
 Subject: Re: trim not working and irreparable errors from btrfsck

 Am Thu, 13 Aug 2015 17:14:36 -0600
 schrieb Chris Murphy li...@colorremedies.com:

  Right now I think there's no status because a.) no bug report and b.)
  not enough information.

 I was mainly asking because apparently there *is* a patch that helps some
 people affected by this, but nobody ever commented on it.  Perhaps there's
 a reason for that, but I found it curious.  (I see now that it was submitted 
 in
 early January, in the thread [PATCH V2] Btrfs: really fix trim 0 bytes after 
 a
 device delete.)

 I can open a bug (I mean, that's part of being a user of btrfs at this 
 stage), I'm
 just surprised that nobody else has.

I have to use that patch on one of my systems. I just assumed it was never 
merged because it wasn't quite ready yet. It seems to work fine for me though.

Paul.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 1/3] xfstests: btrfs: add functions to create dm-error device

2015-08-14 Thread Eryu Guan

On Fri, Aug 14, 2015 at 06:47:02PM +0800, Anand Jain wrote:
 From: Anand Jain anand.j...@oracle.com
 
 Controlled EIO from the device is achieved using the dm device.
 Helper functions are at common/dmerror.
 
 Broadly steps will include calling _init_dmerror().
 _init_dmerror() will use SCRATCH_DEV to create dm linear device and assign
 DMERROR_DEV to /dev/mapper/error-test.
 
 When test script is ready to get EIO, the test cases can call
 _load_dmerror_table() which then it will load the dm error.
 so that reading DMERROR_DEV will cause EIO. After the test case is
 complete, cleanup must be done by calling _cleanup_dmerror().
 
 Signed-off-by: Anand Jain anand.j...@oracle.com
 Reviewed-by: Filipe Manana fdman...@suse.com
 ---
 v4-v5: No Change. keep up with the patch set
 v3-v4: rebase on latest xfstests code
 v2.1-v3: accepts Filipe Manana's review comments, thanks
 v2-v2.1: fixed missed typo error fixup in the commit.
 v1-v2: accepts Dave Chinner's review comments, thanks
  common/dmerror | 69 
 ++
  common/rc  |  9 
  2 files changed, 78 insertions(+)
  create mode 100644 common/dmerror
 
 diff --git a/common/dmerror b/common/dmerror
 new file mode 100644
 index 000..f895d90
 --- /dev/null
 +++ b/common/dmerror
 @@ -0,0 +1,69 @@
 +##/bin/bash
 +#
 +# Copyright (c) 2015 Oracle.  All Rights Reserved.
 +#
 +# This program is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU General Public License as
 +# published by the Free Software Foundation.
 +#
 +# This program is distributed in the hope that it would be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +#
 +# You should have received a copy of the GNU General Public License
 +# along with this program; if not, write the Free Software Foundation,
 +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 +#
 +#
 +# common functions for setting up and tearing down a dmerror device
 +
 +_init_dmerror()
 +{
 + $DMSETUP_PROG remove error-test  /dev/null 21
 +
 + local BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV`
 +
 + DMERROR_DEV='/dev/mapper/error-test'
 +
 + DMLINEAR_TABLE=0 $BLK_DEV_SIZE linear $SCRATCH_DEV 0
 +
 + $DMSETUP_PROG create error-test --table $DMLINEAR_TABLE || \
 + _fatal failed to create dm linear device
 +
 + DMERROR_TABLE=0 $BLK_DEV_SIZE error $SCRATCH_DEV 0
 +}
 +
 +_scratch_mkfs_dmerror()
 +{
 + $MKFS_BTRFS_PROG $* $DMERROR_DEV  $seqres.full 21 || \
 + _fatal failed to create mkfs.btrfs $* $DMERROR_DEV

I didn't follow previous reviews, please correct me if I miss anything.

Is dmerror only for btrfs testing? I saw $MKFS_BTRFS_PROG here. And do
we need $MKFS_OPTIONS too?

 +}
 +
 +_mount_dmerror()
 +{
 + mount -t $FSTYP $MOUNT_OPTIONS $DMERROR_DEV $SCRATCH_MNT

$MOUNT_PROG ?

 +}
 +
 +_unmount_dmerror()
 +{
 + $UMOUNT_PROGS $SCRATCH_MNT

$UMOUNT_PROG, no S at the end.

Thanks,
Eryu

 +}
 +
 +_cleanup_dmerror()
 +{
 + $UMOUNT_PROG $SCRATCH_MNT  /dev/null 21
 + $DMSETUP_PROG remove error-test  /dev/null 21
 +}
 +
 +_load_dmerror_table()
 +{
 + $DMSETUP_PROG suspend error-test
 + [ $? -ne 0 ]  _fatal  failed to suspend error-test
 +
 + $DMSETUP_PROG load error-test --table $DMERROR_TABLE
 + [ $? -ne 0 ]  _fatal failed to load error table error-test
 +
 + $DMSETUP_PROG resume error-test
 + [ $? -ne 0 ]  _fatal  failed to resume error-test
 +}
 diff --git a/common/rc b/common/rc
 index 70d2fa8..8d4da0e 100644
 --- a/common/rc
 +++ b/common/rc
 @@ -1337,6 +1337,15 @@ _require_sane_bdev_flush()
   fi
  }
  
 +# this test requires the device mapper error target
 +#
 +_require_dmerror()
 +{
 + _require_command $DMSETUP_PROG dmsetup
 + $DMSETUP_PROG targets | grep error /dev/null 21
 + [ $? -ne 0 ]  _notrun This test requires dm error support
 +}
 +
  # this test requires the device mapper flakey target
  #
  _require_dm_flakey()
 -- 
 2.4.1
 
 --
 To unsubscribe from this list: send the line unsubscribe fstests in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 2/3] xfstests: btrfs: test device replace, with EIO on the src dev

2015-08-14 Thread Eryu Guan

On Fri, Aug 14, 2015 at 06:47:03PM +0800, Anand Jain wrote:
 From: Anand Jain anand.j...@oracle.com
 
 This test case will test to confirm the replace works with
 the failed (EIO) replacing source device. EIO condition is
 achieved using the DM device.
 
 Signed-off-by: Anand Jain anand.j...@oracle.com
 Reviewed-by: Filipe Manana fdman...@suse.com
 ---
 v4-v5: rebase on latest xfstests code and accepts Filipe comment
 v3-v4: rebase on latest xfstests code
 v2-v3: accepts Filipe Manana's review comments, thanks
 v1-v2: accepts Dave Chinner's review comments, thanks
  tests/btrfs/098 | 81 
 +
  tests/btrfs/098.out | 11 
  tests/btrfs/group   |  1 +
  3 files changed, 93 insertions(+)
  create mode 100755 tests/btrfs/098
  create mode 100644 tests/btrfs/098.out
 
 diff --git a/tests/btrfs/098 b/tests/btrfs/098
 new file mode 100755
 index 000..afb41d1
 --- /dev/null
 +++ b/tests/btrfs/098
 @@ -0,0 +1,81 @@
 +#! /bin/bash
 +# FS QA Test No. btrfs/098
 +#
 +#test device replace works when the source device has EIO

Nitpick here, need a space after # :)

 +#
 +# Copyright (c) 2015 Oracle.  All Rights Reserved.
 +#
 +# This program is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU General Public License as
 +# published by the Free Software Foundation.
 +#
 +# This program is distributed in the hope that it would be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +#
 +# You should have received a copy of the GNU General Public License
 +# along with this program; if not, write the Free Software Foundation,
 +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 +#
 +
 +seq=`basename $0`
 +seqres=$RESULT_DIR/$seq
 +echo QA output created by $seq
 +
 +here=`pwd`
 +tmp=/tmp/$$
 +
 +status=1 # failure is the default!
 +trap _cleanup; exit \$status 0 1 2 3 15
 +
 +
 +_cleanup()
 +{
 + _cleanup_dmerror
 + rm -f $tmp

should be rm -f $tmp.* as many functions in common/rc and check create
tmp files like $tmp.xxx

Thanks,
Eryu

 +}
 +
 +# get standard environment, filters and checks
 +. ./common/rc
 +. ./common/filter
 +. ./common/filter.btrfs
 +. ./common/dmerror
 +
 +_supported_fs btrfs
 +_supported_os Linux
 +_need_to_be_root
 +_require_scratch_dev_pool 3
 +_require_dmerror
 +
 +rm -f $seqres.full
 +
 +dev1=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`
 +dev2=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $3}'`
 +
 +_init_dmerror
 +_scratch_mkfs_dmerror -f -d raid1 -m raid1 $dev1
 +_mount_dmerror
 +
 +_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
 +$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
 _filter_btrfs_filesystem_show
 +
 +error_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
 + egrep $DMERROR_DEV | $AWK_PROG '{print $2}'`
 +
 +snapshot_cmd=$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT
 +snapshot_cmd=$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`
 +run_check $FSSTRESS_PROG -d $SCRATCH_MNT -n 200 -p 8 $FSSTRESS_AVOID -x \
 + $snapshot_cmd -X 50 
 /dev/null
 +
 +# now load the error into the DMERROR_DEV
 +_load_dmerror_table
 +
 +_run_btrfs_util_prog replace start -B $error_devid $dev2 $SCRATCH_MNT
 +
 +_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
 +$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
 _filter_btrfs_filesystem_show
 +
 +echo === device replace completed
 +
 +status=0; exit
 diff --git a/tests/btrfs/098.out b/tests/btrfs/098.out
 new file mode 100644
 index 000..eb2f87f
 --- /dev/null
 +++ b/tests/btrfs/098.out
 @@ -0,0 +1,11 @@
 +QA output created by 098
 +Label: none  uuid:  UUID
 + Total devices NUM FS bytes used SIZE
 + devid DEVID size SIZE used SIZE path SCRATCH_DEV
 + devid DEVID size SIZE used SIZE path /dev/mapper/error-test
 +
 +Label: none  uuid:  UUID
 + Total devices NUM FS bytes used SIZE
 + devid DEVID size SIZE used SIZE path SCRATCH_DEV
 +
 +=== device replace completed
 diff --git a/tests/btrfs/group b/tests/btrfs/group
 index e13865a..c8a53b5 100644
 --- a/tests/btrfs/group
 +++ b/tests/btrfs/group
 @@ -100,3 +100,4 @@
  095 auto quick metadata
  096 auto quick clone
  097 auto quick send clone
 +098 auto quick replace
 -- 
 2.4.1
 
 --
 To unsubscribe from this list: send the line unsubscribe fstests in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 3/3] xfstests: btrfs: test device delete with EIO on src dev

2015-08-14 Thread Eryu Guan

On Fri, Aug 14, 2015 at 06:47:04PM +0800, Anand Jain wrote:
 From: Anand Jain anand.j...@oracle.com
 
 This test case tests if the device delete works with
 the failed (EIO) source device. EIO errors are achieved
 usign the DM device.
 
 This test would need following btrfs-progs and btrfs
 kernel patch
btrfs-progs: device delete to accept devid
Btrfs: device delete by devid
 
 However when btrfs-progs patch is not found this test will
 not run, and when kernel patch is not found btrfs-progs
 will fail gracefully and thus the test script.
 
 Signed-off-by: Anand Jain anand.j...@oracle.com
 ---
 v4-v5: rebase on latest xfstests code, and accepts Filipe comment
 v3-v4: rebase on latest xfstests code
 v2-v3: accepts Filipe Manana's review comments, thanks
 v1-v2: accepts Dave Chinner's review comments, thanks
  common/rc   |  7 +
  tests/btrfs/099 | 82 
 +
  tests/btrfs/099.out | 11 +++
  tests/btrfs/group   |  1 +
  4 files changed, 101 insertions(+)
  create mode 100755 tests/btrfs/099
  create mode 100644 tests/btrfs/099.out
 
 diff --git a/common/rc b/common/rc
 index 8d4da0e..31a0328 100644
 --- a/common/rc
 +++ b/common/rc
 @@ -2737,6 +2737,13 @@ _require_meta_uuid()
   umount $SCRATCH_MNT
  }
  
 +_require_btrfs_dev_del_by_devid()
 +{
 + $BTRFS_UTIL_PROG device delete --help | egrep devid  /dev/null 21
 + [ $? -eq 0 ] || _notrun $BTRFS_UTIL_PROG too old \
 + (must support 'btrfs device delete devid /mnt')
 +}
 +
  _get_total_inode()
  {
   if [ -z $1 ]; then
 diff --git a/tests/btrfs/099 b/tests/btrfs/099
 new file mode 100755
 index 000..4464e24
 --- /dev/null
 +++ b/tests/btrfs/099
 @@ -0,0 +1,82 @@
 +#! /bin/bash
 +# FS QA Test No. btrfs/099
 +#
 +# test device delete when the source device has EIO
 +#
 +# Copyright (c) 2015 Oracle.  All Rights Reserved.
 +#
 +# This program is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU General Public License as
 +# published by the Free Software Foundation.
 +#
 +# This program is distributed in the hope that it would be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +#
 +# You should have received a copy of the GNU General Public License
 +# along with this program; if not, write the Free Software Foundation,
 +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 +#
 +
 +seq=`basename $0`
 +seqres=$RESULT_DIR/$seq
 +echo QA output created by $seq
 +
 +here=`pwd`
 +tmp=/tmp/$$
 +
 +status=1 # failure is the default!
 +trap _cleanup; exit \$status 0 1 2 3 15
 +
 +
 +_cleanup()
 +{
 + _cleanup_dmerror
 + rm -f $tmp

And here too, rm -f $tmp.*

Thanks,
Eryu
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 2/3] xfstests: btrfs: test device replace, with EIO on the src dev

2015-08-14 Thread Filipe David Manana

On Fri, Aug 14, 2015 at 11:47 AM, Anand Jain anand.j...@oracle.com wrote:
 From: Anand Jain anand.j...@oracle.com

 This test case will test to confirm the replace works with
 the failed (EIO) replacing source device. EIO condition is
 achieved using the DM device.

 Signed-off-by: Anand Jain anand.j...@oracle.com
 Reviewed-by: Filipe Manana fdman...@suse.com
 ---
 v4-v5: rebase on latest xfstests code and accepts Filipe comment
 v3-v4: rebase on latest xfstests code
 v2-v3: accepts Filipe Manana's review comments, thanks
 v1-v2: accepts Dave Chinner's review comments, thanks
  tests/btrfs/098 | 81 
 +
  tests/btrfs/098.out | 11 
  tests/btrfs/group   |  1 +
  3 files changed, 93 insertions(+)
  create mode 100755 tests/btrfs/098
  create mode 100644 tests/btrfs/098.out

 diff --git a/tests/btrfs/098 b/tests/btrfs/098
 new file mode 100755
 index 000..afb41d1
 --- /dev/null
 +++ b/tests/btrfs/098
 @@ -0,0 +1,81 @@
 +#! /bin/bash
 +# FS QA Test No. btrfs/098
 +#
 +#test device replace works when the source device has EIO
 +#
 +# Copyright (c) 2015 Oracle.  All Rights Reserved.
 +#
 +# This program is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU General Public License as
 +# published by the Free Software Foundation.
 +#
 +# This program is distributed in the hope that it would be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +#
 +# You should have received a copy of the GNU General Public License
 +# along with this program; if not, write the Free Software Foundation,
 +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 +#
 +
 +seq=`basename $0`
 +seqres=$RESULT_DIR/$seq
 +echo QA output created by $seq
 +
 +here=`pwd`
 +tmp=/tmp/$$
 +
 +status=1   # failure is the default!
 +trap _cleanup; exit \$status 0 1 2 3 15
 +
 +
 +_cleanup()
 +{
 +   _cleanup_dmerror
 +   rm -f $tmp
 +}
 +
 +# get standard environment, filters and checks
 +. ./common/rc
 +. ./common/filter
 +. ./common/filter.btrfs
 +. ./common/dmerror
 +
 +_supported_fs btrfs
 +_supported_os Linux
 +_need_to_be_root
 +_require_scratch_dev_pool 3
 +_require_dmerror
 +
 +rm -f $seqres.full
 +
 +dev1=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`
 +dev2=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $3}'`
 +
 +_init_dmerror
 +_scratch_mkfs_dmerror -f -d raid1 -m raid1 $dev1
 +_mount_dmerror
 +
 +_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
 +$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
 _filter_btrfs_filesystem_show
 +
 +error_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
 +   egrep $DMERROR_DEV | $AWK_PROG '{print $2}'`
 +
 +snapshot_cmd=$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT
 +snapshot_cmd=$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`
 +run_check $FSSTRESS_PROG -d $SCRATCH_MNT -n 200 -p 8 $FSSTRESS_AVOID -x \
 +   $snapshot_cmd -X 50 
 /dev/null

Sorry missed this before, but you don't need to redirect stdout/stderr
to /dev/null.
run_check redirects them to $seqres.full where it's actually useful -
when we have the test failing, we can check $seqres.full to see what
seed fsstress used (fsstress prints it to stdout/stderr). That's for
the case where it's failing only for some seeds of course.

Same observation applies to the other test/patch.

Thanks.

 +
 +# now load the error into the DMERROR_DEV
 +_load_dmerror_table
 +
 +_run_btrfs_util_prog replace start -B $error_devid $dev2 $SCRATCH_MNT
 +
 +_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
 +$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
 _filter_btrfs_filesystem_show
 +
 +echo === device replace completed
 +
 +status=0; exit
 diff --git a/tests/btrfs/098.out b/tests/btrfs/098.out
 new file mode 100644
 index 000..eb2f87f
 --- /dev/null
 +++ b/tests/btrfs/098.out
 @@ -0,0 +1,11 @@
 +QA output created by 098
 +Label: none  uuid:  UUID
 +   Total devices NUM FS bytes used SIZE
 +   devid DEVID size SIZE used SIZE path SCRATCH_DEV
 +   devid DEVID size SIZE used SIZE path /dev/mapper/error-test
 +
 +Label: none  uuid:  UUID
 +   Total devices NUM FS bytes used SIZE
 +   devid DEVID size SIZE used SIZE path SCRATCH_DEV
 +
 +=== device replace completed
 diff --git a/tests/btrfs/group b/tests/btrfs/group
 index e13865a..c8a53b5 100644
 --- a/tests/btrfs/group
 +++ b/tests/btrfs/group
 @@ -100,3 +100,4 @@
  095 auto quick metadata
  096 auto quick clone
  097 auto quick send clone
 +098 auto quick replace
 --
 2.4.1

 --
 To unsubscribe from this list: send the line unsubscribe fstests in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

Reasonable men adapt themselves to the

Re: RAID0 wrong (raw) device?

2015-08-14 Thread Austin S Hemmelgarn


On 2015-08-13 19:29, Gareth Pye wrote:

I would have been surprised if any generic file system copes well with
being mounted in several locations at once, DRBD appears to fight
really hard to avoid that happening :)

And yeah I'm doing the second thing, I've successfully switched which
of the servers is active a few times with no ill effect (I would
expect scrub to give me some significant warnings if one of the disks
was a couple of months out of date) so I'm presuming that DRBD copes
reasonably well or I've been very lucky. Either that luck is very
deterministic, DRBD copes correctly, or I've been very very lucky.

Very very lucky doesn't sound likely.

Yeah, I'd be willing to bet that DRBD does cope well with direct writes 
to the backing store (either that or it prevents the kernel from doing 
that, which would be even better and would not surprise me at all).  In 
my experience it's one of the most resilient shared storage options out 
there.





smime.p7s
Description: S/MIME Cryptographic Signature

Re: RAID0 wrong (raw) device?

2015-08-14 Thread Ulli Horlacher

On Fri 2015-08-14 (00:24), Anand Jain wrote:

  root@toy02:~# btrfs filesystem show
  Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
   Total devices 2 FS bytes used 106.51GiB
   devid3 size 1.82TiB used 82.03GiB path /dev/drbd2
   devid4 size 1.82TiB used 82.03GiB path /dev/drbd3
 
  And now, after a reboot:
 
  root@toy02:~/bin# btrfs filesystem show
  Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
   Total devices 2 FS bytes used 119.82GiB
   devid3 size 1.82TiB used 82.03GiB path /dev/drbd2
   devid4 size 1.82TiB used 82.03GiB path /dev/sde
 
  GRMPF!
 
 pls use 'btrfs fi show -m' and just ignore no option or -d if fs is 
 mounted, as -m reads from the kernel.

There is now a new behaviour: after the btrfs mount, I can see shortly the
wrong raw device /dev/sde and a few seconds later there is the correct
/dev/drbd3 :


root@toy02:/etc# umount /data
root@toy02:/etc# mount /data
root@toy02:/etc# btrfs filesystem show
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
Total devices 2 FS bytes used 109.56GiB
devid3 size 1.82TiB used 63.03GiB path /dev/drbd2
devid4 size 1.82TiB used 63.03GiB path /dev/sde

Btrfs v3.12
root@toy02:/etc# btrfs filesystem show
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
Total devices 2 FS bytes used 109.56GiB
devid3 size 1.82TiB used 63.03GiB path /dev/drbd2
devid4 size 1.82TiB used 63.03GiB path /dev/drbd3

Btrfs v3.12

root@toy02:/etc# btrfs filesystem show -m
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
Total devices 2 FS bytes used 109.56GiB
devid3 size 1.82TiB used 63.03GiB path /dev/drbd2
devid4 size 1.82TiB used 63.03GiB path /dev/drbd3

Btrfs v3.12


Still, the kernel sees 3 instead of (really) 2 HGST drives:

root@toy02:/etc# hdparm -I /dev/sdb | grep Number:
Model Number:   HGST HUS724020ALA640
Serial Number:  PN2134P5G2P2AX

root@toy02:/etc# hdparm -I /dev/sde | grep Number:
Model Number:   HGST HUS724020ALA640
Serial Number:  PN2134P5G2P2AX

root@toy02:/etc# hdparm -I /dev/sdd | grep Number:
Model Number:   HGST HUS724020ALA640
Serial Number:  PN2134P5G2P2XX

-- 
Ullrich Horlacher  Informationssysteme und Serverbetrieb
IZUS/TIK   E-Mail: horlac...@rus.uni-stuttgart.de
Universitaet Stuttgart Tel:++49-711-68565868
Allmandring 30aFax:++49-711-682357
70550 Stuttgart (Germany)  WWW:http://www.tik.uni-stuttgart.de/
REF:55ccc4ab.2080...@oracle.com
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

re: Btrfs: don't start the log transaction if the log tree init fails

2015-08-14 Thread Dan Carpenter

Hello Miao Xie,

The patch e87ac1368700: Btrfs: don't start the log transaction if
the log tree init fails from Feb 20, 2014, leads to the following
static checker warning:

fs/btrfs/tree-log.c:178 start_log_trans()
warn: we tested 'root-log_root' before and it was 'false'

fs/btrfs/tree-log.c
   147  if (root-log_root) {

We test root-log_root here.

   148  if (btrfs_need_log_full_commit(root-fs_info, trans)) {
   149  ret = -EAGAIN;
   150  goto out;
   151  }
   152  if (!root-log_start_pid) {
   153  root-log_start_pid = current-pid;
   154  clear_bit(BTRFS_ROOT_MULTI_LOG_TASKS, 
root-state);
   155  } else if (root-log_start_pid != current-pid) {
   156  set_bit(BTRFS_ROOT_MULTI_LOG_TASKS, 
root-state);
   157  }
   158  
   159  atomic_inc(root-log_batch);
   160  atomic_inc(root-log_writers);
   161  if (ctx) {
   162  index = root-log_transid % 2;
   163  list_add_tail(ctx-list, 
root-log_ctxs[index]);
   164  ctx-log_transid = root-log_transid;
   165  }
   166  mutex_unlock(root-log_mutex);
   167  return 0;
   168  }
   169  
   170  ret = 0;
   171  mutex_lock(root-fs_info-tree_log_mutex);
   172  if (!root-fs_info-log_root_tree)
   173  ret = btrfs_init_log_root_tree(trans, root-fs_info);
   174  mutex_unlock(root-fs_info-tree_log_mutex);
   175  if (ret)
   176  goto out;
   177  
   178  if (!root-log_root) {

Couldn't we just remove this condition here?  This is a new Smatch thing
I am working on and I am investigating false positives.

   179  ret = btrfs_add_log_tree(trans, root);
   180  if (ret)
   181  goto out;
   182  }

regards,
dan carpenter
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: trim not working and irreparable errors from btrfsck

2015-08-14 Thread Marc Joliet

Am Fri, 14 Aug 2015 10:05:55 +0200
schrieb Marc Joliet mar...@gmx.de:

 (I mean, that's part of being a user of btrfs at this stage)

I meant *being prepared* to file a bug report, not that one constantly has to
file bug reports :) .

-- 
Marc Joliet
--
People who think they know everything really annoy those of us who know we
don't - Bjarne Stroustrup


pgpK308M7lnAT.pgp
Description: Digitale Signatur von OpenPGP

Re: trim not working and irreparable errors from btrfsck

2015-08-14 Thread Marc Joliet

Am Thu, 13 Aug 2015 17:14:36 -0600
schrieb Chris Murphy li...@colorremedies.com:

 On Thu, Aug 13, 2015 at 3:23 AM, Marc Joliet mar...@gmx.de wrote:
 
  Speaking as a user, since fstrim -av still always outputs 0 bytes trimmed
  on my system: what's the status of this?  Did anybody ever file a bug 
  report?
 
 Since I'm not having this problem with my SSD, I'm not in a position
 to provide any meaningful information for such a report.
 
 The bug should whether this problem is reproducible with ext4 and XFS
 on the same device, and the complete details of the stacking (if this
 is not the full device or partition of it; e.g. if LVM, md, or
 encryption is between fs and physical device). And also the bug should
 include full dmesg as attachment, and strace of the fstrim command
 that results in 0 bytes trimmed. And probably separate bugs for each
 make/model of SSD, with the bug including make/model and firmware
 version.
 
 Right now I think there's no status because a.) no bug report and b.)
 not enough information.

I was mainly asking because apparently there *is* a patch that helps some
people affected by this, but nobody ever commented on it.  Perhaps there's a
reason for that, but I found it curious.  (I see now that it was submitted in
early January, in the thread [PATCH V2] Btrfs: really fix trim 0 bytes after a
device delete.)

I can open a bug (I mean, that's part of being a user of btrfs at this stage),
I'm just surprised that nobody else has.

BTW, is there a way to tell if the discard mount option does anything?  I'm
curious about whether it could behave differently.

-- 
Marc Joliet
--
People who think they know everything really annoy those of us who know we
don't - Bjarne Stroustrup


pgp3RKB19fH0i.pgp
Description: Digitale Signatur von OpenPGP

Re: trim not working and irreparable errors from btrfsck

2015-08-14 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 6/18/15 1:25 AM, Duncan wrote:
 Austin S Hemmelgarn posted on Wed, 17 Jun 2015 13:17:22 -0400 as 
 excerpted:
 
 On 2015-06-17 11:40, Christian wrote:
 On 06/17/2015 11:28 AM, Chris Murphy wrote:
 
 
 However, fstrim still gives me 0 B (0 bytes) trimmed, so
 that may be another problem. Is there a way to check if
 trim works?
 
 That sounds like maybe your SSD is blacklisted for trim, is
 all I can think of. So trim shouldn't be the cause of the
 problem if it's being blacklisted. The recent problems appear
 to be around newer SSDs that support queue trim and newer
 kernels that issue queued trim. There have been some patches
 related to trim to the kernel, but the existence of
 blacklisting and claims of bugs in firmware make it difficult
 to test and isolate.
 
 http://techreport.com/news/28473/some-samsung-ssds-may-suffer-from-
a-

 
buggy-trim-implementation
 
 
 This is an Intel SSD in a Lenovo Thinkpad X1 Carbon. Trim
 worked until a few weeks ago and still works for my small ext4
 boot partition (just ran it to check). I will keep looking for
 a solution. Thanks!
 
 I'm seeing the same issue here, but with a Crucial brand SSD.
 Somewhat interestingly, I don't see any issues like this with
 BTRFS on top of LVM's thin-provisioning volumes, or with any
 other filesystems, so I think it has something to do with how
 BTRFS is reporting unused space or how it is submitting the
 discard requests.
 
 FWIW, there's a current btrfs patch in progress that relates to
 problems with btrfs trim.
 
 But while I do have SSDs, I purposefully overprovisioned them by
 nearly 100% (IOW I partitioned only about 55%, the rest is entirely
 unused), so trim isn't as critical here as it is for many.  I don't
 use the discard mount option, and have a systemd timer job setup to
 automate my fstrims and don't worry about the output too much, so I
 haven't been following the patch progress /that/ closely.
 
 But I do know that recent kernel btrfs trims (either fstrim or
 discard mount option triggered) haven't been working as originally
 intended due to some bug, and this patch is supposed to fix it.
 
 I'd thus conclude that you're very likely hitting this known issue,
 and that either for 4.1 or 4.2 (again, I'm not following progress
 that closely, and don't remember for sure if it's in 4.1, altho
 I've been running the rcs since rc6 or so), the problem should be
 fixed as that patch gets into mainline.
 
 Anyone wishing to investigate further can of course check the list
 (and/ or possibly the kernel's git log) for discard/trim related
 patches and follow the progress once found.
 
 ... Actually, just checked myself.  Looks like the patches were
 first posted on March 30 @ 15:12:17 -0400 or so (that's the time
 for one of them).  There's one for the discard mount option, and
 another for FITRIM (which may or may not be a typo for FSTRIM, I'm
 not actually sure).  Jeff Mahoney je...@suse.com author.  That
 should be enough to find the threads.  And I don't see the patches
 in the late 4.1-rc I'm running so either my git log search foo is
 bad or it'll be (at least) 4.2.

It's not a typo. FITRIM is the name of the ioctl that fstrim calls.

The final version of that patch set is ready to go.  Mostly.  It
probably needs to be re-integrated now.  The reason it was delayed for
inclusion is that it makes other bugs more obvious and irrecoverable
since the data is completely gone.

I'm not sure what Chris's timeline for inclusion is.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJVzgU/AAoJEB57S2MheeWyjSwQALOLfzNLiuAdxBXDnP076Pq7
8m2F2DtTRxuDCwBmnlgZtX3QuWK/J1HVRpAO/aC6WQkOo3uRNFrG4xK45EOTA5hH
VBNtAFooMreicFQq5ZE6i+4yEdV8D4YRSoVn7+GrjL40IjiP8u7HXtDGw0x4ugGI
iVNf3yipaTZtlRcjGt91dfW3w3D8RpjUK3z7RwSOEy3C8GP90omRWVkYV/jcFIo4
hqFMZ77hisRLf1aCFxXlO14ERyMpLPtC3HOBMLHRrdpjPp/f4XnXyFmFA0kbOX8S
dwS9qPRmlnS5Lif2XMXK0a6aA0HK7sN/ghMigAh9t4zHwDkuDpAd6OWVEuCMMpCY
uN2KyuNsjam2DxJHQVulNu1xlS/sGedfh8p66lC29fkB8ZpyGp4fnK1N4MVRdk8R
4o/emRb+vg7CTZ3fvss7Af6w+m22GISO43Q1MWr6Hr1Ll2y0DWL1IaB/zky8sr/5
u6E5RI7DOvbFyC31dGqvh5WQDIPrTxRoDMJL+pSOkF4CM5SM1uHak4IgUqfZ85hr
MXVhRHFmH9UXRTFrkxzAV2wmSNpl2ki2pX5ItB6+c4fMMStb5dynThv27R69xxHf
mn8qZBuwc5iXXsPJ9dUAxTRoquOw9Rd/1fz4S/oLH6xOrtlNlLa2HFour4Ofp16h
3e6CvcV+h4/sz0PYpYSQ
=ZiEh
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: can we make balance delete missing devices?

2015-08-14 Thread Chris Murphy

On Fri, Aug 14, 2015 at 6:12 AM, Russell Coker russ...@coker.com.au wrote:
 [ 2918.502237] BTRFS info (device loop1): disk space caching is enabled
 [ 2918.503213] BTRFS: failed to read chunk tree on loop1
 [ 2918.540082] BTRFS: open_ctree failed

 I just had a test RAID-1 filesystem with a missing device.  I mounted it with
 the degraded option and added a new device.  I balanced it (to make it do
 RAID-1 again) and thought everything was good.  Then when I tried to mount it
 again it gave errors such as the above (not sure why).  Then I tried wiping
 /dev/loop1 and it refused to mount entirely due to having 2 missing devices.

 Obviously it was my mistake to not remove the missing device, and wiping
 /dev/loop1 was a bad idea.  Failing to remove a missing device seems likely to
 be a common mistake.  Could we make the balance operation automatically delete
 the missing device?  I can't imagine a situation in which a balance would be
 desired but deleting the missing device wouldn't be desired.

I think this is specious because balance doesn't at all convey a
missing device will be silently dropped. If a device is missing and
balancing is a bad idea, then balance should probably fail rather than
automatically delete missing.

The proper way to avoid this is to use btrfs replace start. Maybe it's
just an old habit that needs purging, device add + device delete +
balance, this is the exact use case replace was meant to address.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 2/3] xfstests: btrfs: test device replace, with EIO on the src dev

2015-08-14 Thread Anand Jain

From: Anand Jain anand.j...@oracle.com

This test case will test to confirm the replace works with
the failed (EIO) replacing source device. EIO condition is
achieved using the DM device.

Signed-off-by: Anand Jain anand.j...@oracle.com
Reviewed-by: Filipe Manana fdman...@suse.com
---
v5-v6: accepts Eryu and Filipe's comments
v4-v5: rebase on latest xfstests code and accepts Filipe comment
v3-v4: rebase on latest xfstests code
v2-v3: accepts Filipe Manana's review comments, thanks
v1-v2: accepts Dave Chinner's review comments, thanks
 tests/btrfs/098 | 81 +
 tests/btrfs/098.out | 11 
 tests/btrfs/group   |  1 +
 3 files changed, 93 insertions(+)
 create mode 100755 tests/btrfs/098
 create mode 100644 tests/btrfs/098.out

diff --git a/tests/btrfs/098 b/tests/btrfs/098
new file mode 100755
index 000..a41ea86
--- /dev/null
+++ b/tests/btrfs/098
@@ -0,0 +1,81 @@
+#! /bin/bash
+# FS QA Test No. btrfs/098
+#
+# Test device replace works when the source device has EIO
+#
+# Copyright (c) 2015 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+
+status=1   # failure is the default!
+trap _cleanup; exit \$status 0 1 2 3 15
+
+
+_cleanup()
+{
+   _cleanup_dmerror
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/filter.btrfs
+. ./common/dmerror
+
+_supported_fs btrfs
+_supported_os Linux
+_need_to_be_root
+_require_scratch_dev_pool 3
+_require_dmerror
+
+rm -f $seqres.full
+
+dev1=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`
+dev2=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $3}'`
+
+_init_dmerror
+_scratch_mkfs_dmerror -f -d raid1 -m raid1 $dev1
+_mount_dmerror
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+error_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
+   egrep $DMERROR_DEV | $AWK_PROG '{print $2}'`
+
+snapshot_cmd=$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT
+snapshot_cmd=$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`
+run_check $FSSTRESS_PROG -d $SCRATCH_MNT -n 200 -p 8 $FSSTRESS_AVOID -x \
+   $snapshot_cmd -X 50
+
+# now load the error into the DMERROR_DEV
+_load_dmerror_table
+
+_run_btrfs_util_prog replace start -B $error_devid $dev2 $SCRATCH_MNT
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+echo === device replace completed
+
+status=0; exit
diff --git a/tests/btrfs/098.out b/tests/btrfs/098.out
new file mode 100644
index 000..eb2f87f
--- /dev/null
+++ b/tests/btrfs/098.out
@@ -0,0 +1,11 @@
+QA output created by 098
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+   devid DEVID size SIZE used SIZE path /dev/mapper/error-test
+
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+
+=== device replace completed
diff --git a/tests/btrfs/group b/tests/btrfs/group
index e13865a..c8a53b5 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -100,3 +100,4 @@
 095 auto quick metadata
 096 auto quick clone
 097 auto quick send clone
+098 auto quick replace
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 1/3] xfstests: btrfs: add functions to create dm-error device

2015-08-14 Thread Anand Jain

From: Anand Jain anand.j...@oracle.com

Controlled EIO from the device is achieved using the dm device.
Helper functions are at common/dmerror.

Broadly steps will include calling _init_dmerror().
_init_dmerror() will use SCRATCH_DEV to create dm linear device and assign
DMERROR_DEV to /dev/mapper/error-test.

When test script is ready to get EIO, the test cases can call
_load_dmerror_table() which then it will load the dm error.
so that reading DMERROR_DEV will cause EIO. After the test case is
complete, cleanup must be done by calling _cleanup_dmerror().

Signed-off-by: Anand Jain anand.j...@oracle.com
Reviewed-by: Filipe Manana fdman...@suse.com
---
v5-v6: accepts Eryu's comments
v4-v5: No Change. keep up with the patch set
v3-v4: rebase on latest xfstests code
v2.1-v3: accepts Filipe Manana's review comments, thanks
v2-v2.1: fixed missed typo error fixup in the commit.
v1-v2: accepts Dave Chinner's review comments, thanks
 common/dmerror | 69 ++
 common/rc  |  9 
 2 files changed, 78 insertions(+)
 create mode 100644 common/dmerror

diff --git a/common/dmerror b/common/dmerror
new file mode 100644
index 000..928e998
--- /dev/null
+++ b/common/dmerror
@@ -0,0 +1,69 @@
+##/bin/bash
+#
+# Copyright (c) 2015 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#
+# common functions for setting up and tearing down a dmerror device
+
+_init_dmerror()
+{
+   $DMSETUP_PROG remove error-test  /dev/null 21
+
+   local BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV`
+
+   DMERROR_DEV='/dev/mapper/error-test'
+
+   DMLINEAR_TABLE=0 $BLK_DEV_SIZE linear $SCRATCH_DEV 0
+
+   $DMSETUP_PROG create error-test --table $DMLINEAR_TABLE || \
+   _fatal failed to create dm linear device
+
+   DMERROR_TABLE=0 $BLK_DEV_SIZE error $SCRATCH_DEV 0
+}
+
+_scratch_mkfs_dmerror()
+{
+   $MKFS_BTRFS_PROG $MKFS_OPTIONS $* $DMERROR_DEV  $seqres.full 21 || \
+   _fatal failed to create mkfs.btrfs $* $DMERROR_DEV
+}
+
+_mount_dmerror()
+{
+   $MOUNT_PROG -t $FSTYP $MOUNT_OPTIONS $DMERROR_DEV $SCRATCH_MNT
+}
+
+_unmount_dmerror()
+{
+   $UMOUNT_PROG $SCRATCH_MNT
+}
+
+_cleanup_dmerror()
+{
+   $UMOUNT_PROG $SCRATCH_MNT  /dev/null 21
+   $DMSETUP_PROG remove error-test  /dev/null 21
+}
+
+_load_dmerror_table()
+{
+   $DMSETUP_PROG suspend error-test
+   [ $? -ne 0 ]  _fatal  failed to suspend error-test
+
+   $DMSETUP_PROG load error-test --table $DMERROR_TABLE
+   [ $? -ne 0 ]  _fatal failed to load error table error-test
+
+   $DMSETUP_PROG resume error-test
+   [ $? -ne 0 ]  _fatal  failed to resume error-test
+}
diff --git a/common/rc b/common/rc
index 70d2fa8..8d4da0e 100644
--- a/common/rc
+++ b/common/rc
@@ -1337,6 +1337,15 @@ _require_sane_bdev_flush()
fi
 }
 
+# this test requires the device mapper error target
+#
+_require_dmerror()
+{
+   _require_command $DMSETUP_PROG dmsetup
+   $DMSETUP_PROG targets | grep error /dev/null 21
+   [ $? -ne 0 ]  _notrun This test requires dm error support
+}
+
 # this test requires the device mapper flakey target
 #
 _require_dm_flakey()
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

The performance is not as expected when used several disks on raid0.

2015-08-14 Thread Eduardo Bach

Hi all,

This is my first email to this list, so please excuse any gaffe.

I am in the evaluation early stages of a new storage, an SGI MIS,
currently with two HBAs LSI and 32 disks.
The hba controllers are LSI 9207-8i and the disks are Seagate 6TB,
model ST6000NM0004-1FT17Z.

To evaluate the performance I am using IOzone over a raid0 using all
the 32 disks, with the parameters: iozone -i0 -i1 -t5 -s 20G  -P0.

With btrfs the result approaches 3.5GB/s. When using mdadm+xfs the
result reaches 6gb/s, which is the expected value when compared with
parallel dd made on discs.
When used btrfs with only half of the disc the result is about 3GB/s.

More information:

# uname -a
Linux spstrg13 4.2.0-999-generic #201508132200 SMP Fri Aug 14 02:01:52
UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

# btrfs --version
btrfs-progs v4.0

# btrfs fi show
Label: none  uuid: be2a5671-87d1-4b89-ac4a-04efabb5912f
Total devices 32 FS bytes used 3.66MiB
devid1 size 5.46TiB used 1.07GiB path /dev/sdc
devid2 size 5.46TiB used 1.06GiB path /dev/sdd
devid3 size 5.46TiB used 1.06GiB path /dev/sde
devid4 size 5.46TiB used 1.06GiB path /dev/sdf
devid5 size 5.46TiB used 1.06GiB path /dev/sdg
devid6 size 5.46TiB used 1.06GiB path /dev/sdh
devid7 size 5.46TiB used 1.06GiB path /dev/sdi
devid8 size 5.46TiB used 1.06GiB path /dev/sdj
devid9 size 5.46TiB used 1.06GiB path /dev/sdk
devid   10 size 5.46TiB used 1.06GiB path /dev/sdl
devid   11 size 5.46TiB used 1.06GiB path /dev/sdm
devid   12 size 5.46TiB used 1.06GiB path /dev/sdn
devid   13 size 5.46TiB used 1.06GiB path /dev/sdo
devid   14 size 5.46TiB used 1.06GiB path /dev/sdp
devid   15 size 5.46TiB used 1.06GiB path /dev/sdq
devid   16 size 5.46TiB used 1.06GiB path /dev/sdr
devid   17 size 5.46TiB used 1.06GiB path /dev/sds
devid   18 size 5.46TiB used 1.06GiB path /dev/sdt
devid   19 size 5.46TiB used 1.06GiB path /dev/sdu
devid   20 size 5.46TiB used 1.06GiB path /dev/sdv
devid   21 size 5.46TiB used 1.06GiB path /dev/sdw
devid   22 size 5.46TiB used 1.06GiB path /dev/sdx
devid   23 size 5.46TiB used 1.06GiB path /dev/sdy
devid   24 size 5.46TiB used 1.06GiB path /dev/sdz
devid   25 size 5.46TiB used 1.06GiB path /dev/sdaa
devid   26 size 5.46TiB used 1.06GiB path /dev/sdab
devid   27 size 5.46TiB used 1.06GiB path /dev/sdac
devid   28 size 5.46TiB used 1.06GiB path /dev/sdad
devid   29 size 5.46TiB used 1.06GiB path /dev/sdae
devid   30 size 5.46TiB used 1.06GiB path /dev/sdaf
devid   31 size 5.46TiB used 1.06GiB path /dev/sdag
devid   32 size 5.46TiB used 1.06GiB path /dev/sdah

btrfs-progs v4.0

# btrfs fi df /root/backup/root/storageTestes/mbtr
Data, RAID0: total=30.00GiB, used=3.50MiB
System, RAID0: total=32.00MiB, used=16.00KiB
Metadata, RAID0: total=4.00GiB, used=128.00KiB
Metadata, single: total=8.00MiB, used=16.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B

The dmesg is attached.

The result are about the same using kernel 3.16 and btrfs tools 3.12.
I am far from be able to isolate the problem, so please ask me any
information you think is relevant.
Thanks in advance.

Eduardo.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 0/3] dm error based test cases

2015-08-14 Thread Anand Jain

This is v6 of this patch set. Mainly accepts Filipe latest review comments and 
Eryu's review comments with thanks.


Anand Jain (3):
  xfstests: btrfs: add functions to create dm-error device
  xfstests: btrfs: test device replace, with EIO on the src dev
  xfstests: btrfs: test device delete with EIO on src dev

 common/dmerror  | 69 
 common/rc   | 16 +++
 tests/btrfs/098 | 81 
 tests/btrfs/098.out | 11 +++
 tests/btrfs/099 | 82 +
 tests/btrfs/099.out | 11 +++
 tests/btrfs/group   |  2 ++
 7 files changed, 272 insertions(+)
 create mode 100644 common/dmerror
 create mode 100755 tests/btrfs/098
 create mode 100644 tests/btrfs/098.out
 create mode 100755 tests/btrfs/099
 create mode 100644 tests/btrfs/099.out

-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 3/3] xfstests: btrfs: test device delete with EIO on src dev

2015-08-14 Thread Anand Jain

From: Anand Jain anand.j...@oracle.com

This test case tests if the device delete works with
the failed (EIO) source device. EIO errors are achieved
usign the DM device.

This test would need following btrfs-progs and btrfs
kernel patch
   btrfs-progs: device delete to accept devid
   Btrfs: device delete by devid

However when btrfs-progs patch is not found this test will
not run, and when kernel patch is not found btrfs-progs
will fail gracefully and thus the test script.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
v5-v6: accepts Eryu and Filipe's comments, thanks
v4-v5: rebase on latest xfstests code, and accepts Filipe comment
v3-v4: rebase on latest xfstests code
v2-v3: accepts Filipe Manana's review comments, thanks
v1-v2: accepts Dave Chinner's review comments, thanks
 common/rc   |  7 +
 tests/btrfs/099 | 82 +
 tests/btrfs/099.out | 11 +++
 tests/btrfs/group   |  1 +
 4 files changed, 101 insertions(+)
 create mode 100755 tests/btrfs/099
 create mode 100644 tests/btrfs/099.out

diff --git a/common/rc b/common/rc
index 8d4da0e..31a0328 100644
--- a/common/rc
+++ b/common/rc
@@ -2737,6 +2737,13 @@ _require_meta_uuid()
umount $SCRATCH_MNT
 }
 
+_require_btrfs_dev_del_by_devid()
+{
+   $BTRFS_UTIL_PROG device delete --help | egrep devid  /dev/null 21
+   [ $? -eq 0 ] || _notrun $BTRFS_UTIL_PROG too old \
+   (must support 'btrfs device delete devid /mnt')
+}
+
 _get_total_inode()
 {
if [ -z $1 ]; then
diff --git a/tests/btrfs/099 b/tests/btrfs/099
new file mode 100755
index 000..a0761c7
--- /dev/null
+++ b/tests/btrfs/099
@@ -0,0 +1,82 @@
+#! /bin/bash
+# FS QA Test No. btrfs/099
+#
+# test device delete when the source device has EIO
+#
+# Copyright (c) 2015 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+
+status=1   # failure is the default!
+trap _cleanup; exit \$status 0 1 2 3 15
+
+
+_cleanup()
+{
+   _cleanup_dmerror
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/filter.btrfs
+. ./common/dmerror
+
+_supported_fs btrfs
+_supported_os Linux
+_need_to_be_root
+_require_scratch_dev_pool 3
+_require_btrfs_dev_del_by_devid
+_require_dmerror
+
+rm -f $seqres.full
+
+dev1=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`
+dev2=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $3}'`
+
+_init_dmerror
+_scratch_mkfs_dmerror -f -d raid1 -m raid1 $dev1 $dev2
+_mount_dmerror
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+error_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
+   egrep $DMERROR_DEV | $AWK_PROG '{print $2}'`
+
+snapshot_cmd=$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT
+snapshot_cmd=$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`
+run_check $FSSTRESS_PROG -d $SCRATCH_MNT -n 200 -p 8 $FSSTRESS_AVOID -x \
+   $snapshot_cmd -X 50
+
+# now load the error into the DMERROR_DEV
+_load_dmerror_table
+
+_run_btrfs_util_prog device delete $error_devid $SCRATCH_MNT
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+echo === device delete completed
+
+status=0; exit
diff --git a/tests/btrfs/099.out b/tests/btrfs/099.out
new file mode 100644
index 000..ec74e45
--- /dev/null
+++ b/tests/btrfs/099.out
@@ -0,0 +1,11 @@
+QA output created by 099
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+   devid DEVID size SIZE used SIZE path /dev/mapper/error-test
+
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+
+=== device delete completed
diff --git a/tests/btrfs/group b/tests/btrfs/group
index c8a53b5..968ee63 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -101,3 +101,4 @@
 096 auto quick clone
 097 auto quick send clone
 098 auto quick replace
+099 auto quick replace
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs

Re: [PATCH v2] fstests: generic/018: expand write backwards sync but contiguous to test regression in btrfs

2015-08-14 Thread Eric Sandeen

On 8/13/15 3:47 AM, Liu Bo wrote:
 Btrfs has a problem when defraging a file which has a large fragment'ed range,
 it'd leave the tail extent as a seperate extent instead of merging it with
 previous extents.
 
 This makes generic/018 recognize the above regression.

Sorry for the late review, but here it is ;)

In 2 years (heck, even now) we'll have no idea why this change was made.

What regression is that?  Can you describe it?  Is there already an upstream
fix/commit you can refer to?

I see 3 changes here:

1) You change xfs_io's for loop from seq 9 -1 0 to seq 64 -1 0 - 
presumably
this matters to btrfs.  Why does this matter?

 Meanwhile, I find that in the case of 'write backwards sync but contiguous,
 ext4 doesn't produce fragments like btrfs and xfs, so I modify 018.out a 
 little
 bit to let ext4 pass.

2) You stop expecting 10 extents initially in the backwards-write test for the
above reason, I guess.  I'm a little unsure about this.  For me, this passes 
as-is.
If it isn't working for you, we should understand why, instead of making the 
test
ignore it.

(And bundling this ext4 change into a btrfs-specific commit isn't great, anyway)

 Moreover, I follow Filipe's suggestion to filter xfs_io's output in order to
 check these writes actually succeed.

3) You stop redirecting xfs_io to /dev/null, and save it to the golden output
file instead.

Honestly, I find hundreds of extra xfs_io output lines to be rather unhelpful,
because the old output file used to be quite easy to read, to see what's going 
on.

Today it only redirects stdout:

$XFS_IO_PROG -f -c pwrite -b $((4 * bsize)) 0 $((4 * bsize)) $fragfile \
 /dev/null

so if a write fails, I *think* stderr will get output, and the test *should*
fail as a result.[1]  You could add a || _fail xfs_io failed for good 
measure...

-Eric

[1] oh, maybe not, I guess xfs_io is kind of notorious for not returning 
errors...

 Signed-off-by: Liu Bo bo.li@oracle.com
 ---
 v2: fix typo in title, s/expend/expand/g
 
  tests/generic/018 |  16 ++--
  tests/generic/018.out | 198 
 +-
  2 files changed, 203 insertions(+), 11 deletions(-)
 
 diff --git a/tests/generic/018 b/tests/generic/018
 index d97bb88..3693874 100755
 --- a/tests/generic/018
 +++ b/tests/generic/018
 @@ -68,28 +68,24 @@ $XFS_IO_PROG -f -c truncate 1m $fragfile
  _defrag --before 0 --after 0 $fragfile
  
  echo Contiguous file: | tee -a $seqres.full
 -$XFS_IO_PROG -f -c pwrite -b $((4 * bsize)) 0 $((4 * bsize)) $fragfile \
 -  /dev/null
 +$XFS_IO_PROG -f -c pwrite -b $((4 * bsize)) 0 $((4 * bsize)) $fragfile | 
 _filter_xfs_io
  _defrag --before 1 --after 1 $fragfile
  
  echo Write backwards sync, but contiguous - should defrag to 1 extent | 
 tee -a $seqres.full
 -for i in `seq 9 -1 0`; do
 - $XFS_IO_PROG -fs -c pwrite -b $bsize $((i * bsize)) $bsize $fragfile \
 -  /dev/null
 +for i in `seq 64 -1 0`; do
 + $XFS_IO_PROG -fd -c pwrite -b $bsize $((i * bsize)) $bsize $fragfile 
 | _filter_xfs_io
  done
 -_defrag --before 10 --after 1 $fragfile
 +_defrag --after 1 $fragfile
  
  echo Write backwards sync leaving holes - defrag should do nothing | tee 
 -a $seqres.full
  for i in `seq 31 -2 0`; do
 - $XFS_IO_PROG -fs -c pwrite -b $bsize $((i * bsize)) $bsize $fragfile \
 -  /dev/null
 + $XFS_IO_PROG -fs -c pwrite -b $bsize $((i * bsize)) $bsize $fragfile 
 | _filter_xfs_io
  done
  _defrag --before 16 --after 16 $fragfile
  
  echo Write forwards sync leaving holes - defrag should do nothing | tee -a 
 $seqres.full
  for i in `seq 0 2 31`; do
 - $XFS_IO_PROG -fs -c pwrite -b $bsize $((i * bsize)) $bsize $fragfile \
 -  /dev/null
 + $XFS_IO_PROG -fs -c pwrite -b $bsize $((i * bsize)) $bsize $fragfile 
 | _filter_xfs_io
  done
  _defrag --before 16 --after 16 $fragfile
  
 diff --git a/tests/generic/018.out b/tests/generic/018.out
 index 5f265d1..0886a9a 100644
 --- a/tests/generic/018.out
 +++ b/tests/generic/018.out
 @@ -6,14 +6,210 @@ Sparse file (no blocks):
  Before: 0
  After: 0
  Contiguous file:
 +wrote 16384/16384 bytes at offset 0
 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  Before: 1
  After: 1
  Write backwards sync, but contiguous - should defrag to 1 extent
 -Before: 10
 +wrote 4096/4096 bytes at offset 262144
 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 +wrote 4096/4096 bytes at offset 258048
 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 +wrote 4096/4096 bytes at offset 253952
 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 +wrote 4096/4096 bytes at offset 249856
 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 +wrote 4096/4096 bytes at offset 245760
 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 +wrote 4096/4096 bytes at offset

Re: The performance is not as expected when used several disks on raid0.

2015-08-14 Thread Chris Murphy

On Fri, Aug 14, 2015 at 1:50 PM, Austin S Hemmelgarn
ahferro...@gmail.com wrote:
 On 2015-08-14 14:31, Chris Murphy wrote:

 On Fri, Aug 14, 2015 at 9:16 AM, Eduardo Bach hellb...@gmail.com wrote:

 With btrfs the result approaches 3.5GB/s. When using mdadm+xfs the
 result reaches 6gb/s, which is the expected value when compared with
 parallel dd made on discs.


 mdadm with what chunk (strip) size? The default for mdadm is 512KiB.
 On Btrfs it's fixed at 64KiB. While testing with 64KiB chunk with XFS
 on md RAID might improve its performance relative to Btrfs, at least
 it's a more apples to apples comparison.

 I have a feeling that XFS will still win this.  It is one of the slower
 filesystems for Linux, but it still beats BTRFS senseless when it comes to
 performance as of right now.

Yeah I was suggesting with a 64KiB chunk the XFS case might get even faster.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: The performance is not as expected when used several disks on raid0.

2015-08-14 Thread Austin S Hemmelgarn


On 2015-08-14 15:54, Chris Murphy wrote:

On Fri, Aug 14, 2015 at 1:50 PM, Austin S Hemmelgarn
ahferro...@gmail.com wrote:

On 2015-08-14 14:31, Chris Murphy wrote:


On Fri, Aug 14, 2015 at 9:16 AM, Eduardo Bach hellb...@gmail.com wrote:


With btrfs the result approaches 3.5GB/s. When using mdadm+xfs the
result reaches 6gb/s, which is the expected value when compared with
parallel dd made on discs.



mdadm with what chunk (strip) size? The default for mdadm is 512KiB.
On Btrfs it's fixed at 64KiB. While testing with 64KiB chunk with XFS
on md RAID might improve its performance relative to Btrfs, at least
it's a more apples to apples comparison.


I have a feeling that XFS will still win this.  It is one of the slower
filesystems for Linux, but it still beats BTRFS senseless when it comes to
performance as of right now.


Yeah I was suggesting with a 64KiB chunk the XFS case might get even faster.


Ah, misunderstood what you meant.  Yeah, that will almost certainly make 
things faster for XFS.


FWIW, running BTRFS on top of MDRAID actually works very well, 
especially for BTRFS raid1 on top of MD-RAID0 (I get an almost 50% 
performance increase for this usage over BTRFS raid10, although most of 
this is probably due to how btrfs dispatches I/O's to disks in 
multi-disk stetups).




smime.p7s
Description: S/MIME Cryptographic Signature

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Timothy Normand Miller

I'm not sure my situation is quite like the one you linked, so here's
my bug report:

https://bugzilla.kernel.org/show_bug.cgi?id=102881

On Fri, Aug 14, 2015 at 2:44 PM, Chris Murphy li...@colorremedies.com wrote:
 On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller
 theo...@gmail.com wrote:
 Sorry about that empty email.  I hit a wrong key, and gmail decided to send.

 Anyhow, my replacement drive is going to arrive this evening, and I
 need to know how to add it to my btrfs array.  Here's the situation:

 - I had a drive fail, so I removed it and mounted degraded.
 - I hooked up a replacement drive, did an add on that one, and did a
 delete missing.
 - During the rebalance, the replacement drive failed, there were OOPSes, etc.
 - Now, although all of my data is there, I can't mount degraded,
 because btrfs is complaining that too many devices are missing (3 are
 there, but it sees 2 missing).

 It might be related to this (long) bug:
 https://bugzilla.kernel.org/show_bug.cgi?id=92641

 While Btrfs RAID 1 can tolerate only a single device failure, what you
 have is an in-progress rebuild of a missing device. If it becomes
 missing, the volume should be no worse off than it was before. But
 Btrfs doesn't see it this way, instead is sees this as two separate
 missing devices and now too many devices missing and it refuses to
 proceed. And there's no mechanism to remove missing devices unless you
 can mount rw. So it's stuck.


 So I could use some help with cleaning up this mess.  All the data is
 there, so I need to know how to either force it to mount degraded, or
 add and remove devices offline.  Where do I begin?

 You can try to ask on IRC. I have no ideas for this scenario, I've
 tried and failed. My case was throw away, what should still be
 possible is using btrfs restore.


 Also, doesn't it seem a bit arbitrary that there are too many
 missing, when all of the data is there?  If I understand correctly,
 all four drives in my RAID1 should all have copies of the metadata,

 No that's not correct. RAID 1 means 2 copies of metadata. In a 4
 device RAID 1 that's still only 2 copies. It is not n-way RAID 1.

 But that doesn't matter here, the problem is that Btrfs has a narrow
 idea of the volume, it assumes without context that once the number of
 devices is below the minimum, the volume can't be mounted. In reality,
 an exception exists if the failure is for an in-progress rebuild of a
 missing drive. That drive failing should mean the volume is no worse
 off than before but Btrfs doesn't know that.

 Pretty sure about that anyway.


 and of the remaining three good drives, there should be one or two
 copies of every data block.  So it's all there, but btrfs has decided,
 based on the NUMBER of missing devices, that it won't mount.
 Shouldn't it refuse to mount if it knows there is data missing?  For
 that matter, why should it even refuse in that case?  So some data
 might missing, so it should throw some errors if you try to access
 that missing data.  Right?

 I think no data is missing, no metadata is missing, and Btrfs is
 confused and stuck in this case.

 --
 Chris Murphy



-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Chris Murphy

On Fri, Aug 14, 2015 at 1:03 PM, Timothy Normand Miller
theo...@gmail.com wrote:
 I'm not sure my situation is quite like the one you linked, so here's
 my bug report:

 https://bugzilla.kernel.org/show_bug.cgi?id=102881

I can easily reproduce with just 2 device RAID. I updated the bug.
It's best these are separate bugs, but I think the underlying problems
are related.

The work around is to mount -o ro,degraded, and then move data to a
new Btrfs volume with btrfs send/receive or conventional copy for data
that's not already in a read-only snapshot.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Chris Murphy

On Fri, Aug 14, 2015 at 12:12 PM, Timothy Normand Miller
theo...@gmail.com wrote:
 Sorry about that empty email.  I hit a wrong key, and gmail decided to send.

 Anyhow, my replacement drive is going to arrive this evening, and I
 need to know how to add it to my btrfs array.  Here's the situation:

 - I had a drive fail, so I removed it and mounted degraded.
 - I hooked up a replacement drive, did an add on that one, and did a
 delete missing.
 - During the rebalance, the replacement drive failed, there were OOPSes, etc.
 - Now, although all of my data is there, I can't mount degraded,
 because btrfs is complaining that too many devices are missing (3 are
 there, but it sees 2 missing).

It might be related to this (long) bug:
https://bugzilla.kernel.org/show_bug.cgi?id=92641

While Btrfs RAID 1 can tolerate only a single device failure, what you
have is an in-progress rebuild of a missing device. If it becomes
missing, the volume should be no worse off than it was before. But
Btrfs doesn't see it this way, instead is sees this as two separate
missing devices and now too many devices missing and it refuses to
proceed. And there's no mechanism to remove missing devices unless you
can mount rw. So it's stuck.


 So I could use some help with cleaning up this mess.  All the data is
 there, so I need to know how to either force it to mount degraded, or
 add and remove devices offline.  Where do I begin?

You can try to ask on IRC. I have no ideas for this scenario, I've
tried and failed. My case was throw away, what should still be
possible is using btrfs restore.


 Also, doesn't it seem a bit arbitrary that there are too many
 missing, when all of the data is there?  If I understand correctly,
 all four drives in my RAID1 should all have copies of the metadata,

No that's not correct. RAID 1 means 2 copies of metadata. In a 4
device RAID 1 that's still only 2 copies. It is not n-way RAID 1.

But that doesn't matter here, the problem is that Btrfs has a narrow
idea of the volume, it assumes without context that once the number of
devices is below the minimum, the volume can't be mounted. In reality,
an exception exists if the failure is for an in-progress rebuild of a
missing drive. That drive failing should mean the volume is no worse
off than before but Btrfs doesn't know that.

Pretty sure about that anyway.


 and of the remaining three good drives, there should be one or two
 copies of every data block.  So it's all there, but btrfs has decided,
 based on the NUMBER of missing devices, that it won't mount.
 Shouldn't it refuse to mount if it knows there is data missing?  For
 that matter, why should it even refuse in that case?  So some data
 might missing, so it should throw some errors if you try to access
 that missing data.  Right?

I think no data is missing, no metadata is missing, and Btrfs is
confused and stuck in this case.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: The performance is not as expected when used several disks on raid0.

2015-08-14 Thread Austin S Hemmelgarn


On 2015-08-14 14:31, Chris Murphy wrote:

On Fri, Aug 14, 2015 at 9:16 AM, Eduardo Bach hellb...@gmail.com wrote:


With btrfs the result approaches 3.5GB/s. When using mdadm+xfs the
result reaches 6gb/s, which is the expected value when compared with
parallel dd made on discs.


mdadm with what chunk (strip) size? The default for mdadm is 512KiB.
On Btrfs it's fixed at 64KiB. While testing with 64KiB chunk with XFS
on md RAID might improve its performance relative to Btrfs, at least
it's a more apples to apples comparison.

I have a feeling that XFS will still win this.  It is one of the slower 
filesystems for Linux, but it still beats BTRFS senseless when it comes 
to performance as of right now.




smime.p7s
Description: S/MIME Cryptographic Signature

Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Timothy Normand Miller

My

-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: The performance is not as expected when used several disks on raid0.

2015-08-14 Thread Calvin Walton

On Fri, 2015-08-14 at 12:16 -0300, Eduardo Bach wrote:
 Hi all,
 
 This is my first email to this list, so please excuse any gaffe.
 
 I am in the evaluation early stages of a new storage, an SGI MIS,
 currently with two HBAs LSI and 32 disks.
 The hba controllers are LSI 9207-8i and the disks are Seagate 6TB,
 model ST6000NM0004-1FT17Z.
 
 To evaluate the performance I am using IOzone over a raid0 using all
 the 32 disks, with the parameters: iozone -i0 -i1 -t5 -s 20G  -P0.
 
 With btrfs the result approaches 3.5GB/s. When using mdadm+xfs the
 result reaches 6gb/s, which is the expected value when compared with
 parallel dd made on discs.
 When used btrfs with only half of the disc the result is about 3GB/s.

There's two things in particular to pay attention with on btrfs  with
this sort of setup right now:

   1. btrfs's raid0 is not an n-way stripe; it's a 2-way stripe only. (n
  -way stripe is a long requested feature, but there is no timeline on
  its completion) A single-threaded disk write will only ever be
  writing to two disks at the same time. The total throughput you get
  for multithreaded writes is up to which blocks the allocator happens
  to pick; it will probably often happen that multiple threads will
  both be using the same chunk, sharing IO from only 2 disks.
   2. Btrfs development is currently primarily focused on functionality
  over performance. There's several places where placeholder or
  untuned algorithms are used (e.g. the multi-mirror io read
  scheduling just does pid % number_of_mirrors to pick a mirror).

This kind of a performance difference on large performance-oriented
RAID systems between btrfs's built-in raid and mdadm is interesting to
see, but for the moment I'd say it's mostly expected.

One of the developers here might have some more precise information on
exactly why you're seeing such a performance difference.

As an aside, you have 192TB in RAID0? That's certainly pretty
impressive, but as soon as one disk dies, you're going to lose a *lot*
of data.

-- 
Calvin Walton calvin.wal...@kepstin.ca

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: The performance is not as expected when used several disks on raid0.

2015-08-14 Thread Calvin Walton

On Fri, 2015-08-14 at 12:30 -0400, Calvin Walton wrote:
 On Fri, 2015-08-14 at 12:16 -0300, Eduardo Bach wrote:
  Hi all,
  
  This is my first email to this list, so please excuse any gaffe.
  
  I am in the evaluation early stages of a new storage, an SGI MIS,
  currently with two HBAs LSI and 32 disks.
  The hba controllers are LSI 9207-8i and the disks are Seagate 6TB,
  model ST6000NM0004-1FT17Z.
  
  To evaluate the performance I am using IOzone over a raid0 using
  all
  the 32 disks, with the parameters: iozone -i0 -i1 -t5 -s 20G  -P0.
  
  With btrfs the result approaches 3.5GB/s. When using mdadm+xfs the
  result reaches 6gb/s, which is the expected value when compared
  with
  parallel dd made on discs.
  When used btrfs with only half of the disc the result is about
  3GB/s.
 
 There's two things in particular to pay attention with on btrfs  with
 this sort of setup right now:

Umm, Ok, I made a mistake. You can ignore paragraph #1 - I got some
details about the btrfs raid1 and raid0 modes mixed up!
Btrfs RAID0 is n-way striping across all available drives which have
room for allocations.

1. btrfs's raid0 is not an n-way stripe; it's a 2-way stripe
 only. (n
   -way stripe is a long requested feature, but there is no
 timeline on
   its completion) A single-threaded disk write will only ever be
   writing to two disks at the same time. The total throughput you
 get
   for multithreaded writes is up to which blocks the allocator
 happens
   to pick; it will probably often happen that multiple threads
 will
   both be using the same chunk, sharing IO from only 2 disks.
2. Btrfs development is currently primarily focused on
 functionality
   over performance. There's several places where placeholder or
   untuned algorithms are used (e.g. the multi-mirror io read
   scheduling just does pid % number_of_mirrors to pick a mirror).
 
 This kind of a performance difference on large performance-oriented
 RAID systems between btrfs's built-in raid and mdadm is interesting
 to
 see, but for the moment I'd say it's mostly expected.
 
 One of the developers here might have some more precise information
 on
 exactly why you're seeing such a performance difference.
 
 As an aside, you have 192TB in RAID0? That's certainly pretty
 impressive, but as soon as one disk dies, you're going to lose a
 *lot*
 of data.
 

-- 
Calvin Walton calvin.wal...@kepstin.ca

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Timothy Normand Miller

Sorry about that empty email.  I hit a wrong key, and gmail decided to send.

Anyhow, my replacement drive is going to arrive this evening, and I
need to know how to add it to my btrfs array.  Here's the situation:

- I had a drive fail, so I removed it and mounted degraded.
- I hooked up a replacement drive, did an add on that one, and did a
delete missing.
- During the rebalance, the replacement drive failed, there were OOPSes, etc.
- Now, although all of my data is there, I can't mount degraded,
because btrfs is complaining that too many devices are missing (3 are
there, but it sees 2 missing).

So I could use some help with cleaning up this mess.  All the data is
there, so I need to know how to either force it to mount degraded, or
add and remove devices offline.  Where do I begin?

Also, doesn't it seem a bit arbitrary that there are too many
missing, when all of the data is there?  If I understand correctly,
all four drives in my RAID1 should all have copies of the metadata,
and of the remaining three good drives, there should be one or two
copies of every data block.  So it's all there, but btrfs has decided,
based on the NUMBER of missing devices, that it won't mount.
Shouldn't it refuse to mount if it knows there is data missing?  For
that matter, why should it even refuse in that case?  So some data
might missing, so it should throw some errors if you try to access
that missing data.  Right?

Thanks!

-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Anand Jain




On 08/15/2015 02:12 AM, Timothy Normand Miller wrote:

Sorry about that empty email.  I hit a wrong key, and gmail decided to send.

Anyhow, my replacement drive is going to arrive this evening, and I
need to know how to add it to my btrfs array.  Here's the situation:

- I had a drive fail, so I removed it and mounted degraded.


that bit dangerous to do without the below patch. patch has more details 
why.



- I hooked up a replacement drive, did an add on that one, and did a
delete missing.
- During the rebalance, the replacement drive failed, there were OOPSes, etc.
- Now, although all of my data is there, I can't mount degraded,
because btrfs is complaining that too many devices are missing (3 are
there, but it sees 2 missing).



This is addressed in the patch

  [PATCH 23/23] Btrfs: allow -o rw,degraded for single group profile


Thanks, Anand



So I could use some help with cleaning up this mess.  All the data is
there, so I need to know how to either force it to mount degraded, or
add and remove devices offline.  Where do I begin?

Also, doesn't it seem a bit arbitrary that there are too many
missing, when all of the data is there?  If I understand correctly,
all four drives in my RAID1 should all have copies of the metadata,
and of the remaining three good drives, there should be one or two
copies of every data block.  So it's all there, but btrfs has decided,
based on the NUMBER of missing devices, that it won't mount.
Shouldn't it refuse to mount if it knows there is data missing?  For
that matter, why should it even refuse in that case?  So some data
might missing, so it should throw some errors if you try to access
that missing data.  Right?

Thanks!


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Anand Jain





Just to be clear, I removed the drive (the original failed drive) when
the power was off, then powered up, and then mounted degraded.  That's
not dangerous that I know of.


patch has details. pls refer.


Where is this patch, and what kernel versions can this be applied to?



https://patchwork.kernel.org/patch/7014141/

its on 4.3. but should apply nice on below.

thanks
Anand
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Deleted files cause btrfs-send to fail

2015-08-14 Thread Marc Joliet

Am Thu, 13 Aug 2015 10:54:58 +0200
schrieb Marc Joliet mar...@gmx.de:

 Am Thu, 13 Aug 2015 08:29:19 + (UTC)
 schrieb Duncan 1i5t5.dun...@cox.net:
 
  Marc Joliet posted on Thu, 13 Aug 2015 09:05:41 +0200 as excerpted:
  
   Here's the actual output now, obtained via btrfs-progs 4.0.1 from an
   initramfs emergency shell:
   
   checking extents checking free space cache checking fs roots root 5
   inode 8338813 errors 2000, link count wrong
   unresolved ref dir 26699 index 50500 namelen 4 name root
   filetype 0 errors 3, no dir item, no dir index
   root 5 inode 8338814 errors 2000, link count wrong
   unresolved ref dir 26699 index 50502 namelen 6 name marcec
   filetype 0 errors 3, no dir item, no dir index
   root 5 inode 8338815 errors 2000, link count wrong
   unresolved ref dir 26699 index 50504 namelen 6 name systab
   filetype 0 errors 3, no dir item, no dir index
   root 5 inode 8710030 errors 2000, link count wrong
   unresolved ref dir 26699 index 59588 namelen 6 name marcec
   filetype 0 errors 3, no dir item, no dir index
   root 5 inode 8710031 errors 2000, link count wrong
   unresolved ref dir 26699 index 59590 namelen 4 name root
   filetype 0 errors 3, no dir item, no dir index
   Checking filesystem on /dev/sda1 UUID:
   0267d8b3-a074-460a-832d-5d5fd36bae64 found 63467610172 bytes used err is
   1 total csum bytes: 59475016 total tree bytes: 1903411200 total fs tree
   bytes: 1691504640 total extent tree bytes: 130322432 btree space waste
   bytes: 442495212 file data blocks allocated: 555097092096
referenced 72887840768
   btrfs-progs v4.0.1
   
   Again: is this fixable?
  
  FWIW, root 5 (which you asked about upthread) is the main filesystem 
  root.  So all these appear to be on the main filesystem, not on snapshots/
  subvolumes.
 
[...]
  But if it's critical, you may wish to wait and have someone else confirm 
  that before acting on it, just in case I have it wrong.
 
 I can wait until tonight, at least.  The FS still mounts, and it's just the 
 root
 subvolume that's affected; running btrfs-send on the /home subvolume still
 works.

Well, I got impatient, and just went ahead and did it (I have backups, after
all).  It looks like it worked: the affected files were moved to /lost+found/,
where I deleted them again, and btrfs-send works again.  The output of btrfs
check after --repair:

checking extents
checking free space cache
checking fs roots
checking csums
There are no extents for csum range 0-69632
Csum exists for 0-69632 but there is no extent record
Checking filesystem on /dev/sda1
UUID: 0267d8b3-a074-460a-832d-5d5fd36bae64
block group 274307481600 has wrong amount of free spacefailed to load free 
space cache for block group 274307481600
found 60980420666 bytes used err is 1
total csum bytes: 57521732
total tree bytes: 199680
total fs tree bytes: 1791721472
total extent tree bytes: 127942656
btree space waste bytes: 460072661
file data blocks allocated: 478650343424
 referenced 73326161920
btrfs-progs v4.1.2

If I notice anything amiss, I'll report back.

(One other thing I found interesting was that btrfs scrub didn't care about
the link count errors.)

Greetings.
-- 
Marc Joliet
--
People who think they know everything really annoy those of us who know we
don't - Bjarne Stroustrup


pgpvHFAJ5LvQi.pgp
Description: Digitale Signatur von OpenPGP

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Timothy Normand Miller

On Fri, Aug 14, 2015 at 7:49 PM, Anand Jain anand.j...@oracle.com wrote:



 - I had a drive fail, so I removed it and mounted degraded.


 that bit dangerous to do without the below patch. patch has more details
 why.

Just to be clear, I removed the drive (the original failed drive) when
the power was off, then powered up, and then mounted degraded.  That's
not dangerous that I know of.


 - I hooked up a replacement drive, did an add on that one, and did a
 delete missing.
 - During the rebalance, the replacement drive failed, there were OOPSes,
 etc.
 - Now, although all of my data is there, I can't mount degraded,
 because btrfs is complaining that too many devices are missing (3 are
 there, but it sees 2 missing).



 This is addressed in the patch

   [PATCH 23/23] Btrfs: allow -o rw,degraded for single group profile


Where is this patch, and what kernel versions can this be applied to?



-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RAID0 wrong (raw) device?

2015-08-14 Thread Anand Jain



First of all there is a known issue in handling multiple paths /
instances of the same device image in btrfs. Fixing this caused
regression earlier. And my survey
   [survey]  BTRFS_IOC_DEVICES_READY return status
almost told me not to fix the bug.

But these are just a reporting issue which would confuse users, should 
be fixed.




There is now a new behaviour: after the btrfs mount, I can see shortly the
wrong raw device /dev/sde and a few seconds later there is the correct
/dev/drbd3 :


yep possible. but it does not mean that btrfs kernel is using the new 
path its just a reporting (bug).




(pls use -m option)


root@toy02:/etc# umount /data
root@toy02:/etc# mount /data
root@toy02:/etc# btrfs filesystem show
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
 Total devices 2 FS bytes used 109.56GiB
 devid3 size 1.82TiB used 63.03GiB path /dev/drbd2
 devid4 size 1.82TiB used 63.03GiB path /dev/sde

Btrfs v3.12
root@toy02:/etc# btrfs filesystem show
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
 Total devices 2 FS bytes used 109.56GiB
 devid3 size 1.82TiB used 63.03GiB path /dev/drbd2
 devid4 size 1.82TiB used 63.03GiB path /dev/drbd3

Btrfs v3.12

root@toy02:/etc# btrfs filesystem show -m
Label: data  uuid: 411af13f-6cae-4f03-99dc-5941acb3135b
 Total devices 2 FS bytes used 109.56GiB
 devid3 size 1.82TiB used 63.03GiB path /dev/drbd2
 devid4 size 1.82TiB used 63.03GiB path /dev/drbd3

Btrfs v3.12






Still, the kernel sees 3 instead of (really) 2 HGST drives:

root@toy02:/etc# hdparm -I /dev/sdb | grep Number:
 Model Number:   HGST HUS724020ALA640
 Serial Number:  PN2134P5G2P2AX

root@toy02:/etc# hdparm -I /dev/sde | grep Number:
 Model Number:   HGST HUS724020ALA640
 Serial Number:  PN2134P5G2P2AX


This is important to know but not a btrfs issue. Do you have multiple 
host paths reaching this this device with serial # PN2134P5G2P2AX ?

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Major qgroup regression in 4.2?

2015-08-14 Thread Mark Fasheh

On Thu, Aug 13, 2015 at 04:13:08PM -0700, Mark Fasheh wrote:
 If there *is* a plan to make this all work again, can I please hear it? The
 comment mentions something about adding those nodes to a dirty_extent_root.
 Why wasn't that done?

Ok so I had more time to look through the changes today and came up with
this naive patch, it simply inserts dirty extent records where we were doing
our qgroup refs before. This passes my micro-test but I'm unclear on whether
there's some pitfall I'm unaware of (I'm guessing there must be?). Please
advise.

Thanks,
--Mark

--
Mark Fasheh


From: Mark Fasheh mfas...@suse.de

btrfs: qgroup: account shared subtree during snapshot delete (again)

Commit 0ed4792 ('btrfs: qgroup: Switch to new extent-oriented qgroup
mechanism.') removed our qgroup accounting during
btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
going bad shortly after a snapshot is removed.

Fix this by adding a dirty extent record when we encounter extents
during our shared subtree walk. This effectively restores the
functionality we had with the original shared subtree walkign code in
commit 1152651 (btrfs: qgroup: account shared subtrees during snapshot
delete)

This patch also moves the open coded allocation handling for
qgroup_extent_record into their own functions.

Signed-off-by: Mark Fasheh mfas...@suse.de

qdiff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index ac3e81d..8156f50 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -486,7 +486,7 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info,
qexisting = btrfs_qgroup_insert_dirty_extent(delayed_refs,
 qrecord);
if (qexisting)
-   kfree(qrecord);
+   btrfs_qgroup_free_extent_record(qrecord);
}
 
spin_lock_init(head_ref-lock);
@@ -654,7 +654,7 @@ int btrfs_add_delayed_tree_ref(struct btrfs_fs_info 
*fs_info,
goto free_ref;
 
if (fs_info-quota_enabled  is_fstree(ref_root)) {
-   record = kmalloc(sizeof(*record), GFP_NOFS);
+   record = btrfs_qgroup_alloc_extent_record();
if (!record)
goto free_head_ref;
}
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 07204bf..ab81135 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7756,18 +7756,31 @@ reada:
wc-reada_slot = slot;
 }
 
-/*
- * TODO: Modify related function to add related node/leaf to dirty_extent_root,
- * for later qgroup accounting.
- *
- * Current, this function does nothing.
- */
+static int record_one_item(struct btrfs_trans_handle *trans, u64 bytenr,
+  u64 num_bytes)
+{
+   struct btrfs_qgroup_extent_record *qrecord = 
btrfs_qgroup_alloc_extent_record();
+   struct btrfs_delayed_ref_root *delayed_refs = 
trans-transaction-delayed_refs;
+
+   if (!qrecord)
+   return -ENOMEM;
+
+   qrecord-bytenr = bytenr;
+   qrecord-num_bytes = num_bytes;
+   qrecord-old_roots = NULL;
+
+   if (btrfs_qgroup_insert_dirty_extent(delayed_refs, qrecord))
+   btrfs_qgroup_free_extent_record(qrecord);
+
+   return 0;
+}
+
 static int account_leaf_items(struct btrfs_trans_handle *trans,
  struct btrfs_root *root,
  struct extent_buffer *eb)
 {
int nr = btrfs_header_nritems(eb);
-   int i, extent_type;
+   int i, extent_type, ret;
struct btrfs_key key;
struct btrfs_file_extent_item *fi;
u64 bytenr, num_bytes;
@@ -7790,6 +7803,10 @@ static int account_leaf_items(struct btrfs_trans_handle 
*trans,
continue;
 
num_bytes = btrfs_file_extent_disk_num_bytes(eb, fi);
+
+   ret = record_one_item(trans, bytenr, num_bytes);
+   if (ret)
+   return ret;
}
return 0;
 }
@@ -7858,8 +7875,6 @@ static int adjust_slots_upwards(struct btrfs_root *root,
 
 /*
  * root_eb is the subtree root and is locked before this function is called.
- * TODO: Modify this function to mark all (including complete shared node)
- * to dirty_extent_root to allow it get accounted in qgroup.
  */
 static int account_shared_subtree(struct btrfs_trans_handle *trans,
  struct btrfs_root *root,
@@ -7937,6 +7952,11 @@ walk_down:
btrfs_tree_read_lock(eb);
btrfs_set_lock_blocking_rw(eb, BTRFS_READ_LOCK);
path-locks[level] = BTRFS_READ_LOCK_BLOCKING;
+
+   ret = record_one_item(trans, child_bytenr,
+ root-nodesize);
+   if (ret)
+   goto out;
}
 
if (level == 0) {
diff --git a/fs/btrfs/qgroup.c

[survey] sysfs layout for btrfs

2015-08-14 Thread Anand Jain


Hello,

as of now btrfs sysfs does not include the attributes for the volume 
manager part in its sysfs layout, so its being developed and there are 
two types of layout here below, so I have a quick survey to know which 
will be preferred. contenders are:

1. FS and VM (volume manager) attributes[1] merged sysfs layout
2. FS and VM attributes separated sysfs layout.

These two choices differ whether the VM attributes are amalgamate with 
existing FS attributes or if VM attributes are put under a kobject named 
pools/volumes under /sys/fs/btrfs. More in the below example. which 
would highlight the trade off between these two.


Eg for #1 (above):
The existing sysfs for btrfs, has the top kobject fsid

  /sys/fs/btrfs/fsid -- holds FS attr, VM attr will be added here.
  /sys/fs/btrfs/fsid/devices/uuid [2]  -- btrfs_devices attr here
  /sys/fs/btrfs/fsid/devices/uuid/state
  /sys/fs/btrfs/fsid/devices/uuid/offline

we won't be able to change the sysfs entries which is already there. 
However we could change the context in which they are created and 
destroyed that is, from mount and unmount, to device scan and module 
unload respectively. And so this will enable us to implement the # 1 
(above).


Eg for #2 (above):
For the 2nd choice, a new 'pools or volumes' kobject will be created 
under existing /sys/fs/btrfs/ which will hold the VM attributes. 
(however note that: there will be duplicate kobjects like fsid both 
under FS and VM in this choice #2).


 /sys/fs/btrfs/fsid --- as is, will continue to hold fs attributes.
 /sys/fs/btrfs/pools/fsid/ -- will hold VM attributes
 /sys/fs/btrfs/pools/fsid/devices/sdx -- btrfs_devices attr here
 /sys/fs/btrfs/pools/fsid/devices/sdx/state
 /sys/fs/btrfs/pools/fsid/devices/sdx/offline

There is certainly a small trade-off between these two. Your comments / 
feedback are kindly appreciated.


Thanks, Anand

[1] attributes will be of the btrfs_fs_devices structure. And few newly 
introduced attributes like 'state', to state the volume current state.


[2] note that we can not use sdx here since a link to the block device 
already exists with that name.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Timothy Normand Miller

I applied that patch to my 4.1.4, it mounted degraded, and now it's
balancing to the new drive.

Thanks for all the help!

On Fri, Aug 14, 2015 at 8:28 PM, Anand Jain anand.j...@oracle.com wrote:


 Just to be clear, I removed the drive (the original failed drive) when
 the power was off, then powered up, and then mounted degraded.  That's
 not dangerous that I know of.


 patch has details. pls refer.


 Where is this patch, and what kernel versions can this be applied to?



 https://patchwork.kernel.org/patch/7014141/

 its on 4.3. but should apply nice on below.

 thanks
 Anand



-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Deleted files cause btrfs-send to fail

2015-08-14 Thread Duncan

Marc Joliet posted on Fri, 14 Aug 2015 23:37:37 +0200 as excerpted:

 (One other thing I found interesting was that btrfs scrub didn't care
 about the link count errors.)

A lot of people are confused about exactly what btrfs scrub does, and 
expect it to detect and possibly fix stuff it has nothing to do with.  
It's *not* an fsck.

Scrub does one very useful, but limited, thing.  It systematically 
verifies that the computed checksums for all data and metadata covered by 
checksums match the corresponding recorded checksums.  For dup/raid1/
raid10 modes, if there's a match failure, it will look up the other copy 
and see if it matches, replacing the invalid block with a new copy of the 
other one, assuming it's valid.  For raid56 modes, it attempts to compute 
the valid copy from parity and, again assuming a match after doing so, 
does the replace.  If a valid copy cannot be found or computed, either 
because it's damaged too or because there's no second copy or parity to 
fall back on (single and raid0 modes), then scrub will detect but cannot 
correct the error.

In routine usage, btrfs automatically does the same thing if it happens 
to come across checksum errors in its normal IO stream, but it has to 
come across them first.  Scrub's benefit is that it systematically 
verifies (and corrects errors where it can) checksums on the entire 
filesystem, not just the parts that happen to appear in the normal IO 
stream.

Such checksum errors can be for a few reasons...

I have one ssd that's gradually failing and returns checksum errors 
fairly regularly.  Were I using a normal filesystem I'd have had to 
replace it some time ago.  But with btrfs in raid1 mode and regular 
scrubs (and backups, should they be needed; sometimes I let them get a 
bit stale, but I do have them and am prepared to live with the stale 
restored data if I have to), I've been able to keep using the failing 
device.  When the scrubs hit errors and btrfs does the rewrite from the 
good copy, a block relocation on the failing device is triggered as well, 
with the bad block taken out of service and a new one from the set of 
spares all modern devices have takes its place.  Currently, smartctl -A 
reports 904 reallocated sectors raw value, with a standardized value of 
92.  Before the first reallocated sector, the standardized value was 253, 
perfect.  With the first reallocated sector, it immediately dropped to 
100, apparently the rounded percentage of spare sectors left.  It has 
gradually dropped since then to its current 92, with a threshold value of 
36.  So while it's gradually failing, there's still plenty of spare 
sectors left.  Normally I would have replaced the device even so, but 
I've never actually had the opportunity to actually watch a slow failure 
continue to get worse over time, and now that I do I'm a bit curious how 
things will go, so I'm just letting it happen, tho I do have a 
replacement device already purchased and ready, when the time comes. 

So real media failure, bitrot, is one reason for bad checksums.  The data 
read back from the device simply isn't the same data that was stored to 
it, and the checksum fails as a result.

Of course bad connector cables or storage chipset firmware or hardware is 
another hardware cause.

Sudden reboot or power loss, with data being actively written and one 
copy either already updated or not yet touched, while the other is 
actually being written at the time of the crash so the write isn't 
completed, is yet another reason for checksum failure.  This one is 
actually why a scrub can appear to do so much more than it does, because 
where there's a second copy (or parity) of the data available, scrub can 
use it to recover the partially written copy (which being partially 
written fails its checksum verification) to either the completed write 
state, if the other copy was already written, or the pre-write state, if 
the other copy hadn't been written at all, yet.  In this way the result 
is often the same one an fsck would normally produce, detecting and 
fixing the error, but the mechanism is entirely different -- it only 
detected and fixed the error because the checksum was bad and it had a 
good copy it could replace it with, not because it had any smarts about 
how the filesystem actually worked, and could actually tell what the 
error was and correct it by actually correcting it.


Meanwhile, in your case the problem was an actual btrfs logic bug -- it 
didn't track the inode ref-counts correctly, and didn't remove the inode 
when the last reference to it was deleted, because it still thought there 
were more references.  So the metadata actually written to storage was 
incorrect due to the logic flaw, but the checksum covering it was indeed 
the correct checksum for that metadata, as wrong as the metadata actually 
happened to be.  So scrub couldn't detect the error, because it was an 
error not in checksum, which was computed correctly over the metadata, 
but in

lockup

2015-08-14 Thread Russell Coker

I have a Xen server with 14 DomUs that are being used for BTRFS and ZFS 
training.  About 5 people are corrupting virtual disks and scrubbing them, 
lots of IO.

All the virtual machine disk images are snapshots of a master image with copy 
on write.  I just had the following error which ended with a NMI.  I copied 
what I could.  It's running the latest Debian/Jessie kernel 3.16.7.

[15780.056002] Code: 44 24 10 e9 1c ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 
66 
66 66 90 41 54 55 48 89 fd 53 4c 8b 67 50 66 66 66 66 90 f0 ff 4d 4c 74 35 5b 
5d 41 5c c3 48 8b 1d a9 07 07 00 48 85 db 74 1c 48 8b 
[15808.056003] BUG: soft lockup - CPU#1 stuck for 22s! [qemu-system-i38:22730]
[15808.056003] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables 
x_tables xen_netback xen_gntdev xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss 
oid_registry nfs_acl nfs lockd fscache sunrpc bridge stp llc ext4 crc16 
mbcache jbd2 ppdev psmouse serio_raw pcspkr k8temp joydev evdev ipmi_si ns558 
gameport parport_pc parport ipmi_msghandler snd_mpu401_uart snd_rawmidi 
snd_seq_device snd processor button soundcore edac_mce_amd edac_core 
i2c_nforce2 i2c_core shpchp thermal_sys loop autofs4 crc32c_generic btrfs xor 
raid6_pq raid1 md_mod sd_mod crc_t10dif crct10dif_generic crct10dif_common 
hid_generic usbhid hid sg sr_mod cdrom ata_generic ohci_pci mptsas 
scsi_transport_sas mptscsih mptbase e1000 pata_amd ehci_pci ohci_hcd ehci_hcd 
libata forcedeth scsi_mod usbcore usb_common
[15808.056003] CPU: 1 PID: 22730 Comm: qemu-system-i38 Not tainted 3.16.0-4-
amd64 #1 Debian 3.16.7-ckt11-1+deb8u3
[15808.056003] Hardware name: Sun Microsystems Sun Fire X4100 M2/Sun Fire 
X4100 M2, BIOS 0ABJX102 11/03/2008
[15808.056003] task: 8812e010 ti: 880001e9c000 task.ti: 
880001e9c000
[15808.056003] RIP: e030:[a024edb9]  [a024edb9] 
btrfs_put_ordered_extent+0x19/0xc0 [btrfs]
[15808.056003] RSP: e02b:880001e9fe08  EFLAGS: 0202
[15808.056003] RAX: 0583 RBX: 88000a4f0580 RCX: 06a4
[15808.056003] RDX: 88000a4f0580 RSI: 88000a4f0508 RDI: 88000a4f0508
[15808.056003] RBP: 88000a4f0508 R08: 88000a4f0560 R09: 8800502f29b0
[15808.056003] R10: 7ff0 R11: 0005 R12: 880053821950
[15808.056003] R13: 88000a4f0508 R14: 880004f7cf00 R15: 880001e9fe50
[15808.056003] FS:  7fdc312f5700() GS:88007744() 
knlGS:
[15808.056003] CS:  e033 DS:  ES:  CR0: 8005003b
[15808.056003] CR2: 7f4af0c74000 CR3: 2e534000 CR4: 
0660
[15808.056003] Stack:
[15808.056003]  88000a4f0580 880052d76800 880002503800 
a02342f4
[15808.056003]  880004f7cfa8 880002503000  
a02881e2
[15808.056003]  8800  880052d76800 
88000b7f7b18
[15808.056003] Call Trace:
[15808.056003]  [a02342f4] ? btrfs_wait_pending_ordered+0xc4/0x100 
[btrfs]
[15808.056003]  [a02881e2] ? __btrfs_run_delayed_items+0xf2/0x1d0 
[btrfs]
[15808.056003]  [a0236356] ? btrfs_commit_transaction+0x2d6/0xa10 
[btrfs]
[15808.056003]  [810a7a40] ? prepare_to_wait_event+0xf0/0xf0
[15808.056003]  [a0246529] ? btrfs_sync_file+0x1c9/0x2f0 [btrfs]
[15808.056003]  [811d53cb] ? do_fsync+0x4b/0x70
[15808.056003]  [811d564f] ? SyS_fdatasync+0xf/0x20
[15808.056003]  [8151158d] ? system_call_fast_compare_end+0x10/0x15
[15808.056003] Code: 44 24 10 e9 1c ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 66 
66 
66 66 90 41 54 55 48 89 fd 53 4c 8b 67 50 66 66 66 66 90 f0 ff 4d 4c 74 35 5b 
5d 41 5c c3 48 8b 1d a9 07 07 00 48 85 db 74 1c 48 8b 
[15818.440002] INFO: rcu_sched self-detected stall on CPU { 1}  (t=68266 
jiffies 
g=236497 c=236496 q=6784)
[15818.440002] sending NMI to all CPUs:
[15818.440002] NMI backtrace for cpu 1
[15818.440002] CPU: 1 PID: 22730 Comm: qemu-system-i38 Not tainted 3.16.0-4-
amd64 #1 Debian 3.16.7-ckt11-1+deb8u3
[15818.440002] Hardware name: Sun Microsystems Sun Fire X4100 M2/Sun Fire 
X4100 M2, BIOS 0ABJX102 11/03/2008
[15818.440002] task: 8812e010 ti: 880001e9c000 task.ti: 
880001e9c000
[15818.440002] RIP: e030:[8100130a]  [8100130a] 
xen_hypercall_vcpu_op+0xa/0x20
[15818.440002] RSP: e02b:880077443cc8  EFLAGS: 0046
[15818.440002] RAX:  RBX: 0001 RCX: 8100130a
[15818.440002] RDX:  RSI: 0001 RDI: 
000b
[15818.440002] RBP: 818e2900 R08: 818e23e0 R09: 880bcc40
[15818.440002] R10: 0855 R11: 0246 R12: 818e23e0
[15818.440002] R13: 0005 R14: 1a80 R15: 81853680
[15818.440002] FS:  7fdc312f5700() GS:88007744() 
knlGS:
[15818.440002] CS:  e033 DS:  ES:  CR0: 8005003b
[15818.440002]

Re: delete missing with two missing devices doesn't delete both missing, only does a partial reconstruction

2015-08-14 Thread Timothy Normand Miller

BTW, when this is all over with, how do I make sure there are really
two copies of everything?  Will a scrub verify this?  Should I run a
balance operation?

On Fri, Aug 14, 2015 at 11:29 PM, Timothy Normand Miller
theo...@gmail.com wrote:
 After applying Anand's patch, I was able to mount my 4-drive RAID1 and
 bring a new fourth drive online.  However, something weird happened
 where the first delete missing only deleted one missing drive and
 only did a partial duplication.  I've posted a bug report here:

 https://bugzilla.kernel.org/show_bug.cgi?id=102901

 --
 Timothy Normand Miller, PhD
 Assistant Professor of Computer Science, Binghamton University
 http://www.cs.binghamton.edu/~millerti/
 Open Graphics Project



-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can't mount degraded. How to remove/add drives OFFLINE?

2015-08-14 Thread Chris Murphy

I thought for a second that maybe the problem is due to the phantom
single chunk(s) created at mkfs time. I redid the test, and did a
balance to get rid of the single chunk. I did this right after
populating volume with some data. But the problem still happens.

---
Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: delete missing with two missing devices doesn't delete both missing, only does a partial reconstruction

2015-08-14 Thread Anand Jain



 BTW, when this is all over with, how do I make sure there are really
 two copies of everything?  Will a scrub verify this?  Should I run a
 balance operation?

pls use 'btrfs bal profile and convert' to migrate single chunk (if any 
created when there were lesser number of RW-able devices) back to your 
desired raid1. Do this when all the devices are back online. Kindly note 
there is a bug in the btrfs VM that you won't be able to bring a device 
online with out unmount - mount (I am working to fix). btrfs-progs will 
be wrong in this case don't depend too much on that.

So to understand inside of btrfs kernel volume I generally use:
https://patchwork.kernel.org/patch/5816011/

In there if bdev is null it indicates device is scanned but not part of 
VM yet. Then unmount - mount will bring device back to be part of VM.


 After applying Anand's patch, I was able to mount my 4-drive RAID1
 and bring a new fourth drive online.

 However, something weird happened
 where the first delete missing only deleted one missing drive and
 only did a partial duplication.  I've posted a bug report here:

that seems to be normal to me. unless I am missing something else / clarity.


Thanks, Anand
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

delete missing with two missing devices doesn't delete both missing, only does a partial reconstruction

2015-08-14 Thread Timothy Normand Miller

After applying Anand's patch, I was able to mount my 4-drive RAID1 and
bring a new fourth drive online.  However, something weird happened
where the first delete missing only deleted one missing drive and
only did a partial duplication.  I've posted a bug report here:

https://bugzilla.kernel.org/show_bug.cgi?id=102901

-- 
Timothy Normand Miller, PhD
Assistant Professor of Computer Science, Binghamton University
http://www.cs.binghamton.edu/~millerti/
Open Graphics Project
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/23] Btrfs: rename super_kobj to fsid_kobj

2015-08-14 Thread Anand Jain

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/sysfs.c   | 36 ++--
 fs/btrfs/volumes.c |  2 +-
 fs/btrfs/volumes.h |  2 +-
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 52319d1..e0ac859 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -437,24 +437,24 @@ static const struct attribute *btrfs_attrs[] = {
NULL,
 };
 
-static void btrfs_release_super_kobj(struct kobject *kobj)
+static void btrfs_release_fsid_kobj(struct kobject *kobj)
 {
struct btrfs_fs_devices *fs_devs = to_fs_devs(kobj);
 
-   memset(fs_devs-super_kobj, 0, sizeof(struct kobject));
+   memset(fs_devs-fsid_kobj, 0, sizeof(struct kobject));
complete(fs_devs-kobj_unregister);
 }
 
 static struct kobj_type btrfs_ktype = {
.sysfs_ops  = kobj_sysfs_ops,
-   .release= btrfs_release_super_kobj,
+   .release= btrfs_release_fsid_kobj,
 };
 
 static inline struct btrfs_fs_devices *to_fs_devs(struct kobject *kobj)
 {
if (kobj-ktype != btrfs_ktype)
return NULL;
-   return container_of(kobj, struct btrfs_fs_devices, super_kobj);
+   return container_of(kobj, struct btrfs_fs_devices, fsid_kobj);
 }
 
 static inline struct btrfs_fs_info *to_fs_info(struct kobject *kobj)
@@ -502,12 +502,12 @@ static int addrm_unknown_feature_attrs(struct 
btrfs_fs_info *fs_info, bool add)
attrs[0] = fa-kobj_attr.attr;
if (add) {
int ret;
-   ret = 
sysfs_merge_group(fs_info-fs_devices-super_kobj,
+   ret = 
sysfs_merge_group(fs_info-fs_devices-fsid_kobj,
agroup);
if (ret)
return ret;
} else
-   
sysfs_unmerge_group(fs_info-fs_devices-super_kobj,
+   
sysfs_unmerge_group(fs_info-fs_devices-fsid_kobj,
agroup);
}
 
@@ -523,9 +523,9 @@ static void __btrfs_sysfs_remove_fsid(struct 
btrfs_fs_devices *fs_devs)
fs_devs-device_dir_kobj = NULL;
}
 
-   if (fs_devs-super_kobj.state_initialized) {
-   kobject_del(fs_devs-super_kobj);
-   kobject_put(fs_devs-super_kobj);
+   if (fs_devs-fsid_kobj.state_initialized) {
+   kobject_del(fs_devs-fsid_kobj);
+   kobject_put(fs_devs-fsid_kobj);
wait_for_completion(fs_devs-kobj_unregister);
}
 }
@@ -555,8 +555,8 @@ void btrfs_sysfs_remove_mounted(struct btrfs_fs_info 
*fs_info)
kobject_put(fs_info-space_info_kobj);
}
addrm_unknown_feature_attrs(fs_info, false);
-   sysfs_remove_group(fs_info-fs_devices-super_kobj, 
btrfs_feature_attr_group);
-   sysfs_remove_files(fs_info-fs_devices-super_kobj, btrfs_attrs);
+   sysfs_remove_group(fs_info-fs_devices-fsid_kobj, 
btrfs_feature_attr_group);
+   sysfs_remove_files(fs_info-fs_devices-fsid_kobj, btrfs_attrs);
btrfs_sysfs_rm_device_link(fs_info-fs_devices, NULL);
 }
 
@@ -675,7 +675,7 @@ int btrfs_sysfs_add_device(struct btrfs_fs_devices *fs_devs)
 {
if (!fs_devs-device_dir_kobj)
fs_devs-device_dir_kobj = kobject_create_and_add(devices,
-   fs_devs-super_kobj);
+   fs_devs-fsid_kobj);
 
if (!fs_devs-device_dir_kobj)
return -ENOMEM;
@@ -730,8 +730,8 @@ int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs,
int error;
 
init_completion(fs_devs-kobj_unregister);
-   fs_devs-super_kobj.kset = btrfs_kset;
-   error = kobject_init_and_add(fs_devs-super_kobj,
+   fs_devs-fsid_kobj.kset = btrfs_kset;
+   error = kobject_init_and_add(fs_devs-fsid_kobj,
btrfs_ktype, parent, %pU, fs_devs-fsid);
return error;
 }
@@ -740,7 +740,7 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
 {
int error;
struct btrfs_fs_devices *fs_devs = fs_info-fs_devices;
-   struct kobject *super_kobj = fs_devs-super_kobj;
+   struct kobject *fsid_kobj = fs_devs-fsid_kobj;
 
btrfs_set_fs_info_ptr(fs_info);
 
@@ -748,13 +748,13 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
if (error)
return error;
 
-   error = sysfs_create_files(super_kobj, btrfs_attrs);
+   error = sysfs_create_files(fsid_kobj, btrfs_attrs);
if (error) {
btrfs_sysfs_rm_device_link(fs_devs, NULL);
return error;
}
 
-   error = sysfs_create_group(super_kobj,
+   error = sysfs_create_group(fsid_kobj,

[PATCH 02/23] Btrfs: rename btrfs_sysfs_remove_one to btrfs_sysfs_remove_mounted

2015-08-14 Thread Anand Jain

---
 fs/btrfs/ctree.h   | 2 +-
 fs/btrfs/disk-io.c | 4 ++--
 fs/btrfs/sysfs.c   | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index afce306..4484063 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4005,7 +4005,7 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
 int btrfs_init_sysfs(void);
 void btrfs_exit_sysfs(void);
 int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info);
-void btrfs_sysfs_remove_one(struct btrfs_fs_info *fs_info);
+void btrfs_sysfs_remove_mounted(struct btrfs_fs_info *fs_info);
 
 /* xattr.c */
 ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 376a6ef..8571025 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3108,7 +3108,7 @@ fail_cleaner:
filemap_write_and_wait(fs_info-btree_inode-i_mapping);
 
 fail_sysfs:
-   btrfs_sysfs_remove_one(fs_info);
+   btrfs_sysfs_remove_mounted(fs_info);
 
 fail_fsdev_sysfs:
btrfs_sysfs_remove_fsid(fs_info-fs_devices);
@@ -3791,7 +3791,7 @@ void close_ctree(struct btrfs_root *root)
   percpu_counter_sum(fs_info-delalloc_bytes));
}
 
-   btrfs_sysfs_remove_one(fs_info);
+   btrfs_sysfs_remove_mounted(fs_info);
btrfs_sysfs_remove_fsid(fs_info-fs_devices);
 
btrfs_free_fs_roots(fs_info);
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index cabf840..095a302 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -545,7 +545,7 @@ void btrfs_sysfs_remove_fsid(struct btrfs_fs_devices 
*fs_devs)
}
 }
 
-void btrfs_sysfs_remove_one(struct btrfs_fs_info *fs_info)
+void btrfs_sysfs_remove_mounted(struct btrfs_fs_info *fs_info)
 {
btrfs_reset_fs_info_ptr(fs_info);
 
@@ -776,7 +776,7 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
 
return 0;
 failure:
-   btrfs_sysfs_remove_one(fs_info);
+   btrfs_sysfs_remove_mounted(fs_info);
return error;
 }
 
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/23] Btrfs: kernel operation should come after user input has been verified

2015-08-14 Thread Anand Jain

By general rule of thumb there shouldn't be any way that user land
could trigger a kernel operation just by sending wrong arguments.

Here do commit cleanups after user input has been verified.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/dev-replace.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 673a2c3..937e53b 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -325,19 +325,6 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
args-start.tgtdev_name[0] == '\0')
return -EINVAL;
 
-   /*
-* Here we commit the transaction to make sure commit_total_bytes
-* of all the devices are updated.
-*/
-   trans = btrfs_attach_transaction(root);
-   if (!IS_ERR(trans)) {
-   ret = btrfs_commit_transaction(trans, root);
-   if (ret)
-   return ret;
-   } else if (PTR_ERR(trans) != -ENOENT) {
-   return PTR_ERR(trans);
-   }
-
/* the disk copy procedure reuses the scrub code */
mutex_lock(fs_info-volume_mutex);
ret = btrfs_find_device_by_user_input(root, args-start.srcdevid,
@@ -354,6 +341,19 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
if (ret)
return ret;
 
+   /*
+* Here we commit the transaction to make sure commit_total_bytes
+* of all the devices are updated.
+*/
+   trans = btrfs_attach_transaction(root);
+   if (!IS_ERR(trans)) {
+   ret = btrfs_commit_transaction(trans, root);
+   if (ret)
+   return ret;
+   } else if (PTR_ERR(trans) != -ENOENT) {
+   return PTR_ERR(trans);
+   }
+
btrfs_dev_replace_lock(dev_replace);
switch (dev_replace-replace_state) {
case BTRFS_IOCTL_DEV_REPLACE_STATE_NEVER_STARTED:
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 21/23] Btrfs: fix fs logging for multi device

2015-08-14 Thread Anand Jain

In case of multi device btrfs fs, using one of device for
the logging purpose it quite confusing, instead use the
fsid. FSID is bit long, but the device path can be long
as well in some cases.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/super.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 56c0174..a8a0109 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -190,12 +190,12 @@ static const char * const logtypes[] = {
 
 void btrfs_printk(const struct btrfs_fs_info *fs_info, const char *fmt, ...)
 {
-   struct super_block *sb = fs_info-sb;
char lvl[4];
struct va_format vaf;
va_list args;
const char *type = logtypes[4];
int kern_level;
+   struct btrfs_fs_devices *fs_devs = fs_info-fs_devices;
 
va_start(args, fmt);
 
@@ -212,7 +212,7 @@ void btrfs_printk(const struct btrfs_fs_info *fs_info, 
const char *fmt, ...)
vaf.fmt = fmt;
vaf.va = args;
 
-   printk(%sBTRFS %s (device %s): %pV\n, lvl, type, sb-s_id, vaf);
+   printk(%sBTRFS: %pU %s: %pV\n, lvl, fs_devs-fsid, type, vaf);
 
va_end(args);
 }
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/23] Btrfs: use btrfs_find_device_by_user_input()

2015-08-14 Thread Anand Jain

btrfs_rm_device() has a section of the code which can be replaced
btrfs_find_device_by_user_input()

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 84 --
 1 file changed, 19 insertions(+), 65 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f1b36b9..1d35332 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1689,7 +1689,6 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
struct btrfs_super_block *disk_super = NULL;
struct btrfs_fs_devices *cur_devices;
u64 num_devices;
-   u8 *dev_uuid;
int ret = 0;
bool clear_super = false;
char *dev_name = NULL;
@@ -1700,62 +1699,19 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
if (ret)
goto out;
 
-   if (devid) {
-   device = btrfs_find_device(root-fs_info, devid,
-   NULL, NULL);
-   if (!device) {
-   ret = -ENOENT;
-   goto out;
-   }
-   device_path = rcu_str_deref(device-name);
-   } else if (strcmp(device_path, missing) == 0) {
-   struct list_head *devices;
-   struct btrfs_device *tmp;
-
-   device = NULL;
-   devices = root-fs_info-fs_devices-devices;
-   /*
-* It is safe to read the devices since the volume_mutex
-* is held.
-*/
-   list_for_each_entry(tmp, devices, dev_list) {
-   if (tmp-in_fs_metadata 
-   !tmp-is_tgtdev_for_dev_replace 
-   !tmp-bdev) {
-   device = tmp;
-   break;
-   }
-   }
-   if (!device) {
-   ret = BTRFS_ERROR_DEV_MISSING_NOT_FOUND;
-   goto out;
-   }
-   } else {
-   ret = btrfs_get_bdev_and_sb(device_path,
-   FMODE_WRITE | FMODE_EXCL,
-   root-fs_info-bdev_holder, 0,
-   bdev, bh);
-   if (ret)
-   goto out;
-   disk_super = (struct btrfs_super_block *)bh-b_data;
-   devid = btrfs_stack_device_id(disk_super-dev_item);
-   dev_uuid = disk_super-dev_item.uuid;
-   device = btrfs_find_device(root-fs_info, devid, dev_uuid,
-  disk_super-fsid);
-   if (!device) {
-   ret = -ENOENT;
-   goto error_brelse;
-   }
-   }
+   ret = btrfs_find_device_by_user_input(root, devid, device_path,
+   device);
+   if (ret)
+   goto out;
 
if (device-is_tgtdev_for_dev_replace) {
ret = BTRFS_ERROR_DEV_TGT_REPLACE;
-   goto error_brelse;
+   goto out;
}
 
if (device-writeable  root-fs_info-fs_devices-rw_devices == 1) {
ret = BTRFS_ERROR_DEV_ONLY_WRITABLE;
-   goto error_brelse;
+   goto out;
}
 
if (device-writeable) {
@@ -1865,7 +1821,7 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
 * to fail. So return success
 */
ret = 0;
-   goto done;
+   goto out;
}
 
disk_super = (struct btrfs_super_block *)bh-b_data;
@@ -1876,6 +1832,7 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
memset(disk_super-magic, 0, sizeof(disk_super-magic));
set_buffer_dirty(bh);
sync_dirty_buffer(bh);
+   brelse(bh);
 
/* clear the mirror copies of super block on the disk
 * being removed, 0th copy is been taken care above and
@@ -1887,7 +1844,6 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
i_size_read(bdev-bd_inode))
break;
 
-   brelse(bh);
bh = __bread(bdev, bytenr / 4096,
BTRFS_SUPER_INFO_SIZE);
if (!bh)
@@ -1897,35 +1853,33 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
 
if (btrfs_super_bytenr(disk_super) != bytenr ||
btrfs_super_magic(disk_super) != BTRFS_MAGIC) {
+   brelse(bh);

[PATCH 10/23] Btrfs: rename btrfs_dev_replace_find_srcdev()

2015-08-14 Thread Anand Jain

The patch renames btrfs_dev_replace_find_srcdev() to
btrfs_find_device_by_user_input() so that it can be used
by btrfs_rm_device() as well in the next patches.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/dev-replace.c | 24 +---
 fs/btrfs/volumes.c | 19 +++
 fs/btrfs/volumes.h |  3 +++
 3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 6eb9324..673a2c3 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -44,9 +44,6 @@ static void btrfs_dev_replace_update_device_in_mapping_tree(
struct btrfs_fs_info *fs_info,
struct btrfs_device *srcdev,
struct btrfs_device *tgtdev);
-static int btrfs_dev_replace_find_srcdev(struct btrfs_root *root, u64 srcdevid,
-char *srcdev_name,
-struct btrfs_device **device);
 static u64 __btrfs_dev_replace_cancel(struct btrfs_fs_info *fs_info);
 static int btrfs_dev_replace_kthread(void *data);
 static int btrfs_dev_replace_continue_on_mount(struct btrfs_fs_info *fs_info);
@@ -343,7 +340,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
 
/* the disk copy procedure reuses the scrub code */
mutex_lock(fs_info-volume_mutex);
-   ret = btrfs_dev_replace_find_srcdev(root, args-start.srcdevid,
+   ret = btrfs_find_device_by_user_input(root, args-start.srcdevid,
args-start.srcdev_name,
src_device);
if (ret) {
@@ -626,25 +623,6 @@ static void 
btrfs_dev_replace_update_device_in_mapping_tree(
write_unlock(em_tree-lock);
 }
 
-static int btrfs_dev_replace_find_srcdev(struct btrfs_root *root, u64 srcdevid,
-char *srcdev_name,
-struct btrfs_device **device)
-{
-   int ret;
-
-   if (srcdevid) {
-   ret = 0;
-   *device = btrfs_find_device(root-fs_info, srcdevid, NULL,
-   NULL);
-   if (!*device)
-   ret = -ENOENT;
-   } else {
-   ret = btrfs_find_device_missing_or_by_path(root, srcdev_name,
-  device);
-   }
-   return ret;
-}
-
 void btrfs_dev_replace_status(struct btrfs_fs_info *fs_info,
  struct btrfs_ioctl_dev_replace_args *args)
 {
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1573997..101a473 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2089,6 +2089,25 @@ int btrfs_find_device_missing_or_by_path(struct 
btrfs_root *root,
}
 }
 
+int btrfs_find_device_by_user_input(struct btrfs_root *root, u64 srcdevid,
+char *srcdev_name,
+struct btrfs_device **device)
+{
+   int ret;
+
+   if (srcdevid) {
+   ret = 0;
+   *device = btrfs_find_device(root-fs_info, srcdevid, NULL,
+   NULL);
+   if (!*device)
+   ret = -ENOENT;
+   } else {
+   ret = btrfs_find_device_missing_or_by_path(root, srcdev_name,
+  device);
+   }
+   return ret;
+}
+
 /*
  * does all the dirty work required for changing file system's UUID.
  */
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index f4b0ed8..a093b36 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -429,6 +429,9 @@ void btrfs_close_extra_devices(struct btrfs_fs_devices 
*fs_devices, int step);
 int btrfs_find_device_missing_or_by_path(struct btrfs_root *root,
 char *device_path,
 struct btrfs_device **device);
+int btrfs_find_device_by_user_input(struct btrfs_root *root, u64 srcdevid,
+char *srcdev_name,
+struct btrfs_device **device);
 struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
const u64 *devid,
const u8 *uuid);
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 00/23] btrfs device related patch set

2015-08-14 Thread Anand Jain

This patch set includes patches which has been sent before
independently, however here they are consolidated on the
current integration 4.3.

Most of them are cleanup and preparatory work for the RFE
which are published before, viz.. addition of sys volume attributes
and introduce a method to offline device.

And except for the patch
 Btrfs: device delete by devid
provides a way to delete device using devid (assume that devid
has failed) thus fixes the issue reported by the user in the
community.

and the patch
 Btrfs: allow -o rw,degraded for single group profile
fixes an important btrfs volume availability issue


Anand Jain (22):
  Btrfs: rename btrfs_sysfs_add_one to btrfs_sysfs_add_mounted
  Btrfs: rename btrfs_sysfs_remove_one to btrfs_sysfs_remove_mounted
  Btrfs: rename btrfs_kobj_add_device to btrfs_sysfs_add_device_link
  Btrfs: rename btrfs_kobj_rm_device to btrfs_sysfs_rm_device_link
  Btrfs: rename super_kobj to fsid_kobj
  Btrfs: SB read failure should return EIO for __bread failure
  Btrfs: __btrfs_std_error() logic should be consistent w/out
CONFIG_PRINTK defined
  Btrfs: device delete by devid
  Btrfs: move check for min number of devices to a function
  Btrfs: rename btrfs_dev_replace_find_srcdev()
  Btrfs: use BTRFS_ERROR_DEV_MISSING_NOT_FOUND when missing device is
not found
  Btrfs: use btrfs_find_device_by_user_input()
  Btrfs: add btrfs_read_dev_one_super() to read one specific SB
  Btrfs: fix btrfs_scratch_superblock() with fixes from device delete
  Btrfs: use btrfs_scratch_superblock() in btrfs_rm_device()
  Btrfs: device path change must be logged
  Btrfs: kernel operation should come after user input has been verified
  Btrfs: check device_path in btrfs_find_device_by_user_input()
  Btrfs: avoid user cli usage error logging into the sys log
  Btrfs: move device close to btrfs_close_one_device
  Btrfs: fix fs logging for multi device
  Btrfs: allow -o rw,degraded for single group profile

Liu Bo (1):
  Btrfs: move kobj stuff out of dev_replace lock range

 fs/btrfs/ctree.h   |   4 +-
 fs/btrfs/dev-replace.c |  64 +++--
 fs/btrfs/disk-io.c |  65 ++---
 fs/btrfs/disk-io.h |   2 +
 fs/btrfs/ioctl.c   |  50 ++-
 fs/btrfs/super.c   |  34 ++---
 fs/btrfs/sysfs.c   |  52 +++
 fs/btrfs/sysfs.h   |   4 +-
 fs/btrfs/volumes.c | 345 +
 fs/btrfs/volumes.h |  10 +-
 include/uapi/linux/btrfs.h |   8 ++
 11 files changed, 331 insertions(+), 307 deletions(-)

-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/23] Btrfs: rename btrfs_sysfs_add_one to btrfs_sysfs_add_mounted

2015-08-14 Thread Anand Jain

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/ctree.h   | 2 +-
 fs/btrfs/disk-io.c | 2 +-
 fs/btrfs/sysfs.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 938efe3..afce306 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4004,7 +4004,7 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
 /* sysfs.c */
 int btrfs_init_sysfs(void);
 void btrfs_exit_sysfs(void);
-int btrfs_sysfs_add_one(struct btrfs_fs_info *fs_info);
+int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info);
 void btrfs_sysfs_remove_one(struct btrfs_fs_info *fs_info);
 
 /* xattr.c */
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index cc15514b..376a6ef 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2928,7 +2928,7 @@ retry_root_backup:
goto fail_fsdev_sysfs;
}
 
-   ret = btrfs_sysfs_add_one(fs_info);
+   ret = btrfs_sysfs_add_mounted(fs_info);
if (ret) {
pr_err(BTRFS: failed to init sysfs interface: %d\n, ret);
goto fail_fsdev_sysfs;
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 603b0cc..cabf840 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -736,7 +736,7 @@ int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs,
return error;
 }
 
-int btrfs_sysfs_add_one(struct btrfs_fs_info *fs_info)
+int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
 {
int error;
struct btrfs_fs_devices *fs_devs = fs_info-fs_devices;
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 08/23] Btrfs: device delete by devid

2015-08-14 Thread Anand Jain

This introduces BTRFS_IOC_RM_DEV_V2, which can accept devid as
an argument to delete the device.

Current only choice to is to pass device path for the device delete
cli, but if btrfs is unable to read device SB, then cli fails. And
user won't be able to delete the device.

With this patch now the user can specify devid as the device
to delete.

The patch won't delete the old interface so that kernel will
remain compatible with the older user-interface programs like
btrfs-progs.

Test case/script:
echo 0 $(blockdev --getsz /dev/sdf) linear /dev/sdf 0 | dmsetup create 
bad_disk
mkfs.btrfs -f -d raid1 -m raid1 /dev/sdd /dev/sde /dev/mapper/bad_disk
mount /dev/sdd /btrfs
dmsetup suspend bad_disk
echo 0 $(blockdev --getsz /dev/sdf) error /dev/sdf 0 | dmsetup load bad_disk
dmsetup resume bad_disk
echo bad disk failed. now deleting/replacing
btrfs dev del  3  /btrfs
echo $?
btrfs fi show /btrfs
umount /btrfs
btrfs-show-super /dev/sdd | egrep num_device
dmsetup remove bad_disk
wipefs -a /dev/sdf

v3: commit update, included test case

v2: don't use device-name after free
commit update with the test script which I have been using

Signed-off-by: Anand Jain anand.j...@oracle.com
Reported-by: Martin m_bt...@ml1.co.uk
---
 fs/btrfs/ioctl.c   | 50 -
 fs/btrfs/volumes.c | 51 --
 fs/btrfs/volumes.h |  2 +-
 include/uapi/linux/btrfs.h |  8 
 4 files changed, 98 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 0adf542..6c9e58c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2653,6 +2653,52 @@ out:
return ret;
 }
 
+static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
+{
+   struct btrfs_root *root = BTRFS_I(file_inode(file))-root;
+   struct btrfs_ioctl_vol_args_v3 *vol_args;
+   int ret;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
+   ret = mnt_want_write_file(file);
+   if (ret)
+   return ret;
+
+   vol_args = memdup_user(arg, sizeof(*vol_args));
+   if (IS_ERR(vol_args)) {
+   ret = PTR_ERR(vol_args);
+   goto err_drop;
+   }
+
+   vol_args-name[BTRFS_PATH_NAME_MAX] = '\0';
+
+   if (atomic_xchg(root-fs_info-mutually_exclusive_operation_running,
+   1)) {
+   ret = BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
+   goto out;
+   }
+
+   mutex_lock(root-fs_info-volume_mutex);
+   ret = btrfs_rm_device(root, vol_args-name, vol_args-devid);
+   mutex_unlock(root-fs_info-volume_mutex);
+   atomic_set(root-fs_info-mutually_exclusive_operation_running, 0);
+
+   if (!ret) {
+   if (vol_args-devid)
+   btrfs_info(root-fs_info, disk devid %llu deleted,
+   
vol_args-devid);
+   else
+   btrfs_info(root-fs_info, disk deleted - %s, 
vol_args-name);
+   }
+out:
+   kfree(vol_args);
+err_drop:
+   mnt_drop_write_file(file);
+   return ret;
+}
+
 static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
 {
struct btrfs_root *root = BTRFS_I(file_inode(file))-root;
@@ -2681,7 +2727,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void 
__user *arg)
}
 
mutex_lock(root-fs_info-volume_mutex);
-   ret = btrfs_rm_device(root, vol_args-name);
+   ret = btrfs_rm_device(root, vol_args-name, 0);
mutex_unlock(root-fs_info-volume_mutex);
atomic_set(root-fs_info-mutually_exclusive_operation_running, 0);
 
@@ -5419,6 +5465,8 @@ long btrfs_ioctl(struct file *file, unsigned int
return btrfs_ioctl_add_dev(root, argp);
case BTRFS_IOC_RM_DEV:
return btrfs_ioctl_rm_dev(file, argp);
+   case BTRFS_IOC_RM_DEV_V2:
+   return btrfs_ioctl_rm_dev_v2(file, argp);
case BTRFS_IOC_FS_INFO:
return btrfs_ioctl_fs_info(root, argp);
case BTRFS_IOC_DEV_INFO:
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a3fde18..2f8b974 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1637,21 +1637,21 @@ out:
return ret;
 }
 
-int btrfs_rm_device(struct btrfs_root *root, char *device_path)
+int btrfs_rm_device(struct btrfs_root *root, char *device_path, u64 devid)
 {
struct btrfs_device *device;
struct btrfs_device *next_device;
-   struct block_device *bdev;
+   struct block_device *bdev = NULL;
struct buffer_head *bh = NULL;
-   struct btrfs_super_block *disk_super;
+   struct btrfs_super_block *disk_super = NULL;
struct btrfs_fs_devices *cur_devices;
u64 all_avail;
-   u64 devid;
u64 num_devices;
u8 *dev_uuid;
unsigned seq;
int ret = 0;
bool clear_super = false;
+   char *dev_name = NULL;

[PATCH 22/23] Btrfs: move kobj stuff out of dev_replace lock range

2015-08-14 Thread Anand Jain

From: Liu Bo bo.li@oracle.com

To avoid deadlock described in commit 084b6e7c7607 (btrfs: Fix a lockdep 
warning when running xfstest.),
we should move kobj stuff out of dev_replace lock range.

Signed-off-by: Liu Bo bo.li@oracle.com
Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/dev-replace.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 0df3d9b..c326d51 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -369,10 +369,6 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
WARN_ON(!tgt_device);
dev_replace-tgtdev = tgt_device;
 
-   ret = btrfs_sysfs_add_device_link(tgt_device-fs_devices, tgt_device);
-   if (ret)
-   btrfs_err(root-fs_info, kobj add dev failed %d\n, ret);
-
printk_in_rcu(KERN_INFO
  BTRFS: dev_replace from %s (devid %llu) to %s started\n,
  src_device-missing ? missing disk :
@@ -395,6 +391,10 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
args-result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR;
btrfs_dev_replace_unlock(dev_replace);
 
+   ret = btrfs_sysfs_add_device_link(tgt_device-fs_devices, tgt_device);
+   if (ret)
+   btrfs_err(root-fs_info, kobj add dev failed %d\n, ret);
+
btrfs_wait_ordered_roots(root-fs_info, -1);
 
/* force writing the updated state information to disk */
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 09/23] Btrfs: move check for min number of devices to a function

2015-08-14 Thread Anand Jain

btrfs_rm_device() has a section of the code to check for min number
of the devices required by various group profile. This patch move
that part of the code in the function __check_raid_min_devices()

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 78 ++
 1 file changed, 43 insertions(+), 35 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2f8b974..1573997 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1637,61 +1637,69 @@ out:
return ret;
 }
 
-int btrfs_rm_device(struct btrfs_root *root, char *device_path, u64 devid)
+static int __check_raid_min_devices(struct btrfs_fs_info *fs_info)
 {
-   struct btrfs_device *device;
-   struct btrfs_device *next_device;
-   struct block_device *bdev = NULL;
-   struct buffer_head *bh = NULL;
-   struct btrfs_super_block *disk_super = NULL;
-   struct btrfs_fs_devices *cur_devices;
u64 all_avail;
u64 num_devices;
-   u8 *dev_uuid;
unsigned seq;
-   int ret = 0;
-   bool clear_super = false;
-   char *dev_name = NULL;
-
-   mutex_lock(uuid_mutex);
 
-   do {
-   seq = read_seqbegin(root-fs_info-profiles_lock);
-
-   all_avail = root-fs_info-avail_data_alloc_bits |
-   root-fs_info-avail_system_alloc_bits |
-   root-fs_info-avail_metadata_alloc_bits;
-   } while (read_seqretry(root-fs_info-profiles_lock, seq));
-
-   num_devices = root-fs_info-fs_devices-num_devices;
-   btrfs_dev_replace_lock(root-fs_info-dev_replace);
-   if (btrfs_dev_replace_is_ongoing(root-fs_info-dev_replace)) {
+   num_devices = fs_info-fs_devices-num_devices;
+   btrfs_dev_replace_lock(fs_info-dev_replace);
+   if (btrfs_dev_replace_is_ongoing(fs_info-dev_replace)) {
WARN_ON(num_devices  1);
num_devices--;
}
-   btrfs_dev_replace_unlock(root-fs_info-dev_replace);
+   btrfs_dev_replace_unlock(fs_info-dev_replace);
+
+   do {
+   seq = read_seqbegin(fs_info-profiles_lock);
+
+   all_avail = fs_info-avail_data_alloc_bits |
+   fs_info-avail_system_alloc_bits |
+   fs_info-avail_metadata_alloc_bits;
+   } while (read_seqretry(fs_info-profiles_lock, seq));
 
if ((all_avail  BTRFS_BLOCK_GROUP_RAID10)  num_devices = 4) {
-   ret = BTRFS_ERROR_DEV_RAID10_MIN_NOT_MET;
-   goto out;
+   return BTRFS_ERROR_DEV_RAID10_MIN_NOT_MET;
}
 
if ((all_avail  BTRFS_BLOCK_GROUP_RAID1)  num_devices = 2) {
-   ret = BTRFS_ERROR_DEV_RAID1_MIN_NOT_MET;
-   goto out;
+   return BTRFS_ERROR_DEV_RAID1_MIN_NOT_MET;
}
 
if ((all_avail  BTRFS_BLOCK_GROUP_RAID5) 
-   root-fs_info-fs_devices-rw_devices = 2) {
-   ret = BTRFS_ERROR_DEV_RAID5_MIN_NOT_MET;
-   goto out;
+   fs_info-fs_devices-rw_devices = 2) {
+   return BTRFS_ERROR_DEV_RAID5_MIN_NOT_MET;
}
+
if ((all_avail  BTRFS_BLOCK_GROUP_RAID6) 
-   root-fs_info-fs_devices-rw_devices = 3) {
-   ret = BTRFS_ERROR_DEV_RAID6_MIN_NOT_MET;
-   goto out;
+   fs_info-fs_devices-rw_devices = 3) {
+   return BTRFS_ERROR_DEV_RAID6_MIN_NOT_MET;
}
 
+   return 0;
+}
+
+int btrfs_rm_device(struct btrfs_root *root, char *device_path, u64 devid)
+{
+   struct btrfs_device *device;
+   struct btrfs_device *next_device;
+   struct block_device *bdev = NULL;
+   struct buffer_head *bh = NULL;
+   struct btrfs_super_block *disk_super = NULL;
+   struct btrfs_fs_devices *cur_devices;
+   u64 num_devices;
+   u8 *dev_uuid;
+   int ret = 0;
+   bool clear_super = false;
+   char *dev_name = NULL;
+
+   mutex_lock(uuid_mutex);
+
+   ret = __check_raid_min_devices(root-fs_info);
+   if (ret)
+   goto out;
+
if (devid) {
device = btrfs_find_device(root-fs_info, devid,
NULL, NULL);
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 23/23] Btrfs: allow -o rw,degraded for single group profile

2015-08-14 Thread Anand Jain

As of now only the exception to allow mount when number of missing
device is more than group profile tolerance count is
 RDONLY
this patch adds another lateral exception
 DEGRADED

This will enable user to recover from the following and
similar volume unavailability issue

raid1 volume:
 mkfs.btrfs -draid1 -mraid1 /dev/sdc /dev/sdd

unscan the device scan:
 modprobe -r btrfs  modprobe btrfs = dev scanned is cleared

since kernel does not know about /dev/sdd use degraded
option to mount:
 mount -o degraded /dev/sdc /btrfs   = sdd is not used
 umount /btrfs

problem: following umount the mount fails even with degraded option:
 mount -o degraded /dev/sdc /btrfs   == fails.

because: unmount triggered commit used single profile which needs
all the disks

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/disk-io.c | 3 ++-
 fs/btrfs/super.c   | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 2f2379d..3377f1a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2949,7 +2949,8 @@ retry_root_backup:
btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
if (fs_info-fs_devices-missing_devices 
 fs_info-num_tolerated_disk_barrier_failures 
-   !(sb-s_flags  MS_RDONLY)) {
+   !(sb-s_flags  MS_RDONLY ||
+   btrfs_test_opt(fs_info-dev_root, DEGRADED))) {
pr_warn(BTRFS: missing devices(%llu) exceeds the limit(%d), 
writeable mount is not allowed\n,
fs_info-fs_devices-missing_devices,
fs_info-num_tolerated_disk_barrier_failures);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index a8a0109..315035a2 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1666,7 +1666,8 @@ static int btrfs_remount(struct super_block *sb, int 
*flags, char *data)
 
if (fs_info-fs_devices-missing_devices 
 fs_info-num_tolerated_disk_barrier_failures 
-   !(*flags  MS_RDONLY)) {
+   !(*flags  MS_RDONLY ||
+   btrfs_test_opt(root, DEGRADED))) {
btrfs_warn(fs_info,
too many missing devices, writeable remount is 
not allowed);
ret = -EACCES;
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 15/23] Btrfs: use btrfs_scratch_superblock() in btrfs_rm_device()

2015-08-14 Thread Anand Jain

With the previous patches now the btrfs_scratch_superblock()
is ready to be used in btrfs_rm_device() so use it.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 69 --
 1 file changed, 5 insertions(+), 64 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b2a19ea..ebf37a9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1684,9 +1684,6 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
 {
struct btrfs_device *device;
struct btrfs_device *next_device;
-   struct block_device *bdev = NULL;
-   struct buffer_head *bh = NULL;
-   struct btrfs_super_block *disk_super = NULL;
struct btrfs_fs_devices *cur_devices;
u64 num_devices;
int ret = 0;
@@ -1807,68 +1804,12 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path, u64 devid)
 * remove it from the devices list and zero out the old super
 */
if (clear_super) {
-   u64 bytenr;
-   int i;
-
-   if (!disk_super) {
-   ret = btrfs_get_bdev_and_sb(dev_name,
-   FMODE_WRITE | FMODE_EXCL,
-   root-fs_info-bdev_holder, 0,
-   bdev, bh);
-   if (ret) {
-   /*
-* It could be a failed device ok for 
clear_super
-* to fail. So return success
-*/
-   ret = 0;
-   goto out;
-   }
-
-   disk_super = (struct btrfs_super_block *)bh-b_data;
-   }
-   /* make sure this device isn't detected as part of
-* the FS anymore
-*/
-   memset(disk_super-magic, 0, sizeof(disk_super-magic));
-   set_buffer_dirty(bh);
-   sync_dirty_buffer(bh);
-   brelse(bh);
-
-   /* clear the mirror copies of super block on the disk
-* being removed, 0th copy is been taken care above and
-* the below would take of the rest
-*/
-   for (i = 1; i  BTRFS_SUPER_MIRROR_MAX; i++) {
-   bytenr = btrfs_sb_offset(i);
-   if (bytenr + BTRFS_SUPER_INFO_SIZE =
-   i_size_read(bdev-bd_inode))
-   break;
-
-   bh = __bread(bdev, bytenr / 4096,
-   BTRFS_SUPER_INFO_SIZE);
-   if (!bh)
-   continue;
-
-   disk_super = (struct btrfs_super_block *)bh-b_data;
-
-   if (btrfs_super_bytenr(disk_super) != bytenr ||
-   btrfs_super_magic(disk_super) != BTRFS_MAGIC) {
-   brelse(bh);
-   continue;
-   }
-   memset(disk_super-magic, 0,
-   sizeof(disk_super-magic));
-   set_buffer_dirty(bh);
-   sync_dirty_buffer(bh);
-   brelse(bh);
-   }
-
-   if (bdev) {
-   /* Notify udev that device has changed */
-   btrfs_kobject_uevent(bdev, KOBJ_CHANGE);
+   struct block_device *bdev;
 
-   /* Update ctime/mtime for device path for libblkid */
-   update_dev_time(dev_name);
+   bdev = blkdev_get_by_path(dev_name, FMODE_READ | FMODE_EXCL,
+   root-fs_info-bdev_holder);
+   if (!IS_ERR(bdev)) {
+   btrfs_scratch_superblock(bdev, dev_name);
blkdev_put(bdev, FMODE_READ | FMODE_EXCL);
}
}
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/2] Btrfs-progs: device delete to accept devid

2015-08-14 Thread Anand Jain

This is btrfs-progs side of the patch.
Kernel patch is
   Btrfs: device delete by devid

Anand Jain (2):
  btrfs-progs: move is_numerical to utils-lib.h and make it non static
  btrfs-progs: device delete to accept devid

 Documentation/btrfs-device.asciidoc |  2 +-
 cmds-device.c   | 45 -
 cmds-replace.c  | 11 -
 ioctl.h |  8 +++
 utils-lib.c | 11 +
 utils.h |  1 +
 6 files changed, 56 insertions(+), 22 deletions(-)

-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] btrfs-progs: move is_numerical to utils-lib.h and make it non static

2015-08-14 Thread Anand Jain

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 cmds-replace.c | 11 ---
 utils-lib.c| 11 +++
 utils.h|  1 +
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/cmds-replace.c b/cmds-replace.c
index 85365e3..e6a27e3 100644
--- a/cmds-replace.c
+++ b/cmds-replace.c
@@ -65,17 +65,6 @@ static const char * const replace_cmd_group_usage[] = {
NULL
 };
 
-static int is_numerical(const char *str)
-{
-   if (!(*str = '0'  *str = '9'))
-   return 0;
-   while (*str = '0'  *str = '9')
-   str++;
-   if (*str != '\0')
-   return 0;
-   return 1;
-}
-
 static int dev_replace_cancel_fd = -1;
 static void dev_replace_sigint_handler(int signal)
 {
diff --git a/utils-lib.c b/utils-lib.c
index 79ef35e..9ac0b7b 100644
--- a/utils-lib.c
+++ b/utils-lib.c
@@ -38,3 +38,14 @@ u64 arg_strtou64(const char *str)
}
return value;
 }
+
+int is_numerical(const char *str)
+{
+   if (!(*str = '0'  *str = '9'))
+   return 0;
+   while (*str = '0'  *str = '9')
+   str++;
+   if (*str != '\0')
+   return 0;
+   return 1;
+}
diff --git a/utils.h b/utils.h
index 94606ed..0975301 100644
--- a/utils.h
+++ b/utils.h
@@ -243,5 +243,6 @@ int btrfs_tree_search2_ioctl_supported(int fd);
 int btrfs_check_nodesize(u32 nodesize, u32 sectorsize);
 
 const char *get_argv0_buf(void);
+int is_numerical(const char *str);
 
 #endif
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] btrfs-progs: device delete to accept devid

2015-08-14 Thread Anand Jain

This patch introduces new option devid for the command

  btrfs device delete device_path|devid[..]  mnt

In a user reported issue on a 3-disk-RAID1, one disk failed with its
SB unreadable. Now with this patch user will have a choice to delete
the device using devid.

The other method we could do, is to match the input device_path
to the available device_paths with in the kernel. But that won't
work in all the cases, like what if user provided mapper path
when the path within the kernel is a non-mapper path.

This patch depends on the below kernel patch for the new feature to work,
however it will fail-back to the old interface for the kernel without the
patch

  Btrfs: device delete by devid

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 Documentation/btrfs-device.asciidoc |  2 +-
 cmds-device.c   | 45 -
 ioctl.h |  8 +++
 3 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/Documentation/btrfs-device.asciidoc 
b/Documentation/btrfs-device.asciidoc
index 2827598..61ede6e 100644
--- a/Documentation/btrfs-device.asciidoc
+++ b/Documentation/btrfs-device.asciidoc
@@ -74,7 +74,7 @@ do not perform discard by default
 -f|--force
 force overwrite of existing filesystem on the given disk(s)
 
-*remove* dev [dev...] path::
+*remove* dev|devid [dev|devid...] path::
 Remove device(s) from a filesystem identified by path.
 
 *delete* dev [dev...] path::
diff --git a/cmds-device.c b/cmds-device.c
index 0e60500..eb4358d 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -164,16 +164,34 @@ static int _cmd_rm_dev(int argc, char **argv, const char 
* const *usagestr)
struct  btrfs_ioctl_vol_args arg;
int res;
 
-   if (!is_block_device(argv[i])) {
+   struct  btrfs_ioctl_vol_args_v3 argv3 = {0};
+   int its_num = false;
+
+   if (is_numerical(argv[i])) {
+   argv3.devid = arg_strtou64(argv[i]);
+   its_num = true;
+   } else if (is_block_device(argv[i])) {
+   strncpy_null(argv3.name, argv[i]);
+   } else {
fprintf(stderr,
-   ERROR: %s is not a block device\n, argv[i]);
+   ERROR: %s is not a block device or devid\n, 
argv[i]);
ret++;
continue;
}
-   memset(arg, 0, sizeof(arg));
-   strncpy_null(arg.name, argv[i]);
-   res = ioctl(fdmnt, BTRFS_IOC_RM_DEV, arg);
+   res = ioctl(fdmnt, BTRFS_IOC_RM_DEV_V2, argv3);
e = errno;
+   if (res  e == ENOTTY) {
+   if (its_num) {
+   fprintf(stderr,
+   Error: Kernel does not support delete by 
devid\n);
+   ret = 1;
+   continue;
+   }
+   memset(arg, 0, sizeof(arg));
+   strncpy_null(arg.name, argv[i]);
+   res = ioctl(fdmnt, BTRFS_IOC_RM_DEV, arg);
+   e = errno;
+   }
if (res) {
const char *msg;
 
@@ -181,9 +199,16 @@ static int _cmd_rm_dev(int argc, char **argv, const char * 
const *usagestr)
msg = btrfs_err_str(res);
else
msg = strerror(e);
-   fprintf(stderr,
-   ERROR: error removing the device '%s' - %s\n,
-   argv[i], msg);
+
+   if (its_num)
+   fprintf(stderr,
+   ERROR: error removing the devid '%llu' 
- %s\n,
+   argv3.devid, msg);
+   else
+   fprintf(stderr,
+   ERROR: error removing the device '%s' 
- %s\n,
+   argv[i], msg);
+
ret++;
}
}
@@ -193,7 +218,7 @@ static int _cmd_rm_dev(int argc, char **argv, const char * 
const *usagestr)
 }
 
 static const char * const cmd_rm_dev_usage[] = {
-   btrfs device remove device [device...] path,
+   btrfs device remove device|devid [device|devid...] path,
Remove a device from a filesystem,
NULL
 };
@@ -204,7 +229,7 @@ static int cmd_rm_dev(int argc, char **argv)
 }
 
 static const char * const cmd_del_dev_usage[] = {
-   btrfs device delete device [device...] path,
+   btrfs device delete device|devid [device|devid...] path,
Remove a device from a filesystem,
NULL
 };
diff --git a/ioctl.h b/ioctl.h
index dff015a..6870931 100644
--- a/ioctl.h
+++ b/ioctl.h
@@

[PATCH v5 3/3] xfstests: btrfs: test device delete with EIO on src dev

2015-08-14 Thread Anand Jain

From: Anand Jain anand.j...@oracle.com

This test case tests if the device delete works with
the failed (EIO) source device. EIO errors are achieved
usign the DM device.

This test would need following btrfs-progs and btrfs
kernel patch
   btrfs-progs: device delete to accept devid
   Btrfs: device delete by devid

However when btrfs-progs patch is not found this test will
not run, and when kernel patch is not found btrfs-progs
will fail gracefully and thus the test script.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
v4-v5: rebase on latest xfstests code, and accepts Filipe comment
v3-v4: rebase on latest xfstests code
v2-v3: accepts Filipe Manana's review comments, thanks
v1-v2: accepts Dave Chinner's review comments, thanks
 common/rc   |  7 +
 tests/btrfs/099 | 82 +
 tests/btrfs/099.out | 11 +++
 tests/btrfs/group   |  1 +
 4 files changed, 101 insertions(+)
 create mode 100755 tests/btrfs/099
 create mode 100644 tests/btrfs/099.out

diff --git a/common/rc b/common/rc
index 8d4da0e..31a0328 100644
--- a/common/rc
+++ b/common/rc
@@ -2737,6 +2737,13 @@ _require_meta_uuid()
umount $SCRATCH_MNT
 }
 
+_require_btrfs_dev_del_by_devid()
+{
+   $BTRFS_UTIL_PROG device delete --help | egrep devid  /dev/null 21
+   [ $? -eq 0 ] || _notrun $BTRFS_UTIL_PROG too old \
+   (must support 'btrfs device delete devid /mnt')
+}
+
 _get_total_inode()
 {
if [ -z $1 ]; then
diff --git a/tests/btrfs/099 b/tests/btrfs/099
new file mode 100755
index 000..4464e24
--- /dev/null
+++ b/tests/btrfs/099
@@ -0,0 +1,82 @@
+#! /bin/bash
+# FS QA Test No. btrfs/099
+#
+# test device delete when the source device has EIO
+#
+# Copyright (c) 2015 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+
+status=1   # failure is the default!
+trap _cleanup; exit \$status 0 1 2 3 15
+
+
+_cleanup()
+{
+   _cleanup_dmerror
+   rm -f $tmp
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/filter.btrfs
+. ./common/dmerror
+
+_supported_fs btrfs
+_supported_os Linux
+_need_to_be_root
+_require_scratch_dev_pool 3
+_require_btrfs_dev_del_by_devid
+_require_dmerror
+
+rm -f $seqres.full
+
+dev1=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`
+dev2=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $3}'`
+
+_init_dmerror
+_scratch_mkfs_dmerror -f -d raid1 -m raid1 $dev1 $dev2
+_mount_dmerror
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+error_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
+   egrep $DMERROR_DEV | $AWK_PROG '{print $2}'`
+
+snapshot_cmd=$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT
+snapshot_cmd=$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`
+run_check $FSSTRESS_PROG -d $SCRATCH_MNT -n 200 -p 8 $FSSTRESS_AVOID -x \
+   $snapshot_cmd -X 50 
/dev/null
+
+# now load the error into the DMERROR_DEV
+_load_dmerror_table
+
+_run_btrfs_util_prog device delete $error_devid $SCRATCH_MNT
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+echo === device delete completed
+
+status=0; exit
diff --git a/tests/btrfs/099.out b/tests/btrfs/099.out
new file mode 100644
index 000..ec74e45
--- /dev/null
+++ b/tests/btrfs/099.out
@@ -0,0 +1,11 @@
+QA output created by 099
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+   devid DEVID size SIZE used SIZE path /dev/mapper/error-test
+
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+
+=== device delete completed
diff --git a/tests/btrfs/group b/tests/btrfs/group
index c8a53b5..968ee63 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -101,3 +101,4 @@
 096 auto quick clone
 097 auto quick send clone
 098 auto quick replace
+099 auto quick replace
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to

[PATCH v5 1/3] xfstests: btrfs: add functions to create dm-error device

2015-08-14 Thread Anand Jain

From: Anand Jain anand.j...@oracle.com

Controlled EIO from the device is achieved using the dm device.
Helper functions are at common/dmerror.

Broadly steps will include calling _init_dmerror().
_init_dmerror() will use SCRATCH_DEV to create dm linear device and assign
DMERROR_DEV to /dev/mapper/error-test.

When test script is ready to get EIO, the test cases can call
_load_dmerror_table() which then it will load the dm error.
so that reading DMERROR_DEV will cause EIO. After the test case is
complete, cleanup must be done by calling _cleanup_dmerror().

Signed-off-by: Anand Jain anand.j...@oracle.com
Reviewed-by: Filipe Manana fdman...@suse.com
---
v4-v5: No Change. keep up with the patch set
v3-v4: rebase on latest xfstests code
v2.1-v3: accepts Filipe Manana's review comments, thanks
v2-v2.1: fixed missed typo error fixup in the commit.
v1-v2: accepts Dave Chinner's review comments, thanks
 common/dmerror | 69 ++
 common/rc  |  9 
 2 files changed, 78 insertions(+)
 create mode 100644 common/dmerror

diff --git a/common/dmerror b/common/dmerror
new file mode 100644
index 000..f895d90
--- /dev/null
+++ b/common/dmerror
@@ -0,0 +1,69 @@
+##/bin/bash
+#
+# Copyright (c) 2015 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#
+# common functions for setting up and tearing down a dmerror device
+
+_init_dmerror()
+{
+   $DMSETUP_PROG remove error-test  /dev/null 21
+
+   local BLK_DEV_SIZE=`blockdev --getsz $SCRATCH_DEV`
+
+   DMERROR_DEV='/dev/mapper/error-test'
+
+   DMLINEAR_TABLE=0 $BLK_DEV_SIZE linear $SCRATCH_DEV 0
+
+   $DMSETUP_PROG create error-test --table $DMLINEAR_TABLE || \
+   _fatal failed to create dm linear device
+
+   DMERROR_TABLE=0 $BLK_DEV_SIZE error $SCRATCH_DEV 0
+}
+
+_scratch_mkfs_dmerror()
+{
+   $MKFS_BTRFS_PROG $* $DMERROR_DEV  $seqres.full 21 || \
+   _fatal failed to create mkfs.btrfs $* $DMERROR_DEV
+}
+
+_mount_dmerror()
+{
+   mount -t $FSTYP $MOUNT_OPTIONS $DMERROR_DEV $SCRATCH_MNT
+}
+
+_unmount_dmerror()
+{
+   $UMOUNT_PROGS $SCRATCH_MNT
+}
+
+_cleanup_dmerror()
+{
+   $UMOUNT_PROG $SCRATCH_MNT  /dev/null 21
+   $DMSETUP_PROG remove error-test  /dev/null 21
+}
+
+_load_dmerror_table()
+{
+   $DMSETUP_PROG suspend error-test
+   [ $? -ne 0 ]  _fatal  failed to suspend error-test
+
+   $DMSETUP_PROG load error-test --table $DMERROR_TABLE
+   [ $? -ne 0 ]  _fatal failed to load error table error-test
+
+   $DMSETUP_PROG resume error-test
+   [ $? -ne 0 ]  _fatal  failed to resume error-test
+}
diff --git a/common/rc b/common/rc
index 70d2fa8..8d4da0e 100644
--- a/common/rc
+++ b/common/rc
@@ -1337,6 +1337,15 @@ _require_sane_bdev_flush()
fi
 }
 
+# this test requires the device mapper error target
+#
+_require_dmerror()
+{
+   _require_command $DMSETUP_PROG dmsetup
+   $DMSETUP_PROG targets | grep error /dev/null 21
+   [ $? -ne 0 ]  _notrun This test requires dm error support
+}
+
 # this test requires the device mapper flakey target
 #
 _require_dm_flakey()
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v5 0/3] dm error based test cases

2015-08-14 Thread Anand Jain

This is v5 of this patch set. Mainly accepts Filipe latest review comments.

Anand Jain (3):
  xfstests: btrfs: add functions to create dm-error device
  xfstests: btrfs: test device replace, with EIO on the src dev
  xfstests: btrfs: test device delete with EIO on src dev

 common/dmerror  | 69 
 common/rc   | 16 +++
 tests/btrfs/098 | 81 
 tests/btrfs/098.out | 11 +++
 tests/btrfs/099 | 82 +
 tests/btrfs/099.out | 11 +++
 tests/btrfs/group   |  2 ++
 7 files changed, 272 insertions(+)
 create mode 100644 common/dmerror
 create mode 100755 tests/btrfs/098
 create mode 100644 tests/btrfs/098.out
 create mode 100755 tests/btrfs/099
 create mode 100644 tests/btrfs/099.out

-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v5 2/3] xfstests: btrfs: test device replace, with EIO on the src dev

2015-08-14 Thread Anand Jain

From: Anand Jain anand.j...@oracle.com

This test case will test to confirm the replace works with
the failed (EIO) replacing source device. EIO condition is
achieved using the DM device.

Signed-off-by: Anand Jain anand.j...@oracle.com
Reviewed-by: Filipe Manana fdman...@suse.com
---
v4-v5: rebase on latest xfstests code and accepts Filipe comment
v3-v4: rebase on latest xfstests code
v2-v3: accepts Filipe Manana's review comments, thanks
v1-v2: accepts Dave Chinner's review comments, thanks
 tests/btrfs/098 | 81 +
 tests/btrfs/098.out | 11 
 tests/btrfs/group   |  1 +
 3 files changed, 93 insertions(+)
 create mode 100755 tests/btrfs/098
 create mode 100644 tests/btrfs/098.out

diff --git a/tests/btrfs/098 b/tests/btrfs/098
new file mode 100755
index 000..afb41d1
--- /dev/null
+++ b/tests/btrfs/098
@@ -0,0 +1,81 @@
+#! /bin/bash
+# FS QA Test No. btrfs/098
+#
+#test device replace works when the source device has EIO
+#
+# Copyright (c) 2015 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+
+status=1   # failure is the default!
+trap _cleanup; exit \$status 0 1 2 3 15
+
+
+_cleanup()
+{
+   _cleanup_dmerror
+   rm -f $tmp
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/filter.btrfs
+. ./common/dmerror
+
+_supported_fs btrfs
+_supported_os Linux
+_need_to_be_root
+_require_scratch_dev_pool 3
+_require_dmerror
+
+rm -f $seqres.full
+
+dev1=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $2}'`
+dev2=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $3}'`
+
+_init_dmerror
+_scratch_mkfs_dmerror -f -d raid1 -m raid1 $dev1
+_mount_dmerror
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+error_devid=`$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT |\
+   egrep $DMERROR_DEV | $AWK_PROG '{print $2}'`
+
+snapshot_cmd=$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT
+snapshot_cmd=$snapshot_cmd $SCRATCH_MNT/snap_\`date +'%H_%M_%S_%N'\`
+run_check $FSSTRESS_PROG -d $SCRATCH_MNT -n 200 -p 8 $FSSTRESS_AVOID -x \
+   $snapshot_cmd -X 50 
/dev/null
+
+# now load the error into the DMERROR_DEV
+_load_dmerror_table
+
+_run_btrfs_util_prog replace start -B $error_devid $dev2 $SCRATCH_MNT
+
+_run_btrfs_util_prog filesystem show -m $SCRATCH_MNT
+$BTRFS_UTIL_PROG filesystem show -m $SCRATCH_MNT | 
_filter_btrfs_filesystem_show
+
+echo === device replace completed
+
+status=0; exit
diff --git a/tests/btrfs/098.out b/tests/btrfs/098.out
new file mode 100644
index 000..eb2f87f
--- /dev/null
+++ b/tests/btrfs/098.out
@@ -0,0 +1,11 @@
+QA output created by 098
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+   devid DEVID size SIZE used SIZE path /dev/mapper/error-test
+
+Label: none  uuid:  UUID
+   Total devices NUM FS bytes used SIZE
+   devid DEVID size SIZE used SIZE path SCRATCH_DEV
+
+=== device replace completed
diff --git a/tests/btrfs/group b/tests/btrfs/group
index e13865a..c8a53b5 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -100,3 +100,4 @@
 095 auto quick metadata
 096 auto quick clone
 097 auto quick send clone
+098 auto quick replace
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

can we make balance delete missing devices?

2015-08-14 Thread Russell Coker

[ 2918.502237] BTRFS info (device loop1): disk space caching is enabled
[ 2918.503213] BTRFS: failed to read chunk tree on loop1
[ 2918.540082] BTRFS: open_ctree failed

I just had a test RAID-1 filesystem with a missing device.  I mounted it with 
the degraded option and added a new device.  I balanced it (to make it do 
RAID-1 again) and thought everything was good.  Then when I tried to mount it 
again it gave errors such as the above (not sure why).  Then I tried wiping 
/dev/loop1 and it refused to mount entirely due to having 2 missing devices.

Obviously it was my mistake to not remove the missing device, and wiping 
/dev/loop1 was a bad idea.  Failing to remove a missing device seems likely to 
be a common mistake.  Could we make the balance operation automatically delete 
the missing device?  I can't imagine a situation in which a balance would be 
desired but deleting the missing device wouldn't be desired.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/23] Btrfs: __btrfs_std_error() logic should be consistent w/out CONFIG_PRINTK defined

2015-08-14 Thread Anand Jain

error handling logic behaves differently with or without
CONFIG_PRINTK defined, since there are two copies of the same
function which a bit of different logic

One, when CONFIG_PRINTK is defined, code is

__btrfs_std_error(..)
{
::
   save_error_info(fs_info);
   if (sb-s_flags  MS_BORN)
   btrfs_handle_error(fs_info);
}

and two when CONFIG_PRINTK is not defined, the code is

__btrfs_std_error(..)
{
::
   if (sb-s_flags  MS_BORN) {
   save_error_info(fs_info);
   btrfs_handle_error(fs_info);
}
}

I doubt if this was intentional ? and appear to have caused since
we maintain two copies of the same function and they got diverged
with commits.

Now to decide which logic is correct reviewed changes as below,

 533574c6bc30cf526cc1c41bde050c854a945efb
Commit added two copies of this function

 cf79ffb5b79e8a2b587fbf218809e691bb396c98
Commit made change to only one copy of the function and to the
copy when CONFIG_PRINTK is defined.

To fix this, instead of maintaining two copies of same function
approach, maintain single function, and just put the extra
portion of the code under CONFIG_PRINTK define.

This patch just does that. And keeps code of with CONFIG_PRINTK
defined.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/super.c | 27 +--
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index c389c13..56c0174 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -130,7 +130,6 @@ static void btrfs_handle_error(struct btrfs_fs_info 
*fs_info)
}
 }
 
-#ifdef CONFIG_PRINTK
 /*
  * __btrfs_std_error decodes expected errors from the caller and
  * invokes the approciate error response.
@@ -140,7 +139,9 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const 
char *function,
   unsigned int line, int errno, const char *fmt, ...)
 {
struct super_block *sb = fs_info-sb;
+#ifdef CONFIG_PRINTK
const char *errstr;
+#endif
 
/*
 * Special case: if the error is EROFS, and we're already
@@ -149,6 +150,7 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const 
char *function,
if (errno == -EROFS  (sb-s_flags  MS_RDONLY))
return;
 
+#ifdef CONFIG_PRINTK
errstr = btrfs_decode_error(errno);
if (fmt) {
struct va_format vaf;
@@ -166,6 +168,7 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const 
char *function,
printk(KERN_CRIT BTRFS: error (device %s) in %s:%d: errno=%d 
%s\n,
sb-s_id, function, line, errno, errstr);
}
+#endif
 
/* Don't go through full error handling during mount */
save_error_info(fs_info);
@@ -173,6 +176,7 @@ void __btrfs_std_error(struct btrfs_fs_info *fs_info, const 
char *function,
btrfs_handle_error(fs_info);
 }
 
+#ifdef CONFIG_PRINTK
 static const char * const logtypes[] = {
emergency,
alert,
@@ -212,27 +216,6 @@ void btrfs_printk(const struct btrfs_fs_info *fs_info, 
const char *fmt, ...)
 
va_end(args);
 }
-
-#else
-
-void __btrfs_std_error(struct btrfs_fs_info *fs_info, const char *function,
-  unsigned int line, int errno, const char *fmt, ...)
-{
-   struct super_block *sb = fs_info-sb;
-
-   /*
-* Special case: if the error is EROFS, and we're already
-* under MS_RDONLY, then it is safe here.
-*/
-   if (errno == -EROFS  (sb-s_flags  MS_RDONLY))
-   return;
-
-   /* Don't go through full error handling during mount */
-   if (sb-s_flags  MS_BORN) {
-   save_error_info(fs_info);
-   btrfs_handle_error(fs_info);
-   }
-}
 #endif
 
 /*
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/23] Btrfs: add btrfs_read_dev_one_super() to read one specific SB

2015-08-14 Thread Anand Jain

This uses a chunk of code from btrfs_read_dev_super() and creates
a function called btrfs_read_dev_one_super() so that next patch
can use it for scratch superblock.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/disk-io.c | 54 ++
 fs/btrfs/disk-io.h |  2 ++
 2 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index faf5b8d..2f2379d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3183,6 +3183,37 @@ static void btrfs_end_buffer_write_sync(struct 
buffer_head *bh, int uptodate)
put_bh(bh);
 }
 
+int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num,
+   struct buffer_head **bh)
+{
+   struct buffer_head *bufhead;
+   struct btrfs_super_block *super;
+   u64 bytenr;
+
+   bytenr = btrfs_sb_offset(copy_num);
+   if (bytenr + BTRFS_SUPER_INFO_SIZE = i_size_read(bdev-bd_inode))
+   return -EINVAL;
+
+   bufhead = __bread(bdev, bytenr / 4096, BTRFS_SUPER_INFO_SIZE);
+   /*
+* If we fail to read from the underlaying drivers, as of now
+* the best option we have is to mark it EIO.
+*/
+   if (!bufhead)
+   return -EIO;
+
+   super = (struct btrfs_super_block *)bufhead-b_data;
+   if (btrfs_super_bytenr(super) != bytenr ||
+   btrfs_super_magic(super) != BTRFS_MAGIC) {
+   brelse(bufhead);
+   return -EINVAL;
+   }
+
+   *bh = bufhead;
+   return 0;
+}
+
+
 struct buffer_head *btrfs_read_dev_super(struct block_device *bdev)
 {
struct buffer_head *bh;
@@ -3190,7 +3221,6 @@ struct buffer_head *btrfs_read_dev_super(struct 
block_device *bdev)
struct btrfs_super_block *super;
int i;
u64 transid = 0;
-   u64 bytenr;
int ret = -EINVAL;
 
/* we would like to check all the supers, but that would make
@@ -3199,28 +3229,12 @@ struct buffer_head *btrfs_read_dev_super(struct 
block_device *bdev)
 * later supers, using BTRFS_SUPER_MIRROR_MAX instead
 */
for (i = 0; i  1; i++) {
-   bytenr = btrfs_sb_offset(i);
-   if (bytenr + BTRFS_SUPER_INFO_SIZE =
-   i_size_read(bdev-bd_inode))
-   break;
-   bh = __bread(bdev, bytenr / 4096,
-   BTRFS_SUPER_INFO_SIZE);
-   /*
-* If we fail to read from the underlaying drivers, as of now
-* the best option we have is to mark it EIO.
-*/
-   if (!bh) {
-   ret = -EIO;
+
+   ret = btrfs_read_dev_one_super(bdev, i, bh);
+   if (ret)
continue;
-   }
 
super = (struct btrfs_super_block *)bh-b_data;
-   if (btrfs_super_bytenr(super) != bytenr ||
-   btrfs_super_magic(super) != BTRFS_MAGIC) {
-   brelse(bh);
-   ret = -EINVAL;
-   continue;
-   }
 
if (!latest || btrfs_super_generation(super)  transid) {
brelse(latest);
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index d4cbfee..8dc9ff1 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -60,6 +60,8 @@ void close_ctree(struct btrfs_root *root);
 int write_ctree_super(struct btrfs_trans_handle *trans,
  struct btrfs_root *root, int max_mirrors);
 struct buffer_head *btrfs_read_dev_super(struct block_device *bdev);
+int btrfs_read_dev_one_super(struct block_device *bdev, int copy_num,
+   struct buffer_head **bh);
 int btrfs_commit_super(struct btrfs_root *root);
 struct extent_buffer *btrfs_find_tree_block(struct btrfs_fs_info *fs_info,
u64 bytenr);
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 06/23] Btrfs: SB read failure should return EIO for __bread failure

2015-08-14 Thread Anand Jain

This will return EIO when __bread() fails to read SB,
instead of EINVAL.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/disk-io.c | 18 +++---
 fs/btrfs/volumes.c |  8 
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8571025..faf5b8d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2648,8 +2648,8 @@ int open_ctree(struct super_block *sb,
 * Read super block and check the signature bytes only
 */
bh = btrfs_read_dev_super(fs_devices-latest_bdev);
-   if (!bh) {
-   err = -EINVAL;
+   if (IS_ERR(bh)) {
+   err = PTR_ERR(bh);
goto fail_alloc;
}
 
@@ -3191,6 +3191,7 @@ struct buffer_head *btrfs_read_dev_super(struct 
block_device *bdev)
int i;
u64 transid = 0;
u64 bytenr;
+   int ret = -EINVAL;
 
/* we would like to check all the supers, but that would make
 * a btrfs mount succeed after a mkfs from a different FS.
@@ -3204,13 +3205,20 @@ struct buffer_head *btrfs_read_dev_super(struct 
block_device *bdev)
break;
bh = __bread(bdev, bytenr / 4096,
BTRFS_SUPER_INFO_SIZE);
-   if (!bh)
+   /*
+* If we fail to read from the underlaying drivers, as of now
+* the best option we have is to mark it EIO.
+*/
+   if (!bh) {
+   ret = -EIO;
continue;
+   }
 
super = (struct btrfs_super_block *)bh-b_data;
if (btrfs_super_bytenr(super) != bytenr ||
btrfs_super_magic(super) != BTRFS_MAGIC) {
brelse(bh);
+   ret = -EINVAL;
continue;
}
 
@@ -3222,6 +3230,10 @@ struct buffer_head *btrfs_read_dev_super(struct 
block_device *bdev)
brelse(bh);
}
}
+
+   if (!latest)
+   return ERR_PTR(ret);
+
return latest;
 }
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 8368393..a3fde18 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -211,8 +211,8 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t 
flags, void *holder,
}
invalidate_bdev(*bdev);
*bh = btrfs_read_dev_super(*bdev);
-   if (!*bh) {
-   ret = -EINVAL;
+   if (IS_ERR(*bh)) {
+   ret = PTR_ERR(*bh);
blkdev_put(*bdev, flags);
goto error;
}
@@ -6842,8 +6842,8 @@ int btrfs_scratch_superblock(struct btrfs_device *device)
struct btrfs_super_block *disk_super;
 
bh = btrfs_read_dev_super(device-bdev);
-   if (!bh)
-   return -EINVAL;
+   if (IS_ERR(bh))
+   return PTR_ERR(bh);
disk_super = (struct btrfs_super_block *)bh-b_data;
 
memset(disk_super-magic, 0, sizeof(disk_super-magic));
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 03/23] Btrfs: rename btrfs_kobj_add_device to btrfs_sysfs_add_device_link

2015-08-14 Thread Anand Jain

---
 fs/btrfs/dev-replace.c | 2 +-
 fs/btrfs/sysfs.c   | 4 ++--
 fs/btrfs/sysfs.h   | 2 +-
 fs/btrfs/volumes.c | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 564a7de..c1bf0d6 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -376,7 +376,7 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
WARN_ON(!tgt_device);
dev_replace-tgtdev = tgt_device;
 
-   ret = btrfs_kobj_add_device(tgt_device-fs_devices, tgt_device);
+   ret = btrfs_sysfs_add_device_link(tgt_device-fs_devices, tgt_device);
if (ret)
btrfs_err(root-fs_info, kobj add dev failed %d\n, ret);
 
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 095a302..df67f6b 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -683,7 +683,7 @@ int btrfs_sysfs_add_device(struct btrfs_fs_devices *fs_devs)
return 0;
 }
 
-int btrfs_kobj_add_device(struct btrfs_fs_devices *fs_devices,
+int btrfs_sysfs_add_device_link(struct btrfs_fs_devices *fs_devices,
struct btrfs_device *one_device)
 {
int error = 0;
@@ -744,7 +744,7 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
 
btrfs_set_fs_info_ptr(fs_info);
 
-   error = btrfs_kobj_add_device(fs_devs, NULL);
+   error = btrfs_sysfs_add_device_link(fs_devs, NULL);
if (error)
return error;
 
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index 6392527..6529680 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -82,7 +82,7 @@ char *btrfs_printable_features(enum btrfs_feature_set set, 
u64 flags);
 extern const char * const btrfs_feature_set_names[3];
 extern struct kobj_type space_info_ktype;
 extern struct kobj_type btrfs_raid_ktype;
-int btrfs_kobj_add_device(struct btrfs_fs_devices *fs_devices,
+int btrfs_sysfs_add_device_link(struct btrfs_fs_devices *fs_devices,
struct btrfs_device *one_device);
 int btrfs_kobj_rm_device(struct btrfs_fs_devices *fs_devices,
 struct btrfs_device *one_device);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7c84a81..18ea1eb 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2309,7 +2309,7 @@ int btrfs_init_new_device(struct btrfs_root *root, char 
*device_path)
tmp + 1);
 
/* add sysfs device entry */
-   btrfs_kobj_add_device(root-fs_info-fs_devices, device);
+   btrfs_sysfs_add_device_link(root-fs_info-fs_devices, device);
 
/*
 * we've got more storage, clear any full flags on the space
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 11/23] Btrfs: use BTRFS_ERROR_DEV_MISSING_NOT_FOUND when missing device is not found

2015-08-14 Thread Anand Jain

use btrfs specific error code BTRFS_ERROR_DEV_MISSING_NOT_FOUND instead of 
-ENOENT.
Next this removes the logging when user specifies missing and we don't find
it in the kernel device list. logging are for system events not for user input 
errors.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 101a473..f1b36b9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2078,10 +2078,8 @@ int btrfs_find_device_missing_or_by_path(struct 
btrfs_root *root,
}
}
 
-   if (!*device) {
-   btrfs_err(root-fs_info, no missing device found);
-   return -ENOENT;
-   }
+   if (!*device)
+   return BTRFS_ERROR_DEV_MISSING_NOT_FOUND;
 
return 0;
} else {
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 16/23] Btrfs: device path change must be logged

2015-08-14 Thread Anand Jain

From the issue diagnosable point of view, log if the device path is
changed.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index ebf37a9..dcb10fa 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -595,6 +595,10 @@ static noinline int device_list_add(const char *path,
return -EEXIST;
}
 
+   printk_in_rcu(KERN_INFO \
+   BTRFS: device fsid %pU devid %llu old path %s new path 
%s\n,
+   disk_super-fsid, devid, rcu_str_deref(device-name), 
path);
+
name = rcu_string_strdup(path, GFP_NOFS);
if (!name)
return -ENOMEM;
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 20/23] Btrfs: move device close to btrfs_close_one_device

2015-08-14 Thread Anand Jain

this will help to add the proposed device offline RFE
---
 fs/btrfs/volumes.c | 66 +-
 fs/btrfs/volumes.h |  1 +
 2 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f3ca87d..00ca858 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -768,36 +768,7 @@ static int __btrfs_close_devices(struct btrfs_fs_devices 
*fs_devices)
 
mutex_lock(fs_devices-device_list_mutex);
list_for_each_entry_safe(device, tmp, fs_devices-devices, dev_list) {
-   struct btrfs_device *new_device;
-   struct rcu_string *name;
-
-   if (device-bdev)
-   fs_devices-open_devices--;
-
-   if (device-writeable 
-   device-devid != BTRFS_DEV_REPLACE_DEVID) {
-   list_del_init(device-dev_alloc_list);
-   fs_devices-rw_devices--;
-   }
-
-   if (device-missing)
-   fs_devices-missing_devices--;
-
-   new_device = btrfs_alloc_device(NULL, device-devid,
-   device-uuid);
-   BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
-
-   /* Safe because we are under uuid_mutex */
-   if (device-name) {
-   name = rcu_string_strdup(device-name-str, GFP_NOFS);
-   BUG_ON(!name); /* -ENOMEM */
-   rcu_assign_pointer(new_device-name, name);
-   }
-
-   list_replace_rcu(device-dev_list, new_device-dev_list);
-   new_device-fs_devices = device-fs_devices;
-
-   call_rcu(device-rcu, free_device);
+   btrfs_close_one_device(device);
}
mutex_unlock(fs_devices-device_list_mutex);
 
@@ -6890,3 +6861,38 @@ void btrfs_reset_fs_info_ptr(struct btrfs_fs_info 
*fs_info)
fs_devices = fs_devices-seed;
}
 }
+
+void btrfs_close_one_device(struct btrfs_device *device)
+{
+   struct btrfs_fs_devices *fs_devices = device-fs_devices;
+   struct btrfs_device *new_device;
+   struct rcu_string *name;
+
+   if (device-bdev)
+   fs_devices-open_devices--;
+
+   if (device-writeable 
+   device-devid != BTRFS_DEV_REPLACE_DEVID) {
+   list_del_init(device-dev_alloc_list);
+   fs_devices-rw_devices--;
+   }
+
+   if (device-missing)
+   fs_devices-missing_devices--;
+
+   new_device = btrfs_alloc_device(NULL, device-devid,
+   device-uuid);
+   BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
+
+   /* Safe because we are under uuid_mutex */
+   if (device-name) {
+   name = rcu_string_strdup(device-name-str, GFP_NOFS);
+   BUG_ON(!name); /* -ENOMEM */
+   rcu_assign_pointer(new_device-name, name);
+   }
+
+   list_replace_rcu(device-dev_list, new_device-dev_list);
+   new_device-fs_devices = device-fs_devices;
+
+   call_rcu(device-rcu, free_device);
+}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 32a66c7..5f4911a 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -550,5 +550,6 @@ static inline void unlock_chunks(struct btrfs_root *root)
 struct list_head *btrfs_get_fs_uuids(void);
 void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
 void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
+void btrfs_close_one_device(struct btrfs_device *device);
 
 #endif
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/23] Btrfs: rename btrfs_kobj_rm_device to btrfs_sysfs_rm_device_link

2015-08-14 Thread Anand Jain

---
 fs/btrfs/dev-replace.c | 2 +-
 fs/btrfs/sysfs.c   | 6 +++---
 fs/btrfs/sysfs.h   | 2 +-
 fs/btrfs/volumes.c | 6 +++---
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index c1bf0d6..6eb9324 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -587,7 +587,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info 
*fs_info,
mutex_unlock(uuid_mutex);
 
/* replace the sysfs entry */
-   btrfs_kobj_rm_device(fs_info-fs_devices, src_device);
+   btrfs_sysfs_rm_device_link(fs_info-fs_devices, src_device);
btrfs_rm_dev_replace_free_srcdev(fs_info, src_device);
 
/* write back the superblocks */
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index df67f6b..52319d1 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -557,7 +557,7 @@ void btrfs_sysfs_remove_mounted(struct btrfs_fs_info 
*fs_info)
addrm_unknown_feature_attrs(fs_info, false);
sysfs_remove_group(fs_info-fs_devices-super_kobj, 
btrfs_feature_attr_group);
sysfs_remove_files(fs_info-fs_devices-super_kobj, btrfs_attrs);
-   btrfs_kobj_rm_device(fs_info-fs_devices, NULL);
+   btrfs_sysfs_rm_device_link(fs_info-fs_devices, NULL);
 }
 
 const char * const btrfs_feature_set_names[3] = {
@@ -637,7 +637,7 @@ static void init_feature_attrs(void)
 
 /* when one_device is NULL, it removes all device links */
 
-int btrfs_kobj_rm_device(struct btrfs_fs_devices *fs_devices,
+int btrfs_sysfs_rm_device_link(struct btrfs_fs_devices *fs_devices,
struct btrfs_device *one_device)
 {
struct hd_struct *disk;
@@ -750,7 +750,7 @@ int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info)
 
error = sysfs_create_files(super_kobj, btrfs_attrs);
if (error) {
-   btrfs_kobj_rm_device(fs_devs, NULL);
+   btrfs_sysfs_rm_device_link(fs_devs, NULL);
return error;
}
 
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index 6529680..9c09522 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -84,7 +84,7 @@ extern struct kobj_type space_info_ktype;
 extern struct kobj_type btrfs_raid_ktype;
 int btrfs_sysfs_add_device_link(struct btrfs_fs_devices *fs_devices,
struct btrfs_device *one_device);
-int btrfs_kobj_rm_device(struct btrfs_fs_devices *fs_devices,
+int btrfs_sysfs_rm_device_link(struct btrfs_fs_devices *fs_devices,
 struct btrfs_device *one_device);
 int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs,
struct kobject *parent);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 18ea1eb..4c7c344 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1801,7 +1801,7 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path)
if (device-bdev) {
device-fs_devices-open_devices--;
/* remove sysfs entry */
-   btrfs_kobj_rm_device(root-fs_info-fs_devices, device);
+   btrfs_sysfs_rm_device_link(root-fs_info-fs_devices, device);
}
 
call_rcu(device-rcu, free_device);
@@ -1971,7 +1971,7 @@ void btrfs_destroy_dev_replace_tgtdev(struct 
btrfs_fs_info *fs_info,
WARN_ON(!tgtdev);
mutex_lock(fs_info-fs_devices-device_list_mutex);
 
-   btrfs_kobj_rm_device(fs_info-fs_devices, tgtdev);
+   btrfs_sysfs_rm_device_link(fs_info-fs_devices, tgtdev);
 
if (tgtdev-bdev) {
btrfs_scratch_superblock(tgtdev);
@@ -2388,7 +2388,7 @@ int btrfs_init_new_device(struct btrfs_root *root, char 
*device_path)
 error_trans:
btrfs_end_transaction(trans, root);
rcu_string_free(device-name);
-   btrfs_kobj_rm_device(root-fs_info-fs_devices, device);
+   btrfs_sysfs_rm_device_link(root-fs_info-fs_devices, device);
kfree(device);
 error:
blkdev_put(bdev, FMODE_EXCL);
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 18/23] Btrfs: check device_path in btrfs_find_device_by_user_input()

2015-08-14 Thread Anand Jain

so btrfs_dev_replace_start() can be sleak and btrfs_rm_device() will
also need it.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/dev-replace.c | 4 
 fs/btrfs/volumes.c | 3 +++
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 937e53b..0df3d9b 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -321,10 +321,6 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
return -EINVAL;
}
 
-   if ((args-start.srcdevid == 0  args-start.srcdev_name[0] == '\0') ||
-   args-start.tgtdev_name[0] == '\0')
-   return -EINVAL;
-
/* the disk copy procedure reuses the scrub code */
mutex_lock(fs_info-volume_mutex);
ret = btrfs_find_device_by_user_input(root, args-start.srcdevid,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index dcb10fa..5803c45 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2001,6 +2001,9 @@ int btrfs_find_device_by_user_input(struct btrfs_root 
*root, u64 srcdevid,
if (!*device)
ret = -ENOENT;
} else {
+   if (!srcdev_name || !srcdev_name[0])
+   return -EINVAL;
+
ret = btrfs_find_device_missing_or_by_path(root, srcdev_name,
   device);
}
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/23] Btrfs: fix btrfs_scratch_superblock() with fixes from device delete

2015-08-14 Thread Anand Jain

This patch updates the btrfs_scratch_superblock(), (which is used
by the replace device thread), with those fixes from the
scratch superblock code section of btrfs_rm_device(). The fixes are:
  Scratch all copies of superblock
  Notify kobject that superblock has been changed
  Update time on the device

so that btrfs_rm_device() can use the function btrfs_scratch_superblock()
instead of its own scratch code. And further replace deivce code which
similarly releases device back to the system, will have the fixes from
the btrfs device delete.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 40 
 fs/btrfs/volumes.h |  2 +-
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1d35332..b2a19ea 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1915,7 +1915,8 @@ void btrfs_rm_dev_replace_remove_srcdev(struct 
btrfs_fs_info *fs_info,
if (srcdev-writeable) {
fs_devices-rw_devices--;
/* zero out the old super if it is writable */
-   btrfs_scratch_superblock(srcdev);
+   btrfs_scratch_superblock(srcdev-bdev,
+   rcu_str_deref(srcdev-name));
}
 
if (srcdev-bdev)
@@ -1965,7 +1966,8 @@ void btrfs_destroy_dev_replace_tgtdev(struct 
btrfs_fs_info *fs_info,
btrfs_sysfs_rm_device_link(fs_info-fs_devices, tgtdev);
 
if (tgtdev-bdev) {
-   btrfs_scratch_superblock(tgtdev);
+   btrfs_scratch_superblock(tgtdev-bdev,
+   rcu_str_deref(tgtdev-name));
fs_info-fs_devices-open_devices--;
}
fs_info-fs_devices-num_devices--;
@@ -6844,22 +6846,36 @@ int btrfs_get_dev_stats(struct btrfs_root *root,
return 0;
 }
 
-int btrfs_scratch_superblock(struct btrfs_device *device)
+void btrfs_scratch_superblock(struct block_device *bdev, char *device_path)
 {
struct buffer_head *bh;
struct btrfs_super_block *disk_super;
+   int copy_num;
 
-   bh = btrfs_read_dev_super(device-bdev);
-   if (IS_ERR(bh))
-   return PTR_ERR(bh);
-   disk_super = (struct btrfs_super_block *)bh-b_data;
+   if (!bdev)
+   return;
 
-   memset(disk_super-magic, 0, sizeof(disk_super-magic));
-   set_buffer_dirty(bh);
-   sync_dirty_buffer(bh);
-   brelse(bh);
+   for (copy_num = 0; copy_num  BTRFS_SUPER_MIRROR_MAX;
+   copy_num++) {
 
-   return 0;
+   if (btrfs_read_dev_one_super(bdev, copy_num, bh))
+   continue;
+
+   disk_super = (struct btrfs_super_block *)bh-b_data;
+
+   memset(disk_super-magic, 0, sizeof(disk_super-magic));
+   set_buffer_dirty(bh);
+   sync_dirty_buffer(bh);
+   brelse(bh);
+   }
+
+   /* Notify udev that device has changed */
+   btrfs_kobject_uevent(bdev, KOBJ_CHANGE);
+
+   /* Update ctime/mtime for device path for libblkid */
+   update_dev_time(device_path);
+
+   return;
 }
 
 /*
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index a093b36..32a66c7 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -477,7 +477,7 @@ void btrfs_destroy_dev_replace_tgtdev(struct btrfs_fs_info 
*fs_info,
  struct btrfs_device *tgtdev);
 void btrfs_init_dev_replace_tgtdev_for_resume(struct btrfs_fs_info *fs_info,
  struct btrfs_device *tgtdev);
-int btrfs_scratch_superblock(struct btrfs_device *device);
+void btrfs_scratch_superblock(struct block_device *bdev, char *device_path);
 int btrfs_is_parity_mirror(struct btrfs_mapping_tree *map_tree,
   u64 logical, u64 len, int mirror_num);
 unsigned long btrfs_full_stripe_len(struct btrfs_root *root,
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 19/23] Btrfs: avoid user cli usage error logging into the sys log

2015-08-14 Thread Anand Jain

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 fs/btrfs/volumes.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5803c45..f3ca87d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -198,7 +198,6 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t 
flags, void *holder,
 
if (IS_ERR(*bdev)) {
ret = PTR_ERR(*bdev);
-   printk(KERN_INFO BTRFS: open %s failed\n, device_path);
goto error;
}
 
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

74 matches

Mail list logo