Re: Help interpreting RAID1 space allocation

2013-08-23 Thread Chris Murphy

On Aug 23, 2013, at 10:24 PM, Chris Murphy  wrote:
> 
> When replacing a failed disk, I'd like btrfs to compare states between the 
> available drives and know that it needs to catch up the newly added device, 
> but this doesn't yet happen. It's necessary to call btrfs balance.

I can only test device replacement, not a readd. Upon 'btrfs device delete 
missing /mnt' there's a delay, and it's doing a balance. I don't know what 
happens for a readd, if the whole volume needs balancing or if it's able to 
just write the changes.

> And still further, somehow the data profile has reverted to single even 
> though the mkfs.btfs was raid1. [SNIP] That is a huge bug. I'll try to come 
> up with some steps to reproduce.

If I create the file system, mount it, but I do not copy any data, upon adding 
new and deleting missing, the data profile is changed from raid1 to single. If 
I've first copied data to the volume prior to device failure/missing, this 
doesn't happen, it remains raid1.

Also, again after the missing device is replaced, and the volume rebalanced, 
while the mount option is degraded, subsequent file copies end up on both 
virtual disks. So it says it's degraded but it really isn't? I'm not sure 
what's up here.


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help interpreting RAID1 space allocation

2013-08-23 Thread Chris Murphy

On Aug 23, 2013, at 6:05 PM, Joel Johnson  wrote:

> What is the expectation on hot-adding a failed drive, is an explicit 'device 
> add' or 'replace' expected/required?

I'd expect to have to add a device and then remove missing. There isn't a readd 
option in btrfs, which in md parlance is used for readding a device previously 
part of an array. 

When replacing a failed disk, I'd like btrfs to compare states between the 
available drives and know that it needs to catch up the newly added device, but 
this doesn't yet happen. It's necessary to call btrfs balance.

However, after adding a device and deleting missing, I see what you see, the 
btrfs volume is still mounted degraded. 

Further, if I use:

# mount -o remount /dev/sdb /mnt
# mount | grep btrfs
/dev/sdc on /mnt type btrfs (rw,relatime,seclabel,degraded,space_cache)

So it's still degraded. I had to unmount and mount again to clear degraded. 
That seems to be a problem, to have to unmount the volume in order to remove 
the degraded flag, which is needed to begin the rebalance. And what if btrfs is 
the root file system? It needs to be rebooted to clear the degraded option.

And still further, somehow the data profile has reverted to single even though 
the mkfs.btfs was raid1. So even though the volume now has two devices, has 
been balanced, is not degraded, the data profile is single and presumably I no 
longer actually have mirrored data at all on this volume. That is a huge bug. 
I'll try to come up with some steps to reproduce.


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help interpreting RAID1 space allocation

2013-08-23 Thread Joel Johnson
I've created a test volume and copied a bulk of data to it, however the 
results of the space allocation are confusing at best. I've tried to 
capture the history of events leading up to the current state. This is 
all on a Debian Wheezy system using a 3.10.5 kernel package 
(linux-image-3.10-2-amd64) and btrfs tools v0.20-rc1 (Debian package 
0.19+20130315-5). The host uses an Intel Atom 330 processor, and runs 
the 64-bit kernel with a 32-bit userland.


I initially created the volume as RAID1 data, then removed (hotplugged 
out from under the system) one of the drives while empty as a test. I 
then unmounted and remounted it with the degraded option and copied a 
small amount of data. Once verifying that the space was used, I 
hotplugged the second original drive, which was detected and added back 
to the volume (showing up in a filesystem show instead of missing). I 
then tried to copy over more data than a RAID1 should be expected to 
hold (~650GB onto 2 x 500GB disks in RAID1), got out of space reported 
as expected. I then deleted all data from the volume (did not recreate 
the filesystem), and copied just over 300GB of data onto the volume, 
which is the current state.


Only as I was typing this up did I notice that the mount options still 
show degraded from the original mount. I expected that once the second 
drive was readded, since it showed up as part of the volume 
automatically (I assume because the UUIDs matched?), however since all 
data appears to have been written to the first drive, I am led to 
believe that the second drive was present but not readded, even though 
it reappeared as devid 2 in the listing.


If the above is correct, then I have two questions that I haven't found 
any documentation on:


1. What is the expectation on hot-adding a failed drive, is an explicit 
'device add' or 'replace' expected/required? In my case it appeared to 
be auto-added, but that may have been spurious or misleading. I'd 
consider that if an explicit readd is required, that the device not be 
listed at all, however I would be much more interested to see hotplug of 
a previously missing device (with older modifications from the same 
volume) be readded and synced automatically.


2. If initially mounted as degraded, once a new drive is added, is a 
remount required? I'd hope not, but since the mount flag can't be 
changed later on, what is the best way to confirm health of the volume? 
Until this issue I'd assumed using 'filesystem show'. Since the mount 
flag is at mount time only, degraded seems to mean "be degraded if 
needed" instead of a positive indicator that the volume is indeed 
degraded"




$ mount | grep btrfs
/dev/sdc on /mnt/new-store type btrfs (rw,relatime,degraded,space_cache)

$ du -hsx /mnt/new-store
305G/mnt/new-store

$ df -h | grep new-store
/dev/sdc 932G  307G  160G  66% /mnt/new-store

$ btrfs fi show /dev/sdc
Label: 'new-store'  uuid: 14e6e9c7-b249-40ff-8be1-78fc8b26b53d
Total devices 2 FS bytes used 540.00KB
devid2 size 465.76GB used 2.01GB path /dev/sdd
devid1 size 465.76GB used 453.03GB path /dev/sdc

$ btrfs fi df /mnt/new-store
Data, RAID1: total=1.00GB, used=997.21MB
Data: total=450.01GB, used=303.18GB
System, RAID1: total=8.00MB, used=56.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=1.00GB, used=617.14MB
Metadata: total=1.01GB, used=0.00

I may be missing or mis-remembering some of the order of events leading 
to the current state, however the space usage numbers don't reflect 
anything close to what I would expect.


On the data used on sdc I've assumed that's old from when I filled the 
volume and hasn't been reclaimed by a balance or other operation. 
However, the "used=997.21MB" from fi df, as well as the "FS bytes used 
540.00KB" from fi show seem suspect based on what


Thanks for help understanding the space allocation and usage patterns, 
I've tried to put pieces together based on man pages, wiki and other 
postings, but can't seem to reconcile what I think I should be seeing 
based on that reading with what I'm actually seeing.


Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 23/08/13 01:20, Mark Ridley wrote:
> The main reason I started using strict allocate = yes on samba was out
> of desperation/exasperation with BTRFS.

The most effective performance option is to turn oplocks on.
Opportunistic locks are granted to a client when it is the only one with a
file open.  That lets it do caching and not even tell the server about
record locking.  Once another client opens the same file then the first
client has to flush all outstanding writes, its caches, record locking etc.

I don't know what Samba currently uses as the default value, but it is
traditionally set to off because there are scenarios under which it isn't
sufficiently robust (eg access on the Unix server side, a network break
when there is outstanding data).

http://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/locking.html#id2616903

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlIXuX0ACgkQmOOfHg372QSRwgCgwwwoo7QqRql8+CS5xggRBVRt
krAAn1wFiIIliZJ7WbWh/oh8crEWLgfR
=HZG9
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RESEND] xfstests: add a test for btrfs device replace operation

2013-08-23 Thread Josef Bacik
On Fri, Aug 23, 2013 at 03:07:11PM +0200, Stefan Behrens wrote:
> This test performs btrfs device replace tests with all possible profiles
> (single/dup/mixed/raid0/raid1/raid10), one round with the '-r' option
> to 'btrfs replace start' and one round without this option. The
> cancelation is tested only once and with the dup/single profile for
> metadata/data.
> 
> This test takes 181 seconds on my SSD equiped test box and 237s on
> spinning disks. Almost all the time is spent when the filesystem is
> populated with test data. The replace operation itself takes less than
> a second for all the tests, except for the test that is marked as
> 'thorough' which will run for about 8 seconds on my test box.
> 
> The amount of tests done depends on the number of devices in the
> SCRATCH_DEV_POOL. For full test coverage, at least 5 devices should
> be available (e.g. 5 partitions). With less than 2 entries in
> SCRATCH_DEV_POOL, the test is not executed.
> 
> The source and target devices for the replace operation are arbitrarily
> chosen out of SCRATCH_DEV_POOl. Since the target device mustn't be
> smaller than the source device, the requirement for this test is that
> all devices have _exactly_ the same size. If this is not the case, the
> test terminates with _notrun.
> 
> To check the filesystems after replacing a device, a scrub run is
> performed, a btrfsck run, and finally the filesystem is remounted.
> 
> This commit depends on my other commit:
> "xfstest: don't remove the two first devices from SCRATCH_DEV_POOL"
> 
> Signed-off-by: Stefan Behrens 

This worked well, thanks,

Reviewed-by: Josef Bacik 

Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstests: update _filter_size() for Btrfs

2013-08-23 Thread Josef Bacik
On Fri, Aug 23, 2013 at 03:07:12PM +0200, Stefan Behrens wrote:
> From: root 
> 
> The btrfs-progs tools changed the output:
> - 100GiB instead of 100GB
> 
> xfstest btrfs/006 is one that failed due to this change.
> 
> Signed-off-by: Stefan Behrens 

Thank you for this, it was super annoying with the new btrfs-progs breaking
this.

Reviewed-by: Josef Bacik 

Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfstest: fix btrfs/006 for 10+ devices in SCRATCH_DEV_POOL

2013-08-23 Thread Josef Bacik
On Fri, Aug 23, 2013 at 03:07:13PM +0200, Stefan Behrens wrote:
> One problem was the output of "uniq -c" which added spaces depending
> on the size of the count value (e.g. one space less for 10+ devices).
> 
> The second problem was that "btrfs fi show" was doing the same:
> "devid %4llu size %s used %s path %s".
> 

Got 11 devices together and this works

Reviewed-by: Josef Bacik 

Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4] xfstests: don't remove the two first devices from SCRATCH_DEV_POOL

2013-08-23 Thread Josef Bacik
On Fri, Aug 23, 2013 at 03:07:10PM +0200, Stefan Behrens wrote:
> Since common/config is executed twice, if SCRATCH_DEV_POOL is configured
> via the environment, the current code removes the first device entry twice
> which means that you lose the second device for the test.
> 
> The fix is to not remove anything from SCRATCH_DEV_POOL anymore.
> That used to be done (I can only guess) to allow to pass the
> SCRATCH_DEV_POOL as an argument to _scratch_mkfs. Since _scratch_mkfs adds
> the SCRATCH_DEV, the pool mustn't contain that device anymore.
> 
> A new function _scratch_pool_mkfs is introduced that does the expected
> thing.
> 

This didn't break anything and makes sense

Reviewed-by: Josef Bacik 

Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: fix for patch "cleanup: don't check the same thing twice"

2013-08-23 Thread Mitch Harder
On Fri, Aug 23, 2013 at 4:03 AM, Miao Xie  wrote:
> On fri, 23 Aug 2013 10:34:42 +0200, Stefan Behrens wrote:
>> Mitch Harder noticed that the patch 3c64a1a mentioned in the subject
>> line was causing a kernel BUG() on snapshot deletion.
>>
>> The patch was wrong. It did not handle cached roots correctly. The
>> check for root_refs == 0 was removed everywhere where
>> btrfs_read_fs_root_no_name() had been used to retrieve the root,
>> because this check was already dealt with in
>> btrfs_read_fs_root_no_name(). But in the case when the root was
>> found in the cache, there was no such check.
>>
>> This patch adds the missing check in the case where the root is
>> found in the cache.
>>
>> Reported-by: Mitch Harder 
>> Signed-off-by: Stefan Behrens 
>> ---
>>  fs/btrfs/disk-io.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 43ec3c6..7078554 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1583,8 +1583,11 @@ struct btrfs_root *btrfs_read_fs_root_no_name(struct 
>> btrfs_fs_info *fs_info,
>>   ERR_PTR(-ENOENT);
>>  again:
>>   root = btrfs_lookup_fs_root(fs_info, location->objectid);
>> - if (root)
>> + if (root) {
>> + if (btrfs_root_refs(&root->root_item) == 0)
>> + return ERR_PTR(-ENOENT);
>>   return root;
>> + }
>
> It seems good to me.
>
> Reviewed-by: Miao Xie 
>
>>
>>   root = btrfs_read_fs_root(fs_info->tree_root, location);
>>   if (IS_ERR(root))
>>

Tested-by: Mitch Harder 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel BUG on Snapshot Deletion (3.11.0-rc5)

2013-08-23 Thread Mitch Harder
On Fri, Aug 23, 2013 at 3:48 AM, Stefan Behrens
 wrote:
> On Wed, 21 Aug 2013 08:44:55 -0500, Mitch Harder wrote:
>> I've had a hard time assembling a portable reproducer for this issue.
>>
>> I discovered that my reproducer was highly dependent on a local
>> archive of out-of-date git kernel sources.  My efforts to reproduce
>> the error with a portable set of scripts with publicly available
>> kernel git sources weren't successful.
>>
>> It seems like this issue is related to a corner-case workload that is
>> difficult to reproduce.
>>
>> So I've bisected the error I was seeing with my local script, and
>> identified the following commit as triggering my issue:
>>
>> commit:3c64a1aba7cfcb04f79e76f859b3d0275d59
>> Btrfs: cleanup: don't check the same thing twice
>> https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=for-linus&id=3c64a1aba7cfcb04
>>
>> I tested a kernel which reverted this change, and also added WARN_ON
>> lines to provide a back trace.
> [...]
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index cd46e2c..a1091f7 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -2302,6 +2302,12 @@ static noinline int
>> relink_extent_backref(struct btrfs_path *path,
>>  return 0;
>>  return PTR_ERR(root);
>>  }
>> +if (btrfs_root_refs(&root->root_item) == 0) {
>> +srcu_read_unlock(&fs_info->subvol_srcu, index);
>> +/* parse ENOENT to 0 */
>> +WARN_ON(1);
>> +return 0;
>> +}
> [...]
>> [ 1616.886868] [ cut here ]
>> [ 1616.886912] WARNING: at fs/btrfs/inode.c:2308 
>> relink_extent_backref+0x103/0x721 [btrfs]()
>> [ 1616.887050] Call Trace:
>> [ 1616.887064] [] dump_stack+0x19/0x1b
>> [ 1616.887071] [] warn_slowpath_common+0x67/0x80
>> [ 1616.887077] [] warn_slowpath_null+0x1a/0x1c
>> [ 1616.887100] [] relink_extent_backref+0x103/0x721
>> [ 1616.887205] [] btrfs_finish_ordered_io+0x742/0x829
>
> Mitch,
>
> Thank you for this excellent work to find the cause of the issue. I've sent a 
> patch "Btrfs: fix for patch "cleanup: don't check the same thing twice"" and 
> would appreciate if you could repeat your test, just to make sure, because I 
> was never able to reproduce this issue myself.
>

Thanks.

I've tested my "special" workload with your patch on the latest
3.11_rc6 kernel, and the patch corrects the errors I was encountering.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Clemens Eisserer
Hi,

> The speed improvement for dumping large databases through samba with
> strict allocate = yes to BTRFS was amazing.  It reduced a 1 hour dump down
> to 20 minutes.

What you want btrfs to do is to allocate a file of fixed-size on disk
in advance, without knowing how large the file will be after
compression (which is the only thing of interest to btrfs). So the
whole idea doesn't make a lot of sence.

Sure, you could generate a 50gb file filled with zeros, but once you
put real data in it, it will fragment a lot - probably even more than
the file which was created without strict allocate.

Regards
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][RESEND] xfstests: add a test for btrfs device replace operation

2013-08-23 Thread Stefan Behrens
This test performs btrfs device replace tests with all possible profiles
(single/dup/mixed/raid0/raid1/raid10), one round with the '-r' option
to 'btrfs replace start' and one round without this option. The
cancelation is tested only once and with the dup/single profile for
metadata/data.

This test takes 181 seconds on my SSD equiped test box and 237s on
spinning disks. Almost all the time is spent when the filesystem is
populated with test data. The replace operation itself takes less than
a second for all the tests, except for the test that is marked as
'thorough' which will run for about 8 seconds on my test box.

The amount of tests done depends on the number of devices in the
SCRATCH_DEV_POOL. For full test coverage, at least 5 devices should
be available (e.g. 5 partitions). With less than 2 entries in
SCRATCH_DEV_POOL, the test is not executed.

The source and target devices for the replace operation are arbitrarily
chosen out of SCRATCH_DEV_POOl. Since the target device mustn't be
smaller than the source device, the requirement for this test is that
all devices have _exactly_ the same size. If this is not the case, the
test terminates with _notrun.

To check the filesystems after replacing a device, a scrub run is
performed, a btrfsck run, and finally the filesystem is remounted.

This commit depends on my other commit:
"xfstest: don't remove the two first devices from SCRATCH_DEV_POOL"

Signed-off-by: Stefan Behrens 
---
V1 -> V2:
Major reworking in order to address all the comments from Eric Sandeen's
review.

V2 -> V3:
- Rebased.
- The check that the devices have the same size was missing the target
  device.

V3 -> V4:
Somehow I managed to add code that used a hardcoded path instead of
$SCRATCH_MNT. Fixed this issue.

 common/config   |   1 +
 tests/btrfs/010 | 277 
 tests/btrfs/010.out |   3 +
 tests/btrfs/group   |   1 +
 4 files changed, 282 insertions(+)

diff --git a/common/config b/common/config
index 586870b..db086fb 100644
--- a/common/config
+++ b/common/config
@@ -207,6 +207,7 @@ case "$HOSTOS" in
 export MKFS_UDF_PROG="`set_prog_path mkudffs`"
 export MKFS_BTRFS_PROG="`set_btrfs_mkfs_prog_path_with_opts`"
 export BTRFS_UTIL_PROG="`set_prog_path btrfs`"
+export BTRFS_SHOW_SUPER_PROG="`set_prog_path btrfs-show-super`"
 export XFS_FSR_PROG="`set_prog_path xfs_fsr`"
 export MKFS_NFS_PROG="false"
 ;;
diff --git a/tests/btrfs/010 b/tests/btrfs/010
new file mode 100755
index 000..36800ff
--- /dev/null
+++ b/tests/btrfs/010
@@ -0,0 +1,277 @@
+#! /bin/bash
+# FSQA Test No. btrfs/010
+#
+# Test of the btrfs replace operation.
+#
+# The amount of tests done depends on the number of devices in the
+# SCRATCH_DEV_POOL. For full test coverage, at least 5 devices should
+# be available (e.g. 5 partitions).
+#
+# The source and target devices for the replace operation are
+# arbitrarily chosen out of SCRATCH_DEV_POOl. Since the target device
+# mustn't be smaller than the source device, the requirement for this
+# test is that all devices have _exactly_ the same size. If this is
+# not the case, this test is not run.
+#
+# To check the filesystems after replacing a device, a scrub run is
+# performed, a btrfsck run, and finally the filesystem is remounted.
+#
+#---
+# Copyright (C) 2013 STRATO.  All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+noise_pid=0
+
+_cleanup()
+{
+   if [ $noise_pid -ne 0 ] && ps -p $noise_pid | grep -q $noise_pid; then
+   kill -TERM $noise_pid
+   fi
+   wait
+   rm -f $tmp.tmp
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_need_to_be_root
+_supported_fs btrfs
+_require_scratch
+_require_scratch_dev_pool
+_require_command $BTRFS_SHOW_SUPER_PROG btrfs-show-super
+
+rm -f $seqres.full
+rm -f $tmp.tmp
+
+echo "*** test btrfs replace"
+
+workout()
+{
+   local mkfs_options="$1"
+   local num_devs4raid="$2"
+   local with_cancel="$3"
+   lo

[PATCH] xfstests: update _filter_size() for Btrfs

2013-08-23 Thread Stefan Behrens
From: root 

The btrfs-progs tools changed the output:
- 100GiB instead of 100GB

xfstest btrfs/006 is one that failed due to this change.

Signed-off-by: Stefan Behrens 
---
 common/filter | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/common/filter b/common/filter
index dbb1674..ee738ca 100644
--- a/common/filter
+++ b/common/filter
@@ -262,7 +262,7 @@ _filter_uuid()
 # Filter out sizes like 6.14MB etc
 _filter_size()
 {
-   sed -e "s/[0-9\.]\+\s\?[b|k|m|g|t][b]\?//ig"
+   sed -e "s/[0-9\.]\+\s\?[b|k|m|g|t][i]\?[b]\?//ig"
 }
 
 # Convert string read from stdin like 128K to bytes and print it to stdout
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] xfstest: fix btrfs/006 for 10+ devices in SCRATCH_DEV_POOL

2013-08-23 Thread Stefan Behrens
One problem was the output of "uniq -c" which added spaces depending
on the size of the count value (e.g. one space less for 10+ devices).

The second problem was that "btrfs fi show" was doing the same:
"devid %4llu size %s used %s path %s".

Signed-off-by: Stefan Behrens 
---
 common/filter.btrfs |  4 ++--
 tests/btrfs/006.out | 14 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/common/filter.btrfs b/common/filter.btrfs
index e9a2bc2..29512cd 100644
--- a/common/filter.btrfs
+++ b/common/filter.btrfs
@@ -10,7 +10,7 @@ _filter_btrfs_version()
 
 _filter_devid()
 {
-   sed -e "s/\(devid\s\+\)[0-9]\+/\1 /g"
+   sed -e "s/\(devid\)\s\+[0-9]\+/\1 /g"
 }
 
 # If passed a number as first arg, filter that number of devices
@@ -53,7 +53,7 @@ _filter_btrfs_device_stats()
 
_filter_scratch | _filter_scratch_pool | \
sed -e "s/[0-9]\+$//g" | sort | uniq $UNIQ_OPT | \
-   sed -e "s/$NUMDEVS / /g"
+   sed -e "s/ *$NUMDEVS / /g"
 }
 
 # make sure this script returns success
diff --git a/tests/btrfs/006.out b/tests/btrfs/006.out
index ab33b7e..22bcb77 100644
--- a/tests/btrfs/006.out
+++ b/tests/btrfs/006.out
@@ -6,21 +6,21 @@ TestLabel.006
 == Show filesystem by label
 Label: 'TestLabel.006'  uuid: 
Total devices  FS bytes used 
-   devid  size  used  path SCRATCH_DEV
+   devid  size  used  path SCRATCH_DEV
 
 == Show filesystem by UUID
 Label: 'TestLabel.006'  uuid: 
Total devices  FS bytes used 
-   devid  size  used  path SCRATCH_DEV
+   devid  size  used  path SCRATCH_DEV
 
 == Sync filesystem
 FSSync 'SCRATCH_MNT'
 == Show device stats by mountpoint
-   [SCRATCH_DEV].corruption_errs 
-   [SCRATCH_DEV].flush_io_errs   
-   [SCRATCH_DEV].generation_errs 
-   [SCRATCH_DEV].read_io_errs
-   [SCRATCH_DEV].write_io_errs   
+ [SCRATCH_DEV].corruption_errs 
+ [SCRATCH_DEV].flush_io_errs   
+ [SCRATCH_DEV].generation_errs 
+ [SCRATCH_DEV].read_io_errs
+ [SCRATCH_DEV].write_io_errs   
 == Show device stats by first/scratch dev
 [SCRATCH_DEV].corruption_errs 
 [SCRATCH_DEV].flush_io_errs   
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4] xfstests: don't remove the two first devices from SCRATCH_DEV_POOL

2013-08-23 Thread Stefan Behrens
Since common/config is executed twice, if SCRATCH_DEV_POOL is configured
via the environment, the current code removes the first device entry twice
which means that you lose the second device for the test.

The fix is to not remove anything from SCRATCH_DEV_POOL anymore.
That used to be done (I can only guess) to allow to pass the
SCRATCH_DEV_POOL as an argument to _scratch_mkfs. Since _scratch_mkfs adds
the SCRATCH_DEV, the pool mustn't contain that device anymore.

A new function _scratch_pool_mkfs is introduced that does the expected
thing.

Signed-off-by: Stefan Behrens 
---
V1 -> V2:
- rebased.
- fixed a missing adaption to the new way to deal with SCRATCH_DEV_POOL
  in  _test_replace() in tests/btrfs/003.

V2 -> V3:
- V2 was the result of a git rebase mistake (a user error), forget it,
  sorry. V3 now contains the correct fix for a missing adaption to the
  new way to deal with SCRATCH_DEV_POOL in _test_replace() in
  tests/btrfs/003.

V3 -> V4:
- rebased, because it didn't apply cleanly anymore today.

 common/config   |  1 -
 common/rc   | 12 
 tests/btrfs/002 |  3 ++-
 tests/btrfs/003 | 16 
 tests/btrfs/006 |  4 ++--
 5 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/common/config b/common/config
index 39dd469..586870b 100644
--- a/common/config
+++ b/common/config
@@ -267,7 +267,6 @@ get_next_config() {
exit 1
fi
SCRATCH_DEV=`echo $SCRATCH_DEV_POOL | awk '{print $1}'`
-   SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | awk '{ ORS=" "; for 
(i = 2; i <= NF; i++) print $i}'`
fi
 
echo $SCRATCH_DEV | grep -q ":" > /dev/null 2>&1
diff --git a/common/rc b/common/rc
index ae80b12..77e96c4 100644
--- a/common/rc
+++ b/common/rc
@@ -550,6 +550,18 @@ _scratch_mkfs()
 esac
 }
 
+_scratch_pool_mkfs()
+{
+case $FSTYP in
+btrfs)
+   $MKFS_BTRFS_PROG $MKFS_OPTIONS $* $SCRATCH_DEV_POOL > /dev/null
+   ;;
+*)
+   echo "_scratch_pool_mkfs is not implemented for $FSTYP" 1>&2
+   ;;
+esac
+}
+
 # Create fs of certain size on scratch device
 # _scratch_mkfs_sized  [optional blocksize]
 _scratch_mkfs_sized()
diff --git a/tests/btrfs/002 b/tests/btrfs/002
index 03e9137..f4389ae 100755
--- a/tests/btrfs/002
+++ b/tests/btrfs/002
@@ -45,8 +45,9 @@ _need_to_be_root
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
+_require_scratch_dev_pool
 
-_scratch_mkfs $SCRATCH_DEV_POOL > /dev/null 2>&1 || _fail "mkfs failed"
+_scratch_pool_mkfs > /dev/null 2>&1 || _fail "mkfs failed"
 _scratch_mount
 
 # Create and save sha256sum
diff --git a/tests/btrfs/003 b/tests/btrfs/003
index 5c88651..262b1d5 100755
--- a/tests/btrfs/003
+++ b/tests/btrfs/003
@@ -58,7 +58,7 @@ rm -f $seqres.full
 _test_raid0()
 {
export MKFS_OPTIONS="-m raid0 -d raid0"
-   _scratch_mkfs $SCRATCH_DEV_POOL >> $seqres.full 2>&1 || _fail "mkfs 
failed"
+   _scratch_pool_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
_scratch_mount
dirp=`mktemp -duq $SCRATCH_MNT/dir.XX`
_populate_fs -n 1 -f 20 -d 10 -r $dirp -s 10
@@ -68,7 +68,7 @@ _test_raid0()
 _test_raid1()
 {
export MKFS_OPTIONS="-m raid1 -d raid1"
-   _scratch_mkfs $SCRATCH_DEV_POOL >> $seqres.full 2>&1 || _fail "mkfs 
failed"
+   _scratch_pool_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
_scratch_mount
dirp=`mktemp -duq $SCRATCH_MNT/dir.XX`
_populate_fs -n 1 -f 20 -d 10 -r $dirp -s 10
@@ -78,7 +78,7 @@ _test_raid1()
 _test_raid10()
 {
export MKFS_OPTIONS="-m raid10 -d raid10"
-   _scratch_mkfs $SCRATCH_DEV_POOL >> $seqres.full 2>&1 || _fail "mkfs 
failed"
+   _scratch_pool_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
_scratch_mount
dirp=`mktemp -duq $SCRATCH_MNT/dir.XX`
_populate_fs -n 1 -f 20 -d 10 -r $dirp -s 10
@@ -88,7 +88,7 @@ _test_raid10()
 _test_single()
 {
export MKFS_OPTIONS="-m single -d single"
-   _scratch_mkfs $SCRATCH_DEV_POOL >> $seqres.full 2>&1 || _fail "mkfs 
failed"
+   _scratch_pool_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
_scratch_mount
dirp=`mktemp -duq $SCRATCH_MNT/dir.XX`
_populate_fs -n 1 -f 20 -d 10 -r $dirp -s 10
@@ -108,7 +108,7 @@ _test_add()
_scratch_mount
dirp=`mktemp -duq $SCRATCH_MNT/dir.XX`
_populate_fs -n 1 -f 20 -d 10 -r $dirp -s 10
-   for i in `seq 1 $n`; do
+   for i in `seq 2 $n`; do
$BTRFS_UTIL_PROG device add ${devs[$i]} $SCRATCH_MNT >> 
$seqres.full 2>&1 || _fail "device add failed"
done
$BTRFS_UTIL_PROG filesystem balance $SCRATCH_MNT >> $seqres.full 2>&1 
|| _fail "balance failed"
@@ -124,9 +124,9 @@ _test_replace()
local d
local DEVHTL=""
 
-   # exclude the last disk in the disk pool
+   # exclude the first and the last disk in the disk pool
n=$(($n-1))
-   ds=${devs[@]:0:$n}
+   ds=${devs[@]:1:$

Re: [PATCH] xfstests: update filters and output of btrfs/006

2013-08-23 Thread Stefan Behrens
On Fri, 16 Aug 2013 12:10:31 -0500, Eric Sandeen wrote:
> On 8/16/13 12:02 PM, Stefan Behrens wrote:
>> The btrfs-progs tools changed the output:
>> - 100GiB instead of 100GB
>> - The number of spaces was changed
> 
> ugh.
> 
>>
>> Signed-off-by: Stefan Behrens 
>> ---
>>  common/filter   |  2 +-
>>  common/filter.btrfs |  3 ++-
>>  tests/btrfs/006 |  6 +++---
>>  tests/btrfs/006.out | 36 ++--
>>  4 files changed, 24 insertions(+), 23 deletions(-)
>>
>> diff --git a/common/filter b/common/filter
>> index dbb1674..ee738ca 100644
>> --- a/common/filter
>> +++ b/common/filter
>> @@ -262,7 +262,7 @@ _filter_uuid()
>>  # Filter out sizes like 6.14MB etc
>>  _filter_size()
>>  {
>> -sed -e "s/[0-9\.]\+\s\?[b|k|m|g|t][b]\?//ig"
>> +sed -e "s/[0-9\.]\+\s\?[b|k|m|g|t][i]\?[b]\?//ig"
>>  }
> 
> makes sense
> 
> But for the rest, is the output change intentional, or sloppiness/accidental?
> 
> If it's really intentional, then:
> 
> Reviewed-by: Eric Sandeen 
> 

Thanks for the review, Eric!

Since this commit is not yet added to the repo, I just took the time to
create a cleaner fix.

The tools didn't change the number of spaces in the output. The root
cause was that I had 11 devices in SCRATCH_DEV_POOL for the first time.
And this caused changes in the number of spaces in the output wherever
things like %4d was used.

One problem was the output of "uniq -c" which added spaces depending
on the size of the count value (e.g. one space less for 10+ devices).

The second problem was that "btrfs fi show" was doing the same:
"devid %4llu size %s used %s path %s".

Please scratch this patch out, I'll send two better patches instead
which address the real problems.


>>  # Convert string read from stdin like 128K to bytes and print it to stdout
>> diff --git a/common/filter.btrfs b/common/filter.btrfs
>> index e9a2bc2..1584596 100644
>> --- a/common/filter.btrfs
>> +++ b/common/filter.btrfs
>> @@ -33,6 +33,7 @@ _filter_btrfs_filesystem_show()
>>  # the uniq collapses all device lines into 1
>>  _filter_uuid $UUID | _filter_scratch | _filter_scratch_pool | \
>>  _filter_size | _filter_btrfs_version | _filter_devid | \
>> +_filter_spaces | \
>>  sed -e "s/\(Total devices\) $NUMDEVS/\1 $NUM_SUBST/g" | \
>>  uniq
>>  }
>> @@ -51,7 +52,7 @@ _filter_btrfs_device_stats()
>>  UNIQ_OPT=""
>>  fi
>>  
>> -_filter_scratch | _filter_scratch_pool | \
>> +_filter_scratch | _filter_scratch_pool | _filter_spaces | \
>>  sed -e "s/[0-9]\+$//g" | sort | uniq $UNIQ_OPT | \
>>  sed -e "s/$NUMDEVS / /g"
>>  }
>> diff --git a/tests/btrfs/006 b/tests/btrfs/006
>> index 9f7beff..f323cc4 100755
>> --- a/tests/btrfs/006
>> +++ b/tests/btrfs/006
>> @@ -82,13 +82,13 @@ echo "== Sync filesystem"
>>  $BTRFS_UTIL_PROG filesystem sync $SCRATCH_MNT | _filter_scratch
>>  
>>  echo "== Show device stats by mountpoint"
>> -$BTRFS_UTIL_PROG device stats $SCRATCH_MNT | _filter_btrfs_device_stats 
>> $TOTAL_DEVS
>> +$BTRFS_UTIL_PROG device stats $SCRATCH_MNT | _filter_btrfs_device_stats 
>> $TOTAL_DEVS | _filter_spaces
>>  echo "== Show device stats by first/scratch dev"
>>  $BTRFS_UTIL_PROG device stats $SCRATCH_DEV | _filter_btrfs_device_stats
>>  echo "== Show device stats by second dev"
>> -$BTRFS_UTIL_PROG device stats $FIRST_POOL_DEV | sed -e 
>> "s,$FIRST_POOL_DEV,FIRST_POOL_DEV,g"
>> +$BTRFS_UTIL_PROG device stats $FIRST_POOL_DEV | sed -e 
>> "s,$FIRST_POOL_DEV,FIRST_POOL_DEV,g" | _filter_spaces
>>  echo "== Show device stats by last dev"
>> -$BTRFS_UTIL_PROG device stats $LAST_POOL_DEV | sed -e 
>> "s,$LAST_POOL_DEV,LAST_POOL_DEV,g"
>> +$BTRFS_UTIL_PROG device stats $LAST_POOL_DEV | sed -e 
>> "s,$LAST_POOL_DEV,LAST_POOL_DEV,g" | _filter_spaces
>>  
>>  # success, all done
>>  status=0
>> diff --git a/tests/btrfs/006.out b/tests/btrfs/006.out
>> index ab33b7e..413a5a8 100644
>> --- a/tests/btrfs/006.out
>> +++ b/tests/btrfs/006.out
>> @@ -4,38 +4,38 @@
>>  TestLabel.006
>>  == Mount.
>>  == Show filesystem by label
>> -Label: 'TestLabel.006'  uuid: 
>> +Label: 'TestLabel.006' uuid: 
>>  Total devices  FS bytes used 
>> -devid  size  used  path SCRATCH_DEV
>> +devid  size  used  path SCRATCH_DEV
>>  
>>  == Show filesystem by UUID
>> -Label: 'TestLabel.006'  uuid: 
>> +Label: 'TestLabel.006' uuid: 
>>  Total devices  FS bytes used 
>> -devid  size  used  path SCRATCH_DEV
>> +devid  size  used  path SCRATCH_DEV
>>  
>>  == Sync filesystem
>>  FSSync 'SCRATCH_MNT'
>>  == Show device stats by mountpoint
>> -   [SCRATCH_DEV].corruption_errs 
>> -   [SCRATCH_DEV].flush_io_errs   
>> -   [SCRATCH_DEV].generation_errs 
>> -   [SCRATCH_DEV].read_io_errs
>> -   [SCRATCH_DEV].write_io_errs   
>> +  [SCRATCH_DEV].corruption_errs 
>> +  [SCRATCH_DEV].flush_io_errs 
>> +  [SCRATCH_DEV].generation_errs 
>> +  [SCRATCH_DEV].read_io_errs 
>> +  [SCRATCH_DEV].write_io_errs 
>>  == Show device stats by first/scratch

Re: default mount options 3.10

2013-08-23 Thread Martin Steigerwald
Am Freitag, 23. August 2013, 12:29:42 schrieb Xavier Bassery:
> On Fri, 23 Aug 2013 11:38:56 +0200
> 
> David Kofler  wrote:
> > Hi,
> > can someone tell me which mount options are included in "defaults"
> > mount option? Couldn't find this in BTRFS Wiki. I'm using Debian
> > Wheezy 7.1 and Linux kernel 3.10.6.
> > Thanks in advance.
> 
> Hi,
> you've looked at the wrong place.
> From mount man page:
> 
> FILESYSTEM INDEPENDENT MOUNT OPTIONS
> defaults : Use default options: rw, suid, dev, exec, auto, nouser, async
> 
> You can also have a look here [1].

That are just VFS options, no BTRFS specific ones.

A good way is to look at output of "mount" and "cat /proc/mounts". It can 
differ from situation as well, for example SSD or not.

But some hints at BTRFS default options are also on (search for "default"):

https://btrfs.wiki.kernel.org/index.php/Mount_options

I donĀ“t know wether there is a complete documentation on the defaults.

> [1] https://wiki.debian.org/fstab#Field_definitions
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: default mount options 3.10

2013-08-23 Thread Xavier Bassery
On Fri, 23 Aug 2013 11:38:56 +0200
David Kofler  wrote:

> Hi,
> can someone tell me which mount options are included in "defaults"
> mount option? Couldn't find this in BTRFS Wiki. I'm using Debian
> Wheezy 7.1 and Linux kernel 3.10.6.
> Thanks in advance.

Hi,
you've looked at the wrong place.
>From mount man page:

FILESYSTEM INDEPENDENT MOUNT OPTIONS
defaults : Use default options: rw, suid, dev, exec, auto, nouser, async

You can also have a look here [1].

Best regards,
Xavier

[1] https://wiki.debian.org/fstab#Field_definitions
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] Btrfs: add btrfs_alloc_device and switch to it

2013-08-23 Thread Ilya Dryomov
Currently btrfs_device is allocated ad-hoc in a few different places,
and as a result not all fields are initialized properly.  In particular,
readahead state is only initialized in device_list_add (at scan time),
and not in btrfs_init_new_device (when the new device is added with
'btrfs dev add').  Fix this by adding an allocation helper and switch
everybody but __btrfs_close_devices to it.  (__btrfs_close_devices is
dealt with in a later commit.)

Signed-off-by: Ilya Dryomov 
---
v2:
- fixed a dev-replace regression spotted by Stefan Behrens

 fs/btrfs/volumes.c |  151 
 fs/btrfs/volumes.h |3 ++
 2 files changed, 97 insertions(+), 57 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index ae1bcb0..100c21c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -101,6 +101,27 @@ void btrfs_cleanup_fs_uuids(void)
}
 }
 
+static struct btrfs_device *__alloc_device(void)
+{
+   struct btrfs_device *dev;
+
+   dev = kzalloc(sizeof(*dev), GFP_NOFS);
+   if (!dev)
+   return ERR_PTR(-ENOMEM);
+
+   INIT_LIST_HEAD(&dev->dev_list);
+   INIT_LIST_HEAD(&dev->dev_alloc_list);
+
+   spin_lock_init(&dev->io_lock);
+
+   spin_lock_init(&dev->reada_lock);
+   atomic_set(&dev->reada_in_flight, 0);
+   INIT_RADIX_TREE(&dev->reada_zones, GFP_NOFS & ~__GFP_WAIT);
+   INIT_RADIX_TREE(&dev->reada_extents, GFP_NOFS & ~__GFP_WAIT);
+
+   return dev;
+}
+
 static noinline struct btrfs_device *__find_device(struct list_head *head,
   u64 devid, u8 *uuid)
 {
@@ -414,17 +435,12 @@ static noinline int device_list_add(const char *path,
if (fs_devices->opened)
return -EBUSY;
 
-   device = kzalloc(sizeof(*device), GFP_NOFS);
-   if (!device) {
+   device = btrfs_alloc_device(NULL, &devid,
+   disk_super->dev_item.uuid);
+   if (IS_ERR(device)) {
/* we can safely leave the fs_devices entry around */
-   return -ENOMEM;
+   return PTR_ERR(device);
}
-   device->devid = devid;
-   device->dev_stats_valid = 0;
-   device->work.func = pending_bios_fn;
-   memcpy(device->uuid, disk_super->dev_item.uuid,
-  BTRFS_UUID_SIZE);
-   spin_lock_init(&device->io_lock);
 
name = rcu_string_strdup(path, GFP_NOFS);
if (!name) {
@@ -432,15 +448,6 @@ static noinline int device_list_add(const char *path,
return -ENOMEM;
}
rcu_assign_pointer(device->name, name);
-   INIT_LIST_HEAD(&device->dev_alloc_list);
-
-   /* init readahead state */
-   spin_lock_init(&device->reada_lock);
-   device->reada_curr_zone = NULL;
-   atomic_set(&device->reada_in_flight, 0);
-   device->reada_next = 0;
-   INIT_RADIX_TREE(&device->reada_zones, GFP_NOFS & ~__GFP_WAIT);
-   INIT_RADIX_TREE(&device->reada_extents, GFP_NOFS & ~__GFP_WAIT);
 
mutex_lock(&fs_devices->device_list_mutex);
list_add_rcu(&device->dev_list, &fs_devices->devices);
@@ -491,8 +498,9 @@ static struct btrfs_fs_devices *clone_fs_devices(struct 
btrfs_fs_devices *orig)
list_for_each_entry(orig_dev, &orig->devices, dev_list) {
struct rcu_string *name;
 
-   device = kzalloc(sizeof(*device), GFP_NOFS);
-   if (!device)
+   device = btrfs_alloc_device(NULL, &orig_dev->devid,
+   orig_dev->uuid);
+   if (IS_ERR(device))
goto error;
 
/*
@@ -506,13 +514,6 @@ static struct btrfs_fs_devices *clone_fs_devices(struct 
btrfs_fs_devices *orig)
}
rcu_assign_pointer(device->name, name);
 
-   device->devid = orig_dev->devid;
-   device->work.func = pending_bios_fn;
-   memcpy(device->uuid, orig_dev->uuid, sizeof(device->uuid));
-   spin_lock_init(&device->io_lock);
-   INIT_LIST_HEAD(&device->dev_list);
-   INIT_LIST_HEAD(&device->dev_alloc_list);
-
list_add(&device->dev_list, &fs_devices->devices);
device->fs_devices = fs_devices;
fs_devices->num_devices++;
@@ -1956,10 +1957,10 @@ int btrfs_init_new_device(struct btrfs_root *root, char 
*device_path)
}
mutex_unlock(&root->fs_info->fs_devices->device_list_mutex);
 
-   device = kzalloc(sizeof(*device), GFP_NOFS);
-   if (!device) {
+   device = btrfs_alloc_device(root->fs_info, NULL, NULL);
+   if (IS_ERR(device)) {
/* we can safely leave the fs

Re: Deduplication

2013-08-23 Thread mpe

On 23/08/2013 09:42, Liu Bo wrote:

Hi,

On Thu, Aug 22, 2013 at 09:28:09PM +0200, Florian Lindner wrote:

Hello,

some questions regarding btrfs deduplication.

- What is the state of it? Is it "safe" to use?
https://btrfs.wiki.kernel.org/index.php/Deduplication does not yield
much information.


For inband dedup, it's experimental because it introduces some format changes.



In our tests we are trying btrfs for our internal storage backup backend 
and we like a lot compression! Now would like to test inband dedup.
Does those patchs (are v6 the last?) can be used on ubuntu lts? Do have 
a simple howt-to to follow?


Thanks,
Michele

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: don't allow the replace procedure on read only filesystems

2013-08-23 Thread Stefan Behrens
On Fri, 23 Aug 2013 14:54:50 +0800, Wang Shilong wrote:
> Hey Stefan,
> 
> On 08/20/2013 12:51 AM, Stefan Behrens wrote:
>> If you start the replace procedure on a read only filesystem, at
>> the end the procedure fails to write the updated dev_items to the
>> chunk tree. The problem is that this error is not indicated except
>> for a WARN_ON(). If the user now thinks that everything was done
>> as expected and destroys the source device (with mkfs or with a
>> hammer). The next mount fails with "failed to read chunk root" and
>> the filesystem is gone.
>>
>> This commit adds code to fail the attempt to start the replace
>> procedure if the filesystem is mounted read-only.
>>
>> Signed-off-by: Stefan Behrens 
>> Cc:  # 3.10+
>> ---
>>  fs/btrfs/ioctl.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 3e36626..bf42d41 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -3653,6 +3653,9 @@ static long btrfs_ioctl_dev_replace(struct btrfs_root 
>> *root, void __user *arg)
>>  
>>  switch (p->cmd) {
>>  case BTRFS_IOCTL_DEV_REPLACE_CMD_START:
>> +if (root->fs_info->sb->s_flags & MS_RDONLY)
>> +return -EROFS;
>> +
> 
> This is not really safe. Consideringļ¼š
> 
> Task 1Task2   
> |--->sb->s_flags & MS_RDONLY  
> 
>   Remount filesyste RO
> 
> |--->do replace operations
> 
> For the above case, we will continue to do device replace while the 
> filesystem is READONLY.
> and i think mnt_want_file() will be a right choice.

You are right that a window for a race condition remains and that this
is therefore not a correct solution.

The problem that I have with surrounding the long running replace
procedure with mnt_want_write_file/mnt_drop_write_file is that I am
unable to remount read-only during this time. remount read-only usually
suspends the replace operation, but with mnt_want_write_file it fails
and the Btrfs remount code is not called.
This would be a problem if some (non-Btrfs-replace-aware) software tries
to remount the filesystem read-only.

Therefore I still think that the quick & dirty read-only check that I
added is the better solution.

This happens when mnt_want_write_file() is used:
# btrfs replace start /dev/sdi /dev/sde /mnt2
# mount -o remount,ro /dev/sdi /mnt2
mount: you must specify the filesystem type
# echo $?
32
# btrfs replace cancel /mnt2
# mount -o remount,ro /dev/sdi /mnt2
# echo $?
0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


default mount options 3.10

2013-08-23 Thread David Kofler

Hi,
can someone tell me which mount options are included in "defaults" mount 
option? Couldn't find this in BTRFS Wiki. I'm using Debian Wheezy 7.1 
and Linux kernel 3.10.6.

Thanks in advance.

Best Regards
David
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Mark Ridley
That would be fine, but nodatacow (according to the btrfs wiki) stops
compression, so I might as well get the speed benefits of 'strict allocate
= yes' which also disables compression.

If you want to use BTRFS to store backups then compression has be turned
on.

Database files like MSSQL usually compress down by 7-10 times, so its a
shame to not get this benefit.


On 23/08/2013 10:08, "Duncan" <1i5t5.dun...@cox.net> wrote:

>Mark Ridley posted on Fri, 23 Aug 2013 09:20:04 +0100 as excerpted:
>
>> I don't want to try nodatacow (which would probably fix the issue), but
>> you lose compression on the whole filesystem, autodefrag doesn't fix it
>> either.
>
>I don't do servantware (in the context of my sig) and thus don't do samba
>here, so I'll just black-box that side of things entirely.
>
>But from the btrfs side, are you considering nodatacow at the filesystem
>or individual file level?  It seems doing it at the individual file level
>might be what you need.
>
>Note that nodatacow must be set on the file at creation in ordered to
>work, which is impractical to do directly.  However, if you can arrange
>for the files in question to appear in a particular directory, you can
>set the nodatacow attribute on the directory, and files created within it
>will inherit that.
>
>It seems to me that should do what you need, PROVIDED that you can
>arrange for the files to appear in a particular dir/dir-tree, not more or
>less randomly written throughout the entire (btrfs) filesystem.
>
>-- 
>Duncan - List replies preferred.   No HTML msgs.
>"Every nonfree program has a lord, a master --
>and if you use the program, he is your master."  Richard Stallman
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Mark Ridley
The speed improvement for dumping large databases through samba with
strict allocate = yes to BTRFS was amazing.  It reduced a 1 hour dump down
to 20 minutes.

On 23/08/2013 09:01, "Roger Binns"  wrote:

>-BEGIN PGP SIGNED MESSAGE-
>Hash: SHA1
>
>On 22/08/13 07:07, Josef Bacik wrote:
>> Not sure what strict allocate = yes does,
>
>I've worked on SMB servers before and can answer that.  Historically the
>way Windows apps (right back into the 16 bit days) have made sure there is
>space for a file about to be written is to ask the OS to allocate all the
>space for it.  (Unix by default leaves holes making a sparse file.)
>
>For example if a 10MB file is going to be written then an allocation will
>be done of 10MB.  (The exact underlying protocol commands vary, but
>originally were similar to the Unix seek to end and write.)  After that
>seeks and writes are done.  Because the allocation succeeded the app knows
>that it won't get an out of space error.
>
>Separately from that, it turns out that some filesystems do benefit from
>preallocating the file to the expected size, and then writing the contents
>in dribs and drabs into the allocated space.
>
>Consequently Samba gives you the option of really allocating all the file,
>either for Windows semantics compatibility, or because it results in
>improved performance on the Unix filesystem.
>
>However I can't see it being of any benefit on a COW filesystem like
>btrfs.
>
>Roger
>
>
>
>-BEGIN PGP SIGNATURE-
>Version: GnuPG v1.4.12 (GNU/Linux)
>
>iEYEARECAAYFAlIXFtsACgkQmOOfHg372QR7cwCggRyQxtxj9E7dNKV94M/Tv5o6
>LC0AoN9icJNVxzkV0kDQSgf3Vt0N3g3V
>=wBHz
>-END PGP SIGNATURE-
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: fix for patch "cleanup: don't check the same thing twice"

2013-08-23 Thread Miao Xie
On fri, 23 Aug 2013 10:34:42 +0200, Stefan Behrens wrote:
> Mitch Harder noticed that the patch 3c64a1a mentioned in the subject
> line was causing a kernel BUG() on snapshot deletion.
> 
> The patch was wrong. It did not handle cached roots correctly. The
> check for root_refs == 0 was removed everywhere where
> btrfs_read_fs_root_no_name() had been used to retrieve the root,
> because this check was already dealt with in
> btrfs_read_fs_root_no_name(). But in the case when the root was
> found in the cache, there was no such check.
> 
> This patch adds the missing check in the case where the root is
> found in the cache.
> 
> Reported-by: Mitch Harder 
> Signed-off-by: Stefan Behrens 
> ---
>  fs/btrfs/disk-io.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 43ec3c6..7078554 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1583,8 +1583,11 @@ struct btrfs_root *btrfs_read_fs_root_no_name(struct 
> btrfs_fs_info *fs_info,
>   ERR_PTR(-ENOENT);
>  again:
>   root = btrfs_lookup_fs_root(fs_info, location->objectid);
> - if (root)
> + if (root) {
> + if (btrfs_root_refs(&root->root_item) == 0)
> + return ERR_PTR(-ENOENT);
>   return root;
> + }

It seems good to me. 

Reviewed-by: Miao Xie 

>  
>   root = btrfs_read_fs_root(fs_info->tree_root, location);
>   if (IS_ERR(root))
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Duncan
Mark Ridley posted on Fri, 23 Aug 2013 09:20:04 +0100 as excerpted:

> I don't want to try nodatacow (which would probably fix the issue), but
> you lose compression on the whole filesystem, autodefrag doesn't fix it
> either.

I don't do servantware (in the context of my sig) and thus don't do samba 
here, so I'll just black-box that side of things entirely.

But from the btrfs side, are you considering nodatacow at the filesystem 
or individual file level?  It seems doing it at the individual file level 
might be what you need.

Note that nodatacow must be set on the file at creation in ordered to 
work, which is impractical to do directly.  However, if you can arrange 
for the files in question to appear in a particular directory, you can 
set the nodatacow attribute on the directory, and files created within it 
will inherit that.

It seems to me that should do what you need, PROVIDED that you can 
arrange for the files to appear in a particular dir/dir-tree, not more or 
less randomly written throughout the entire (btrfs) filesystem.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Mark Ridley

I tried defrag -c and it does nothing to files that have come in with
strict allocate = yes.

On 22/08/2013 19:29, "Kai Krakow"  wrote:

>Josef Bacik  schrieb:
>
>> Not sure what strict allocate = yes does, but I assume it probably does
>> fallocate() in which case yeah we aren't going to compress, we'll just
>> write
>> into the preallocated space.  We don't support compressed writes into
>> preallocated space ATM, and I'm not sure we ever will.  Thanks,
>
>Good to know, this renders btrfs as efficient storage backend for Windows
>file shares pretty useless. Does this also happen with compress-force?
>
>As a work-around one could write a cronjob that regularly defrags all
>files 
>changed since the last run with -c option...
>
>Thanks,
>Kai
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Btrfs: add btrfs_alloc_device and switch to it

2013-08-23 Thread Ilya Dryomov
On Fri, Aug 23, 2013 at 10:35 AM, Stefan Behrens
 wrote:
> This WARN_ON(1) is triggered with the device replace procedure because
> BTRFS_DEV_REPLACE_DEVID is zero.

Ah, a dreaded 0 in C.  What a screw up.  This came from my rebuild
branch where btrfs_init_dev_replace_tgtdev is a bit different, and I
just auto-merged it.  I'll fix it up.

Thanks,

Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel BUG on Snapshot Deletion (3.11.0-rc5)

2013-08-23 Thread Stefan Behrens
On Wed, 21 Aug 2013 08:44:55 -0500, Mitch Harder wrote:
> I've had a hard time assembling a portable reproducer for this issue.
> 
> I discovered that my reproducer was highly dependent on a local
> archive of out-of-date git kernel sources.  My efforts to reproduce
> the error with a portable set of scripts with publicly available
> kernel git sources weren't successful.
> 
> It seems like this issue is related to a corner-case workload that is
> difficult to reproduce.
> 
> So I've bisected the error I was seeing with my local script, and
> identified the following commit as triggering my issue:
> 
> commit:3c64a1aba7cfcb04f79e76f859b3d0275d59
> Btrfs: cleanup: don't check the same thing twice
> https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=for-linus&id=3c64a1aba7cfcb04
> 
> I tested a kernel which reverted this change, and also added WARN_ON
> lines to provide a back trace.
[...]
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index cd46e2c..a1091f7 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -2302,6 +2302,12 @@ static noinline int
> relink_extent_backref(struct btrfs_path *path,
>  return 0;
>  return PTR_ERR(root);
>  }
> +if (btrfs_root_refs(&root->root_item) == 0) {
> +srcu_read_unlock(&fs_info->subvol_srcu, index);
> +/* parse ENOENT to 0 */
> +WARN_ON(1);
> +return 0;
> +}
[...]
> [ 1616.886868] [ cut here ]
> [ 1616.886912] WARNING: at fs/btrfs/inode.c:2308 
> relink_extent_backref+0x103/0x721 [btrfs]()
> [ 1616.887050] Call Trace:
> [ 1616.887064] [] dump_stack+0x19/0x1b
> [ 1616.887071] [] warn_slowpath_common+0x67/0x80
> [ 1616.887077] [] warn_slowpath_null+0x1a/0x1c
> [ 1616.887100] [] relink_extent_backref+0x103/0x721
> [ 1616.887205] [] btrfs_finish_ordered_io+0x742/0x829

Mitch,

Thank you for this excellent work to find the cause of the issue. I've sent a 
patch "Btrfs: fix for patch "cleanup: don't check the same thing twice"" and 
would appreciate if you could repeat your test, just to make sure, because I 
was never able to reproduce this issue myself.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Btrfs: fix for patch "cleanup: don't check the same thing twice"

2013-08-23 Thread Stefan Behrens
Mitch Harder noticed that the patch 3c64a1a mentioned in the subject
line was causing a kernel BUG() on snapshot deletion.

The patch was wrong. It did not handle cached roots correctly. The
check for root_refs == 0 was removed everywhere where
btrfs_read_fs_root_no_name() had been used to retrieve the root,
because this check was already dealt with in
btrfs_read_fs_root_no_name(). But in the case when the root was
found in the cache, there was no such check.

This patch adds the missing check in the case where the root is
found in the cache.

Reported-by: Mitch Harder 
Signed-off-by: Stefan Behrens 
---
 fs/btrfs/disk-io.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 43ec3c6..7078554 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1583,8 +1583,11 @@ struct btrfs_root *btrfs_read_fs_root_no_name(struct 
btrfs_fs_info *fs_info,
ERR_PTR(-ENOENT);
 again:
root = btrfs_lookup_fs_root(fs_info, location->objectid);
-   if (root)
+   if (root) {
+   if (btrfs_root_refs(&root->root_item) == 0)
+   return ERR_PTR(-ENOENT);
return root;
+   }
 
root = btrfs_read_fs_root(fs_info->tree_root, location);
if (IS_ERR(root))
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrf: cleanup: don't check for root_refs == 0 twice

2013-08-23 Thread Stefan Behrens
btrfs_read_fs_root_no_name() already checks if btrfs_root_refs()
is zero and returns ENOENT in this case. There is no need to do
it again in three more places.

Signed-off-by: Stefan Behrens 
---
 fs/btrfs/file.c   | 5 -
 fs/btrfs/relocation.c | 3 ---
 fs/btrfs/scrub.c  | 5 -
 3 files changed, 13 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 5c63229..fce83dd 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -310,11 +310,6 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info 
*fs_info,
goto cleanup;
}
 
-   if (btrfs_root_refs(&inode_root->root_item) == 0) {
-   ret = -ENOENT;
-   goto cleanup;
-   }
-
key.objectid = defrag->ino;
btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY);
key.offset = 0;
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index cf5e30f..aacc212 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2354,9 +2354,6 @@ again:
if (IS_ERR(root))
continue;
 
-   if (btrfs_root_refs(&root->root_item) == 0)
-   continue;
-
trans = btrfs_join_transaction(root);
BUG_ON(IS_ERR(trans));
 
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ec6a33a..0afcd45 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3217,11 +3217,6 @@ static int copy_nocow_pages_for_inode(u64 inum, u64 
offset, u64 root, void *ctx)
return PTR_ERR(local_root);
}
 
-   if (btrfs_root_refs(&local_root->root_item) == 0) {
-   srcu_read_unlock(&fs_info->subvol_srcu, srcu_index);
-   return -ENOENT;
-   }
-
key.type = BTRFS_INODE_ITEM_KEY;
key.objectid = inum;
key.offset = 0;
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Mark Ridley
The main reason I started using strict allocate = yes on samba was out of
desperation/exasperation with BTRFS.

BTRFS stalls from time to time causing SAMBA and/or MSSQL to give up on
the dump of a database.

>From what I have noticed, if for example you dump a 50GB database to samba
without strict allocate, then sometimes the file size gets set to 15GB
(for example), MSSQL then fills it in and then the file size jumps up to a
next amount and gets filled in.

But a lot of times, especially on a system that has been going for about 6
months and may be it is fragmented, but the next allocation of size just
stalls for a while, you cannot even ls the directory.

Either Samba or MSSQL gets fed up of this and the dump fails with a write
error.

I am seeing this on 3.3.2, 3.6.11 and even worse on 3.9 and 3.10.

I don't want to try nodatacow (which would probably fix the issue), but
you lose compression on the whole filesystem, autodefrag doesn't fix it
either.

Any suggestions?

Many thanks.

On 23/08/2013 09:01, "Roger Binns"  wrote:

>-BEGIN PGP SIGNED MESSAGE-
>Hash: SHA1
>
>On 22/08/13 07:07, Josef Bacik wrote:
>> Not sure what strict allocate = yes does,
>
>I've worked on SMB servers before and can answer that.  Historically the
>way Windows apps (right back into the 16 bit days) have made sure there is
>space for a file about to be written is to ask the OS to allocate all the
>space for it.  (Unix by default leaves holes making a sparse file.)
>
>For example if a 10MB file is going to be written then an allocation will
>be done of 10MB.  (The exact underlying protocol commands vary, but
>originally were similar to the Unix seek to end and write.)  After that
>seeks and writes are done.  Because the allocation succeeded the app knows
>that it won't get an out of space error.
>
>Separately from that, it turns out that some filesystems do benefit from
>preallocating the file to the expected size, and then writing the contents
>in dribs and drabs into the allocated space.
>
>Consequently Samba gives you the option of really allocating all the file,
>either for Windows semantics compatibility, or because it results in
>improved performance on the Unix filesystem.
>
>However I can't see it being of any benefit on a COW filesystem like
>btrfs.
>
>Roger
>
>
>
>-BEGIN PGP SIGNATURE-
>Version: GnuPG v1.4.12 (GNU/Linux)
>
>iEYEARECAAYFAlIXFtsACgkQmOOfHg372QR7cwCggRyQxtxj9E7dNKV94M/Tv5o6
>LC0AoN9icJNVxzkV0kDQSgf3Vt0N3g3V
>=wBHz
>-END PGP SIGNATURE-
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Samba strict allocate = yes stops btrfs compression working

2013-08-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/08/13 07:07, Josef Bacik wrote:
> Not sure what strict allocate = yes does,

I've worked on SMB servers before and can answer that.  Historically the
way Windows apps (right back into the 16 bit days) have made sure there is
space for a file about to be written is to ask the OS to allocate all the
space for it.  (Unix by default leaves holes making a sparse file.)

For example if a 10MB file is going to be written then an allocation will
be done of 10MB.  (The exact underlying protocol commands vary, but
originally were similar to the Unix seek to end and write.)  After that
seeks and writes are done.  Because the allocation succeeded the app knows
that it won't get an out of space error.

Separately from that, it turns out that some filesystems do benefit from
preallocating the file to the expected size, and then writing the contents
in dribs and drabs into the allocated space.

Consequently Samba gives you the option of really allocating all the file,
either for Windows semantics compatibility, or because it results in
improved performance on the Unix filesystem.

However I can't see it being of any benefit on a COW filesystem like btrfs.

Roger



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlIXFtsACgkQmOOfHg372QR7cwCggRyQxtxj9E7dNKV94M/Tv5o6
LC0AoN9icJNVxzkV0kDQSgf3Vt0N3g3V
=wBHz
-END PGP SIGNATURE-

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deduplication

2013-08-23 Thread Liu Bo
Hi,

On Thu, Aug 22, 2013 at 09:28:09PM +0200, Florian Lindner wrote:
> Hello,
> 
> some questions regarding btrfs deduplication.
> 
> - What is the state of it? Is it "safe" to use?
> https://btrfs.wiki.kernel.org/index.php/Deduplication does not yield
> much information.

For inband dedup, it's experimental because it introduces some format changes.

> 
> - https://pypi.python.org/pypi/bedup says: "bedup looks for new and
> changed files, making sure that multiple copies of identical files
> share space on disk. It integrates deeply with btrfs so that scans are
> incremental and low-impact."
> 
> Is is file-based deduplcation only or is there also a block-based
> mode? Does not seem so according to the docs.

Both inband dedup and offband dedup patchset are floating on the list, and
inband is block-based, but I'm not sure about offband dedup.

-liubo

> 
> If so, what is the advantage to numerous other tools that replace
> duplicates by hard links? Performance?
> 
> Thanks,
> 
> Florian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: remove ourselves from the cluster list under lock

2013-08-23 Thread Miao Xie
On  thu, 22 Aug 2013 17:04:59 -0400, Josef Bacik wrote:
> A user was reporting weird warnings from btrfs_put_delayed_ref() and I noticed
> that we were doing this list_del_init() on our head ref outside of
> delayed_refs->lock.  This is a problem if we have people still on the list, we
> could end up modifying old pointers and such.  Fix this by removing us from 
> the
> list before we do our run_delayed_ref on our head ref.  Thanks,

Looks good.

Reviewed-by: Miao Xie 

> 
> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/extent-tree.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 4c89566..95c6539 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -2440,6 +2440,8 @@ static noinline int run_clustered_refs(struct 
> btrfs_trans_handle *trans,
>   default:
>   WARN_ON(1);
>   }
> + } else {
> + list_del_init(&locked_ref->cluster);
>   }
>   spin_unlock(&delayed_refs->lock);
>  
> @@ -2462,7 +2464,6 @@ static noinline int run_clustered_refs(struct 
> btrfs_trans_handle *trans,
>* list before we release it.
>*/
>   if (btrfs_delayed_ref_is_head(ref)) {
> - list_del_init(&locked_ref->cluster);
>   btrfs_delayed_ref_unlock(locked_ref);
>   locked_ref = NULL;
>   }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Btrfs: add btrfs_alloc_device and switch to it

2013-08-23 Thread Stefan Behrens
On Mon, 12 Aug 2013 14:33:02 +0300, Ilya Dryomov wrote:
> Currently btrfs_device is allocated ad-hoc in a few different places,
> and as a result not all fields are initialized properly.  In particular,
> readahead state is only initialized in device_list_add (at scan time),
> and not in btrfs_init_new_device (when the new device is added with
> 'btrfs dev add').  Fix this by adding an allocation helper and switch
> everybody but __btrfs_close_devices to it.  (__btrfs_close_devices is
> dealt with in a later commit.)
> 
> Signed-off-by: Ilya Dryomov 
> ---
>  fs/btrfs/volumes.c |  150 
> 
>  fs/btrfs/volumes.h |3 ++
>  2 files changed, 96 insertions(+), 57 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index ae1bcb0..fe52704 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
[...]


> @@ -2142,9 +2133,9 @@ int btrfs_init_dev_replace_tgtdev(struct btrfs_root 
> *root, char *device_path,
>   }
>   }
>  
> - device = kzalloc(sizeof(*device), GFP_NOFS);
> - if (!device) {
> - ret = -ENOMEM;
> + device = btrfs_alloc_device(NULL, BTRFS_DEV_REPLACE_DEVID, NULL);
> + if (IS_ERR(device)) {
> + ret = PTR_ERR(device);

BTRFS_DEV_REPLACE_DEVID is not a "const u64 *devid".


> +struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
> + const u64 *devid,
> + const u8 *uuid)
> +{
> + struct btrfs_device *dev;
> + u64 tmp;
> +
> + if (!devid && !fs_info) {
> + WARN_ON(1);
> + return ERR_PTR(-EINVAL);
> + }

This WARN_ON(1) is triggered with the device replace procedure because
BTRFS_DEV_REPLACE_DEVID is zero.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html