Re: [PATCH] btrfs: add btrfs resize unit t/p/e support

2014-03-27 Thread Brendan Hide

On 2014/03/27 04:51 AM, Gui Hecheng wrote:

[snip]

We add t/p/e support by replacing lib/cmdline.c:memparse
with btrfs_memparse. The btrfs_memparse copies memparse's code
and add unit t/p/e parsing.

Is there a conflict preventing adding this to memparse directly?

--
__
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: make device discard process interruptible

2014-03-27 Thread David Sterba
The ioctl for the whole range is not interruptible, which can be
annoying when the discard is not wanted but user forgets to use the -K
option.

Signed-off-by: David Sterba dste...@suse.cz
---
 utils.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/utils.c b/utils.c
index 013d74f9a0cd..3e9c527a492c 100644
--- a/utils.c
+++ b/utils.c
@@ -52,8 +52,10 @@
 #define BLKDISCARD _IO(0x12,119)
 #endif
 
-static int
-discard_blocks(int fd, u64 start, u64 len)
+/*
+ * Discard the given range in one go
+ */
+static int discard_range(int fd, u64 start, u64 len)
 {
u64 range[2] = { start, len };
 
@@ -62,6 +64,26 @@ discard_blocks(int fd, u64 start, u64 len)
return 0;
 }
 
+/*
+ * Discard blocks in the given range in 1G chunks, the process is interruptible
+ */
+static int discard_blocks(int fd, u64 start, u64 len)
+{
+   while (len  0) {
+   /* 1G granularity */
+   u64 chunk_size = min_t(u64, len, 1*1024*1024*1024);
+   int ret;
+
+   ret = discard_range(fd, start, chunk_size);
+   if (ret)
+   return ret;
+   len -= chunk_size;
+   start += chunk_size;
+   }
+
+   return 0;
+}
+
 static u64 reference_root_table[] = {
[1] =   BTRFS_ROOT_TREE_OBJECTID,
[2] =   BTRFS_EXTENT_TREE_OBJECTID,
-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: add btrfs resize unit t/p/e support

2014-03-27 Thread David Sterba
On Thu, Mar 27, 2014 at 09:35:41AM +0200, Brendan Hide wrote:
 On 2014/03/27 04:51 AM, Gui Hecheng wrote:
 [snip]
 
 We add t/p/e support by replacing lib/cmdline.c:memparse
 with btrfs_memparse. The btrfs_memparse copies memparse's code
 and add unit t/p/e parsing.
 Is there a conflict preventing adding this to memparse directly?

Agreed, there's no reason do duplicate this function.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix memory leak in btrfs_create_tree()

2014-03-27 Thread Alex Lyakas
Hi Tsutomu Itoh,

On Thu, Mar 21, 2013 at 6:32 AM, Tsutomu Itoh t-i...@jp.fujitsu.com wrote:
 We should free leaf and root before returning from the error
 handling code.

 Signed-off-by: Tsutomu Itoh t-i...@jp.fujitsu.com
 ---
  fs/btrfs/disk-io.c | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

 diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
 index 7d84651..b1b5baa 100644
 --- a/fs/btrfs/disk-io.c
 +++ b/fs/btrfs/disk-io.c
 @@ -1291,6 +1291,7 @@ struct btrfs_root *btrfs_create_tree(struct 
 btrfs_trans_handle *trans,
   0, objectid, NULL, 0, 0, 0);
 if (IS_ERR(leaf)) {
 ret = PTR_ERR(leaf);
 +   leaf = NULL;
 goto fail;
 }

 @@ -1334,11 +1335,16 @@ struct btrfs_root *btrfs_create_tree(struct 
 btrfs_trans_handle *trans,

 btrfs_tree_unlock(leaf);

 +   return root;
 +
  fail:
 -   if (ret)
 -   return ERR_PTR(ret);
 +   if (leaf) {
 +   btrfs_tree_unlock(leaf);
 +   free_extent_buffer(leaf);
I believe this is not enough. Few lines above, another reference on
the root is taken by
root-commit_root = btrfs_root_node(root);

So I believe the proper fix would be:
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d9698fd..260af79 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1354,10 +1354,10 @@ struct btrfs_root *btrfs_create_tree(struct
btrfs_trans_handle *trans,
return root;

 fail:
-   if (leaf) {
+   if (leaf)
btrfs_tree_unlock(leaf);
-   free_extent_buffer(leaf);
-   }
+   free_extent_buffer(root-node);
+   free_extent_buffer(root-commit_root);
kfree(root);

return ERR_PTR(ret);



Thanks,
Alex.



 +   }
 +   kfree(root);

 -   return root;
 +   return ERR_PTR(ret);
  }

  static struct btrfs_root *alloc_log_tree(struct btrfs_trans_handle *trans,

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 02/10] Btrfs: wake up the tasks that wait for the io earlier

2014-03-27 Thread David Sterba
On Thu, Mar 06, 2014 at 01:54:56PM +0800, Miao Xie wrote:
 @@ -349,10 +349,13 @@ int btrfs_dec_test_first_ordered_pending(struct inode 
 *inode,
   if (!uptodate)
   set_bit(BTRFS_ORDERED_IOERR, entry-flags);
  
 - if (entry-bytes_left == 0)
 + if (entry-bytes_left == 0) {
   ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, entry-flags);
 - else

waitqueue_active() should be preceded by a barrier (either implicit or
explicit), which is missing here and below. Though this could lead to a
missed wakeup, I don't think it's required here, but for consistency I
suggest to add it or put a comment why it's not needed.

 + if (waitqueue_active(entry-wait))
 + wake_up(entry-wait);
 + } else {
   ret = 1;
 + }
  out:
   if (!ret  cached  entry) {
   *cached = entry;
 @@ -410,10 +413,13 @@ have_entry:
   if (!uptodate)
   set_bit(BTRFS_ORDERED_IOERR, entry-flags);
  
 - if (entry-bytes_left == 0)
 + if (entry-bytes_left == 0) {
   ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, entry-flags);
 - else
 + if (waitqueue_active(entry-wait))

^^^

 + wake_up(entry-wait);
 + } else {
   ret = 1;
 + }
  out:
   if (!ret  cached  entry) {
   *cached = entry;
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: fix listing deleted subvolumes

2014-03-27 Thread David Sterba
The real check whether to show deleted or live subvolumes was skipped if
just '-d' was specified without other filters. The 'deleted' filter was
not accounted.

Signed-off-by: David Sterba dste...@suse.cz
---
 btrfs-list.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 912b27c3deca..c34f85e9991f 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -1218,6 +1218,7 @@ int btrfs_list_setup_filter(struct btrfs_list_filter_set 
**filter_set,
 
if (filter == BTRFS_LIST_FILTER_DELETED) {
set-only_deleted = 1;
+   set-nfilters++;
return 0;
}
 
-- 
1.7.9

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: do not reset last_snapshot after relocation

2014-03-27 Thread Josef Bacik
This was done to allow NO_COW to continue to be NO_COW after relocation but it
is not right.  When relocating we will convert blocks to FULL_BACKREF that we
relocate.  We can leave some of these full backref blocks behind if they are not
cow'ed out during the relocation, like if we fail the relocation with ENOSPC and
then just drop the reloc tree.  Then when we go to cow the block again we won't
lookup the extent flags because we won't think there has been a snapshot
recently which means we will do our normal ref drop thing instead of adding back
a tree ref and dropping the shared ref.  This will cause btrfs_free_extent to
blow up because it can't find the ref we are trying to free.  This was found
with my ref verifying tool.  Thanks,

Signed-off-by: Josef Bacik jba...@fb.com
---
 fs/btrfs/relocation.c | 21 -
 1 file changed, 21 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index ec00777..f026a82 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2318,7 +2318,6 @@ void free_reloc_roots(struct list_head *list)
 static noinline_for_stack
 int merge_reloc_roots(struct reloc_control *rc)
 {
-   struct btrfs_trans_handle *trans;
struct btrfs_root *root;
struct btrfs_root *reloc_root;
u64 last_snap;
@@ -2376,26 +2375,6 @@ again:
list_add_tail(reloc_root-root_list,
  reloc_roots);
goto out;
-   } else if (!ret) {
-   /*
-* recover the last snapshot tranid to avoid
-* the space balance break NOCOW.
-*/
-   root = read_fs_root(rc-extent_root-fs_info,
-   objectid);
-   if (IS_ERR(root))
-   continue;
-
-   trans = btrfs_join_transaction(root);
-   BUG_ON(IS_ERR(trans));
-
-   /* Check if the fs/file tree was snapshoted or not. */
-   if (btrfs_root_last_snapshot(root-root_item) ==
-   otransid - 1)
-   btrfs_set_root_last_snapshot(root-root_item,
-last_snap);
-   
-   btrfs_end_transaction(trans, root);
}
}
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: send, fix more issues related to directory renames

2014-03-27 Thread Filipe David Borba Manana
This is a continuation of the previous changes titled:

   Btrfs: fix incremental send's decision to delay a dir move/rename
   Btrfs: part 2, fix incremental send's decision to delay a dir move/rename

There's a few more cases where a directory rename/move must be delayed which was
previously overlooked. If our immediate ancestor has a lower inode number than
ours and it doesn't have a delayed rename/move operation associated to it, it
doesn't mean there isn't any non-direct ancestor of our current inode that needs
to be renamed/moved before our current inode (i.e. with a higher inode number
than ours).

So we can't stop the search if our immediate ancestor has a lower inode number 
than
ours, we need to navigate the directory hierarchy upwards until we hit the root 
or:

1) find an ancestor with an higher inode number that was renamed/moved in the 
send
   root too (or already has a pending rename/move registered);
2) find an ancestor that is a new directory (higher inode number than ours and
   exists only in the send root).

Reproducer for case 1)

$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdd /mnt

$ mkdir -p /mnt/a/b
$ mkdir -p /mnt/a/c/d
$ mkdir /mnt/a/b/e
$ mkdir /mnt/a/c/d/f
$ mv /mnt/a/b /mnt/a/c/d/2b
$ mkdir /mnt/a/x
$ mkdir /mnt/a/y

$ btrfs subvolume snapshot -r /mnt /mnt/snap1
$ btrfs send /mnt/snap1 -f /tmp/base.send

$ mv /mnt/a/x /mnt/a/y
$ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e
$ mv /mnt/a/c/d /mnt/a/h/2d
$ mv /mnt/a/c /mnt/a/h/2d/2b/2c

$ btrfs subvolume snapshot -r /mnt /mnt/snap2
$ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send

Simple reproducer for case 2)

$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdd /mnt

$ mkdir -p /mnt/a/b
$ mkdir /mnt/a/c
$ mv /mnt/a/b /mnt/a/c/b2
$ mkdir /mnt/a/e

$ btrfs subvolume snapshot -r /mnt /mnt/snap1
$ btrfs send /mnt/snap1 -f /tmp/base.send

$ mv /mnt/a/c/b2 /mnt/a/e/b3
$ mkdir /mnt/a/e/b3/f
$ mkdir /mnt/a/h
$ mv /mnt/a/c /mnt/a/e/b3/f/c2
$ mv /mnt/a/e /mnt/a/h/e2

$ btrfs subvolume snapshot -r /mnt /mnt/snap2
$ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send

Another simple reproducer for case 2)

$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdd /mnt

$ mkdir -p /mnt/a/b
$ mkdir /mnt/a/c
$ mkdir /mnt/a/b/d
$ mkdir /mnt/a/c/e

$ btrfs subvolume snapshot -r /mnt /mnt/snap1
$ btrfs send /mnt/snap1 -f /tmp/base.send

$ mkdir /mnt/a/b/d/f
$ mkdir /mnt/a/b/g
$ mv /mnt/a/c/e /mnt/a/b/g/e2
$ mv /mnt/a/c /mnt/a/b/d/f/c2
$ mv /mnt/a/b/d/f /mnt/a/b/g/e2/f2

$ btrfs subvolume snapshot -r /mnt /mnt/snap2
$ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send

More complex reproducer for case 2)

$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdd /mnt

$ mkdir -p /mnt/a/b
$ mkdir -p /mnt/a/c/d
$ mkdir /mnt/a/b/e
$ mkdir /mnt/a/c/d/f
$ mv /mnt/a/b /mnt/a/c/d/2b
$ mkdir /mnt/a/x
$ mkdir /mnt/a/y

$ btrfs subvolume snapshot -r /mnt /mnt/snap1
$ btrfs send /mnt/snap1 -f /tmp/base.send

$ mv /mnt/a/x /mnt/a/y
$ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e
$ mv /mnt/a/c/d /mnt/a/h/2d
$ mv /mnt/a/c /mnt/a/h/2d/2b/2c

$ btrfs subvolume snapshot -r /mnt /mnt/snap2
$ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send

For both cases the incremental send would enter an infinite loop when building
path strings.

While solving these cases, this change also re-implements the code to detect
when directory moves/renames should be delayed. Instead of dealing with several
specific cases separately, it's now more generic handling all cases with a 
simple
detection algorithm and if when applying a delayed move/rename there's a path 
loop
detected, it further delays the move/rename registering a new ancestor inode as
the dependency inode (so our rename happens after that ancestor is renamed).

Tests for these cases is being added to xfstests too.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 fs/btrfs/send.c |  190 ---
 1 file changed, 96 insertions(+), 94 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 2952889..e2e422c 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -2914,7 +2914,9 @@ static void free_waiting_dir_move(struct send_ctx *sctx,
 static int add_pending_dir_move(struct send_ctx *sctx,
u64 ino,
u64 ino_gen,
-   u64 parent_ino)
+   u64 parent_ino,
+   struct list_head *new_refs,
+   struct list_head *deleted_refs)
 {
struct rb_node **p = sctx-pending_dir_moves.rb_node;
struct rb_node *parent = NULL;
@@ -2946,12 +2948,12 @@ static int add_pending_dir_move(struct send_ctx *sctx,
}
}
 
-   

[PATCH] Btrfs: send, don't crash if we attempt to build a too long path

2014-03-27 Thread Filipe David Borba Manana
There were recently fixed issues where an incremental send would enter
an infinite loop when building a path string, which made it krealloc the
path buffer over and over. This eventually lead to a kernel crash because
we track the buffer's size in a 15 bits unsigned integer and eventually we
ended up assigning it 32768 (returned by ksize) which made
it get a value of 0. We then use this size to compute an offset into our
buffer which falls outside its range (by 1 byte to the left) when the size
is 0, which would make the memmove operation crash with the following trace:

[ 8541.781613] BUG: unable to handle kernel paging request at 88009069c000
[ 8541.781618] IP: [8136cf91] memmove+0x81/0x1a0
[ 8541.781623] PGD 2a2b067 PUD 21fb01067 PMD 21fa7d067 PTE 80009069c060
[ 8541.781626] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[ 8541.781628] Modules linked in: btrfs raid6_pq xor bnep rfcomm bluetooth 
binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc 
parport_pc psmouse serio_raw parport i2c_piix4 pcspkr evbug e1000 floppy [last 
unloaded: btrfs]
[ 8541.781641] CPU: 3 PID: 28970 Comm: btrfs Not tainted 
3.13.0-fdm-btrfs-next-24+ #1
[ 8541.781642] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 8541.781643] task: 88020967c920 ti: 880091246000 task.ti: 
880091246000
[ 8541.781644] RIP: 0010:[8136cf91]  [8136cf91] 
memmove+0x81/0x1a0
[ 8541.781647] RSP: 0018:8800912477c0  EFLAGS: 00010206
[ 8541.781647] RAX: 88009069c000 RBX: 8800caed46d8 RCX: 0800
[ 8541.781648] RDX: 4000 RSI: 8800906a RDI: 88009069c000
[ 8541.781649] RBP: 8800912477e8 R08: 0001 R09: 
[ 8541.781650] R10: 88009069fff8 R11: 6b6b6b6b6b6b6b6b R12: 
[ 8541.781651] R13: 4000 R14: 0113 R15: 
[ 8541.781652] FS:  7fb6c960e800() GS:88021640() 
knlGS:
[ 8541.781653] CS:  0010 DS:  ES:  CR0: 8005003b
[ 8541.781656] CR2: 88009069c000 CR3: cce87000 CR4: 06e0
[ 8541.781660] Stack:
[ 8541.781660]  a02dbfd3 ea0007a54300 8800caed46d8 
880091247830
[ 8541.781663]  0003 880091247818 a02dc316 
000c
[ 8541.781665]  8800caed5918 8800caed5918 0003 
880091247848
[ 8541.781667] Call Trace:
[ 8541.781679]  [a02dbfd3] ? fs_path_ensure_buf+0xf3/0x110 [btrfs]
[ 8541.781687]  [a02dc316] fs_path_prepare_for_add+0x46/0xc0 [btrfs]
[ 8541.781694]  [a02dc418] fs_path_add_path+0x28/0x50 [btrfs]
[ 8541.781701]  [a02de5a3] get_cur_path+0x1f3/0x5a0 [btrfs]
(...)

Since we can't have path strings larger than PATH_MAX, just return with an
ENAMETOOLONG error, which is likely caused by infinite path build loops due
to changes in directory hierarchy. This is better then crashing the kernel
and requiring a system reboot.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 fs/btrfs/send.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index e2e422c..41a4a45 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -349,6 +349,9 @@ static int fs_path_ensure_buf(struct fs_path *p, int len)
if (p-buf_len = len)
return 0;
 
+   if (unlikely(len  PATH_MAX))
+   return -ENAMETOOLONG;
+
path_len = p-end - p-start;
old_buf_len = p-buf_len;
 
@@ -2140,6 +2143,7 @@ static int get_cur_path(struct send_ctx *sctx, u64 ino, 
u64 gen,
u64 parent_inode = 0;
u64 parent_gen = 0;
int stop = 0;
+   u64 start_ino = ino;
 
name = fs_path_alloc();
if (!name) {
@@ -2187,6 +2191,12 @@ out:
fs_path_free(name);
if (!ret)
fs_path_unreverse(dest);
+   else if (unlikely(ret == -ENAMETOOLONG))
+   btrfs_warn(sctx-send_root-fs_info,
+  Possible path build loop in send operation, inode 
%llu, send root %llu, parent root %llu,
+  start_ino, sctx-send_root-root_key.objectid,
+  sctx-parent_root ?
+  sctx-parent_root-root_key.objectid : 0);
return ret;
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5] xfstests: add test for btrfs send regarding directory moves/renames

2014-03-27 Thread Filipe David Borba Manana
From: Filipe Manana fdman...@gmail.com

Regression test for a btrfs incremental send issue where the kernel failed
to build paths strings. This resulted either in sending a wrong path string
to the send stream or entering an infinite loop when building it.
This happened in the following scenarios:

1) A directory was made a child of another directory which has a lower inode
   number and has a pending move/rename operation or there's some non-direct
   ancestor directory with a higher inode number that was renamed/moved too.
   This made the incremental send code go into an infinite loop when building
   a path string;

2) A directory was made a child of another directory which has a higher inode
   number, but the new parent wasn't moved nor renamed. Instead some other
   ancestor higher in the hierarchy, with an higher inode number too, was
   moved/renamed too. This made the incremental send code go into an infinite
   loop when building a path string;

3) An orphan directory is created and at least one of its non-immediate
   descendent directories have a pending move/rename operation. This made
   an incremental send issue to the send stream an invalid path string that
   didn't account for the orphan ancestor directory.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: Added more tests.
V3: Added more tests for more complex cases.
V4: Added more tests, related to case 3) mentioned above.
V5: Added more tests, related to case 1) mentioned above.

 tests/btrfs/045 |  376 +++
 tests/btrfs/045.out |1 +
 tests/btrfs/group   |1 +
 3 files changed, 378 insertions(+)
 create mode 100755 tests/btrfs/045
 create mode 100644 tests/btrfs/045.out

diff --git a/tests/btrfs/045 b/tests/btrfs/045
new file mode 100755
index 000..4567a3f
--- /dev/null
+++ b/tests/btrfs/045
@@ -0,0 +1,376 @@
+#! /bin/bash
+# FS QA Test No. btrfs/045
+#
+# Regression test for a btrfs incremental send issue where the kernel failed
+# to build paths strings. This resulted either in sending a wrong path string
+# to the send stream or entering an infinite loop when building it.
+# This happened in the following scenarios:
+#
+# 1) A directory was made a child of another directory which has a lower inode
+#number and has a pending move/rename operation or there's some non-direct
+#ancestor directory with a higher inode number that was renamed/moved too.
+#This made the incremental send code go into an infinite loop when building
+#a path string;
+#
+# 2) A directory was made a child of another directory which has a higher inode
+#number, but the new parent wasn't moved nor renamed. Instead some other
+#ancestor higher in the hierarchy, with an higher inode number too, was
+#moved/renamed too. This made the incremental send code go into an infinite
+#loop when building a path string;
+#
+# 3) An orphan directory is created and at least one of its non-immediate
+#descendent directories have a pending move/rename operation. This made
+#an incremental send issue to the send stream an invalid path string that
+#didn't account for the orphan ancestor directory.
+#
+# These issues are fixed by the following linux kernel btrfs patches:
+#
+#   Btrfs: fix incremental send's decision to delay a dir move/rename
+#   Btrfs: part 2, fix incremental send's decision to delay a dir move/rename
+#   Btrfs: send, fix more issues related to directory renames
+#   Btrfs: send, account for orphan directories when building path strings
+#
+#---
+# Copyright (c) 2014 Filipe Manana.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+tmp=`mktemp -d`
+status=1   # failure is the default!
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+rm -fr $tmp
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_fssum
+_need_to_be_root
+
+rm -f $seqres.full
+
+_scratch_mkfs /dev/null 21
+_scratch_mount
+
+# case 1), mentioned above
+mkdir -p $SCRATCH_MNT/a/b
+mkdir $SCRATCH_MNT/a/c

RAID-1 - handling disk failures?

2014-03-27 Thread Tomasz Chmielewski
Is btrfs supposed to handle disk failures in RAID-1 mode?

It doesn't seem to be the case for me, with 3.14.0-rc8.

Right now, the system doesn't see the faulty drive anymore (i.e. hdparm -i 
/dev/sdd is unable to give any info).

Accesses to most files on btrfs filesystem just freeze (waiting for IO) the 
process which is accessing the data.

The other drive in RAID-1, /dev/sdc, is healthy.

# grep -i btrfs syslog
Mar 27 09:57:59 bkp010 kernel: [157256.352840] BTRFS: bdev /dev/sdd1 errs: wr 
31, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.353334] BTRFS: bdev /dev/sdd1 errs: wr 
32, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.353816] BTRFS: bdev /dev/sdd1 errs: wr 
33, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.354338] BTRFS: bdev /dev/sdd1 errs: wr 
34, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.354826] BTRFS: bdev /dev/sdd1 errs: wr 
35, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.355314] BTRFS: bdev /dev/sdd1 errs: wr 
36, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.355810] BTRFS: bdev /dev/sdd1 errs: wr 
37, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.356302] BTRFS: bdev /dev/sdd1 errs: wr 
38, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.356790] BTRFS: bdev /dev/sdd1 errs: wr 
39, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:57:59 bkp010 kernel: [157256.357275] BTRFS: bdev /dev/sdd1 errs: wr 
40, rd 1, flush 0, corrupt 0, gen 0
Mar 27 09:58:02 bkp010 kernel: [157259.298965] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:02 bkp010 kernel: [157259.299309] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:02 bkp010 kernel: [157259.299637] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:04 bkp010 kernel: [157261.358796] btrfs_dev_stat_print_on_error: 
9038 callbacks suppressed
Mar 27 09:58:04 bkp010 kernel: [157261.358844] BTRFS: bdev /dev/sdd1 errs: wr 
9007, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.359215] BTRFS: bdev /dev/sdd1 errs: wr 
9008, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.359585] BTRFS: bdev /dev/sdd1 errs: wr 
9009, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.359954] BTRFS: bdev /dev/sdd1 errs: wr 
9010, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.360323] BTRFS: bdev /dev/sdd1 errs: wr 
9011, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.360693] BTRFS: bdev /dev/sdd1 errs: wr 
9012, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.361063] BTRFS: bdev /dev/sdd1 errs: wr 
9013, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.361433] BTRFS: bdev /dev/sdd1 errs: wr 
9014, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.361802] BTRFS: bdev /dev/sdd1 errs: wr 
9015, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:04 bkp010 kernel: [157261.362172] BTRFS: bdev /dev/sdd1 errs: wr 
9016, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.046550] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:09 bkp010 kernel: [157266.046931] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:09 bkp010 kernel: [157266.047307] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:09 bkp010 kernel: [157266.427724] btrfs_dev_stat_print_on_error: 
13860 callbacks suppressed
Mar 27 09:58:09 bkp010 kernel: [157266.427788] BTRFS: bdev /dev/sdd1 errs: wr 
22877, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.428288] BTRFS: bdev /dev/sdd1 errs: wr 
22878, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.431504] BTRFS: bdev /dev/sdd1 errs: wr 
22879, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.432047] BTRFS: bdev /dev/sdd1 errs: wr 
22880, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.499055] BTRFS: bdev /dev/sdd1 errs: wr 
22881, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.499453] BTRFS: bdev /dev/sdd1 errs: wr 
22882, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.499847] BTRFS: bdev /dev/sdd1 errs: wr 
22883, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.500238] BTRFS: bdev /dev/sdd1 errs: wr 
22884, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.500625] BTRFS: bdev /dev/sdd1 errs: wr 
22885, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:09 bkp010 kernel: [157266.501692] BTRFS: bdev /dev/sdd1 errs: wr 
22886, rd 73, flush 0, corrupt 0, gen 0
Mar 27 09:58:10 bkp010 kernel: [157267.726185] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:10 bkp010 kernel: [157267.726472] BTRFS: lost page write due to 
I/O error on /dev/sdd1
Mar 27 09:58:10 bkp010 kernel: 

[PATCH] Btrfs: check for an extent_op on the locked ref

2014-03-27 Thread Josef Bacik
We could have possibly added an extent_op to the locked_ref while we dropped
locked_ref-lock, so check for this case as well and loop around.  Otherwise we
could lose flag updates which would lead to extent tree corruption.  Thanks,

cc: sta...@vger.kernel.org
Signed-off-by: Josef Bacik jba...@fb.com
---
 fs/btrfs/extent-tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a050e83..af5a656 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2448,7 +2448,8 @@ static noinline int __btrfs_run_delayed_refs(struct 
btrfs_trans_handle *trans,
spin_unlock(locked_ref-lock);
spin_lock(delayed_refs-lock);
spin_lock(locked_ref-lock);
-   if (rb_first(locked_ref-ref_root)) {
+   if (rb_first(locked_ref-ref_root) ||
+   locked_ref-extent_op) {
spin_unlock(locked_ref-lock);
spin_unlock(delayed_refs-lock);
continue;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: add a extent ref verify tool

2014-03-27 Thread Josef Bacik
We were having corruption issues that were tied back to problems with the extent
tree.  In order to track them down I built this tool to try and find the
culprit, which was pretty successful.  If you compile with this tool on it will
live verify every ref update that the fs makes and make sure it is consistent
and valid.  This should only be used with a clean file system to start with and
then have the tests run as it doesn't lookup the actual shared refs on mount, so
it will get snapshots and things wrong.  This could be fixed in the future
easily, I just didn't need it for my particular test.  Thanks,

Signed-off-by: Josef Bacik jba...@fb.com
---
 fs/btrfs/Kconfig   |  10 +
 fs/btrfs/Makefile  |   1 +
 fs/btrfs/ctree.c   |   2 +-
 fs/btrfs/ctree.h   |   7 +
 fs/btrfs/disk-io.c |  14 +-
 fs/btrfs/extent-tree.c |  16 +
 fs/btrfs/ref-verify.c  | 892 +
 fs/btrfs/ref-verify.h  |  34 ++
 fs/btrfs/relocation.c  |   1 +
 9 files changed, 975 insertions(+), 2 deletions(-)
 create mode 100644 fs/btrfs/ref-verify.c
 create mode 100644 fs/btrfs/ref-verify.h

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index a66768e..1dfd411 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -88,3 +88,13 @@ config BTRFS_ASSERT
  any of the assertions trip.  This is meant for btrfs developers only.
 
  If unsure, say N.
+
+config BTRFS_FS_REF_VERIFY
+   bool Btrfs with the ref verify tool compiled in
+   depends on BTRFS_FS
+   help
+ Enable run-time extent reference verification instrumentation.  This
+ is meant to be used by btrfs developers for tracking down extent
+ reference problems or verifying they didn't break something.
+
+ If unsure, say N.
diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index f341a98..ae837d2 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -13,6 +13,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
+btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o
 
 btrfs-$(CONFIG_BTRFS_FS_RUN_SANITY_TESTS) += tests/free-space-tests.o \
tests/extent-buffer-tests.o tests/btrfs-tests.o \
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 9d89c16..71bbafe 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -202,7 +202,7 @@ struct extent_buffer *btrfs_lock_root_node(struct 
btrfs_root *root)
  * tree until you end up with a lock on the root.  A locked buffer
  * is returned, with a reference held.
  */
-static struct extent_buffer *btrfs_read_lock_root_node(struct btrfs_root *root)
+struct extent_buffer *btrfs_read_lock_root_node(struct btrfs_root *root)
 {
struct extent_buffer *eb;
 
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 4253ab2..2277006 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1680,6 +1680,12 @@ struct btrfs_fs_info {
 
struct semaphore uuid_tree_rescan_sem;
unsigned int update_uuid_tree_gen:1;
+
+#ifdef CONFIG_BTRFS_FS_REF_VERIFY
+   spinlock_t ref_verify_lock;
+   struct rb_root block_tree;
+   bool ref_verify_enabled;
+#endif
 };
 
 struct btrfs_subvolume_writers {
@@ -3379,6 +3385,7 @@ void btrfs_set_item_key_safe(struct btrfs_root *root, 
struct btrfs_path *path,
 struct btrfs_key *new_key);
 struct extent_buffer *btrfs_root_node(struct btrfs_root *root);
 struct extent_buffer *btrfs_lock_root_node(struct btrfs_root *root);
+struct extent_buffer *btrfs_read_lock_root_node(struct btrfs_root *root);
 int btrfs_find_next_key(struct btrfs_root *root, struct btrfs_path *path,
struct btrfs_key *key, int lowest_level,
u64 min_trans);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a152a96..02ae4d1 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -49,6 +49,7 @@
 #include dev-replace.h
 #include raid56.h
 #include sysfs.h
+#include ref-verify.h
 
 #ifdef CONFIG_X86
 #include asm/cpufeature.h
@@ -2268,6 +2269,11 @@ int open_ctree(struct super_block *sb,
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
fs_info-check_integrity_print_mask = 0;
 #endif
+#ifdef CONFIG_BTRFS_FS_REF_VERIFY
+   spin_lock_init(fs_info-ref_verify_lock);
+   fs_info-block_tree = RB_ROOT;
+   fs_info-ref_verify_enabled = true;
+#endif
 
spin_lock_init(fs_info-balance_lock);
mutex_init(fs_info-balance_mutex);
@@ -2895,7 +2901,12 @@ retry_root_backup:
 
if (sb-s_flags  MS_RDONLY)
return 0;
-
+#ifdef CONFIG_BTRFS_FS_REF_VERIFY
+   if (btrfs_build_ref_tree(fs_info)) {
+   fs_info-ref_verify_enabled = false;
+   printk(KERN_ERR BTRFS: couldn't build ref tree\n);
+   }
+#endif
down_read(fs_info-cleanup_work_sem);
if ((ret = btrfs_orphan_cleanup(fs_info-fs_root)) ||
(ret = 

Re: [PATCH] btrfs: add btrfs resize unit t/p/e support

2014-03-27 Thread Gui Hecheng
On Thu, 2014-03-27 at 16:27 +0100, David Sterba wrote:
 On Thu, Mar 27, 2014 at 09:35:41AM +0200, Brendan Hide wrote:
  On 2014/03/27 04:51 AM, Gui Hecheng wrote:
  [snip]
  
  We add t/p/e support by replacing lib/cmdline.c:memparse
  with btrfs_memparse. The btrfs_memparse copies memparse's code
  and add unit t/p/e parsing.
  Is there a conflict preventing adding this to memparse directly?
 
 Agreed, there's no reason do duplicate this function.
Yes, I will try to modify the original memparse soon.

Thanks all!

-Gui

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html