Re: [PATCH v2] Btrfs, lockdep: get_restripe_target: use lockdep in BUG_ON

2012-04-05 Thread Bobby Powers
On Wed, Apr 4, 2012 at 10:04 PM, Bobby Powers bobbypow...@gmail.com wrote:
 spin_is_locked always returns 0 on non-SMP systems, which causes btrfs
 to fail the mount.  There is documentation pending as to why checking

I guess I should be explicit in stating that this is a regression, so
this patch or something else that addresses it should be pulled into
3.4.

 for spin_is_locked is a bad idea:

 https://lkml.org/lkml/2012/3/27/413

 The suggested lockdep_assert_held() is not appropriate in this case,
 as what get_restripe_target() is checking for is that either
 volume_mutex is held or balance_lock is held.  Luckily
 lockdep_assert_held() is a simple macro:

 WARN_ON(debug_locks  !lockdep_is_held(l))

 We can mimic the structure in get_restripe_target(), but we need to
 make sure lockdep_is_held() is defined for the !LOCKDEP case.

 CC: Ilya Dryomov idryo...@gmail.com
 CC: Chris Mason chris.ma...@oracle.com
 CC: Andi Kleen a...@linux.intel.com
 CC: Jeff Mahoney je...@suse.de
 CC: Ingo Molnar mi...@redhat.com
 CC: linux-ker...@vger.kernel.org
 Signed-off-by: Bobby Powers bobbypow...@gmail.com
 ---
  fs/btrfs/extent-tree.c  |    5 +++--
  include/linux/lockdep.h |    1 +
  2 files changed, 4 insertions(+), 2 deletions(-)

 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index a844204..4d13eb1 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -24,6 +24,7 @@
  #include linux/kthread.h
  #include linux/slab.h
  #include linux/ratelimit.h
 +#include linux/lockdep.h
  #include compat.h
  #include hash.h
  #include ctree.h
 @@ -3158,8 +3159,8 @@ static u64 get_restripe_target(struct btrfs_fs_info 
 *fs_info, u64 flags)
        struct btrfs_balance_control *bctl = fs_info-balance_ctl;
        u64 target = 0;

 -       BUG_ON(!mutex_is_locked(fs_info-volume_mutex) 
 -              !spin_is_locked(fs_info-balance_lock));
 +       BUG_ON(debug_locks  !lockdep_is_held(fs_info-volume_mutex) 
 +              !lockdep_is_held(fs_info-balance_lock));

        if (!bctl)
                return 0;
 diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
 index d36619e..94c0edb 100644
 --- a/include/linux/lockdep.h
 +++ b/include/linux/lockdep.h
 @@ -392,6 +392,7 @@ struct lock_class_key { };

  #define lockdep_depth(tsk)     (0)

 +#define lockdep_is_held(l)     (0)
  #define lockdep_assert_held(l)                 do { } while (0)

  #define lockdep_recursing(tsk)                 (0)
 --
 1.7.10.rc3.3.g19a6c

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs, lockdep: get_restripe_target: use lockdep in BUG_ON

2012-04-05 Thread Ilya Dryomov
On Thu, Apr 05, 2012 at 12:23:01PM -0400, Bobby Powers wrote:
 On Wed, Apr 4, 2012 at 10:04 PM, Bobby Powers bobbypow...@gmail.com wrote:
  spin_is_locked always returns 0 on non-SMP systems, which causes btrfs
  to fail the mount.  There is documentation pending as to why checking
 
 I guess I should be explicit in stating that this is a regression, so
 this patch or something else that addresses it should be pulled into
 3.4.

Yes, this is a regression and spin_is_locked() definitely has to go.  I
don't have a strong opinion on this assert, if there are objections to
v2 I'm OK with ripping that BUG_ON entirely and replacing it with a
comment (this function and its callers are WIP).

Thanks,

Ilya
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: make clear_extent_bit() always return 0 on success

2012-04-05 Thread David Sterba
On Mon, Mar 12, 2012 at 04:39:28PM +0800, Li Zefan wrote:
 Currently it returns a set of bits that were cleared, but this return
 value is not used at all.
 
 Moreover it doesn't seem to be useful, because we may clear the bits
 of a few extent_states, but only the cleared bits of last one is
 returned.
 
 Signed-off-by: Li Zefan l...@cn.fujitsu.com
 ---
  fs/btrfs/extent_io.c |   19 +++
  1 files changed, 7 insertions(+), 12 deletions(-)
 
 diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
 index a55fbe6..c968c95 100644
 --- a/fs/btrfs/extent_io.c
 +++ b/fs/btrfs/extent_io.c
 @@ -394,18 +394,16 @@ static int split_state(struct extent_io_tree *tree, 
 struct extent_state *orig,
  
  /*
   * utility function to clear some bits in an extent state struct.
 - * it will optionally wake up any one waiting on this state (wake == 1), or
 - * forcibly remove the state from the tree (delete == 1).
 + * it will optionally wake up any one waiting on this state (wake == 1)
   *
   * If no bits are set on the state struct after clearing things, the
   * struct is freed and removed from the tree
   */
 -static int clear_state_bit(struct extent_io_tree *tree,
 +static void clear_state_bit(struct extent_io_tree *tree,
   struct extent_state *state,
   int *bits, int wake)
  {
   int bits_to_clear = *bits  ~EXTENT_CTLBITS;
 - int ret = state-state  bits_to_clear;
  
   if ((bits_to_clear  EXTENT_DIRTY)  (state-state  EXTENT_DIRTY)) {
   u64 range = state-end - state-start + 1;
 @@ -427,7 +425,6 @@ static int clear_state_bit(struct extent_io_tree *tree,
   } else {
   merge_state(tree, state);
   }
 - return ret;
  }
  
  static struct extent_state *

The above part of the patch still applies and with only subject change
to something like

  Btrfs: retrurn void from clear_state_bit

is a rc2 material. So, Li, if you're ok with this change I'm adding it
(with the 2/2 patch) to my local queue of rc patches for Chris.


david

(the rest of the patch was done within the error handling series)

 @@ -449,8 +446,7 @@ alloc_extent_state_atomic(struct extent_state *prealloc)
   *
   * the range [start, end] is inclusive.
   *
 - * This takes the tree lock, and returns  0 on error,  0 if any of the
 - * bits were already set, or zero if none of the bits were already set.
 + * This takes the tree lock, and returns  0 on error.
   */
  int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
int bits, int wake, int delete,
 @@ -464,7 +460,6 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 
 start, u64 end,
   struct rb_node *node;
   u64 last_end;
   int err;
 - int set = 0;
   int clear = 0;
  
   if (delete)
 @@ -547,7 +542,7 @@ hit_next:
   if (err)
   goto out;
   if (state-end = end) {
 - set |= clear_state_bit(tree, state, bits, wake);
 + clear_state_bit(tree, state, bits, wake);
   if (last_end == (u64)-1)
   goto out;
   start = last_end + 1;
 @@ -568,13 +563,13 @@ hit_next:
   if (wake)
   wake_up(state-wq);
  
 - set |= clear_state_bit(tree, prealloc, bits, wake);
 + clear_state_bit(tree, prealloc, bits, wake);
  
   prealloc = NULL;
   goto out;
   }
  
 - set |= clear_state_bit(tree, state, bits, wake);
 + clear_state_bit(tree, state, bits, wake);
  next:
   if (last_end == (u64)-1)
   goto out;
 @@ -591,7 +586,7 @@ out:
   if (prealloc)
   free_extent_state(prealloc);
  
 - return set;
 + return 0;
  
  search_again:
   if (start  end)
 -- 1.7.3.1 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] btrfs: extended inode refs

2012-04-05 Thread Mark Fasheh
This patch adds basic support for extended inode refs. This includes support
for link and unlink of the refs, which basically gets us support for rename
as well.

Inode creation does not need changing - extended refs are only added after
the ref array is full.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 fs/btrfs/ctree.h  |   50 +--
 fs/btrfs/inode-item.c |  244 +++--
 fs/btrfs/inode.c  |   20 ++--
 3 files changed, 288 insertions(+), 26 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 80b6486..5fc77ee 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -143,6 +143,13 @@ struct btrfs_ordered_sum;
  */
 #define BTRFS_NAME_LEN 255
 
+/*
+ * Theoretical limit is larger, but we keep this down to a sane
+ * value. That should limit greatly the possibility of collisions on
+ * inode ref items.
+ */
+#define BTRFS_LINK_MAX 65535U
+
 /* 32 bytes in various csum fields */
 #define BTRFS_CSUM_SIZE 32
 
@@ -462,13 +469,16 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS(1ULL  2)
 #define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO(1ULL  3)
 
+#define BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF   (1ULL  6)
+
 #define BTRFS_FEATURE_COMPAT_SUPP  0ULL
 #define BTRFS_FEATURE_COMPAT_RO_SUPP   0ULL
 #define BTRFS_FEATURE_INCOMPAT_SUPP\
(BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF | \
 BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL |\
 BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS |  \
-BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO)
+BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO |  \
+BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF)
 
 /*
  * A leaf is full of items. offset and size tell us where to find
@@ -615,6 +625,14 @@ struct btrfs_inode_ref {
/* name goes here */
 } __attribute__ ((__packed__));
 
+struct btrfs_inode_extref {
+   __le64 parent_objectid;
+   __le64 index;
+   __le16 name_len;
+   __u8   name[0];
+   /* name goes here */
+} __attribute__ ((__packed__));
+
 struct btrfs_timespec {
__le64 sec;
__le32 nsec;
@@ -1400,6 +1418,7 @@ struct btrfs_ioctl_defrag_range_args {
  */
 #define BTRFS_INODE_ITEM_KEY   1
 #define BTRFS_INODE_REF_KEY12
+#define BTRFS_INODE_EXTREF_KEY 13
 #define BTRFS_XATTR_ITEM_KEY   24
 #define BTRFS_ORPHAN_ITEM_KEY  48
 /* reserve 2-15 close to the inode for later flexibility */
@@ -1701,6 +1720,13 @@ BTRFS_SETGET_STACK_FUNCS(block_group_flags,
 BTRFS_SETGET_FUNCS(inode_ref_name_len, struct btrfs_inode_ref, name_len, 16);
 BTRFS_SETGET_FUNCS(inode_ref_index, struct btrfs_inode_ref, index, 64);
 
+/* struct btrfs_inode_extref */
+BTRFS_SETGET_FUNCS(inode_extref_parent, struct btrfs_inode_extref,
+  parent_objectid, 64);
+BTRFS_SETGET_FUNCS(inode_extref_name_len, struct btrfs_inode_extref,
+  name_len, 16);
+BTRFS_SETGET_FUNCS(inode_extref_index, struct btrfs_inode_extref, index, 64);
+
 /* struct btrfs_inode_item */
 BTRFS_SETGET_FUNCS(inode_generation, struct btrfs_inode_item, generation, 64);
 BTRFS_SETGET_FUNCS(inode_sequence, struct btrfs_inode_item, sequence, 64);
@@ -2791,12 +2817,12 @@ int btrfs_del_inode_ref(struct btrfs_trans_handle 
*trans,
   struct btrfs_root *root,
   const char *name, int name_len,
   u64 inode_objectid, u64 ref_objectid, u64 *index);
-struct btrfs_inode_ref *
-btrfs_lookup_inode_ref(struct btrfs_trans_handle *trans,
-   struct btrfs_root *root,
-   struct btrfs_path *path,
-   const char *name, int name_len,
-   u64 inode_objectid, u64 ref_objectid, int mod);
+int btrfs_get_inode_ref_index(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root,
+ struct btrfs_path *path,
+ const char *name, int name_len,
+ u64 inode_objectid, u64 ref_objectid, int mod,
+ u64 *ret_index);
 int btrfs_insert_empty_inode(struct btrfs_trans_handle *trans,
 struct btrfs_root *root,
 struct btrfs_path *path, u64 objectid);
@@ -2804,6 +2830,16 @@ int btrfs_lookup_inode(struct btrfs_trans_handle *trans, 
struct btrfs_root
   *root, struct btrfs_path *path,
   struct btrfs_key *location, int mod);
 
+struct btrfs_inode_extref *
+btrfs_lookup_inode_extref(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root,
+ struct btrfs_path *path,
+ const char *name, int name_len,
+ u64 inode_objectid, u64 ref_objectid, int ins_len,
+ int cow);
+
+u64 btrfs_extref_key_off(u64 parent_objectid, const 

[PATCH 2/3] btrfs: extended inode refs

2012-04-05 Thread Mark Fasheh
Teach tree-log.c about extended inode refs. In particular, we have to adjust
the behavior of inode ref replay as well as log tree recovery to account for
the existence of extended refs.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 fs/btrfs/tree-log.c |  320 +-
 fs/btrfs/tree-log.h |4 +
 2 files changed, 266 insertions(+), 58 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 966cc74..d69b07a 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -23,6 +23,7 @@
 #include disk-io.h
 #include locking.h
 #include print-tree.h
+#include backref.h
 #include compat.h
 #include tree-log.h
 
@@ -748,6 +749,7 @@ static noinline int backref_in_log(struct btrfs_root *log,
 {
struct btrfs_path *path;
struct btrfs_inode_ref *ref;
+   struct btrfs_inode_extref *extref;
unsigned long ptr;
unsigned long ptr_end;
unsigned long name_ptr;
@@ -764,8 +766,24 @@ static noinline int backref_in_log(struct btrfs_root *log,
if (ret != 0)
goto out;
 
-   item_size = btrfs_item_size_nr(path-nodes[0], path-slots[0]);
ptr = btrfs_item_ptr_offset(path-nodes[0], path-slots[0]);
+
+   if (key-type == BTRFS_INODE_EXTREF_KEY) {
+   extref = (struct btrfs_inode_extref *)ptr;
+
+   found_name_len = btrfs_inode_extref_name_len(path-nodes[0],
+extref);
+   if (found_name_len == namelen) {
+   name_ptr = (unsigned long)extref-name;
+   ret = memcmp_extent_buffer(path-nodes[0], name,
+  name_ptr, namelen);
+   if (ret == 0)
+   match = 1;
+   }
+   goto out;
+   }
+
+   item_size = btrfs_item_size_nr(path-nodes[0], path-slots[0]);
ptr_end = ptr + item_size;
while (ptr  ptr_end) {
ref = (struct btrfs_inode_ref *)ptr;
@@ -786,7 +804,6 @@ out:
return match;
 }
 
-
 /*
  * replay one inode back reference item found in the log tree.
  * eb, slot and key refer to the buffer and key found in the log tree.
@@ -801,15 +818,20 @@ static noinline int add_inode_ref(struct 
btrfs_trans_handle *trans,
  struct btrfs_key *key)
 {
struct btrfs_inode_ref *ref;
+   struct btrfs_inode_extref *extref;
struct btrfs_dir_item *di;
+   struct btrfs_key search_key;
struct inode *dir;
struct inode *inode;
unsigned long ref_ptr;
unsigned long ref_end;
-   char *name;
-   int namelen;
+   char *name, *victim_name;
+   int namelen, victim_name_len;
int ret;
int search_done = 0;
+   int log_ref_ver = 0;
+   u64 parent_objectid, inode_objectid, ref_index;
+   struct extent_buffer *leaf;
 
/*
 * it is possible that we didn't log all the parent directories
@@ -817,32 +839,56 @@ static noinline int add_inode_ref(struct 
btrfs_trans_handle *trans,
 * copy the back ref in.  The link count fixup code will take
 * care of the rest
 */
-   dir = read_one_inode(root, key-offset);
+
+   if (key-type == BTRFS_INODE_EXTREF_KEY) {
+   log_ref_ver = 1;
+
+   ref_ptr = btrfs_item_ptr_offset(eb, slot);
+
+   /* So that we don't loop back looking for old style log refs. */
+   ref_end = ref_ptr;
+
+   extref = (struct btrfs_inode_extref *) 
btrfs_item_ptr_offset(eb, slot);
+   namelen = btrfs_inode_extref_name_len(eb, extref);
+   name = kmalloc(namelen, GFP_NOFS);
+
+   read_extent_buffer(eb, name, (unsigned long)extref-name,
+  namelen);
+
+   ref_index = btrfs_inode_extref_index(eb, extref);
+   parent_objectid = btrfs_inode_extref_parent(eb, extref);
+   } else {
+   parent_objectid = key-offset;
+
+   ref_ptr = btrfs_item_ptr_offset(eb, slot);
+   ref_end = ref_ptr + btrfs_item_size_nr(eb, slot);
+
+   ref = (struct btrfs_inode_ref *)ref_ptr;
+   namelen = btrfs_inode_ref_name_len(eb, ref);
+   name = kmalloc(namelen, GFP_NOFS);
+   BUG_ON(!name);
+
+   read_extent_buffer(eb, name, (unsigned long)(ref + 1), namelen);
+
+   ref_index = btrfs_inode_ref_index(eb, ref);
+   }
+
+   inode_objectid = key-objectid;
+
+   dir = read_one_inode(root, parent_objectid);
if (!dir)
return -ENOENT;
 
-   inode = read_one_inode(root, key-objectid);
+   inode = read_one_inode(root, inode_objectid);
if (!inode) {
iput(dir);
return -EIO;
}
 
-   ref_ptr = btrfs_item_ptr_offset(eb, slot);
-   ref_end = 

[PATCH 3/3] btrfs: extended inode refs

2012-04-05 Thread Mark Fasheh
The iterate_irefs in backref.c is used to build path components from inode
refs. I had to add a 2nd iterate function callback to handle extended refs.

Both iterate callbacks eventually converge upon iref_to_path() which I was
able to keep as one function with some small code to abstract away
differences in the two disk structures.

Signed-off-by: Mark Fasheh mfas...@suse.de
---
 fs/btrfs/backref.c |  200 ++--
 fs/btrfs/backref.h |4 +-
 2 files changed, 165 insertions(+), 39 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 0436c12..f2b8952 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -22,6 +22,7 @@
 #include ulist.h
 #include transaction.h
 #include delayed-ref.h
+#include tree-log.h
 
 /*
  * this structure records all encountered refs on the way up to the root
@@ -858,62 +859,75 @@ static int inode_ref_info(u64 inum, u64 ioff, struct 
btrfs_root *fs_root,
 }
 
 /*
- * this iterates to turn a btrfs_inode_ref into a full filesystem path. 
elements
- * of the path are separated by '/' and the path is guaranteed to be
- * 0-terminated. the path is only given within the current file system.
- * Therefore, it never starts with a '/'. the caller is responsible to provide
- * size bytes in dest. the dest buffer will be filled backwards. finally,
- * the start point of the resulting string is returned. this pointer is within
- * dest, normally.
- * in case the path buffer would overflow, the pointer is decremented further
- * as if output was written to the buffer, though no more output is actually
- * generated. that way, the caller can determine how much space would be
- * required for the path to fit into the buffer. in that case, the returned
- * value will be smaller than dest. callers must check this!
+ * Given the parent objectid and name/name_len pairs of an inode ref
+ * (any version) this iterates to turn that information into a
+ * full filesystem path. elements of the path are separated by '/' and
+ * the path is guaranteed to be 0-terminated. the path is only given
+ * within the current file system.  Therefore, it never starts with a
+ * '/'. the caller is responsible to provide size bytes in
+ * dest. the dest buffer will be filled backwards. finally, the
+ * start point of the resulting string is returned. this pointer is
+ * within dest, normally.  in case the path buffer would overflow, the
+ * pointer is decremented further as if output was written to the
+ * buffer, though no more output is actually generated. that way, the
+ * caller can determine how much space would be required for the path
+ * to fit into the buffer. in that case, the returned value will be
+ * smaller than dest. callers must check this!
  */
 static char *iref_to_path(struct btrfs_root *fs_root, struct btrfs_path *path,
-   struct btrfs_inode_ref *iref,
-   struct extent_buffer *eb_in, u64 parent,
-   char *dest, u32 size)
+ int name_len, unsigned long name_off,
+ struct extent_buffer *eb_in, u64 parent,
+ char *dest, u32 size)
 {
-   u32 len;
int slot;
u64 next_inum;
int ret;
s64 bytes_left = size - 1;
struct extent_buffer *eb = eb_in;
struct btrfs_key found_key;
+   struct btrfs_inode_ref *iref;
+   struct btrfs_inode_extref *iref2;
 
if (bytes_left = 0)
dest[bytes_left] = '\0';
 
while (1) {
-   len = btrfs_inode_ref_name_len(eb, iref);
-   bytes_left -= len;
+   bytes_left -= name_len;
if (bytes_left = 0)
read_extent_buffer(eb, dest + bytes_left,
-   (unsigned long)(iref + 1), len);
+  name_off, name_len);
if (eb != eb_in)
free_extent_buffer(eb);
+
+   /* Ok, we have enough to find any refs to the parent inode. */
ret = inode_ref_info(parent, 0, fs_root, path, found_key);
-   if (ret  0)
-   ret = -ENOENT;
-   if (ret)
-   break;
next_inum = found_key.offset;
+   if (ret == 0) {
+   slot = path-slots[0];
+   eb = path-nodes[0];
+   /* make sure we can use eb after releasing the path */
+   if (eb != eb_in)
+   atomic_inc(eb-refs);
+   btrfs_release_path(path);
+   iref = btrfs_item_ptr(eb, slot, struct btrfs_inode_ref);
+
+   name_len = btrfs_inode_ref_name_len(eb, iref);
+   name_off = (unsigned long)(iref + 1);
+   } else {
+   ret = 

bash completion file

2012-04-05 Thread Alfredo Esteban
Hi,

Firstly, thanks to developers for their work.

I'm a btrfs user and I wrote my bash completion file for btrfs
command. This way, I can use tab to complete command line in bash.

I'm sending you it. I hope you find it useful. In Debian-like distros
you should put it in /etc/bash_completion.d/.

For btrfs developers: If you want to include it in the distros
package, feel you free.

Alfredo


btrfs
Description: Binary data


Re: [PATCH 1/2] Btrfs: make clear_extent_bit() always return 0 on success

2012-04-05 Thread Li Zefan
(Note: I've changed my email address ;)

David Sterba wrote:

 On Mon, Mar 12, 2012 at 04:39:28PM +0800, Li Zefan wrote:
 Currently it returns a set of bits that were cleared, but this return
 value is not used at all.

 Moreover it doesn't seem to be useful, because we may clear the bits
 of a few extent_states, but only the cleared bits of last one is
 returned.

 Signed-off-by: Li Zefan l...@cn.fujitsu.com
 ---
  fs/btrfs/extent_io.c |   19 +++
  1 files changed, 7 insertions(+), 12 deletions(-)

 diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
 index a55fbe6..c968c95 100644
 --- a/fs/btrfs/extent_io.c
 +++ b/fs/btrfs/extent_io.c
 @@ -394,18 +394,16 @@ static int split_state(struct extent_io_tree *tree, 
 struct extent_state *orig,
  
  /*
   * utility function to clear some bits in an extent state struct.
 - * it will optionally wake up any one waiting on this state (wake == 1), or
 - * forcibly remove the state from the tree (delete == 1).
 + * it will optionally wake up any one waiting on this state (wake == 1)
   *
   * If no bits are set on the state struct after clearing things, the
   * struct is freed and removed from the tree
   */
 -static int clear_state_bit(struct extent_io_tree *tree,
 +static void clear_state_bit(struct extent_io_tree *tree,
  struct extent_state *state,
  int *bits, int wake)
  {
  int bits_to_clear = *bits  ~EXTENT_CTLBITS;
 -int ret = state-state  bits_to_clear;
  
  if ((bits_to_clear  EXTENT_DIRTY)  (state-state  EXTENT_DIRTY)) {
  u64 range = state-end - state-start + 1;
 @@ -427,7 +425,6 @@ static int clear_state_bit(struct extent_io_tree *tree,
  } else {
  merge_state(tree, state);
  }
 -return ret;
  }
  
  static struct extent_state *
 
 The above part of the patch still applies and with only subject change
 to something like
 
   Btrfs: retrurn void from clear_state_bit
 
 is a rc2 material. So, Li, if you're ok with this change I'm adding it
 (with the 2/2 patch) to my local queue of rc patches for Chris.
 


Thanks for doing this!

--
Li Zefan

 
 david
 
 (the rest of the patch was done within the error handling series)
 
 @@ -449,8 +446,7 @@ alloc_extent_state_atomic(struct extent_state *prealloc)
   *
   * the range [start, end] is inclusive.
   *
 - * This takes the tree lock, and returns  0 on error,  0 if any of the
 - * bits were already set, or zero if none of the bits were already set.
 + * This takes the tree lock, and returns  0 on error.
   */
  int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
   int bits, int wake, int delete,
 @@ -464,7 +460,6 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 
 start, u64 end,
  struct rb_node *node;
  u64 last_end;
  int err;
 -int set = 0;
  int clear = 0;
  
  if (delete)
 @@ -547,7 +542,7 @@ hit_next:
  if (err)
  goto out;
  if (state-end = end) {
 -set |= clear_state_bit(tree, state, bits, wake);
 +clear_state_bit(tree, state, bits, wake);
  if (last_end == (u64)-1)
  goto out;
  start = last_end + 1;
 @@ -568,13 +563,13 @@ hit_next:
  if (wake)
  wake_up(state-wq);
  
 -set |= clear_state_bit(tree, prealloc, bits, wake);
 +clear_state_bit(tree, prealloc, bits, wake);
  
  prealloc = NULL;
  goto out;
  }
  
 -set |= clear_state_bit(tree, state, bits, wake);
 +clear_state_bit(tree, state, bits, wake);
  next:
  if (last_end == (u64)-1)
  goto out;
 @@ -591,7 +586,7 @@ out:
  if (prealloc)
  free_extent_state(prealloc);
  
 -return set;
 +return 0;
  
  search_again:
  if (start  end)
 -- 1.7.3.1 


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] btrfs: extended inode refs

2012-04-05 Thread Liu Bo
On 04/06/2012 09:24 AM, Liu Bo wrote:
 On 04/06/2012 04:09 AM, Mark Fasheh wrote:
 Currently btrfs has a limitation on the maximum number of hard links an
 inode can have. Specifically, links are stored in an array of ref
 items:

 struct btrfs_inode_ref {
  __le64 index;
  __le16 name_len;
  /* name goes here */
 } __attribute__ ((__packed__));

 The ref arrays are found via key triple:

 (inode objectid, BTRFS_INODE_EXTREF_KEY, parent dir objectid)

 Since items can not exceed the size of a leaf, the total number of links
 that can be stored for a given inode / parent dir pair is limited to under
 4k. This works fine for the most common case of few to only a handful of
 links. Once the link count gets higher however, we begin to return EMLINK.


 The following patches fix this situation by introducing a new ref item:

 struct btrfs_inode_extref {
  __le64 parent_objectid;
  __le64 index;
  __le16 name_len;
  __u8   name[0];
  /* name goes here */
 } __attribute__ ((__packed__));

 Extended refs behave differently from ref arrays in several key areas.

 Each extended refs is it's own item so there is no ref array (and
 therefore no limit on size).

 As a result, we must use a different addressing scheme. Extended ref keys
 look like:

 (inode objectid, BTRFS_INODE_EXTREF_KEY, hash)

 Where hash is defined as a function of the parent objectid and link name.

 This effectively fixes the limitation, though we have a slightly less
 efficient packing of link data. To keep the best of both worlds then, I
 implemented the following behavior:

 Extended refs don't replace the existing ref array. An inode gets an
 extended ref for a given link _only_ after the ref array has been filled.  So
 the most common cases shouldn't actually see any difference in performance
 or disk usage as they'll never get to the point where we're using an
 extended ref.

 It's important while reading the patches however that there's still the
 possibility that we can have a set of operations that grow out an inode ref
 array (adding some extended refs) and then remove only the refs in the
 array.  I don't really see this being common but it's a case we always have
 to consider when coding these changes.

 Right now there is a limitation for extrefs in that we're not handling the
 possibility of a hash collision. There are two ways I see we can deal with
 this:

 We can use a 56-bit hash and keep a generation counter in the lower 8
 bits of the offset field.  The cost would be an additional tree search
 (between offset hash00 and hashFF) if we don't find exactly the name we
 were looking for.

 An alternative solution to dealing with collisions could be to emulate the
 dir-item insertion code - specifically something like insert_with_overflow()
 which will stuff multiple items under one key. I tend to prefer the idea of
 simply including a generation in the key offset however since it maintains
 the 1:1 relationship of keys to names which turns out to be much nicer to
 code for in my honest opinion. Also none of the code which iterates the tree
 looking for refs would have to change as the only difference is in the key
 offset and not in the actual item structure.


 Testing wise, the patches are in an intermediate state. I've debugged a fair
 bit but I'm certain there's gremlins lurking in there.  The basic namespace
 operations work well enough (link, unlink, etc).  I've done light testing of
 my changes in backref.c by exercising BTRFS_IOC_INO_PATHS.  The changes in
 tree-log.c need the most review and testing - I haven't really figured out a
 great way to exercise the code in tree-log yet (suggestions would be
 great!).

 
 For the log recover test, I used to sysrq+b to make sure our log remains on 
 disk.
 
 Will also test this patchset sooner or later.
 

It Works fine in normal mode except we need to note people to modify their 
btrfs-progs with
that incompat flag at the first step ;)

However, for log recover, I use the following script:

$ touch /mnt/btrfs/foobar; 
$ ./fsync_self /mnt/btrfs/foobar; (fsync_self is a wrapper of fsync() written 
by myself)
$ for i in `seq 1 1 300`; do ln /mnt/btrfs/foobar /mnt/btrfs/foobar$i; 
./fsync_self /mnt/btrfs/foobar$i; done;
$ echo b  /proc/sysrq-trigger

when we come back,
$ mount disk /mnt/btrfs

and it hits a warning and a hang, the dmesg log shows:

Btrfs loaded
device fsid 85811dec-dd03-44f1-a8e2-005a67c6b7f5 devid 1 transid 5 /dev/sdb7
btrfs: disk space caching is enabled
Btrfs detected SSD devices, enabling SSD mode
[ cut here ]
WARNING: at fs/btrfs/ctree.c:1677 btrfs_search_slot+0x941/0x960 [btrfs]()
Hardware name: QiTianM7150
Modules linked in: btrfs(O) zlib_deflate libcrc32c ip6table_filter ip6_tables 
iptable_filter ebtable_nat ebtables ipt_REJECT ip_tables bridge stp llc nfsd 
lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq 
freq_table mperf be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i