Re: [PATCH v2 1/7] added helper functions to iterate backrefs

2011-06-23 Thread Jan Schmidt
On 21.06.2011 17:12, Jan Schmidt wrote:
> On 21.06.2011 16:37, David Sterba wrote:
>> [...]
> Something is going wrong here:
> 
>> Example:
>> ipath buffer and scratch are 32K, each, ie. the overly sized
>> ref_name_len will fit there:
>>
>> [ 8766.928232] btrfs: ino2name: 266  p1/
>> [ 8767.440964] i2p: [4] namelen 10, left 32766
>> [ 8767.446417] i2p: [7] path:
>> [ 8767.450445] i2p: [4] namelen 2, left 32755
>> [ 8767.455733] i2p: [7] path: /
>> [ 8767.459834] i2p: [2] p1/
> 
> Do I get that right? This is inode 266, which should refer to
> ./p1/d6/d7/d1f/c41 (as you state below). Already on second iteration,
> we're at something named "pl", which shouldn't happen as we construct
> the path bottom-up. And those namelen 0 in the following lines shouldn't
> happen, either. Furthermore, the path should change on every iteration
> (I guess, that might depend on the printk behind, of course).
> 
>> [ 8767.463593] i2p: [4] namelen 0, left 32752
>> [ 8767.468903] i2p: [7] path:
>> [ 8767.472908] i2p: [4] namelen 2, left 32751
>> [ 8767.478188] i2p: [7] path: /
>> [ 8767.482280] i2p: [2] p1/
>> [ 8767.486024] i2p: [4] namelen 0, left 32748
>> [ 8767.491293] i2p: [7] path:
>> [ 8767.495283] i2p: [4] namelen 2, left 32747
>> [ 8767.500587] i2p: [7] path: /
>> [ 8767.504680] i2p: [2] p1/
>> [ 8767.508430] i2p: [4] namelen 0, left 32744
>> [ 8767.513708] i2p: [7] path:
>> [ 8767.517702] i2p: [4] namelen 2, left 32743
>> [ 8767.522969] i2p: [7] path: /
>> [ 8767.527042] i2p: [2] p1/
>> [ 8767.530787] i2p: [4] namelen 0, left 32740
>> [ 8767.536049] i2p: [7] path:
>> [ 8767.539991] i2p: [4] namelen 2, left 32739
>> [ 8767.545282] i2p: [7] path: /
>> [ 8767.549374] i2p: [2] p1/
>> [ 8767.553109] i2p: [4] namelen 0, left 32736
>> [ 8767.558377] i2p: [7] path:
>> [ 8767.562354] i2p: [4] namelen 2, left 32735
>> [ 8767.567615] i2p: [7] path: /
>> [ 8767.571713] i2p: [2] p1/
>> [ 8767.575443] i2p: [4] namelen 0, left 32732
>> [ 8767.580701] i2p: [7] path:
>> [ 8767.584678] i2p: [4] namelen 2, left 32731
>> [ 8767.589933] i2p: [7] path: /
>> [ 8767.594007] i2p: [2] p1/
>> [ 8767.597713] i2p: [4] namelen 0, left 32728
>> [ 8767.602908] i2p: [7] path:
>> [ 8767.606832] i2p: [4] namelen 2, left 32727
>> [ 8767.612030] i2p: [7] path: /
>> [ 8767.616050] i2p: [2] p1/
>> [ 8767.619676] i2p: [4] namelen 0, left 32724
>> [ 8767.624873] i2p: [7] path:
>> [ 8767.628782] i2p: [4] namelen 2, left 32723
>> [ 8767.633970] i2p: [7] path: /
>> [ 8767.637962] i2p: [2] p1/
>> [ 8767.641639] i2p: [4] namelen 0, left 32720
>> [ 8767.646838] i2p: [7] path:
>> [ 8767.650757] i2p: [4] namelen 2, left 32719
>> [ 8767.655952] i2p: [7] path: /
>> [ 8767.659940] i2p: [2] p1/
>> [ 8767.663604] i2p: [4] namelen 0, left 32716
>> [ 8767.668790] i2p: [7] path:
>> [ 8767.672696] i2p: [4] namelen 2, left 32715
>> [ 8767.677881] i2p: [7] path: /
>> [ 8767.681878] i2p: [2] p1/
>> [ 8767.685549] i2p: [4] namelen 0, left 32712
>> [ 8767.690742] i2p: [7] path:
>> [ 8767.694655] i2p: [4] namelen 2, left 32711
>> [ 8767.699843] i2p: [7] path: /
>> [ 8767.703840] i2p: [2] p1/
>> [ 8767.707520] i2p: [4] namelen 19967, left 32708
>> [ 8767.713057] i2p: [7] path:
>> [ 8767.716955] BUG: unable to handle kernel NULL pointer dereference at  
>>  (null)
>> [ 8767.720932] IP: [] read_extent_buffer+0xa2/0x220 [btrfs]
>> [ 8767.720932] PGD 77bd0067 PUD 79d35067 PMD 0
>> [ 8767.720932] Oops:  [#1] SMP
>> [ 8767.720932] CPU 1
>> [ 8767.720932] Modules linked in: btrfs loop
>> [ 8767.720932]
>> [ 8767.720932] Pid: 10859, comm: btrfs-ino2path Not tainted 
>> 3.0.0-rc3-default+ #75 Intel Corporation Santa Rosa platform/Matanzas
>> [ 8767.720932] RIP: 0010:[]  [] 
>> read_extent_buffer+0xa2/0x220 [btrfs]
>> [ 8767.720932] RSP: 0018:880074acbc58  EFLAGS: 00010202
>> [ 8767.720932] RAX:  RBX: 3deb RCX: 
>> 1000
>> [ 8767.720932] RDX: 01e0 RSI:  RDI: 
>> 0246
>> [ 8767.720932] RBP: 880074acbcb8 R08:  R09: 
>> 0001
>> [ 8767.720932] R10:  R11: 1c04 R12: 
>> 0002
>> [ 8767.720932] R13:  R14: 880079bac1d9 R15: 
>> 880074acbfd8
>> [ 8767.720932] FS:  7f1399198740() GS:88007de0() 
>> knlGS:
>> [ 8767.720932] CS:  0010 DS:  ES:  CR0: 8005003b
>> [ 8767.720932] CR2:  CR3: 74b13000 CR4: 
>> 06e0
>> [ 8767.720932] DR0:  DR1:  DR2: 
>> 
>> [ 8767.720932] DR3:  DR6: 0ff0 DR7: 
>> 0400
>> [ 8767.720932] Process btrfs-ino2path (pid: 10859, threadinfo 
>> 880074aca000, task 880077085340)
>> [ 8767.720932] Stack:
>> [ 8767.720932]  a005f282 81b21e74 88002f2ef668 
>> 
>> [ 8767.720932]  880073f94dd8 880074acbfd8 8800780bd000 
>> 31c5
>> [ 8767.720932]  31c5 0

Re: [PATCH v2 1/7] added helper functions to iterate backrefs

2011-06-21 Thread Jan Schmidt
On 21.06.2011 16:37, David Sterba wrote:
> On Sat, Jun 18, 2011 at 01:45:58PM +0200, Jan Schmidt wrote:
>> +/*
>> + * returns 0 if the path could be dumped (probably truncated)
>> + * returns <0 in case of an error
>> + */
>> +static int inode_to_path(u64 inum, struct btrfs_inode_ref *iref,
>> +struct extent_buffer *eb_ir, int slot,
>> +void *ctx)
>> +{
>> +struct inode_fs_paths *ipath = ctx;
>> +struct extent_buffer *eb_i = ipath->eb;
>> +u32 path_len;
>> +char *fs_path;
>> +
>> +if (ipath->bytes_left < 2)
>> +return -EOVERFLOW;
>> +
>> +*ipath->dest++ = ' ';
>> +--ipath->bytes_left;
> 
> this actually prepends a space before the first entry.

Which is fine because we print the buffer with "(path:%s)" in
scrub.c:257, and a space is well appropriate there. That way we don't
need to handle the first path differently. (Makes only sense as long as
we stick to multiple files per line).

> I've played a bit with this interface, it's the missing piece to
> implement the 'inode nubmer -> directory names' search :)
> All the names are placed into one buffer, 4k sized, separated by a
> space. Not all files fit, so I suggest to print one filename at a time.

It does print all references to one *inode* to the same buffer, right.
If there are more inodes refering to that extent, those paths are
printed to a separate buffer. You can make some reflinks to try that out.

> Me-user wants to see all the filenames, even if the list is potentially
> long.

Well, maybe. However, for me, the number of hardlinks created is either
very low, or I know why a file has a whole bunch of hardlinks myself
(e.g. some kind of backup strategy). In both cases, this approach would
fit. I don't want to see some thousands of printk messages for that one
file I know to have a lot of hardlinks - maybe even missing some error
of a non-hardlinked file due to rate limiting.

On the other hand, you could argue, when I deliberately create a bunch
of reflinks, I will get the same problem. Another approach would be to
put it all into one buffer, no matter which inode it comes from. Either
way shouldn't be hard to accomplish.

> I've also noticed that in iref_to_path()
> 
> 130 len = btrfs_inode_ref_name_len(eb, iref);
> 
> returns very large values, which do not fit into the 4k buffer and the
> function returns. When trying to print the data by read_extent_buffer to
> a temporary array, a pagefault occurs (and seems to come from reading
> the extent buffer). The numbers are like 19000 or higher, which are way
> above path or filename maximum size. I don't think this is the
> intentional behaviour, it relies on some side effect rather than clear
> logic.

Something is going wrong here:

> Example:
> ipath buffer and scratch are 32K, each, ie. the overly sized
> ref_name_len will fit there:
> 
> [ 8766.928232] btrfs: ino2name: 266  p1/
> [ 8767.440964] i2p: [4] namelen 10, left 32766
> [ 8767.446417] i2p: [7] path:
> [ 8767.450445] i2p: [4] namelen 2, left 32755
> [ 8767.455733] i2p: [7] path: /
> [ 8767.459834] i2p: [2] p1/

Do I get that right? This is inode 266, which should refer to
./p1/d6/d7/d1f/c41 (as you state below). Already on second iteration,
we're at something named "pl", which shouldn't happen as we construct
the path bottom-up. And those namelen 0 in the following lines shouldn't
happen, either. Furthermore, the path should change on every iteration
(I guess, that might depend on the printk behind, of course).

> [ 8767.463593] i2p: [4] namelen 0, left 32752
> [ 8767.468903] i2p: [7] path:
> [ 8767.472908] i2p: [4] namelen 2, left 32751
> [ 8767.478188] i2p: [7] path: /
> [ 8767.482280] i2p: [2] p1/
> [ 8767.486024] i2p: [4] namelen 0, left 32748
> [ 8767.491293] i2p: [7] path:
> [ 8767.495283] i2p: [4] namelen 2, left 32747
> [ 8767.500587] i2p: [7] path: /
> [ 8767.504680] i2p: [2] p1/
> [ 8767.508430] i2p: [4] namelen 0, left 32744
> [ 8767.513708] i2p: [7] path:
> [ 8767.517702] i2p: [4] namelen 2, left 32743
> [ 8767.522969] i2p: [7] path: /
> [ 8767.527042] i2p: [2] p1/
> [ 8767.530787] i2p: [4] namelen 0, left 32740
> [ 8767.536049] i2p: [7] path:
> [ 8767.539991] i2p: [4] namelen 2, left 32739
> [ 8767.545282] i2p: [7] path: /
> [ 8767.549374] i2p: [2] p1/
> [ 8767.553109] i2p: [4] namelen 0, left 32736
> [ 8767.558377] i2p: [7] path:
> [ 8767.562354] i2p: [4] namelen 2, left 32735
> [ 8767.567615] i2p: [7] path: /
> [ 8767.571713] i2p: [2] p1/
> [ 8767.575443] i2p: [4] namelen 0, left 32732
> [ 8767.580701] i2p: [7] path:
> [ 8767.584678] i2p: [4] namelen 2, left 32731
> [ 8767.589933] i2p: [7] path: /
> [ 8767.594007] i2p: [2] p1/
> [ 8767.597713] i2p: [4] namelen 0, left 32728
> [ 8767.602908] i2p: [7] path:
> [ 8767.606832] i2p: [4] namelen 2, left 32727
> [ 8767.612030] i2p: [7] path: /
> [ 8767.616050] i2p: [2] p1/
> [ 8767.619676] i2p: [4] namelen 0, left 32724
> [ 8767.624873] i2p: [7] path:
> [ 8767.628782] i2p

Re: [PATCH v2 1/7] added helper functions to iterate backrefs

2011-06-21 Thread David Sterba
On Sat, Jun 18, 2011 at 01:45:58PM +0200, Jan Schmidt wrote:
> +/*
> + * returns 0 if the path could be dumped (probably truncated)
> + * returns <0 in case of an error
> + */
> +static int inode_to_path(u64 inum, struct btrfs_inode_ref *iref,
> + struct extent_buffer *eb_ir, int slot,
> + void *ctx)
> +{
> + struct inode_fs_paths *ipath = ctx;
> + struct extent_buffer *eb_i = ipath->eb;
> + u32 path_len;
> + char *fs_path;
> +
> + if (ipath->bytes_left < 2)
> + return -EOVERFLOW;
> +
> + *ipath->dest++ = ' ';
> + --ipath->bytes_left;

this actually prepends a space before the first entry.

I've played a bit with this interface, it's the missing piece to
implement the 'inode nubmer -> directory names' search :)
All the names are placed into one buffer, 4k sized, separated by a
space. Not all files fit, so I suggest to print one filename at a time.
Me-user wants to see all the filenames, even if the list is potentially
long.

I've also noticed that in iref_to_path()

130 len = btrfs_inode_ref_name_len(eb, iref);

returns very large values, which do not fit into the 4k buffer and the
function returns. When trying to print the data by read_extent_buffer to
a temporary array, a pagefault occurs (and seems to come from reading
the extent buffer). The numbers are like 19000 or higher, which are way
above path or filename maximum size. I don't think this is the
intentional behaviour, it relies on some side effect rather than clear
logic.

Example:
ipath buffer and scratch are 32K, each, ie. the overly sized
ref_name_len will fit there:

[ 8766.928232] btrfs: ino2name: 266  p1/
[ 8767.440964] i2p: [4] namelen 10, left 32766
[ 8767.446417] i2p: [7] path:
[ 8767.450445] i2p: [4] namelen 2, left 32755
[ 8767.455733] i2p: [7] path: /
[ 8767.459834] i2p: [2] p1/
[ 8767.463593] i2p: [4] namelen 0, left 32752
[ 8767.468903] i2p: [7] path:
[ 8767.472908] i2p: [4] namelen 2, left 32751
[ 8767.478188] i2p: [7] path: /
[ 8767.482280] i2p: [2] p1/
[ 8767.486024] i2p: [4] namelen 0, left 32748
[ 8767.491293] i2p: [7] path:
[ 8767.495283] i2p: [4] namelen 2, left 32747
[ 8767.500587] i2p: [7] path: /
[ 8767.504680] i2p: [2] p1/
[ 8767.508430] i2p: [4] namelen 0, left 32744
[ 8767.513708] i2p: [7] path:
[ 8767.517702] i2p: [4] namelen 2, left 32743
[ 8767.522969] i2p: [7] path: /
[ 8767.527042] i2p: [2] p1/
[ 8767.530787] i2p: [4] namelen 0, left 32740
[ 8767.536049] i2p: [7] path:
[ 8767.539991] i2p: [4] namelen 2, left 32739
[ 8767.545282] i2p: [7] path: /
[ 8767.549374] i2p: [2] p1/
[ 8767.553109] i2p: [4] namelen 0, left 32736
[ 8767.558377] i2p: [7] path:
[ 8767.562354] i2p: [4] namelen 2, left 32735
[ 8767.567615] i2p: [7] path: /
[ 8767.571713] i2p: [2] p1/
[ 8767.575443] i2p: [4] namelen 0, left 32732
[ 8767.580701] i2p: [7] path:
[ 8767.584678] i2p: [4] namelen 2, left 32731
[ 8767.589933] i2p: [7] path: /
[ 8767.594007] i2p: [2] p1/
[ 8767.597713] i2p: [4] namelen 0, left 32728
[ 8767.602908] i2p: [7] path:
[ 8767.606832] i2p: [4] namelen 2, left 32727
[ 8767.612030] i2p: [7] path: /
[ 8767.616050] i2p: [2] p1/
[ 8767.619676] i2p: [4] namelen 0, left 32724
[ 8767.624873] i2p: [7] path:
[ 8767.628782] i2p: [4] namelen 2, left 32723
[ 8767.633970] i2p: [7] path: /
[ 8767.637962] i2p: [2] p1/
[ 8767.641639] i2p: [4] namelen 0, left 32720
[ 8767.646838] i2p: [7] path:
[ 8767.650757] i2p: [4] namelen 2, left 32719
[ 8767.655952] i2p: [7] path: /
[ 8767.659940] i2p: [2] p1/
[ 8767.663604] i2p: [4] namelen 0, left 32716
[ 8767.668790] i2p: [7] path:
[ 8767.672696] i2p: [4] namelen 2, left 32715
[ 8767.677881] i2p: [7] path: /
[ 8767.681878] i2p: [2] p1/
[ 8767.685549] i2p: [4] namelen 0, left 32712
[ 8767.690742] i2p: [7] path:
[ 8767.694655] i2p: [4] namelen 2, left 32711
[ 8767.699843] i2p: [7] path: /
[ 8767.703840] i2p: [2] p1/
[ 8767.707520] i2p: [4] namelen 19967, left 32708
[ 8767.713057] i2p: [7] path:
[ 8767.716955] BUG: unable to handle kernel NULL pointer dereference at 
  (null)
[ 8767.720932] IP: [] read_extent_buffer+0xa2/0x220 [btrfs]
[ 8767.720932] PGD 77bd0067 PUD 79d35067 PMD 0
[ 8767.720932] Oops:  [#1] SMP
[ 8767.720932] CPU 1
[ 8767.720932] Modules linked in: btrfs loop
[ 8767.720932]
[ 8767.720932] Pid: 10859, comm: btrfs-ino2path Not tainted 3.0.0-rc3-default+ 
#75 Intel Corporation Santa Rosa platform/Matanzas
[ 8767.720932] RIP: 0010:[]  [] 
read_extent_buffer+0xa2/0x220 [btrfs]
[ 8767.720932] RSP: 0018:880074acbc58  EFLAGS: 00010202
[ 8767.720932] RAX:  RBX: 3deb RCX: 1000
[ 8767.720932] RDX: 01e0 RSI:  RDI: 0246
[ 8767.720932] RBP: 880074acbcb8 R08:  R09: 0001
[ 8767.720932] R10:  R11: 1c04 R12: 0002
[ 8767.720932] R13:  R14: 880079bac1d9 R15: 880074acbfd8
[ 8767.720932] FS:  7f1399198740() GS

[PATCH v2 1/7] added helper functions to iterate backrefs

2011-06-18 Thread Jan Schmidt
These helper functions iterate back references and call a function for each
backref. There is also a function to resolve an inode to a path in the
file system.

Signed-off-by: Jan Schmidt 
---
 fs/btrfs/Makefile  |3 +-
 fs/btrfs/backref.c |  445 
 fs/btrfs/backref.h |   59 +++
 3 files changed, 506 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 9b72dcf..c63f649 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -7,4 +7,5 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
   extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
   export.o tree-log.o acl.o free-space-cache.o zlib.o lzo.o \
-  compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o
+  compression.o delayed-ref.o relocation.o delayed-inode.o backref.o \
+  scrub.o
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
new file mode 100644
index 000..64611e6
--- /dev/null
+++ b/fs/btrfs/backref.c
@@ -0,0 +1,445 @@
+/*
+ * Copyright (C) 2011 STRATO.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#include "ctree.h"
+#include "backref.h"
+
+static int __inode_info(u64 inum, u64 ioff, u8 key_type,
+   struct btrfs_root *fs_root, struct btrfs_path *path,
+   struct btrfs_key *found_key)
+{
+   int ret;
+   struct btrfs_key key;
+   struct extent_buffer *eb;
+
+   key.type = key_type;
+   key.objectid = inum;
+   key.offset = ioff;
+
+   ret = btrfs_search_slot(NULL, fs_root, &key, path, 0, 0);
+   if (ret < 0)
+   return ret;
+
+   eb = path->nodes[0];
+   if (ret && path->slots[0] >= btrfs_header_nritems(eb)) {
+   ret = btrfs_next_leaf(fs_root, path);
+   if (ret)
+   return ret;
+   eb = path->nodes[0];
+   }
+
+   btrfs_item_key_to_cpu(eb, found_key, path->slots[0]);
+   if (found_key->type != key.type || found_key->objectid != key.objectid)
+   return 1;
+
+   return 0;
+}
+
+/*
+ * this makes the path point to (inum INODE_ITEM ioff)
+ */
+int inode_item_info(u64 inum, u64 ioff, struct btrfs_root *fs_root,
+   struct btrfs_path *path)
+{
+   struct btrfs_key key;
+   return __inode_info(inum, ioff, BTRFS_INODE_ITEM_KEY, fs_root, path,
+   &key);
+}
+
+static int inode_ref_info(u64 inum, u64 ioff, struct btrfs_root *fs_root,
+   struct btrfs_path *path, int strict,
+   u64 *out_parent_inum,
+   struct extent_buffer **out_iref_eb,
+   int *out_slot)
+{
+   int ret;
+   struct btrfs_key found_key;
+
+   ret = __inode_info(inum, ioff, BTRFS_INODE_REF_KEY, fs_root, path,
+   &found_key);
+
+   if (!ret) {
+   if (out_slot)
+   *out_slot = path->slots[0];
+   if (out_iref_eb)
+   *out_iref_eb = path->nodes[0];
+   if (out_parent_inum)
+   *out_parent_inum = found_key.offset;
+   }
+
+   btrfs_release_path(path);
+   return ret;
+}
+
+/*
+ * this iterates to turn a btrfs_inode_ref into a full filesystem path. 
elements
+ * of the path are separated by '/' and the path is guaranteed to be
+ * 0-terminated. the path is only given within the current file system.
+ * Therefore, it never starts with a '/'. the caller is responsible to provide
+ * "size" bytes in "dest". the dest buffer will be filled backwards! the idea 
is
+ * that in case of an overflow, the lower part in the hierarchie is most
+ * important to the user. finally, the start point of the resulting string is
+ * returned. in case the path would overflow, "..." is added at the front of
+ * the string and iteration stops regularly.
+ */
+static char *iref_to_path(struct btrfs_root *fs_root, struct btrfs_path *path,
+   struct btrfs_inode_ref *iref,
+   struct extent_buffer *eb, u64 parent,
+   char *dest, u32 size)
+{
+