Re: TRIM support

2011-07-11 Thread Leonidas Spyropoulos
On Mon, Jul 11, 2011 at 7:04 AM, Fajar A. Nugraha  wrote:
> On Mon, Jul 11, 2011 at 5:34 AM, Leonidas Spyropoulos
>  wrote:
>> So any clues for the intel 320 series? I think it doesn't use compression.
>
> At this point your best bet is to try it yourself and see. If it
> doesn't result in poor performance, then keep on using "-o discard".
Could you propose me any tools available for measuring performance?

I only know iozone and tunefs -t parameter.
>
> --
> Fajar
>
>>
>> On Sun, Jul 10, 2011 at 10:59 PM, Fajar A. Nugraha  wrote:
>>> On Mon, Jul 11, 2011 at 3:58 AM, Leonidas Spyropoulos
>>>  wrote:
 On Sun, Jul 10, 2011 at 9:33 AM, Chris Samuel  wrote:
> On Sun, 3 Jul 2011 05:45:17 AM Calvin Walton wrote:
> This LWN article from 2009 explains why it can be problematic
> (especially on SATA drives where TRIM is a non-queued command):
>
> https://lwn.net/Articles/347511/
>
 So the current problem with TRIM in ATA (and SATA) is that it
 introduce delays? As long as it keeps your SSD in a good shape it's
 still better than not having TRIM at all, right?
>>>
>>> Not quite.
>>>
>>> Sandforce-based SSDs have their own way of reducing writes (e.g. by
>>> using internal compression), so you don't have to do anything special.
>>> Also, AFAIK currently TRIM is useless if the drives are behind a
>>> hardware raid controller anyway.
>>>
>>> My Corsair F60 (on a notebook) is actually MUCH SLOWER with -o discard
>>> (i.e. writes capped at 100 iops)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Caution: breathing may be hazardous to your health.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TRIM support

2011-07-11 Thread Ric Wheeler

On 07/11/2011 06:53 AM, Chris Samuel wrote:

On Mon, 11 Jul 2011 07:59:54 AM Fajar A. Nugraha wrote:


Sandforce-based SSDs have their own way of reducing writes
(e.g. by using internal compression), so you don't have to
do anything special

Not just compression, but also block level de-duplication too
(i.e. potentially removing the redundancy of btrfs's duplication
of metadata for safety).

cheers,
Chris


How vendors implement their internal firmware at any given point is not 
something that we can know (or should know).


As mentioned in this thread, you can and should always measure the performance 
of your application on your OS with and without discard being enabled. Note that 
you might have long term effects (i.e., trim enabled via discard might avoid the 
performance hit you see with some devices after extensive use, especially when 
full).


Keep in mind that discard support is built on an industry standard command and 
is used by other vendors (including windows) so manufacturers that do a bad job 
and suffer performance impacts will be *very* motivated to fix their firmware :)


Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TRIM support

2011-07-11 Thread Fajar A. Nugraha
On Mon, Jul 11, 2011 at 2:02 PM, Leonidas Spyropoulos
 wrote:
> On Mon, Jul 11, 2011 at 7:04 AM, Fajar A. Nugraha  wrote:
>> On Mon, Jul 11, 2011 at 5:34 AM, Leonidas Spyropoulos
>>  wrote:
>>> So any clues for the intel 320 series? I think it doesn't use compression.
>>
>> At this point your best bet is to try it yourself and see. If it
>> doesn't result in poor performance, then keep on using "-o discard".

> Could you propose me any tools available for measuring performance?
>
> I only know iozone and tunefs -t parameter.

Anything that can measure random write IO is fine. I use fio with this jobfile:

$ cat randomwrite.fio
[write-test]
rw=randwrite
ioengine=libaio
blocksize=4k
iodepth=32
size=1G

the result:
  write: io=1024MB, bw=32395KB/s, iops=8098, runt= 32368msec

If you still have similar performance with and without "-o discard",
then you should add it your mount options.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory leak?

2011-07-11 Thread Stephane Chazelas
2011-07-10 19:37:28 +0100, Stephane Chazelas:
> 2011-07-10 08:44:34 -0400, Chris Mason:
> [...]
> > Great, we're on the right track.  Does it trigger with mount -o compress
> > instead of mount -o compress_force?
> [...]
> 
> It does trigger. I get that same "invalid opcode".
> 
> BTW, I tried with CONFIG_SLUB and slub_debug and no more useful
> information than with SLAB_DEBUG.
> 
> I'm trying now without dmcrypt. Then I won't have much bandwidth
> for testing.
[...]

Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup
thread when doing an 

- rsync (though I also got (back when on ubuntu and 2.6.38) at
  least one occurrence using bsdtar | bsdtar)
- of a large amount of data (with a large number of files),
  though the bug occurs quite early probably after having
  transfered about 50-100GB
- the source FS being btrfs with compress-force on 3 devices
  (one of which slightly shorter than the others) and a lot of
  subvolumes and snapshots (I'm now copying from read-only
  snapshots but that happened with RW ones as well).
- to a newly created btrfs fs
- on one device (/dev/sdd or dmcrypt)
- mounted with compress or compress-force.

- noatime on either source or dest doesn't make a difference
  (wrt the occurrence of fixup BUG())
- can't reproduce it when dest is not mounted with compress
- beside that BUG(),
- kernel memory is being used up (mostly in
  btrfs_inode_cache) and can't be reclaimed (leading to crash
  with oom killing everybody)
- the target FS can be unmounted but that does not reclaim
  memory. However the *source* FS (that is not the one we tried
  with and without compress) cannot be unmounted (umount hangs,
  see another email for its stack trace).
- Only way to get out of there is reboot with sysrq-b
- happens with 2.6.38, 2.6.39, 3.0.0rc6
- CONFIG_SLAB_DEBUG, CONFIG_DEBUG_PAGEALLOC,
  CONFIG_DEBUG_SLAB_LEAK, slub_debug don't tell us anything
  useful (there's more info in /proc/slabinfo when
  CONFIG_SLAB_DEBUG is on, see below)
- happens with CONFIG_SLUB as well.


slabinfo every about 60-70 seconds which include the "globalstats"

slabinfo - version: 2.1 (statistics)
# name
 : tunables: slabdata 
   : globalst
at
  : cpustat

btrfs_inode_cache  77610  77610   409611 : tunables   24   128 : 
slabdata  77610  77610  0 : globalstat   77610  77610 77610000  
  000 : cpustat104  77609 98  5
btrfs_inode_cache 165696 165696   409611 : tunables   24   128 : 
slabdata 165696 165696  0 : globalstat  174592 166889 17311700   37 
   800 : cpustat  14375 174178  21198   1659
btrfs_inode_cache 173906 173906   409611 : tunables   24   128 : 
slabdata 173906 173906  0 : globalstat  231342 196133 22884880   37 
   800 : cpustat  24914 230649  75318   6338
btrfs_inode_cache 201190 201190   409611 : tunables   24   128 : 
slabdata 201190 201190  0 : globalstat  338963 201190 33145480   38 
  1100 : cpustat  53954 335583 173512  14834
btrfs_inode_cache 224106 224143   409611 : tunables   24   128 : 
slabdata 224106 224143 96 : globalstat  453173 267189 44210180   38 
  1300 : cpustat  77063 448023 277242  23875
btrfs_inode_cache 126520 126520   409611 : tunables   24   128 : 
slabdata 126520 126520  0 : globalstat  486327 267189 472461   320   38 
  1300 : cpustat  96675 479904 414073  35992
btrfs_inode_cache 144723 144723   409611 : tunables   24   128 : 
slabdata 144723 144723  0 : globalstat  537600 267189 521248   320   38 
  1500 : cpustat 114446 530048 459922  39849
btrfs_inode_cache 176590 176590   409611 : tunables   24   128 : 
slabdata 176590 176590  0 : globalstat  626027 267189 605212   320   38 
  3500 : cpustat 142336 616188 535659  46275
btrfs_inode_cache 225715 225752   409611 : tunables   24   128 : 
slabdata 225715 225752 96 : globalstat  766387 267189 739439   320   38 
  6000 : cpustat 181607 753564 653165  56404
btrfs_inode_cache 179039 179076   409611 : tunables   24   128 : 
slabdata 179039 179076 84 : globalstat  821296 267189 793315   480   38 
  6000 : cpustat 189640 808027 753396  65349
btrfs_inode_cache 139572 139609   409611 : tunables   24   128 : 
slabdata 139572 139609 96 : globalstat  890513 267189 858553   560   38 
  6000 : cpustat 214964 875265 874796  75992
btrfs_inode_cache 122064 122101   409611 : tunables   24   128 : 
slabdata 122064 122101 96 : globalstat  936515 267189 903015   720   38 
  6600 : cpustat 230345 920877 947006  82282
btrfs_inode_cache 136431 136468   409611 : tunables   24   128 : 
slabdata 136431 136468 96 : globalstat 1001394 267189 963758   880   38 
  6700 : cpustat 256274 983686 1015526  88131
btrfs_inod

[PATCH] Btrfs-progs: add "btrfs subvolume get-default" subcommand

2011-07-11 Thread Zhong, Xin
Add subcommand to get the default subvolume of btrfs filesystem

Reported-by: Yang, Yi 
Signed-off-by: Zhong, Xin 
---
 btrfs-list.c |   60 -
 btrfs.c  |3 ++
 btrfs_cmds.c |   31 +-
 btrfs_cmds.h |3 +-
 4 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 93766a8..e9c0266 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -536,7 +536,7 @@ build:
return full;
 }
 
-int list_subvols(int fd)
+int list_subvols(int fd, int get_default)
 {
struct root_lookup root_lookup;
struct rb_node *n;
@@ -545,10 +545,12 @@ int list_subvols(int fd)
struct btrfs_ioctl_search_key *sk = &args.key;
struct btrfs_ioctl_search_header *sh;
struct btrfs_root_ref *ref;
+   struct btrfs_dir_item *di;
unsigned long off = 0;
int name_len;
char *name;
u64 dir_id;
+   u64 subvol_id = 0;
int i;
 
root_lookup_init(&root_lookup);
@@ -642,6 +644,55 @@ int list_subvols(int fd)
n = rb_next(n);
}
 
+   memset(&args, 0, sizeof(args));
+
+   /* search in the tree of tree roots */
+   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
+
+   /* search dir item */
+   sk->max_type = BTRFS_DIR_ITEM_KEY;
+   sk->min_type = BTRFS_DIR_ITEM_KEY;
+
+   sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->max_offset = (u64)-1;
+   sk->max_transid = (u64)-1;
+
+   /* just a big number, doesn't matter much */
+   sk->nr_items = 4096;
+
+   /* try to get the objectid of default subvolume */
+   if(get_default) {
+   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
+   if (ret < 0) {
+   fprintf(stderr, "ERROR: can't perform the search\n");
+   return ret;
+   }
+   /* the ioctl returns the number of item it found in nr_items */
+   if (sk->nr_items == 0)
+   goto print;
+
+   off = 0;
+   /* go through each item to find dir item named "default" */
+   for (i = 0; i < sk->nr_items; i++) {
+   sh = (struct btrfs_ioctl_search_header *)(args.buf +
+ off);
+   off += sizeof(*sh);
+   if (sh->type == BTRFS_DIR_ITEM_KEY) {
+   di = (struct btrfs_dir_item *)(args.buf + off);
+   name_len = le16_to_cpu(di->name_len);
+   name = (char *)di + sizeof(struct 
btrfs_dir_item);
+   if (!strncmp("default", name, name_len)) {
+   subvol_id = btrfs_disk_key_objectid(
+   &di->location);
+   break;
+   }
+   }
+
+   off += sh->len;
+   }
+   }
+print:
/* now that we have all the subvol-relative paths filled in,
 * we have to string the subvols together so that we can get
 * a path all the way back to the FS root
@@ -650,7 +701,12 @@ int list_subvols(int fd)
while (n) {
struct root_info *entry;
entry = rb_entry(n, struct root_info, rb_node);
-   resolve_root(&root_lookup, entry);
+   if(!get_default)
+   resolve_root(&root_lookup, entry);
+   /* we only want the default subvolume */
+   else if(subvol_id == entry->root_id)
+   resolve_root(&root_lookup, entry);
+   
n = rb_prev(n);
}
 
diff --git a/btrfs.c b/btrfs.c
index 46314cf..6b73f88 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -73,6 +73,9 @@ static struct Command commands[] = {
"Set the subvolume of the filesystem  which will be 
mounted\n"
"as default."
},
+   { do_get_default_subvol, 1, "subvolume get-default", "\n"
+   "Get the default subvolume of a filesystem."
+   },
{ do_fssync, 1,
  "filesystem sync", "\n"
"Force a sync on the filesystem ."
diff --git a/btrfs_cmds.c b/btrfs_cmds.c
index 8031c58..11c56f6 100644
--- a/btrfs_cmds.c
+++ b/btrfs_cmds.c
@@ -301,7 +301,7 @@ int do_subvol_list(int argc, char **argv)
fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
return 12;
}
-   ret = list_subvols(fd);
+   ret = list_subvols(fd, 0);
if (ret)
return 19;
return 0;
@@ -834,6 +834,35 @@ int do_set_default_subvol(int nargs, char **argv)
return 0;
 }
 
+int do_get_default_subvol(int nargs, char **argv)
+{
+   int fd;

Re: feature request: btrfs-image without zeroing data

2011-07-11 Thread Stephane Chazelas
2011-07-11 02:00:51 +0200, krz...@gmail.com :
> Documentation says that btrfs-image zeros data. Feature request is for
> disabling this. btrfs-image could be used to copy filesystem to
> another drive (for example with snapshots, when copying it file by
> file would take much longer time or acctualy was not possible
> (snapshots)). btrfs-image in turn could be used to actualy shrink loop
> devices/sparse file containing btrfs - by copying filesystem to new
> loop device/sparse file.
> 
> Also it would be nice if copying filesystem could occour without
> intermediate dump to a file...
[...]

I second that.

See also
http://thread.gmane.org/gmane.comp.file-systems.btrfs/9675/focus=9820
for a way to transfer btrfs fs.

(Add a layer of "copy-on-write" on the original devices (LVM
snapshots, nbd/qemu-nbd cow...), "btrfs add" the new device(s)
and then "btrfs del" of the cow'ed original devices.

-- 
Stephane
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2] Btrfs-progs: add "btrfs subvolume get-default" subcommand

2011-07-11 Thread Zhong, Xin
Add subcommand to get the default subvolume of btrfs filesystem

Reported-by: Yang, Yi 
Signed-off-by: Zhong, Xin 
---
 btrfs-list.c |   57 +++--
 btrfs.c  |3 +++
 btrfs_cmds.c |   31 ++-
 btrfs_cmds.h |3 ++-
 4 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 93766a8..aa6a9b4 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -536,7 +536,7 @@ build:
return full;
 }
 
-int list_subvols(int fd)
+int list_subvols(int fd, int get_default)
 {
struct root_lookup root_lookup;
struct rb_node *n;
@@ -545,10 +545,12 @@ int list_subvols(int fd)
struct btrfs_ioctl_search_key *sk = &args.key;
struct btrfs_ioctl_search_header *sh;
struct btrfs_root_ref *ref;
+   struct btrfs_dir_item *di;
unsigned long off = 0;
int name_len;
char *name;
u64 dir_id;
+   u64 subvol_id = 0;
int i;
 
root_lookup_init(&root_lookup);
@@ -642,6 +644,52 @@ int list_subvols(int fd)
n = rb_next(n);
}
 
+   memset(&args, 0, sizeof(args));
+
+   /* search in the tree of tree roots */
+   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
+
+   /* search dir item */
+   sk->max_type = BTRFS_DIR_ITEM_KEY;
+   sk->min_type = BTRFS_DIR_ITEM_KEY;
+
+   sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->max_offset = (u64)-1;
+   sk->max_transid = (u64)-1;
+
+   /* just a big number, doesn't matter much */
+   sk->nr_items = 4096;
+
+   /* try to get the objectid of default subvolume */
+   if(get_default) {
+   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
+   if (ret < 0) {
+   fprintf(stderr, "ERROR: can't perform the search\n");
+   return ret;
+   }
+
+   off = 0;
+   /* go through each item to find dir item named "default" */
+   for (i = 0; i < sk->nr_items; i++) {
+   sh = (struct btrfs_ioctl_search_header *)(args.buf +
+ off);
+   off += sizeof(*sh);
+   if (sh->type == BTRFS_DIR_ITEM_KEY) {
+   di = (struct btrfs_dir_item *)(args.buf + off);
+   name_len = le16_to_cpu(di->name_len);
+   name = (char *)di + sizeof(struct 
btrfs_dir_item);
+   if (!strncmp("default", name, name_len)) {
+   subvol_id = btrfs_disk_key_objectid(
+   &di->location);
+   break;
+   }
+   }
+
+   off += sh->len;
+   }
+   }
+
/* now that we have all the subvol-relative paths filled in,
 * we have to string the subvols together so that we can get
 * a path all the way back to the FS root
@@ -650,7 +698,12 @@ int list_subvols(int fd)
while (n) {
struct root_info *entry;
entry = rb_entry(n, struct root_info, rb_node);
-   resolve_root(&root_lookup, entry);
+   if(!get_default)
+   resolve_root(&root_lookup, entry);
+   /* we only want the default subvolume */
+   else if(subvol_id == entry->root_id)
+   resolve_root(&root_lookup, entry);
+   
n = rb_prev(n);
}
 
diff --git a/btrfs.c b/btrfs.c
index 46314cf..6b73f88 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -73,6 +73,9 @@ static struct Command commands[] = {
"Set the subvolume of the filesystem  which will be 
mounted\n"
"as default."
},
+   { do_get_default_subvol, 1, "subvolume get-default", "\n"
+   "Get the default subvolume of a filesystem."
+   },
{ do_fssync, 1,
  "filesystem sync", "\n"
"Force a sync on the filesystem ."
diff --git a/btrfs_cmds.c b/btrfs_cmds.c
index 8031c58..11c56f6 100644
--- a/btrfs_cmds.c
+++ b/btrfs_cmds.c
@@ -301,7 +301,7 @@ int do_subvol_list(int argc, char **argv)
fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
return 12;
}
-   ret = list_subvols(fd);
+   ret = list_subvols(fd, 0);
if (ret)
return 19;
return 0;
@@ -834,6 +834,35 @@ int do_set_default_subvol(int nargs, char **argv)
return 0;
 }
 
+int do_get_default_subvol(int nargs, char **argv)
+{
+   int fd;
+   int ret;
+   char *subvol;
+
+   subvol = argv[1];
+
+   ret = test_issubvolume(subvol);
+   if (ret < 0) {
+   fprintf(st

Re: [GIT PULL] btrfs fixes

2011-07-11 Thread Tarkan Erimer

On 07/08/2011 09:55 PM, Chris Mason wrote:

Hi everyone,

The for-linus branch of the btrfs-unstable repo:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git for-linus

Has three more fixes.  We're fixing oopsen during space balancing (btrfs
filesystem balance /mnt)) and during device removal.

Dave Sterba also sent in a patch to make /proc/mounts properly match a
few new mount options, which he (correctly I think) considers a
regression fix because it makes it hard for testers/users to verify the
options in a running config.



Hi Chris,

Maybe, any development regarding to the "[BUG] Btrfs: Corrupted root 
filesystem" subjected bug posted by me ?



Tarkan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs hang in flush-btrfs-5

2011-07-11 Thread Jeremy Sanders
Jeremy Sanders wrote:

> Hi - I'm trying btrfs with kernel 2.6.38.8-32.fc15.x86_64 (a Fedora
> kernel). I'm just doing a tar-to-tar copy onto the file system with
> compress- force=zlib. Here are some traces of the stuck processes.

I've managed to reproduce the hang using the latest btrfs from the 
repository. I had to remove some of the tracing lines to get it to compile 
under 2.6.38.8 and an ioctl which wasn't defined. Here is is where it is 
stuck:

[ 8390.923737] flush-btrfs-4   D 88005aeef480 0  2965  2 
0x0080
[ 8390.923907]  8800026cb720 0046 8800026cb690 
0001
[ 8390.924037]  00013840 00013840 00013840 
88005931ae60
[ 8390.924037]  00013840 8800026cbfd8 00013840 
00013840
[ 8390.924037] Call Trace:
[ 8390.924037]  [] ? sync_page+0x0/0x4d
[ 8390.924037]  [] io_schedule+0x47/0x62
[ 8390.924037]  [] sync_page+0x49/0x4d
[ 8390.924037]  [] __wait_on_bit_lock+0x46/0x8f
[ 8390.924037]  [] __lock_page+0x66/0x6d
[ 8390.924037]  [] ? wake_bit_function+0x0/0x31
[ 8390.924037]  [] ? should_resched+0xe/0x2e
[ 8390.924037]  [] lock_page+0x3d/0x41 [btrfs]
[ 8390.924037]  [] lock_delalloc_pages+0xb7/0x1a2 [btrfs]
[ 8390.924037]  [] 
find_lock_delalloc_range.clone.18+0xd9/0x1cb [btrfs]
[ 8390.924037]  [] ? __lookup_tag+0xb9/0x123
[ 8390.924037]  [] __extent_writepage+0x156/0x561 [btrfs]
[ 8390.924037]  [] ? 
radix_tree_gang_lookup_tag_slot+0x81/0xa2
[ 8390.924037]  [] ? find_get_pages_tag+0x6f/0xd5
[ 8390.924037]  [] 
extent_write_cache_pages.clone.9.clone.16+0x134/0x2a1 [btrfs]
[ 8390.924037]  [] extent_writepages+0x47/0x5c [btrfs]
[ 8390.924037]  [] ? btrfs_get_extent+0x0/0x77f [btrfs]
[ 8390.924037]  [] ? bit_waitqueue+0x17/0xa9
[ 8390.924037]  [] btrfs_writepages+0x27/0x29 [btrfs]
[ 8390.924037]  [] do_writepages+0x21/0x2a
[ 8390.924037]  [] writeback_single_inode+0x9c/0x19b
[ 8390.924037]  [] writeback_sb_inodes+0xa1/0x12b
[ 8390.924037]  [] writeback_inodes_wb+0x163/0x175
[ 8390.924037]  [] wb_writeback+0x24f/0x368
[ 8390.924037]  [] wb_do_writeback+0x183/0x19e
[ 8390.924037]  [] ? schedule_timeout+0xb3/0xe3
[ 8390.924037]  [] bdi_writeback_thread+0x88/0x205
[ 8390.924037]  [] ? bdi_writeback_thread+0x0/0x205
[ 8390.924037]  [] kthread+0x82/0x8a
[ 8390.924037]  [] kernel_thread_helper+0x4/0x10
[ 8390.924037]  [] ? kthread+0x0/0x8a
[ 8390.924037]  [] ? kernel_thread_helper+0x0/0x10

[ 8390.933163] tar D 880019053478 0  4195   2953 
0x0084
[ 8390.933163]  8800190533d8 0086 813b9878 
0010
[ 8390.933163]  00013840 00013840 00013840 
880045beae60
[ 8390.933163]  00013840 880019053fd8 00013840 
00013840
[ 8390.933163] Call Trace:
[ 8390.933163]  [] ? read_pmtmr+0x10/0x17
[ 8390.933163]  [] ? sync_page+0x0/0x4d
[ 8390.933163]  [] io_schedule+0x47/0x62
[ 8390.933163]  [] sync_page+0x49/0x4d
[ 8390.933163]  [] __wait_on_bit_lock+0x46/0x8f
[ 8390.933163]  [] __lock_page+0x66/0x6d
[ 8390.933163]  [] ? wake_bit_function+0x0/0x31
[ 8390.933163]  [] lock_page+0x3d/0x41
[ 8390.933163]  [] move_to_new_page+0x11e/0x195
[ 8390.933163]  [] migrate_pages+0x24e/0x38d
[ 8390.933163]  [] ? compaction_alloc+0x0/0x29a
[ 8390.933163]  [] ? zone_page_state_add+0x2f/0x34
[ 8390.933163]  [] compact_zone+0x3f0/0x5e1
[ 8390.933163]  [] compact_zone_order+0xb0/0xbf
[ 8390.933163]  [] ? get_page_from_freelist+0x627/0x670
[ 8390.933163]  [] try_to_compact_pages+0x91/0xe7
[ 8390.933163]  [] __alloc_pages_direct_compact+0xa9/0x16f
[ 8390.933163]  [] __alloc_pages_nodemask+0x469/0x762
[ 8390.933163]  [] ? signal_pending+0x17/0x21
[ 8390.933163]  [] alloc_pages_current+0xb1/0xca
[ 8390.933163]  [] alloc_slab_page+0x1c/0x4a
[ 8390.933163]  [] new_slab+0x52/0x1a7
[ 8390.933163]  [] __slab_alloc+0x224/0x302
[ 8390.933163]  [] ? radix_tree_preload+0x34/0x85
[ 8390.933163]  [] ? radix_tree_preload+0x34/0x85
[ 8390.933163]  [] kmem_cache_alloc+0x5b/0xe1
[ 8390.933163]  [] radix_tree_preload+0x34/0x85
[ 8390.933163]  [] add_to_page_cache_locked+0x58/0x124
[ 8390.933163]  [] add_to_page_cache_lru+0x2a/0x58
[ 8390.933163]  [] find_or_create_page+0x5a/0x8a
[ 8390.933163]  [] prepare_pages.clone.9+0xf1/0x30a 
[btrfs]
[ 8390.933163]  [] ? block_rsv_add_bytes+0x24/0x4e [btrfs]
[ 8390.933163]  [] 
__btrfs_buffered_write.clone.11+0x126/0x2a1 [btrfs]
[ 8390.933163]  [] ? __mark_inode_dirty+0x30/0x169
[ 8390.933163]  [] ? file_update_time+0xf7/0x111
[ 8390.933163]  [] btrfs_file_aio_write+0x3da/0x492 
[btrfs]
[ 8390.933163]  [] ? pipe_read+0x3bd/0x3d2
[ 8390.933163]  [] ? __perf_event_task_sched_out+0x27/0x2c
[ 8390.933163]  [] do_sync_write+0xcb/0x108
[ 8390.933163]  [] ? security_file_permission+0x2e/0x33
[ 8390.933163]  [] vfs_write+0xac/0xff
[ 8390.933163]  [] sys_write+0x4a/0x6e
[ 8390.933163]  [] system_call_fastpath+0x16/0x1b

Jeremy


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ma

R: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-default" subcommand

2011-07-11 Thread Goffredo Baroncelli
>Messaggio originale
>Da: xin.zh...@intel.com
>Data: 11/07/2011 10.56
>A: 
>Cc: 
>Ogg: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-default" 
subcommand
>
>Add subcommand to get the default subvolume of btrfs filesystem
>
>Reported-by: Yang, Yi 
>Signed-off-by: Zhong, Xin 
>---
> btrfs-list.c |   57 
+++--
> btrfs.c  |3 +++
> btrfs_cmds.c |   31 ++-
> btrfs_cmds.h |3 ++-
> 4 files changed, 90 insertions(+), 4 deletions(-)

please update the man page too.

>
>diff --git a/btrfs-list.c b/btrfs-list.c
>index 93766a8..aa6a9b4 100644
>--- a/btrfs-list.c
>+++ b/btrfs-list.c
>@@ -536,7 +536,7 @@ build:
>   return full;
[...]
>+  /* search dir item */
>+  sk->max_type = BTRFS_DIR_ITEM_KEY;
>+  sk->min_type = BTRFS_DIR_ITEM_KEY;
>+
>+  sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
>+  sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
>+  sk->max_offset = (u64)-1;
>+  sk->max_transid = (u64)-1;
>+
[...]
>+  /* go through each item to find dir item named "default" */
>+  for (i = 0; i < sk->nr_items; i++) {
>+  sh = (struct btrfs_ioctl_search_header *)(args.buf +
>+off);
>+  off += sizeof(*sh);
>+  if (sh->type == BTRFS_DIR_ITEM_KEY) {
>+  di = (struct btrfs_dir_item *)(args.buf + off);
>+  name_len = le16_to_cpu(di->name_len);
>+  name = (char *)di + sizeof(struct 
>btrfs_dir_item);
>+  if (!strncmp("default", name, name_len)) {
>+  subvol_id = btrfs_disk_key_objectid(
>+  &di->location);
>+  break;
>+  }
>+  }
>+
>+  off += sh->len;
>+  }

I am not familiar with the "default subvolume key", but are you sure that the 
key is always in the first set of returned keys ?

>+  }
>+
>   /* now that we have all the subvol-relative paths filled in,
>* we have to string the subvols together so that we can get
>* a path all the way back to the FS root
>@@ -650,7 +698,12 @@ int list_subvols(int fd)
>   while (n) {
>   struct root_info *entry;
>   entry = rb_entry(n, struct root_info, rb_node);
>-  resolve_root(&root_lookup, entry);
>+  if(!get_default)
>+  resolve_root(&root_lookup, entry);
>+  /* we only want the default subvolume */
>+  else if(subvol_id == entry->root_id)
>+  resolve_root(&root_lookup, entry);
>+  

What happens if there no is a default subvolume (for example a very old btrfs 
filesystem, and/or after removing the "default" subvolume) ?
I suggest to handle this case printing something like "No default subvolume 
found"


BR
G.Baroncelli

>   n = rb_prev(n);
>   }
> 
>diff --git a/btrfs.c b/btrfs.c
>index 46314cf..6b73f88 100644
>--- a/btrfs.c
>+++ b/btrfs.c
>@@ -73,6 +73,9 @@ static struct Command commands[] = {
>   "Set the subvolume of the filesystem  which will be 
> mounted\n"
>   "as default."
>   },
>+  { do_get_default_subvol, 1, "subvolume get-default", "\n"
>+  "Get the default subvolume of a filesystem."
>+  },
>   { do_fssync, 1,
> "filesystem sync", "\n"
>   "Force a sync on the filesystem ."
>diff --git a/btrfs_cmds.c b/btrfs_cmds.c
>index 8031c58..11c56f6 100644
>--- a/btrfs_cmds.c
>+++ b/btrfs_cmds.c
>@@ -301,7 +301,7 @@ int do_subvol_list(int argc, char **argv)
>   fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
>   return 12;
>   }
>-  ret = list_subvols(fd);
>+  ret = list_subvols(fd, 0);
>   if (ret)
>   return 19;
>   return 0;
>@@ -834,6 +834,35 @@ int do_set_default_subvol(int nargs, char **argv)
>   return 0;
> }
> 
>+int do_get_default_subvol(int nargs, char **argv)
>+{
>+  int fd;
>+  int ret;
>+  char *subvol;
>+
>+  subvol = argv[1];
>+
>+  ret = test_issubvolume(subvol);
>+  if (ret < 0) {
>+  fprintf(stderr, "ERROR: error accessing '%s'\n", subvol);
>+  return 12;
>+  }
>+  if (!ret) {
>+  fprintf(stderr, "ERROR: '%s' is not a subvolume\n", subvol);
>+  return 13;
>+  }
>+
>+  fd = open_file_or_dir(subvol);
>+  if (fd < 0) {
>+  fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
>+  return 12;
>+  }
>+  ret = list_subvols(fd, 1);
>+  if (ret)
>+  return 19;
>+  return 0;
>+}
>+
> int do_df_filesystem(int nargs, char **argv)
> {
>   struct

Re: feature request: btrfs-image without zeroing data

2011-07-11 Thread krz...@gmail.com
2011/7/11 Stephane Chazelas :
> 2011-07-11 02:00:51 +0200, krz...@gmail.com :
>> Documentation says that btrfs-image zeros data. Feature request is for
>> disabling this. btrfs-image could be used to copy filesystem to
>> another drive (for example with snapshots, when copying it file by
>> file would take much longer time or acctualy was not possible
>> (snapshots)). btrfs-image in turn could be used to actualy shrink loop
>> devices/sparse file containing btrfs - by copying filesystem to new
>> loop device/sparse file.
>>
>> Also it would be nice if copying filesystem could occour without
>> intermediate dump to a file...
> [...]
>
> I second that.
>
> See also
> http://thread.gmane.org/gmane.comp.file-systems.btrfs/9675/focus=9820
> for a way to transfer btrfs fs.
>
> (Add a layer of "copy-on-write" on the original devices (LVM
> snapshots, nbd/qemu-nbd cow...), "btrfs add" the new device(s)
> and then "btrfs del" of the cow'ed original devices.
>
> --
> Stephane
>

Copying on block level (dd, lvm) is old trick, however this takes same
ammount of time regardless of actual space used in filesystem. Hence
this feature request. Images inside filesystem can copy only actualy
used data and metadata, which dramaticly reduces copy times in large
volumes that are not filled up...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature request: btrfs-image without zeroing data

2011-07-11 Thread Stephane Chazelas
2011-07-11 14:39:18 +0200, krz...@gmail.com :
> 2011/7/11 Stephane Chazelas :
[...]
> > See also
> > http://thread.gmane.org/gmane.comp.file-systems.btrfs/9675/focus=9820
> > for a way to transfer btrfs fs.
> >
> > (Add a layer of "copy-on-write" on the original devices (LVM
> > snapshots, nbd/qemu-nbd cow...), "btrfs add" the new device(s)
> > and then "btrfs del" of the cow'ed original devices.
[...]
> Copying on block level (dd, lvm) is old trick, however this takes same
> ammount of time regardless of actual space used in filesystem. Hence
> this feature request. Images inside filesystem can copy only actualy
> used data and metadata, which dramaticly reduces copy times in large
> volumes that are not filled up...

The method I suggest doesn't copy the whole disks, please read
more carefully. It can also work to copy from a 3 disk setup to
a 1 disk setup or the other way round.

With btrfs, you can add devices to a FS dynamically, you can
also delete devices in which case data is being transfered to
the other devices. The method I suggest uses that feature.

Cheers,
Stephane
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Btrfs updates for 3.1

2011-07-11 Thread Josef Bacik
On 07/10/2011 08:20 PM, Mitch Harder wrote:
> On Sat, Jul 2, 2011 at 4:25 PM, Josef Bacik  wrote:
>> On 07/01/2011 04:39 PM, Josef Bacik wrote:
>>> Hey Chris,
>>>
>>> Since I'm going on vacation next week I wanted to get everything ready for 
>>> you
>>> in case you get bored with fsck and want to put together a 3.1 tree :).  If 
>>> you
>>> can pull
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git for-chris
>>>
>>> It is based on your for-linus branch.  Here is the shortlog and diffstat
>>>
>>
>> Ugh sorry I had to update the don't panic patch because I'm an idiot.
>> Thanks,
>>
>> Josef
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> Josef:
> 
> I've been testing this series of patches in a 2.6.39.3 kernel merged
> with the latest 'for-linus' branch in Chris btrfs-unstable repository.
> 
> I'm getting a kernel BUG that looks like it could be attributed to the
> "Btrfs: try to only do one btrfs_search_slot in do_setxattr" patch.
> 
> [ 5959.860027] [ cut here ]
> [ 5959.860032] kernel BUG at include/linux/spinlock.h:380!
> [ 5959.860035] invalid opcode:  [#1] SMP
> [ 5959.860038] last sysfs file: /sys/kernel/uevent_seqnum
> [ 5959.860041] CPU 1
> [ 5959.860042] Modules linked in: snd_seq_midi nvidia(P) lgdt330x
> cx88_dvb cx88_vp3054_i2c videobuf_dvb snd_ens1371 tuner_simple
> tuner_types snd_rawmidi tda9887 tda8290 tuner cx8800 cx88_alsa cx8802
> snd_ac97_codec r8169 i2c_i801 cx88xx tveeprom videobuf_dma_sg
> btcx_risc videobuf_core ac97_bus
> [ 5959.860062]
> [ 5959.860065] Pid: 13521, comm: install Tainted: P
> 2.6.39.3-git-local-v10+ #1 Gigabyte Technology Co., Ltd.
> P35-DS3L/P35-DS3L
> [ 5959.860071] RIP: 0010:[]  []
> btrfs_assert_tree_locked+0x21/0x30
> [ 5959.860079] RSP: 0018:880005dffaf8  EFLAGS: 00010246
> [ 5959.860082] RAX: 3300 RBX: 88007be51b40 RCX: 
> 0001
> [ 5959.860084] RDX:  RSI: 880005dfe000 RDI: 
> 880025993900
> [ 5959.860087] RBP: 880005dffaf8 R08: 0001 R09: 
> 0001
> [ 5959.860089] R10:  R11:  R12: 
> 88005a8c2800
> [ 5959.860092] R13: 0001 R14: 000b R15: 
> 88003e41ebd0
> [ 5959.860095] FS:  7f495b017700() GS:88007fd0()
> knlGS:
> [ 5959.860098] CS:  0010 DS:  ES:  CR0: 80050033
> [ 5959.860100] CR2: 7f495afdc000 CR3: 3482b000 CR4: 
> 06e0
> [ 5959.860103] DR0:  DR1:  DR2: 
> 
> [ 5959.860106] DR3:  DR6: 0ff0 DR7: 
> 0400
> [ 5959.860109] Process install (pid: 13521, threadinfo
> 880005dfe000, task 880056aaac40)
> [ 5959.860111] Stack:
> [ 5959.860113]  880005dffb88 813536f7 88000343f000
> 8138add4
> [ 5959.860117]  880078546700 000b 0072
> 0001
> [ 5959.860122]  880005dffb88 00010001 
> 0efb0343f000
> [ 5959.860126] Call Trace:
> [ 5959.860132]  [] push_leaf_left+0xc7/0x190
> [ 5959.860136]  [] ? btrfs_item_size+0xe4/0xf0
> [ 5959.860140]  [] btrfs_del_items+0x39d/0x570
> [ 5959.860144]  [] btrfs_delete_one_dir_name+0xf9/0x100
> [ 5959.860148]  [] do_setxattr+0x191/0x1f0
> [ 5959.860152]  [] ? join_transaction.clone.24+0x21/0x220
> [ 5959.860156]  [] __btrfs_setxattr+0x93/0xf0
> [ 5959.860159]  [] btrfs_set_acl+0x103/0x230
> [ 5959.860163]  [] btrfs_acl_chmod+0xea/0xf0
> [ 5959.860167]  [] ? __mark_inode_dirty+0x68/0x200
> [ 5959.860171]  [] btrfs_setattr+0x9e/0xd0
> [ 5959.860175]  [] notify_change+0x189/0x370
> [ 5959.860179]  [] sys_fchmodat+0xcd/0x100
> [ 5959.860183]  [] ? do_munmap+0x2eb/0x360
> [ 5959.860187]  [] sys_chmod+0x18/0x20
> [ 5959.860191]  [] system_call_fastpath+0x16/0x1b
> [ 5959.860193] Code: 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f
> 44 00 00 48 8b 47 38 a8 02 75 0e 8b 57 64 89 d0 c1 f8 08 31 d0 84 c0
> 74 02 c9 c3 <0f> 0b 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5
> 48 83
> [ 5959.860226] RIP  [] btrfs_assert_tree_locked+0x21/0x30
> [ 5959.860230]  RSP 
> [ 5959.860233] ---[ end trace 56253344f3b39a7c ]---
> 
> I'll try dropping the patch to confirm the source, but it's been an
> intermittent bug.  I haven't run across a simple way to reliably
> reproduce it.

Oh I see what's going on, if we're trying to overwrite an existing xattr
we will panic because I just take the path and try to delete the sucker,
but because the path was originally setup for insertion it unlocked the
parent's parent and boom.  I will fix this up, thanks for reporting it,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.ker

Re: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-default" subcommand

2011-07-11 Thread Andreas Philipp

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 
Would you mind rebasing your patch on Hugo Mills' integration-branch for
btrfs progs at
http://git.darksatanic.net/repo/btrfs-progs-unstable.git
integration-20110705
since it does not apply on top of all changes which are already there.
Additionally, I spotted one whitespace error in the patch and marked it
below.
 
Thanks,
Andreas Philipp
 
On 11.07.2011 10:56, Zhong, Xin wrote:
> Add subcommand to get the default subvolume of btrfs filesystem
>
> Reported-by: Yang, Yi 
> Signed-off-by: Zhong, Xin 
> ---
> btrfs-list.c | 57
+++--
> btrfs.c | 3 +++
> btrfs_cmds.c | 31 ++-
> btrfs_cmds.h | 3 ++-
> 4 files changed, 90 insertions(+), 4 deletions(-)
>
> diff --git a/btrfs-list.c b/btrfs-list.c
> index 93766a8..aa6a9b4 100644
> --- a/btrfs-list.c
> +++ b/btrfs-list.c
> @@ -536,7 +536,7 @@ build:
> return full;
> }
>
> -int list_subvols(int fd)
> +int list_subvols(int fd, int get_default)
> {
> struct root_lookup root_lookup;
> struct rb_node *n;
> @@ -545,10 +545,12 @@ int list_subvols(int fd)
> struct btrfs_ioctl_search_key *sk = &args.key;
> struct btrfs_ioctl_search_header *sh;
> struct btrfs_root_ref *ref;
> + struct btrfs_dir_item *di;
> unsigned long off = 0;
> int name_len;
> char *name;
> u64 dir_id;
> + u64 subvol_id = 0;
> int i;
>
> root_lookup_init(&root_lookup);
> @@ -642,6 +644,52 @@ int list_subvols(int fd)
> n = rb_next(n);
> }
>
> + memset(&args, 0, sizeof(args));
> +
> + /* search in the tree of tree roots */
> + sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
> +
> + /* search dir item */
> + sk->max_type = BTRFS_DIR_ITEM_KEY;
> + sk->min_type = BTRFS_DIR_ITEM_KEY;
> +
> + sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
> + sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
> + sk->max_offset = (u64)-1;
> + sk->max_transid = (u64)-1;
> +
> + /* just a big number, doesn't matter much */
> + sk->nr_items = 4096;
> +
> + /* try to get the objectid of default subvolume */
> + if(get_default) {
> + ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
> + if (ret < 0) {
> + fprintf(stderr, "ERROR: can't perform the search\n");
> + return ret;
> + }
> +
> + off = 0;
> + /* go through each item to find dir item named "default" */
> + for (i = 0; i < sk->nr_items; i++) {
> + sh = (struct btrfs_ioctl_search_header *)(args.buf +
> + off);
> + off += sizeof(*sh);
> + if (sh->type == BTRFS_DIR_ITEM_KEY) {
> + di = (struct btrfs_dir_item *)(args.buf + off);
> + name_len = le16_to_cpu(di->name_len);
> + name = (char *)di + sizeof(struct btrfs_dir_item);
> + if (!strncmp("default", name, name_len)) {
> + subvol_id = btrfs_disk_key_objectid(
> + &di->location);
> + break;
> + }
> + }
> +
> + off += sh->len;
> + }
> + }
> +
> /* now that we have all the subvol-relative paths filled in,
> * we have to string the subvols together so that we can get
> * a path all the way back to the FS root
> @@ -650,7 +698,12 @@ int list_subvols(int fd)
> while (n) {
> struct root_info *entry;
> entry = rb_entry(n, struct root_info, rb_node);
> - resolve_root(&root_lookup, entry);
> + if(!get_default)
> + resolve_root(&root_lookup, entry);
> + /* we only want the default subvolume */
> + else if(subvol_id == entry->root_id)
> + resolve_root(&root_lookup, entry);
> +
This line adds a whitespace error.
> n = rb_prev(n);
> }
>
> diff --git a/btrfs.c b/btrfs.c
> index 46314cf..6b73f88 100644
> --- a/btrfs.c
> +++ b/btrfs.c
> @@ -73,6 +73,9 @@ static struct Command commands[] = {
> "Set the subvolume of the filesystem  which will be mounted\n"
> "as default."
> },
> + { do_get_default_subvol, 1, "subvolume get-default", "\n"
> + "Get the default subvolume of a filesystem."
> + },
> { do_fssync, 1,
> "filesystem sync", "\n"
> "Force a sync on the filesystem ."
> diff --git a/btrfs_cmds.c b/btrfs_cmds.c
> index 8031c58..11c56f6 100644
> --- a/btrfs_cmds.c
> +++ b/btrfs_cmds.c
> @@ -301,7 +301,7 @@ int do_subvol_list(int argc, char **argv)
> fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
> return 12;
> }
> - ret = list_subvols(fd);
> + ret = list_subvols(fd, 0);
> if (ret)
> return 19;
> return 0;
> @@ -834,6 +834,35 @@ int do_set_default_subvol(int nargs, char **argv)
> return 0;
> }
>
> +int do_get_default_subvol(int nargs, char **argv)
> +{
> + int fd;
> + int ret;
> + char *subvol;
> +
> + subvol = argv[1];
> +
> + ret = test_issubvolume(subvol);
> + if (ret < 0) {
> + fprintf(stderr, "ERROR: error accessing '%s'\n", subvol);
> + return 12;
> + }
> + if (!ret) {
> + fprintf(stderr, "ERROR: '%s' is not a subvolume\n", subvol);
> + return 13;
> + }
> +
> + fd = open_file_or_dir(subvol);
> + if (fd < 0) {
> + fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
> + return 12;
> + }
> + ret = list_subvols(fd, 1);
> + if (ret)
> + return 19;
> + return 0;
> +}
> +
> int do_df_filesystem(int nargs, char **argv)
> {
> struct btrfs_ioctl_space_args *sargs;
> diff --git a/btrfs_cmds.h b/btrfs_cmds.h
> index 7bde191..9c

Re: [PATCH v2 4/5] scrub userland implementation

2011-07-11 Thread Jan Schmidt
On 10.07.2011 20:23, Hugo Mills wrote:
>Yes, this is over three months after the initial posting, but since
> nobody else has looked at it yet, and the patch is in my integration
> stack...

... thanks!

>I've not reviewed the whole thing -- just the "scrub start" code so
> far. I've removed the bits I've not checked from the file below.

I rebased the old branch I found to your current integration branch and
fixed up a most of what you mentioned. I'll not send a new version out
until after your complete review (or your statement that this is it or
your statement that you would rather going on reviewing the revised
version).

Things I ripped out are accepted and corrected without resistance.
Comments follow.

> On Wed, Mar 30, 2011 at 06:53:12PM +0200, Jan Schmidt wrote:
> 
>No commit message at all?

Didn't know what to put there. Cover letter says it all. And as
mentioned, this is the initial implementation.

>> Signed-off-by: Jan Schmidt 
>> ---
>>  scrub.c | 1568 
>> +++
>>  1 files changed, 1568 insertions(+), 0 deletions(-)
> 
>This is quite big to review in one lump... Is it possible to split
> the patch into functional sections? (Add shared infrastructure, then
> each of the four functions separately, maybe?)

Thought about that, but it doesn't make sense to me. It is the initial
implementation. A lot of the code is shared, thus adding one lump and
patching the patch with four small additional commits wouldn't help much.

>> diff --git a/scrub.c b/scrub.c
>> new file mode 100644
>> index 000..22052ed
>> --- /dev/null
>> +++ b/scrub.c
>> +#define SCRUB_DATA_FILE "/var/btrfs/scrub.status"
>> +#define SCRUB_PROGRESS_SOCKET_PATH "/var/btrfs/scrub.progress"
> 
>I'd suggest /var/lib/btrfs/[...] instead. Putting it in the top
> level of /var seems a bit presumptuous (and contravenes the FHS).

I wasn't sure if I can expect /var/lib to be present anywhere btrfs
could run. But I changed it to what you suggested.

>> +printf("\ttotal bytes scrubbed: %s with %llu errors\n",
>> +pretty_sizes(p->data_bytes_scrubbed + p->tree_bytes_scrubbed),
>> +max(err_cnt, err_cnt2));
> 
>Memory leak: pretty_sizes() mallocs space for its result.

Pah... In a user space function of a run-once utility right before it
exits. But I fixed that one, just to please you :-)

>> +static void init_fs_stat(struct scrub_fs_stat *fs_stat)
>> +{
>> +memset(fs_stat, 0, sizeof(*fs_stat));
>> +fs_stat->s.finished = 2;
> 
>What does 2 mean? ->s.finished seems to be a boolean everywhere
> except here. Can you turn this value into a more descriptive #define?
> Or just use 1?

Good question. I guess I once wanted to distinguish really finished
scrub runs from not-even-started ones. I changed it to 1 (which makes it
much more likely we'll need that distinction quite soon).

>> +static int cancel_fd = -1;
>> +static void scrub_sigint_record_progress(int signal)
> 
>What does this function have to do with recording progress? Seems a
> bit of a misnomer to me. (Call it scrub_sigint_cancel_scrub, maybe?)

I added a comment and left the name unchanged.

>> +{
>> +ioctl(cancel_fd, BTRFS_IOC_SCRUB_CANCEL, NULL);
>> +}
>> +
>> +static int scrub_handle_sigint_parent(void)
>> +{
>> +struct sigaction sa = {
>> +.sa_handler = SIG_IGN,
>> +.sa_flags = SA_RESTART,
>> +};
>> +
>> +return sigaction(SIGINT, &sa, NULL);
>> +}
>> +
>> +static int scrub_handle_sigint_child(int fd)
>> +{
>> +struct sigaction sa = {
>> +.sa_handler = fd == -1 ? SIG_DFL : scrub_sigint_record_progress,
>> +};
>> +
>> +cancel_fd = fd;
>> +return sigaction(SIGINT, &sa, NULL);
>> +}
>> +
>> +static int _scrub_datafile(const char *fn_base, const char *fn_local,
>> +   const char *fn_tmp, char *datafile, int max)
>> +{
>> +int ret;
>> +
>> +strncpy(datafile, fn_base, max);
> 
>You need to put a zero byte at datafile[max], otherwise it could be
> unterminated (if max <= strlen(fn_base)), and the strlen will then run
> off the end of the string.

Damn. strncpy is a mess. I want strlcpy.

I Modified the code another way. I rather return an error than throwing
away bytes and continue happily.

strncpy third arg always - 1, thus we always have a 0 byte at the end of
the buffer. I then compare strlen to the buffer size.

>> +ret = strlen(datafile);
>> +
>> +if (ret + 1 >= max)
>> +return -EOVERFLOW;
> 
>This will never happen (if you put the zero terminator in)
> 
>> +datafile[ret] = '.';
>> +strncpy(datafile+ret+1, fn_local, max-ret-1);
> 
>... and add a zero byte here, too (or use strncat)
> 
>> +ret = strlen(datafile);
>> +
>> +if (ret + 1 >= max)
>> +return -EOVERFLOW;
> 
>as above: won't happen
> 
>> +if (fn_tmp) {
>> +datafile[ret] = '_';
>> +strncpy(datafile+ret+1, fn_t

Re: btrfs hang in flush-btrfs-5

2011-07-11 Thread Josef Bacik
On 07/11/2011 07:40 AM, Jeremy Sanders wrote:
> Jeremy Sanders wrote:
> 
>> Hi - I'm trying btrfs with kernel 2.6.38.8-32.fc15.x86_64 (a Fedora
>> kernel). I'm just doing a tar-to-tar copy onto the file system with
>> compress- force=zlib. Here are some traces of the stuck processes.
> 
> I've managed to reproduce the hang using the latest btrfs from the 
> repository. I had to remove some of the tracing lines to get it to compile 
> under 2.6.38.8 and an ioctl which wasn't defined. Here is is where it is 
> stuck:
> 

Hrm well that is just unlikely and hard to hit.  Will you try this and
see if it helps you?  Thanks,

Josef

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 59cbdb1..3c8c435 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1081,7 +1081,8 @@ static noinline int prepare_pages(struct
btrfs_root *root, struct file *file,

 again:
for (i = 0; i < num_pages; i++) {
-   pages[i] = grab_cache_page(inode->i_mapping, index + i);
+   pages[i] = find_or_create_page(inode->i_mapping, index + i,
+  GFP_NOFS);
if (!pages[i]) {
faili = i - 1;
err = -ENOMEM;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: use find_or_create_page instead of grab_cache_page

2011-07-11 Thread Josef Bacik
grab_cache_page will use mapping_gfp_mask(), which for all inodes is set to
GFP_HIGHUSER_MOVABLE.  So instead use find_or_create_page in all cases where we
need GFP_NOFS so we don't deadlock.  Thanks,

Signed-off-by: Josef Bacik 
---
 fs/btrfs/file.c |3 ++-
 fs/btrfs/free-space-cache.c |4 ++--
 fs/btrfs/inode.c|2 +-
 fs/btrfs/ioctl.c|4 ++--
 fs/btrfs/relocation.c   |3 ++-
 5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 59cbdb1..3c8c435 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1081,7 +1081,8 @@ static noinline int prepare_pages(struct btrfs_root 
*root, struct file *file,
 
 again:
for (i = 0; i < num_pages; i++) {
-   pages[i] = grab_cache_page(inode->i_mapping, index + i);
+   pages[i] = find_or_create_page(inode->i_mapping, index + i,
+  GFP_NOFS);
if (!pages[i]) {
faili = i - 1;
err = -ENOMEM;
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index fd7fa2a..6377713 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -307,7 +307,7 @@ int __load_free_space_cache(struct btrfs_root *root, struct 
inode *inode,
if (!num_entries && !num_bitmaps)
break;
 
-   page = grab_cache_page(inode->i_mapping, index);
+   page = find_or_create_page(inode->i_mapping, index, GFP_NOFS);
if (!page)
goto free_cache;
 
@@ -594,7 +594,7 @@ int __btrfs_write_out_cache(struct btrfs_root *root, struct 
inode *inode,
 * know and don't freak out.
 */
while (index < num_pages) {
-   page = grab_cache_page(inode->i_mapping, index);
+   page = find_or_create_page(inode->i_mapping, index, GFP_NOFS);
if (!page) {
int i;
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e6ede58..faf516e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3398,7 +3398,7 @@ static int btrfs_truncate_page(struct address_space 
*mapping, loff_t from)
 
ret = -ENOMEM;
 again:
-   page = grab_cache_page(mapping, index);
+   page = find_or_create_page(mapping, index, GFP_NOFS);
if (!page) {
btrfs_delalloc_release_space(inode, PAGE_CACHE_SIZE);
goto out;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a3c4751..09c9a8d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -867,8 +867,8 @@ again:
/* step one, lock all the pages */
for (i = 0; i < num_pages; i++) {
struct page *page;
-   page = grab_cache_page(inode->i_mapping,
-   start_index + i);
+   page = find_or_create_page(inode->i_mapping,
+   start_index + i, GFP_NOFS);
if (!page)
break;
 
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 5e0a3dc..59bb176 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2955,7 +2955,8 @@ static int relocate_file_extent_cluster(struct inode 
*inode,
page_cache_sync_readahead(inode->i_mapping,
  ra, NULL, index,
  last_index + 1 - index);
-   page = grab_cache_page(inode->i_mapping, index);
+   page = find_or_create_page(inode->i_mapping, index,
+  GFP_NOFS);
if (!page) {
btrfs_delalloc_release_metadata(inode,
PAGE_CACHE_SIZE);
-- 
1.7.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory leak?

2011-07-11 Thread Chris Mason
Excerpts from Stephane Chazelas's message of 2011-07-11 05:01:21 -0400:
> 2011-07-10 19:37:28 +0100, Stephane Chazelas:
> > 2011-07-10 08:44:34 -0400, Chris Mason:
> > [...]
> > > Great, we're on the right track.  Does it trigger with mount -o compress
> > > instead of mount -o compress_force?
> > [...]
> > 
> > It does trigger. I get that same "invalid opcode".
> > 
> > BTW, I tried with CONFIG_SLUB and slub_debug and no more useful
> > information than with SLAB_DEBUG.
> > 
> > I'm trying now without dmcrypt. Then I won't have much bandwidth
> > for testing.
> [...]
> 
> Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup
> thread when doing an 

This is some fantastic debugging, thank you.  I know you tested with
slab debugging turned on, did you have CONFIG_DEBUG_PAGEALLOC on as
well?

It's probably something to do with a specific file, but pulling that
file out without extra printks is going to be tricky.  I'll see if I can
reproduce it here.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-07-11 Thread Jan Schmidt
Hi Hubert,

I have to admit I did not recognize this patch but now Hugo is forcing
me to use the "detailed help messages" and I've got an improvement to
suggest:

On 23.01.2011 13:42, Hubert Kario wrote:
> extend the
> 
> btrfs  --help
> 
> command to print detailed help message if available but fallback to
> basic help message if detailed is unavailable
> 
> add detailed help message for 'filesystem defragment' command
> 
> little tweaks in comments
> 
> Signed-off-by: Hubert Kario 
> ---
>  btrfs.c |  101 ++
>  1 files changed, 68 insertions(+), 33 deletions(-)
> 
> diff --git a/btrfs.c b/btrfs.c
> index b84607a..bd6f6f8 100644
> --- a/btrfs.c
> +++ b/btrfs.c
> @@ -23,6 +23,9 @@
>  #include "btrfs_cmds.h"
>  #include "version.h"
>  
> +#define BASIC_HELP 0
> +#define ADVANCED_HELP 1
> +
>  typedef int (*CommandFunction)(int argc, char **argv);
>  
>  struct Command {
> @@ -31,8 +34,10 @@ struct Command {
>  if >= 0, number of arguments,
>  if < 0, _minimum_ number of arguments */
>   char*verb;  /* verb */
> - char*help;  /* help lines; form the 2nd onward they are
> -indented */
> + char*help;  /* help lines; from the 2nd line onward they 
> +   are automatically indented */
> +char*adv_help;  /* advanced help message; from the 2nd line 
> +   onward they are automatically indented */
>   /* the following fields are run-time filled by the program */
>   char**cmds; /* array of subcommands */
> @@ -47,73 +52,96 @@ static struct Command commands[] = {
>   { do_clone, 2,
> "subvolume snapshot", " [/]\n"
>   "Create a writable snapshot of the subvolume  with\n"
> - "the name  in the  directory."
> + "the name  in the  directory.",
> +  NULL
>   },
>   { do_delete_subvolume, 1,
> "subvolume delete", "\n"
> - "Delete the subvolume ."
> + "Delete the subvolume .",
> +  NULL
>   },
>   { do_create_subvol, 1,
> "subvolume create", "[/]\n"
>   "Create a subvolume in  (or the current directory if\n"
> - "not passed)."
> + "not passed).",
> +  NULL
>   },
>   { do_subvol_list, 1, "subvolume list", "\n"
> - "List the snapshot/subvolume of a filesystem."
> + "List the snapshot/subvolume of a filesystem.",
> +  NULL
>   },
>   { do_find_newer, 2, "subvolume find-new", " \n"
> - "List the recently modified files in a filesystem."
> + "List the recently modified files in a filesystem.",
> +  NULL
>   },
>   { do_defrag, -1,
> "filesystem defragment", "[-vcf] [-s start] [-l len] [-t size] 
> | [|...]\n"
> - "Defragment a file or a directory."
> + "Defragment a file or a directory.",
> +  "[-vcf] [-s start] [-l len] [-t size] | 
> [|...]\n"
> +  "Defragment file data or directory metadata.\n"
> +"-v be verbose\n"
> +"-c compress the file while defragmenting\n"
> +"-f flush data to disk immediately after 
> defragmenting\n"
> +"-s start   defragment only from byte onward\n"
> +"-l len defragment only up to len bytes\n"
> +"-t sizeminimal size of file to be considered for 
> defragmenting\n"

Lots of too long lines.

I don't like to repeat the synopsis passage. How about adding the
general ->help when printing ->adv_help as well? This reduces the need
of duplication.

To prove my point, looking at the current version in Hugo's integration
branch, your two synopsis lines already got inconsistent regarding the
-c option :-)

>   },
>   { do_set_default_subvol, 2,
> "subvolume set-default", " \n"
>   "Set the subvolume of the filesystem  which will be 
> mounted\n"
> - "as default."
> + "as default.",
> +  NULL
>   },
>   { do_fssync, 1,
> "filesystem sync", "\n"
> - "Force a sync on the filesystem ."
> + "Force a sync on the filesystem .",
> +  NULL
>   },
>   { do_resize, 2,
> "filesystem resize", "[+/-][gkm]|max \n"
>   "Resize the file system. If 'max' is passed, the filesystem\n"
> - "will occupe all available space on the device."
> + "will occupe all available space on the device.",
> +  NULL
>   },
>   { do_show_filesystem, 999,
> "filesystem show", "[|]\n"
>   "Show the info of a btrfs filesystem. If no  or \n"
> - "is passed, info of all the btrfs filesystem are shown."
> + "

Re: Memory leak?

2011-07-11 Thread Stephane Chazelas
2011-07-11 11:00:19 -0400, Chris Mason:
> Excerpts from Stephane Chazelas's message of 2011-07-11 05:01:21 -0400:
> > 2011-07-10 19:37:28 +0100, Stephane Chazelas:
> > > 2011-07-10 08:44:34 -0400, Chris Mason:
> > > [...]
> > > > Great, we're on the right track.  Does it trigger with mount -o compress
> > > > instead of mount -o compress_force?
> > > [...]
> > > 
> > > It does trigger. I get that same "invalid opcode".
> > > 
> > > BTW, I tried with CONFIG_SLUB and slub_debug and no more useful
> > > information than with SLAB_DEBUG.
> > > 
> > > I'm trying now without dmcrypt. Then I won't have much bandwidth
> > > for testing.
> > [...]
> > 
> > Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup
> > thread when doing an 
[...]
> > - CONFIG_SLAB_DEBUG, CONFIG_DEBUG_PAGEALLOC,
> >   CONFIG_DEBUG_SLAB_LEAK, slub_debug don't tell us anything
> >   useful (there's more info in /proc/slabinfo when
> >   CONFIG_SLAB_DEBUG is on, see below)
[...]
> This is some fantastic debugging, thank you.  I know you tested with
> slab debugging turned on, did you have CONFIG_DEBUG_PAGEALLOC on as
> well?

Yes when using SLAB, not when using SLUB.

> It's probably something to do with a specific file, but pulling that
> file out without extra printks is going to be tricky.  I'll see if I can
> reproduce it here.
[...]

For one occurrence, I know what file was being transfered at the
time of the crash (looking at ctimes on the dest FS, see one of
my earlier emails). And after just checking on the latest BUG(),
it's a different one.

Also, when I resume the rsync (so it doesn't transfer the
already transfered files), it does BUG() again.

regards,
Stephane
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory leak?

2011-07-11 Thread Chris Mason
Excerpts from Stephane Chazelas's message of 2011-07-11 11:35:56 -0400:
> 2011-07-11 11:00:19 -0400, Chris Mason:
> > Excerpts from Stephane Chazelas's message of 2011-07-11 05:01:21 -0400:
> > > 2011-07-10 19:37:28 +0100, Stephane Chazelas:
> > > > 2011-07-10 08:44:34 -0400, Chris Mason:
> > > > [...]
> > > > > Great, we're on the right track.  Does it trigger with mount -o 
> > > > > compress
> > > > > instead of mount -o compress_force?
> > > > [...]
> > > > 
> > > > It does trigger. I get that same "invalid opcode".
> > > > 
> > > > BTW, I tried with CONFIG_SLUB and slub_debug and no more useful
> > > > information than with SLAB_DEBUG.
> > > > 
> > > > I'm trying now without dmcrypt. Then I won't have much bandwidth
> > > > for testing.
> > > [...]
> > > 
> > > Same without dmcrypt. So to sum up, BUG() reached in btrfs-fixup
> > > thread when doing an 
> [...]
> > > - CONFIG_SLAB_DEBUG, CONFIG_DEBUG_PAGEALLOC,
> > >   CONFIG_DEBUG_SLAB_LEAK, slub_debug don't tell us anything
> > >   useful (there's more info in /proc/slabinfo when
> > >   CONFIG_SLAB_DEBUG is on, see below)
> [...]
> > This is some fantastic debugging, thank you.  I know you tested with
> > slab debugging turned on, did you have CONFIG_DEBUG_PAGEALLOC on as
> > well?
> 
> Yes when using SLAB, not when using SLUB.
> 
> > It's probably something to do with a specific file, but pulling that
> > file out without extra printks is going to be tricky.  I'll see if I can
> > reproduce it here.
> [...]
> 
> For one occurrence, I know what file was being transfered at the
> time of the crash (looking at ctimes on the dest FS, see one of
> my earlier emails). And after just checking on the latest BUG(),
> it's a different one.
> 
> Also, when I resume the rsync (so it doesn't transfer the
> already transfered files), it does BUG() again.

Ok, could you please send along the exact rsync command you were
running?

Thanks,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory leak?

2011-07-11 Thread Stephane Chazelas
2011-07-11 12:25:51 -0400, Chris Mason:
[...]
> > Also, when I resume the rsync (so it doesn't transfer the
> > already transfered files), it does BUG() again.
> 
> Ok, could you please send along the exact rsync command you were
> running?
[...]

I did earlier, but here it is again:

rsync --archive --xattrs --hard-links --numeric-ids --sparse --acls /src/ /dst/

Also with:

(cd /src && bsdtar cf - .) | pv | (cd /dst && bsdtar -xpSf - --numeric-owner)

-- 
Stephane

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


last_index variable in btrfs_buffered_write function

2011-07-11 Thread João Eduardo Luís
Hello.

Am I reading the code the wrong way, or is the 'last_index' variable in 
'__btrfs_buffered_write()' (and previously used in 'btrfs_file_aio_write()') 
irrelevant? 

It appears to just be used in 'prepare_pages()', passed as an argument, but 
never actually used by this function.

Furthermore, I'm not sure what is intended with this variable, but if the idea 
is to assign it with the  last page in the range, then I would say that instead 
of

> last_index = (pos + iov_iter_count(i)) >> PAGE_CACHE_SHIFT;

it should be

>  last_index = (pos + iov_iter_count(i) - 1) >> PAGE_CACHE_SHIFT;

Then again, I may be missing something.

Cheers.

---
João Eduardo Luís
gpg key: 477C26E5 from pool.keyserver.eu 







PGP.sig
Description: This is a digitally signed message part


[PATCH v2 1/3] Btrfs-progs: btrfs-list: split list_subvols

2011-07-11 Thread Jan Schmidt
split list_subvols to separate functions and allow printing only in the
containing function. lets us make use of those functions when resolving
logical addresses.

Signed-off-by: Jan Schmidt 
---
 btrfs-list.c |  104 ++---
 1 files changed, 69 insertions(+), 35 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 07b179a..dd685c2 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -199,10 +199,9 @@ static int add_root(struct root_lookup *root_lookup,
  * This can't be called until all the root_info->path fields are filled
  * in by lookup_ino_path
  */
-static int resolve_root(struct root_lookup *rl, struct root_info *ri, int 
print_parent)
+static int resolve_root(struct root_lookup *rl, struct root_info *ri,
+   u64 *root_id, u64 *parent_id, u64 *top_id, char **path)
 {
-   u64 top_id;
-   u64 parent_id = 0;
char *full_path = NULL;
int len = 0;
struct root_info *found;
@@ -211,6 +210,7 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri, int print_
 * we go backwards from the root_info object and add pathnames
 * from parent directories as we go.
 */
+   *parent_id = 0;
found = ri;
while (1) {
char *tmp;
@@ -234,13 +234,12 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri, int print_
 
next = found->ref_tree;
/* record the first parent */
-   if ( parent_id == 0 ) {
-   parent_id = next;
-   }
+   if (*parent_id == 0)
+   *parent_id = next;
 
/* if the ref_tree refers to ourselves, we're at the top */
if (next == found->root_id) {
-   top_id = next;
+   *top_id = next;
break;
}
 
@@ -250,20 +249,15 @@ static int resolve_root(struct root_lookup *rl, struct 
root_info *ri, int print_
 */
found = tree_search(&rl->root, next);
if (!found) {
-   top_id = next;
+   *top_id = next;
break;
}
}
-   if (print_parent) {
-   printf("ID %llu parent %llu top level %llu path %s\n",
-  (unsigned long long)ri->root_id, (unsigned long 
long)parent_id, (unsigned long long)top_id,
-   full_path);
-   } else {
-   printf("ID %llu top level %llu path %s\n",
-  (unsigned long long)ri->root_id, (unsigned long 
long)top_id,
-   full_path);
-   }
-   free(full_path);
+
+   *root_id = ri->root_id;
+   *parent_id = ri->root_id;
+   *path = full_path;
+
return 0;
 }
 
@@ -560,10 +554,8 @@ build:
return full;
 }
 
-int list_subvols(int fd, int print_parent)
+static int __list_subvol_search(int fd, struct root_lookup *root_lookup)
 {
-   struct root_lookup root_lookup;
-   struct rb_node *n;
int ret;
struct btrfs_ioctl_search_args args;
struct btrfs_ioctl_search_key *sk = &args.key;
@@ -574,9 +566,11 @@ int list_subvols(int fd, int print_parent)
char *name;
u64 dir_id;
int i;
-   int e;
 
-   root_lookup_init(&root_lookup);
+   root_lookup_init(root_lookup);
+   memset(&args, 0, sizeof(args));
+
+   root_lookup_init(root_lookup);
 
memset(&args, 0, sizeof(args));
 
@@ -603,12 +597,8 @@ int list_subvols(int fd, int print_parent)
 
while(1) {
ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
-   e = errno;
-   if (ret < 0) {
-   fprintf(stderr, "ERROR: can't perform the search - 
%s\n",
-   strerror(e));
+   if (ret < 0)
return ret;
-   }
/* the ioctl returns the number of item it found in nr_items */
if (sk->nr_items == 0)
break;
@@ -629,7 +619,7 @@ int list_subvols(int fd, int print_parent)
name = (char *)(ref + 1);
dir_id = btrfs_stack_root_ref_dirid(ref);
 
-   add_root(&root_lookup, sh->objectid, sh->offset,
+   add_root(root_lookup, sh->objectid, sh->offset,
 dir_id, name, name_len);
}
 
@@ -657,11 +647,15 @@ int list_subvols(int fd, int print_parent)
} else
break;
}
-   /*
-* now we have an rbtree full of root_info objects, but we need to fill
-* in their path names within the subvol that is referencing each one.
-*/
-   n = rb_first(&root_lookup.root);
+
+   return 0;
+}
+
+static int __l

[PATCH v2 2/3] Btrfs-progs: added ioctls and commands to resolve inodes and logical addrs

2011-07-11 Thread Jan Schmidt
two new commands that make use of the new path resolving functions
implemented for scrub, doing the resolving in-kernel. the result for both
commands is a list of files belonging to that inode / logical address.

Signed-off-by: Jan Schmidt 
---
 btrfs-list.c |   35 
 btrfs.c  |   10 +++
 btrfs_cmds.c |  177 ++
 btrfs_cmds.h |3 +
 ioctl.h  |   29 ++
 5 files changed, 254 insertions(+), 0 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index dd685c2..cbf6a08 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -900,3 +900,38 @@ int find_updated_files(int fd, u64 root_id, u64 oldest_gen)
printf("transid marker was %llu\n", (unsigned long long)max_found);
return ret;
 }
+
+char *path_for_root(int fd, u64 root)
+{
+   struct root_lookup root_lookup;
+   struct rb_node *n;
+   char *ret_path = NULL;
+   int ret;
+
+   ret = __list_subvol_search(fd, &root_lookup);
+   if (ret < 0)
+   return ERR_PTR(ret);
+
+   ret = __list_subvol_fill_paths(fd, &root_lookup);
+   if (ret < 0)
+   return ERR_PTR(ret);
+
+   n = rb_last(&root_lookup.root);
+   while (n) {
+   struct root_info *entry;
+   u64 root_id;
+   u64 parent_id;
+   u64 level;
+   char *path;
+   entry = rb_entry(n, struct root_info, rb_node);
+   resolve_root(&root_lookup, entry, &root_id, &parent_id, &level,
+   &path);
+   if (root_id == root)
+   ret_path = path;
+   else
+   free(path);
+   n = rb_prev(n);
+   }
+
+   return ret_path;
+}
diff --git a/btrfs.c b/btrfs.c
index 67d6f6f..5d483ed 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -178,6 +178,16 @@ static struct Command commands[] = {
"Remove a device from a filesystem.",
  NULL
},
+   { do_ino_to_path, -2,
+ "inspect-internal inode-resolve", "[-v]  \n"
+   "get file system paths for the given inode.",
+ NULL
+   },
+   { do_logical_to_ino, -2,
+ "inspect-internal logical-resolve", "[-v] [-P]  \n"
+   "get file system paths for the given logical address.",
+ NULL
+   },
{ 0, 0, 0, 0 }
 };
 
diff --git a/btrfs_cmds.c b/btrfs_cmds.c
index 0612f34..2db5d31 100644
--- a/btrfs_cmds.c
+++ b/btrfs_cmds.c
@@ -1545,3 +1545,180 @@ int do_df_filesystem(int nargs, char **argv)
 
return 0;
 }
+
+static int __ino_to_path_fd(u64 inum, int fd, int verbose, const char *prepend)
+{
+   int ret;
+   int i;
+   struct btrfs_ioctl_ino_path_args ipa;
+   struct btrfs_data_container *fspath;
+
+   fspath = malloc(4096);
+   if (!fspath)
+   return 1;
+
+   ipa.inum = inum;
+   ipa.size = 4096;
+   ipa.fspath = fspath;
+
+   ret = ioctl(fd, BTRFS_IOC_INO_PATHS, &ipa);
+   if (ret) {
+   printf("ioctl ret=%d, error: %s\n", ret, strerror(errno));
+   goto out;
+   }
+
+   if (verbose)
+   printf("ioctl ret=%d, size=%d, cnt=%d, missed=%d\n", ret,
+   fspath->size, fspath->elem_cnt, fspath->elem_missed);
+
+   for (i = 0; i < fspath->elem_cnt; ++i) {
+   fspath->str[i] += (unsigned long)fspath->str;
+   if (prepend)
+   printf("%s/%s\n", prepend, fspath->str[i]);
+   else
+   printf("%s\n", fspath->str[i]);
+   }
+
+out:
+   free(fspath);
+   return ret;
+}
+
+int do_ino_to_path(int nargs, char **argv)
+{
+   int fd;
+   int verbose = 0;
+
+   optind = 1;
+   while (1) {
+   int c = getopt(nargs, argv, "v");
+   if (c < 0)
+   break;
+   switch (c) {
+   case 'v':
+   verbose = 1;
+   break;
+   default:
+   fprintf(stderr, "invalid arguments for ipath\n");
+   return 1;
+   }
+   }
+   if (nargs - optind != 2) {
+   fprintf(stderr, "invalid arguments for ipath\n");
+   return 1;
+   }
+
+   fd = open_file_or_dir(argv[optind+1]);
+   if (fd < 0) {
+   fprintf(stderr, "ERROR: can't access '%s'\n", argv[optind+1]);
+   return 12;
+   }
+
+   return __ino_to_path_fd(atoll(argv[optind]), fd, verbose,
+   argv[optind+1]);
+}
+
+int do_logical_to_ino(int nargs, char **argv)
+{
+   int ret;
+   int fd;
+   int i;
+   int verbose = 0;
+   int getpath = 1;
+   int bytes_left;
+   struct btrfs_ioctl_logical_ino_args loi;
+   struct btrfs_data_container *inodes;
+   char full_path[4096];
+   char *path_ptr;
+
+   optind = 1;
+   

[PATCH v2 3/3] Btrfs-progs: added resolve commands to man page

2011-07-11 Thread Jan Schmidt
Added "inspect-internal inode-resolve" and "inspect-internal
logical-resolve" to the btrfs(8) man page.

Signed-off-by: Jan Schmidt 
---
 man/btrfs.8.in |   29 +
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index 84a60cd..6e0568b 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -51,6 +51,11 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBscrub status\fP [-d] {\fI\fP|\fI\fP}
 .PP
+\fBbtrfs\fP \fBinspect-internal inode-resolve\fP [-v] \fI\fP 
\fI\fP
+.PP
+\fBbtrfs\fP \fBinspect-internal logical-resolve\fP
+[-Pv] \fI\fP \fI\fP
+.PP
 \fBbtrfs\fP \fBhelp|\-\-help|\-h \fP\fI\fP
 .PP
 \fBbtrfs\fP \fB \-\-help \fP\fI\fP
@@ -286,6 +291,30 @@ for that filesystem or device.
 .IP -d 5
 Print separate statistics for each device of the filesystem.
 .RE
+.TP
+
+\fBinspect-internal inode-resolve\fP [-v] \fI\fP \fI\fP
+Resolves an  in subvolume  to all filesystem paths.
+.RS
+
+\fIOptions\fR
+.IP -v 5
+verbose mode. print count of returned paths and ioctl() return value
+.RE
+.TP
+
+\fBinspect-internal logical-resolve\fP [-Pv] \fI\fP \fI\fP
+Resolves a  address in the filesystem mounted at  to all inodes.
+By default, each inode is then resolved to a file system path (similar to the
+\fBinode-resolve\fP subcommand).
+.RS
+
+\fIOptions\fR
+.IP -P 5
+skip the path resolving and print the inodes instead
+.IP -v 5
+verbose mode. print count of returned paths and all ioctl() return values
+.RE
 
 .PP
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] Btrfs-progs: add the first "inspect-internal" commands

2011-07-11 Thread Jan Schmidt
This is the follow up to the patch series
   "[PATCH v1 0/2] Btrfs-progs: commands "resolve inode" and "resolve logical"
I chose to change the subject as the command names changed.

Changes v1->v2:
- commands renamed as suggested by Goffredo Baroncelli
- man pages added

The kernel patch series just sent (Subject: "Btrfs: scrub: print path to
corrupted files and trigger nodatasum fixup") introduces two new ioctls to
do in-kernel filesystem path construction. This series provides the
corresponding userspace changes, adding two new commands to the btrfs utility.

These patches are based on Hugo's current integration branch.

Please try them out and report bugs here.

-Jan

Jan Schmidt (3):
  Btrfs-progs: btrfs-list: split list_subvols
  Btrfs-progs: added ioctls and commands to resolve inodes and logical
addrs
  Btrfs-progs: added resolve commands to man page

 btrfs-list.c   |  139 +---
 btrfs.c|   10 +++
 btrfs_cmds.c   |  177 
 btrfs_cmds.h   |3 +
 ioctl.h|   29 +
 man/btrfs.8.in |   29 +
 6 files changed, 352 insertions(+), 35 deletions(-)

-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-07-11 Thread Goffredo Baroncelli
Hi, all.

what about generating the man page on the basis of the btrfs help
detailed messages ?

My idea is the following:
before the function source associated to the command we can put a
comment with a detailed help. The comment may be:

[...]
/*** man:btrfs subvolume create
 *
 *  btrfs subvolume create 
 * create a new subvolume
 *
 *  The command [b]btrfs subvolume create[b] is used.
 *
 ***/

void do_create_subvolume(int argc, char **argv)
{
[...]

A script extracts from the comment in the source both:
- the text for the man page
- the text for the detailed help.

So we can reach the following goals:
- the help is linked to the code
- is less likely to forget to update the message
- the man page, the helps are always aligned

BR
G.Baroncelli



On 07/11/2011 05:13 PM, Jan Schmidt wrote:
> Hi Hubert,
> 
> I have to admit I did not recognize this patch but now Hugo is forcing
> me to use the "detailed help messages" and I've got an improvement to
> suggest:
> 
> On 23.01.2011 13:42, Hubert Kario wrote:
>> extend the
>>
>> btrfs  --help
>>
>> command to print detailed help message if available but fallback to
>> basic help message if detailed is unavailable
>>
>> add detailed help message for 'filesystem defragment' command
>>
>> little tweaks in comments
>>
>> Signed-off-by: Hubert Kario 
>> ---
>>  btrfs.c |  101 
>> ++
>>  1 files changed, 68 insertions(+), 33 deletions(-)
>>
>> diff --git a/btrfs.c b/btrfs.c
>> index b84607a..bd6f6f8 100644
>> --- a/btrfs.c
>> +++ b/btrfs.c
>> @@ -23,6 +23,9 @@
>>  #include "btrfs_cmds.h"
>>  #include "version.h"
>>  
>> +#define BASIC_HELP 0
>> +#define ADVANCED_HELP 1
>> +
>>  typedef int (*CommandFunction)(int argc, char **argv);
>>  
>>  struct Command {
>> @@ -31,8 +34,10 @@ struct Command {
>> if >= 0, number of arguments,
>> if < 0, _minimum_ number of arguments */
>>  char*verb;  /* verb */
>> -char*help;  /* help lines; form the 2nd onward they are
>> -   indented */
>> +char*help;  /* help lines; from the 2nd line onward they 
>> +   are automatically indented */
>> +char*adv_help;  /* advanced help message; from the 2nd line 
>> +   onward they are automatically indented */
>>  /* the following fields are run-time filled by the program */
>>  char**cmds; /* array of subcommands */
>> @@ -47,73 +52,96 @@ static struct Command commands[] = {
>>  { do_clone, 2,
>>"subvolume snapshot", " [/]\n"
>>  "Create a writable snapshot of the subvolume  with\n"
>> -"the name  in the  directory."
>> +"the name  in the  directory.",
>> +  NULL
>>  },
>>  { do_delete_subvolume, 1,
>>"subvolume delete", "\n"
>> -"Delete the subvolume ."
>> +"Delete the subvolume .",
>> +  NULL
>>  },
>>  { do_create_subvol, 1,
>>"subvolume create", "[/]\n"
>>  "Create a subvolume in  (or the current directory if\n"
>> -"not passed)."
>> +"not passed).",
>> +  NULL
>>  },
>>  { do_subvol_list, 1, "subvolume list", "\n"
>> -"List the snapshot/subvolume of a filesystem."
>> +"List the snapshot/subvolume of a filesystem.",
>> +  NULL
>>  },
>>  { do_find_newer, 2, "subvolume find-new", " \n"
>> -"List the recently modified files in a filesystem."
>> +"List the recently modified files in a filesystem.",
>> +  NULL
>>  },
>>  { do_defrag, -1,
>>"filesystem defragment", "[-vcf] [-s start] [-l len] [-t size] 
>> | [|...]\n"
>> -"Defragment a file or a directory."
>> +"Defragment a file or a directory.",
>> +  "[-vcf] [-s start] [-l len] [-t size] | 
>> [|...]\n"
>> +  "Defragment file data or directory metadata.\n"
>> +"-v be verbose\n"
>> +"-c compress the file while defragmenting\n"
>> +"-f flush data to disk immediately after 
>> defragmenting\n"
>> +"-s start   defragment only from byte onward\n"
>> +"-l len defragment only up to len bytes\n"
>> +"-t sizeminimal size of file to be considered for 
>> defragmenting\n"
> 
> Lots of too long lines.
> 
> I don't like to repeat the synopsis passage. How about adding the
> general ->help when printing ->adv_help as well? This reduces the need
> of duplication.
> 
> To prove my point, looking at the current version in Hugo's integration
> branch, your two synopsis lines already got inconsistent regarding the
> -c option :-)
> 
>>  },
>>  { do_set_default_subvol, 2,
>>"subvolume set-default", " \n"
>>  

Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-07-11 Thread Jan Schmidt
On 07/11/2011 08:38 PM, Goffredo Baroncelli wrote:
> what about generating the man page on the basis of the btrfs help
> detailed messages ?
> 
> My idea is the following:
> before the function source associated to the command we can put a
> comment with a detailed help. The comment may be:
> 
> [...]
> /*** man:btrfs subvolume create
>  *
>  *  btrfs subvolume create 
>  * create a new subvolume
>  *
>  *  The command [b]btrfs subvolume create[b] is used.
>  *
>  ***/
> 
> void do_create_subvolume(int argc, char **argv)
> {
> [...]

Reminds me of java, but nevertheless, I like the general idea.

> A script extracts from the comment in the source both:
> - the text for the man page
> - the text for the detailed help.

Does anybody have such a script around? I suppose we're not the first
ones writing help texts and man pages.

> So we can reach the following goals:
> - the help is linked to the code
> - is less likely to forget to update the message
> - the man page, the helps are always aligned

Only, we still will need like short and long help. E.g. the full text in
the man page may be inappropriate as a --help message. Also, we do need
a clever idea to get indentation right in the man pages. I fiddled a lot
on the man pages for scrub parameter indentation (to get the second line
describing a command line option indented correctly to start below the
text of the first line, that was).

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: last_index variable in btrfs_buffered_write function

2011-07-11 Thread Mitch Harder
2011/7/11 João Eduardo Luís :
> Hello.
>
> Am I reading the code the wrong way, or is the 'last_index' variable in 
> '__btrfs_buffered_write()' (and previously used in 'btrfs_file_aio_write()') 
> irrelevant?
>
> It appears to just be used in 'prepare_pages()', passed as an argument, but 
> never actually used by this function.
>
> Furthermore, I'm not sure what is intended with this variable, but if the 
> idea is to assign it with the  last page in the range, then I would say that 
> instead of
>
>> last_index = (pos + iov_iter_count(i)) >> PAGE_CACHE_SHIFT;
>
> it should be
>
>>  last_index = (pos + iov_iter_count(i) - 1) >> PAGE_CACHE_SHIFT;
>
> Then again, I may be missing something.
>
> Cheers.
>

I came to the same conclusion a few months ago when looking at a bug
in the same area of code.

The calculation appears to be wrong, but since it's not used anywhere,
you can't say for certain.  :)

I just haven't gotten around to testing a patch to confirm the hypothesis.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-07-11 Thread Goffredo Baroncelli
On 07/11/2011 09:11 PM, Jan Schmidt wrote:
> On 07/11/2011 08:38 PM, Goffredo Baroncelli wrote:
>> what about generating the man page on the basis of the btrfs help
>> detailed messages ?
>>
>> My idea is the following:
>> before the function source associated to the command we can put a
>> comment with a detailed help. The comment may be:
>>
>> [...]
>> /*** man:btrfs subvolume create
>>  *
>>  *  btrfs subvolume create 
>>  * create a new subvolume
>>  *
>>  *  The command [b]btrfs subvolume create[b] is used.
>>  *
>>  ***/
>>
>> void do_create_subvolume(int argc, char **argv)
>> {
>> [...]
> 
> Reminds me of java, but nevertheless, I like the general idea.
> 
>> A script extracts from the comment in the source both:
>> - the text for the man page
>> - the text for the detailed help.
> 
> Does anybody have such a script around? I suppose we're not the first
> ones writing help texts and man pages.
> 

I am trying to write a my own (yes I know, I suffer of the NIH syndrome
:-) ).

>> So we can reach the following goals:
>> - the help is linked to the code
>> - is less likely to forget to update the message
>> - the man page, the helps are always aligned
> 
> Only, we still will need like short and long help. E.g. the full text in
> the man page may be inappropriate as a --help message. Also, we do need
> a clever idea to get indentation right in the man pages. I fiddled a lot
> on the man pages for scrub parameter indentation (to get the second line
> describing a command line option indented correctly to start below the
> text of the first line, that was).

I agree for the short and long help. I think that we need to use some
tags for the bold, italic, Empty line...

> 
> -Jan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/5] scrub userland implementation

2011-07-11 Thread Hugo Mills
   OK, here's the remainder of my comments for this file. Not much for
this bit -- just one comment about locking, a reminder, and an
observation.

On Wed, Mar 30, 2011 at 06:53:12PM +0200, Jan Schmidt wrote:
[...]

> +static int _scrub_write_buf(int fd, const void *data, int len)
> +{
> + int ret;
> + ret = write(fd, data, len);
> + return ret - len;
> +}
> +
> +static int _scrub_writev(int fd, char *buf, int max, const char *fmt, ...)
> + __attribute__ ((format (printf, 4, 5)));
> +static int _scrub_writev(int fd, char *buf, int max, const char *fmt, ...)
> +{
> + int ret;
> + va_list args;
> + 
> + va_start(args, fmt);
> + ret = vsnprintf(buf, max, fmt, args);
> + va_end(args);
> + if (ret >= max)
> + return ret - max;
> + return _scrub_write_buf(fd, buf, ret);
> +}
> +
> +#define _SCRUB_SUM(dest, data, name) dest->scrub_args.progress.name =
> \
> + data->resumed->p.name + data->scrub_args.progress.name
> +static struct scrub_progress *_scrub_resumed_stats(struct scrub_progress 
> *data,
> +   struct scrub_progress 
> *dest)
> +{
> + if (!data->resumed || data->skip)
> + return data;
> +
> + _SCRUB_SUM(dest, data, data_extents_scrubbed);
> + _SCRUB_SUM(dest, data, tree_extents_scrubbed);
> + _SCRUB_SUM(dest, data, data_bytes_scrubbed);
> + _SCRUB_SUM(dest, data, tree_bytes_scrubbed);
> + _SCRUB_SUM(dest, data, read_errors);
> + _SCRUB_SUM(dest, data, csum_errors);
> + _SCRUB_SUM(dest, data, verify_errors);
> + _SCRUB_SUM(dest, data, no_csum);
> + _SCRUB_SUM(dest, data, csum_discards);
> + _SCRUB_SUM(dest, data, super_errors);
> + _SCRUB_SUM(dest, data, malloc_errors);
> + _SCRUB_SUM(dest, data, uncorrectable_errors);
> + _SCRUB_SUM(dest, data, corrected_errors);
> + _SCRUB_SUM(dest, data, last_physical);
> + dest->stats.canceled = data->stats.canceled;
> + dest->stats.finished = data->stats.finished;
> + dest->stats.t_resumed = data->stats.t_start;
> + dest->stats.t_start = data->resumed->stats.t_start;
> + dest->stats.duration = data->resumed->stats.duration +
> + data->stats.duration;
> + dest->scrub_args.devid = data->scrub_args.devid;
> + return dest;
> +}
> +
> +#define _SCRUB_KVWRITE(fd, buf, name, use)   \
> + _scrub_kvwrite(fd, buf, sizeof(buf), #name, \
> +use->scrub_args.progress.name)
> +#define _SCRUB_KVWRITE_STATS(fd, buf, name, use) \
> + _scrub_kvwrite(fd, buf, sizeof(buf), #name, \
> +use->stats.name)
> +static int _scrub_kvwrite(int fd, char *buf, int max,
> +  const char *key, u64 val)
> +{
> + return _scrub_writev(fd, buf, max, "|%s:%lld", key, val);
> +}
> +
> +static int scrub_write_file(int fd, const char *fsid,
> +struct scrub_progress* data, int n)
> +{
> + int ret = 0;
> + int i;
> + char buf[1024];
> + struct scrub_progress local;
> + struct scrub_progress *use;
> +
> + if (n < 1) {
> + return -EINVAL;
> + }
> +
> + ret = _scrub_write_buf(fd, SCRUB_FILE_VERSION_PREFIX SCRUB_FILE_VERSION
> +"\n", sizeof(SCRUB_FILE_VERSION_PREFIX)-1
> ++ sizeof(SCRUB_FILE_VERSION)-1 + 1);
> + if (ret)
> + return -EOVERFLOW;
> +
> + for (i=0; i + use = _scrub_resumed_stats(&data[i], &local);
> + if (_scrub_write_buf(fd, fsid, strlen(fsid)) ||
> + _scrub_write_buf(fd, ":", 1) ||
> + _scrub_writev(fd, buf, sizeof(buf), "%lld",
> +   use->scrub_args.devid) ||
> + _scrub_write_buf(fd, buf, ret) ||
> + _SCRUB_KVWRITE(fd, buf, data_extents_scrubbed, use) ||
> + _SCRUB_KVWRITE(fd, buf, tree_extents_scrubbed, use) ||
> + _SCRUB_KVWRITE(fd, buf, data_bytes_scrubbed, use) ||
> + _SCRUB_KVWRITE(fd, buf, tree_bytes_scrubbed, use) ||
> + _SCRUB_KVWRITE(fd, buf, read_errors, use) ||
> + _SCRUB_KVWRITE(fd, buf, csum_errors, use) ||
> + _SCRUB_KVWRITE(fd, buf, verify_errors, use) ||
> + _SCRUB_KVWRITE(fd, buf, no_csum, use) ||
> + _SCRUB_KVWRITE(fd, buf, csum_discards, use) ||
> + _SCRUB_KVWRITE(fd, buf, super_errors, use) ||
> + _SCRUB_KVWRITE(fd, buf, malloc_errors, use) ||
> + _SCRUB_KVWRITE(fd, buf, uncorrectable_errors, use) ||
> + _SCRUB_KVWRITE(fd, buf, corrected_errors, use) ||
> + _SCRUB_KVWRITE(fd, buf, last_physical, use) ||
> + _SCRUB_KVWRITE_STATS(fd, buf, t_start, use) ||
> + _SCRUB_KVWRITE_STATS(fd, buf,

Re: [PATCH v2 4/5] scrub userland implementation

2011-07-11 Thread Hugo Mills
On Mon, Jul 11, 2011 at 04:29:24PM +0200, Jan Schmidt wrote:
> On 10.07.2011 20:23, Hugo Mills wrote:
> >Yes, this is over three months after the initial posting, but since
> > nobody else has looked at it yet, and the patch is in my integration
> > stack...
> 
> ... thanks!
> 
> >I've not reviewed the whole thing -- just the "scrub start" code so
> > far. I've removed the bits I've not checked from the file below.
> 
> I rebased the old branch I found to your current integration branch and
> fixed up a most of what you mentioned. I'll not send a new version out
> until after your complete review (or your statement that this is it or
> your statement that you would rather going on reviewing the revised
> version).

   Thanks. The other half has just gone out (with few comments).

> Things I ripped out are accepted and corrected without resistance.
> Comments follow.

   Only a couple of rejoinders below.

> > On Wed, Mar 30, 2011 at 06:53:12PM +0200, Jan Schmidt wrote:
[...]

> >> +  case 4: /* read dev id */
> >> +  for (j=0; isdigit(l[i+j]) && i+j < avail; ++j)
> >> +  ;
> >> +  if (!j || i+j+1 >= avail)
> > 
> >j == 0 is clearer than !j here, IMO
> > 
> >> +  _SCRUB_ILLEGAL;
> >> +  p[curr]->devid = atoll(&l[i]);
> >> +  i += j + 1;
> > 
> >Is there any reason that you couldn't just use strtoull here? We
> > know that the string is terminated with a \n (by the guarantee of
> > state 1), so strtoull will always finish within the buffer.
> 
> I just found it way easier to use atoll. We already know the first
> character really is a digit, so why bother with a more cumbersome function?

   Ah, my mistake for not being clearer, I think: I was talking about
the for loop at the head of the state 4 code as well. That only exists
in order to find the end of the number read in by atoll, and strtoull
would do that for you.

[...]

> >> +  char fsid[37];
> > 
> >Magic number. is there a #define in libuuid for this value?
> 
> At least the man page of uuid_parse clearly states uuids have 36 bytes
> plus a \0 in printf format. uuid/uuid.h doesn't contain such a constant.
> But volumes.c, print-tree.c and others do it with 37, too.

   OK, if that's conventional (and not defined in uuid.h), then go
with the magic number.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- "I will not be pushed,  filed, stamped, indexed, briefed, ---
   debriefed or numbered.  My life is my own."   


signature.asc
Description: Digital signature


Re: btrfs hang in flush-btrfs-5

2011-07-11 Thread Jeremy Sanders
Josef Bacik wrote:

> On 07/11/2011 07:40 AM, Jeremy Sanders wrote:
>> Jeremy Sanders wrote:
>> 
>>> Hi - I'm trying btrfs with kernel 2.6.38.8-32.fc15.x86_64 (a Fedora
>>> kernel). I'm just doing a tar-to-tar copy onto the file system with
>>> compress- force=zlib. Here are some traces of the stuck processes.
>> 
>> I've managed to reproduce the hang using the latest btrfs from the
>> repository. I had to remove some of the tracing lines to get it to
>> compile under 2.6.38.8 and an ioctl which wasn't defined. Here is is
>> where it is stuck:
>> 
> 
> Hrm well that is just unlikely and hard to hit.  Will you try this and
> see if it helps you?  Thanks,

It's got quite a bit further past than where it got before and hasn't 
crashed yet. I will let you know when it has finished ok.

I see that the btrfs-delalloc (rather than endio-write) thread is taking up 
100% of CPU and the write speed seems to have dropped during the copying, 
however. The copy started with using endio-write fully on both cores and now 
is using dealloc a lot.

Jeremy


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-07-11 Thread Hugo Mills
On Mon, Jul 11, 2011 at 09:11:24PM +0200, Jan Schmidt wrote:
> On 07/11/2011 08:38 PM, Goffredo Baroncelli wrote:
> > what about generating the man page on the basis of the btrfs help
> > detailed messages ?
> > 
> > My idea is the following:
> > before the function source associated to the command we can put a
> > comment with a detailed help. The comment may be:
> > 
> > [...]
> > /*** man:btrfs subvolume create
> >  *
> >  *  btrfs subvolume create 
> >  * create a new subvolume
> >  *
> >  *  The command [b]btrfs subvolume create[b] is used.
> >  *
> >  ***/
> > 
> > void do_create_subvolume(int argc, char **argv)
> > {
> > [...]
> 
> Reminds me of java, but nevertheless, I like the general idea.

   In principle, it's attractive. I still have some reservations as to
the amount of effort involved in automating the process (vs manually
updating things), and in the levels of detail needed (see the
discussion below).

> > A script extracts from the comment in the source both:
> > - the text for the man page
> > - the text for the detailed help.

   Or possibly going the other direction: from the man page (which
contains all of the information we need to reproduce in the code), it
should be possible, with appropriate structuring, to retrieve the bits
that the code needs to know about, and insert them into a table in a
generated .c file. Just a thought. 

   Oh, and the current man page needs some major work on its
typography -- it's inconsistent with both itself, and with most other
man pages, as far as I can tell. I did have a patch for that, but it
was a long time ago, and clashed with almost everything.

> Does anybody have such a script around? I suppose we're not the first
> ones writing help texts and man pages.
> 
> > So we can reach the following goals:
> > - the help is linked to the code
> > - is less likely to forget to update the message
> > - the man page, the helps are always aligned
> 
> Only, we still will need like short and long help. E.g. the full text in
> the man page may be inappropriate as a --help message. Also, we do need
> a clever idea to get indentation right in the man pages. I fiddled a lot
> on the man pages for scrub parameter indentation (to get the second line
> describing a command line option indented correctly to start below the
> text of the first line, that was).

   We actually need three levels of help:

 * A one-liner "headline" version that comes in the synopsis section
   of the man page and the output of "btrfs --help"

= btrfs filesystem defragment [-vf] [-c{zlib,lzo}] [-s ] 
| [...]
=   Defragment a file or directory

 * A collection of one-liners summarising all the switches, that comes
   as the output of "btrfs foo bar --help": a repeat of the "headline"
   version, plus a single (half-)line for each switch.

= btrfs filesystem defragment [-vf] [-c{zlib,lzo}] [-s ] 
| [...]
=   Defragment a file or directory
=   -v  verbose output
=   -f  do the f thing
=   -c  force compression algorithm 
(zlib or lzo)
=   -s   start the defrag from offset  in the file

 * A detailed description of the command as a whole and each option,
   which appears in the detail section of the man page.

= btrfs filesystem defragment [-vf] [-c{zlib,lzo}] [-s ] 
| [...]

=   Defragment the given file(s) or directories. Defrag does not
=   operate recursively, so if you want to defragment an entire
=   subdirectory and all its children, you should use find(1) to list
=   the files, and pass the list to btrfs fi defrag. [etc]
=   -s   Start the defragmentation operation at offset
=within the file.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- Great oxymorons of the world, no. 2: Common Sense ---


signature.asc
Description: Digital signature


raid1

2011-07-11 Thread krz...@gmail.com
I wanted to confirm that btrfs will continue to work on raid1 when one
of devices will be gone.

dd if=/dev/null of=img0 bs=1 seek=2G
dd if=/dev/null of=img1 bs=1 seek=2G
mkfs.btrfs -d raid1 -m raid1 img0 img1
losetup /dev/loop1 img0
losetup /dev/loop2 img1
mkdir dir
mount -t btrfs /dev/loop1 dir
btrfs device scan
mount -t btrfs /dev/loop1 dir
echo abc > dir/a.txt
umount dir
losetup -d /dev/loop2
btrfs device scan
mount -t btrfs /dev/loop1 dir
mount: wrong fs type, bad option, bad superblock on /dev/loop1,
   missing codepage or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

am I missing some nuance?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1

2011-07-11 Thread Josef Bacik

On 07/11/2011 08:22 PM, krz...@gmail.com wrote:

I wanted to confirm that btrfs will continue to work on raid1 when one
of devices will be gone.

dd if=/dev/null of=img0 bs=1 seek=2G
dd if=/dev/null of=img1 bs=1 seek=2G
mkfs.btrfs -d raid1 -m raid1 img0 img1
losetup /dev/loop1 img0
losetup /dev/loop2 img1
mkdir dir
mount -t btrfs /dev/loop1 dir
btrfs device scan
mount -t btrfs /dev/loop1 dir
echo abc>  dir/a.txt
umount dir
losetup -d /dev/loop2
btrfs device scan
mount -t btrfs /dev/loop1 dir
mount: wrong fs type, bad option, bad superblock on /dev/loop1,
missing codepage or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so

am I missing some nuance?


Yeah you need to mount -o degraded.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


after mounting with -o degraded: ioctl: LOOP_CLR_FD: Device or resource busy

2011-07-11 Thread krz...@gmail.com
dd if=/dev/null of=img5 bs=1 seek=2G
dd if=/dev/null of=img6 bs=1 seek=2G
mkfs.btrfs -d raid1 -m raid1 img5 img6
losetup /dev/loop4 img5
losetup /dev/loop5 img6
btrfs device scan
mount -t btrfs /dev/loop4 dir
umount dir
losetup -d /dev/loop5
mount -t btrfs -o degraded /dev/loop4 dir
umount dir
losetup -d /dev/loop4
ioctl: LOOP_CLR_FD: Device or resource busy
mkfs.ext3 /dev/loop4
mke2fs 1.39 (29-May-2006)
/dev/loop4 is apparently in use by the system; will not make a filesystem here!

this only happens after mouting with -o degraded. loopback device is
unusable until next reboot
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1

2011-07-11 Thread krz...@gmail.com
Thanks.

I don't see reason why this needs another mount switch. This would
fail to start whole system in / parition was btrfs raid1, with no
reason to do so.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-default" subcommand

2011-07-11 Thread Zhong, Xin


> -Original Message-
> From: Andreas Philipp [mailto:philipp.andr...@gmail.com]
> Sent: Monday, July 11, 2011 10:11 PM
> To: Zhong, Xin
> Cc: linux-btrfs@vger.kernel.org; Hugo Mills
> Subject: Re: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-default"
> subcommand
> 
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Would you mind rebasing your patch on Hugo Mills' integration-branch
> for
> btrfs progs at
> http://git.darksatanic.net/repo/btrfs-progs-unstable.git
> integration-20110705
> since it does not apply on top of all changes which are already there.
> Additionally, I spotted one whitespace error in the patch and marked it
> below.

Ok. I will update my patch. Thanks!

> Thanks,
> Andreas Philipp
> 
> On 11.07.2011 10:56, Zhong, Xin wrote:
> > Add subcommand to get the default subvolume of btrfs filesystem
> >
> > Reported-by: Yang, Yi 
> > Signed-off-by: Zhong, Xin 
> > ---
> > btrfs-list.c | 57
> +++--
> > btrfs.c | 3 +++
> > btrfs_cmds.c | 31 ++-
> > btrfs_cmds.h | 3 ++-
> > 4 files changed, 90 insertions(+), 4 deletions(-)
> >
> > diff --git a/btrfs-list.c b/btrfs-list.c
> > index 93766a8..aa6a9b4 100644
> > --- a/btrfs-list.c
> > +++ b/btrfs-list.c
> > @@ -536,7 +536,7 @@ build:
> > return full;
> > }
> >
> > -int list_subvols(int fd)
> > +int list_subvols(int fd, int get_default)
> > {
> > struct root_lookup root_lookup;
> > struct rb_node *n;
> > @@ -545,10 +545,12 @@ int list_subvols(int fd)
> > struct btrfs_ioctl_search_key *sk = &args.key;
> > struct btrfs_ioctl_search_header *sh;
> > struct btrfs_root_ref *ref;
> > + struct btrfs_dir_item *di;
> > unsigned long off = 0;
> > int name_len;
> > char *name;
> > u64 dir_id;
> > + u64 subvol_id = 0;
> > int i;
> >
> > root_lookup_init(&root_lookup);
> > @@ -642,6 +644,52 @@ int list_subvols(int fd)
> > n = rb_next(n);
> > }
> >
> > + memset(&args, 0, sizeof(args));
> > +
> > + /* search in the tree of tree roots */
> > + sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
> > +
> > + /* search dir item */
> > + sk->max_type = BTRFS_DIR_ITEM_KEY;
> > + sk->min_type = BTRFS_DIR_ITEM_KEY;
> > +
> > + sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
> > + sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
> > + sk->max_offset = (u64)-1;
> > + sk->max_transid = (u64)-1;
> > +
> > + /* just a big number, doesn't matter much */
> > + sk->nr_items = 4096;
> > +
> > + /* try to get the objectid of default subvolume */
> > + if(get_default) {
> > + ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
> > + if (ret < 0) {
> > + fprintf(stderr, "ERROR: can't perform the search\n");
> > + return ret;
> > + }
> > +
> > + off = 0;
> > + /* go through each item to find dir item named "default" */
> > + for (i = 0; i < sk->nr_items; i++) {
> > + sh = (struct btrfs_ioctl_search_header *)(args.buf +
> > + off);
> > + off += sizeof(*sh);
> > + if (sh->type == BTRFS_DIR_ITEM_KEY) {
> > + di = (struct btrfs_dir_item *)(args.buf + off);
> > + name_len = le16_to_cpu(di->name_len);
> > + name = (char *)di + sizeof(struct btrfs_dir_item);
> > + if (!strncmp("default", name, name_len)) {
> > + subvol_id = btrfs_disk_key_objectid(
> > + &di->location);
> > + break;
> > + }
> > + }
> > +
> > + off += sh->len;
> > + }
> > + }
> > +
> > /* now that we have all the subvol-relative paths filled in,
> > * we have to string the subvols together so that we can get
> > * a path all the way back to the FS root
> > @@ -650,7 +698,12 @@ int list_subvols(int fd)
> > while (n) {
> > struct root_info *entry;
> > entry = rb_entry(n, struct root_info, rb_node);
> > - resolve_root(&root_lookup, entry);
> > + if(!get_default)
> > + resolve_root(&root_lookup, entry);
> > + /* we only want the default subvolume */
> > + else if(subvol_id == entry->root_id)
> > + resolve_root(&root_lookup, entry);
> > +
> This line adds a whitespace error.
> > n = rb_prev(n);
> > }
> >
> > diff --git a/btrfs.c b/btrfs.c
> > index 46314cf..6b73f88 100644
> > --- a/btrfs.c
> > +++ b/btrfs.c
> > @@ -73,6 +73,9 @@ static struct Command commands[] = {
> > "Set the subvolume of the filesystem  which will be mounted\n"
> > "as default."
> > },
> > + { do_get_default_subvol, 1, "subvolume get-default", "\n"
> > + "Get the default subvolume of a filesystem."
> > + },
> > { do_fssync, 1,
> > "filesystem sync", "\n"
> > "Force a sync on the filesystem ."
> > diff --git a/btrfs_cmds.c b/btrfs_cmds.c
> > index 8031c58..11c56f6 100644
> > --- a/btrfs_cmds.c
> > +++ b/btrfs_cmds.c
> > @@ -301,7 +301,7 @@ int do_subvol_list(int argc, char **argv)
> > fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
> > return 12;
> > }
> > - ret = list_subvols(fd);
> > + ret = list_subvols(fd, 0);
> > if (ret)
> > return 19;
> > return 0;
> > @@ -834,6 +834,35 @@ int do_set_default_subvol(int nargs, char **argv)
> > return 0;
> > }
> >
> > +int do_get_default_subvol(int nargs, char **argv)
> > +{
> > + int fd;
> > + int ret;
> > + char *subvol;
> > +
> > +

RE: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-default" subcommand

2011-07-11 Thread Zhong, Xin
> -Original Message-
> From: Goffredo Baroncelli 
> [mailto:kreij...@libero.it]
> Sent: Monday, July 11, 2011 8:03 PM
> To: Zhong, Xin
> Cc: linux-btrfs@vger.kernel.org
> Subject: R: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-default"
> subcommand
> 
> >Messaggio originale
> >Da: xin.zh...@intel.com
> >Data: 11/07/2011 10.56
> >A: 
> >Cc: 
> >Ogg: [PATCH V2] Btrfs-progs: add "btrfs subvolume get-
> default"
> subcommand
> >
> >Add subcommand to get the default subvolume of btrfs filesystem
> >
> >Reported-by: Yang, Yi 
> >Signed-off-by: Zhong, Xin 
> >---
> > btrfs-list.c |   57
> +++--
> > btrfs.c  |3 +++
> > btrfs_cmds.c |   31 ++-
> > btrfs_cmds.h |3 ++-
> > 4 files changed, 90 insertions(+), 4 deletions(-)
> 
> please update the man page too.
> 
> >
> >diff --git a/btrfs-list.c b/btrfs-list.c
> >index 93766a8..aa6a9b4 100644
> >--- a/btrfs-list.c
> >+++ b/btrfs-list.c
> >@@ -536,7 +536,7 @@ build:
> > return full;
> [...]
> >+/* search dir item */
> >+sk->max_type = BTRFS_DIR_ITEM_KEY;
> >+sk->min_type = BTRFS_DIR_ITEM_KEY;
> >+
> >+sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
> >+sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
> >+sk->max_offset = (u64)-1;
> >+sk->max_transid = (u64)-1;
> >+
> [...]
> >+/* go through each item to find dir item named "default" */
> >+for (i = 0; i < sk->nr_items; i++) {
> >+sh = (struct btrfs_ioctl_search_header *)(args.buf +
> >+  off);
> >+off += sizeof(*sh);
> >+if (sh->type == BTRFS_DIR_ITEM_KEY) {
> >+di = (struct btrfs_dir_item *)(args.buf + off);
> >+name_len = le16_to_cpu(di->name_len);
> >+name = (char *)di + sizeof(struct
> btrfs_dir_item);
> >+if (!strncmp("default", name, name_len)) {
> >+subvol_id = btrfs_disk_key_objectid(
> >+&di->location);
> >+break;
> >+}
> >+}
> >+
> >+off += sh->len;
> >+}
> 
> I am not familiar with the "default subvolume key", but are you sure
> that the
> key is always in the first set of returned keys ?
>

It seems there's not too much dir item in the root tree. In fact, I only see it 
used for for default subvolume in the root tree.
So it should be enough. 

> >+}
> >+
> > /* now that we have all the subvol-relative paths filled in,
> >  * we have to string the subvols together so that we can get
> >  * a path all the way back to the FS root
> >@@ -650,7 +698,12 @@ int list_subvols(int fd)
> > while (n) {
> > struct root_info *entry;
> > entry = rb_entry(n, struct root_info, rb_node);
> >-resolve_root(&root_lookup, entry);
> >+if(!get_default)
> >+resolve_root(&root_lookup, entry);
> >+/* we only want the default subvolume */
> >+else if(subvol_id == entry->root_id)
> >+resolve_root(&root_lookup, entry);
> >+
> 
> What happens if there no is a default subvolume (for example a very old
> btrfs
> filesystem, and/or after removing the "default" subvolume) ?
> I suggest to handle this case printing something like "No default
> subvolume
> found"
>
If there's no default subvolume, the output is empty. Just the same as when you 
use "btrfs subvolume list" and there's no subvolume at all.
Thanks for all the review!
> 
> BR
> G.Baroncelli
> 
> > n = rb_prev(n);
> > }
> >
> >diff --git a/btrfs.c b/btrfs.c
> >index 46314cf..6b73f88 100644
> >--- a/btrfs.c
> >+++ b/btrfs.c
> >@@ -73,6 +73,9 @@ static struct Command commands[] = {
> > "Set the subvolume of the filesystem  which will be
> mounted\n"
> > "as default."
> > },
> >+{ do_get_default_subvol, 1, "subvolume get-default", "\n"
> >+"Get the default subvolume of a filesystem."
> >+},
> > { do_fssync, 1,
> >   "filesystem sync", "\n"
> > "Force a sync on the filesystem ."
> >diff --git a/btrfs_cmds.c b/btrfs_cmds.c
> >index 8031c58..11c56f6 100644
> >--- a/btrfs_cmds.c
> >+++ b/btrfs_cmds.c
> >@@ -301,7 +301,7 @@ int do_subvol_list(int argc, char **argv)
> > fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
> > return 12;
> > }
> >-ret = list_subvols(fd);
> >+ret = list_subvols(fd, 0);
> > if (ret)
> > return 19;
> > return 0;
> >@@ -834,6 +834,35 @@ int do_set_default_subvol(int nargs, char **argv)
> > return 0;
> > }
> >
> >+int do_get_default_subvol(int nargs, char **argv)
> >+{
> >+int fd;
> >+int ret;
> >+

[PATCH V3] Btrfs-progs: add "btrfs subvolume get-default" subcommand

2011-07-11 Thread Zhong, Xin
Add subcommand to get the default subvolume of btrfs filesystem

V2->V3:
* add man page
* based on http://git.darksatanic.net/repo/btrfs-progs-unstable.git
  integration-20110705

Reviewed-by: Andreas Philipp 
Reviewed-by: Goffredo Baroncelli 
Reported-by: Yang, Yi 
Signed-off-by: Zhong, Xin 
---
 btrfs-list.c   |   58 ++-
 btrfs.c|3 ++
 btrfs_cmds.c   |   31 -
 btrfs_cmds.h   |3 +-
 man/btrfs.8.in |7 ++
 5 files changed, 98 insertions(+), 4 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 07b179a..016d09c 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -560,7 +560,7 @@ build:
return full;
 }
 
-int list_subvols(int fd, int print_parent)
+int list_subvols(int fd, int print_parent, int get_default)
 {
struct root_lookup root_lookup;
struct rb_node *n;
@@ -569,10 +569,12 @@ int list_subvols(int fd, int print_parent)
struct btrfs_ioctl_search_key *sk = &args.key;
struct btrfs_ioctl_search_header *sh;
struct btrfs_root_ref *ref;
+   struct btrfs_dir_item *di;
unsigned long off = 0;
int name_len;
char *name;
u64 dir_id;
+   u64 subvol_id = 0;
int i;
int e;
 
@@ -672,6 +674,52 @@ int list_subvols(int fd, int print_parent)
n = rb_next(n);
}
 
+   memset(&args, 0, sizeof(args));
+
+   /* search in the tree of tree roots */
+   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
+
+   /* search dir item */
+   sk->max_type = BTRFS_DIR_ITEM_KEY;
+   sk->min_type = BTRFS_DIR_ITEM_KEY;
+
+   sk->max_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->min_objectid = BTRFS_ROOT_TREE_DIR_OBJECTID;
+   sk->max_offset = (u64)-1;
+   sk->max_transid = (u64)-1;
+
+   /* just a big number, doesn't matter much */
+   sk->nr_items = 4096;
+
+   /* try to get the objectid of default subvolume */
+   if (get_default) {
+   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
+   if (ret < 0) {
+   fprintf(stderr, "ERROR: can't perform the search\n");
+   return ret;
+   }
+
+   off = 0;
+   /* go through each item to find dir item named "default" */
+   for (i = 0; i < sk->nr_items; i++) {
+   sh = (struct btrfs_ioctl_search_header *)(args.buf +
+ off);
+   off += sizeof(*sh);
+   if (sh->type == BTRFS_DIR_ITEM_KEY) {
+   di = (struct btrfs_dir_item *)(args.buf + off);
+   name_len = le16_to_cpu(di->name_len);
+   name = (char *)di + sizeof(struct 
btrfs_dir_item);
+   if (!strncmp("default", name, name_len)) {
+   subvol_id = btrfs_disk_key_objectid(
+   &di->location);
+   break;
+   }
+   }
+
+   off += sh->len;
+   }
+   }
+
/* now that we have all the subvol-relative paths filled in,
 * we have to string the subvols together so that we can get
 * a path all the way back to the FS root
@@ -680,7 +728,13 @@ int list_subvols(int fd, int print_parent)
while (n) {
struct root_info *entry;
entry = rb_entry(n, struct root_info, rb_node);
-   resolve_root(&root_lookup, entry, print_parent);
+   if (!get_default)
+   resolve_root(&root_lookup, entry, print_parent);
+   /* we only want the default subvolume */
+   else if (subvol_id == entry->root_id)
+   resolve_root(&root_lookup, entry, print_parent);
+   else if (subvol_id == 0)
+   break;
n = rb_prev(n);
}
 
diff --git a/btrfs.c b/btrfs.c
index 67d6f6f..1af8360 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -94,6 +94,9 @@ static struct Command commands[] = {
"-l len defragment only up to len bytes\n"
"-t sizeminimal size of file to be considered for 
defragmenting\n"
},
+   { do_get_default_subvol, 1, "subvolume get-default", "\n"
+   "Get the default subvolume of a filesystem."
+   },
{ do_fssync, 1,
  "filesystem sync", "\n"
"Force a sync on the filesystem .",
diff --git a/btrfs_cmds.c b/btrfs_cmds.c
index 0612f34..e151c25 100644
--- a/btrfs_cmds.c
+++ b/btrfs_cmds.c
@@ -340,7 +340,7 @@ int do_subvol_list(int argc, char **argv)
fprintf(stderr, "ERROR: can't access '%s'\n", subvol);
return 12;
}
-   ret = list_subvols(fd, print_pare