Re: [PATCH v2 3/3] btrfs-progs: handle error in the btrfs_prepare_device

2013-12-17 Thread Stefan Behrens
On Tue, 17 Dec 2013 10:33:36 +0800, Anand Jain wrote:
 this patch will handle the strerror reporting of the error instead of
 printing errno,  and also replaced the BUG_ON with the error handling
 
 Signed-off-by: Anand Jain anand.j...@oracle.com
 ---
  v2: commit update
 ---
  cmds-device.c  |  7 +++
  cmds-replace.c | 10 --
  mkfs.c |  9 -
  utils.c| 30 +++---
  4 files changed, 34 insertions(+), 22 deletions(-)
 
[...]
 diff --git a/cmds-replace.c b/cmds-replace.c
 index d9b0940..8160107 100644
 --- a/cmds-replace.c
 +++ b/cmds-replace.c
 @@ -276,13 +276,11 @@ static int cmd_start_replace(int argc, char **argv)
   }
   strncpy((char *)start_args.start.tgtdev_name, dstdev,
   BTRFS_DEVICE_PATH_NAME_MAX);
 - if (btrfs_prepare_device(fddstdev, dstdev, 1, dstdev_block_count, 0,
 -  mixed, 0)) {
 - fprintf(stderr, Error: Failed to prepare device '%s'\n,
 - dstdev);
 - goto leave_with_error;
 - }
 + ret = btrfs_prepare_device(fddstdev, dstdev, 1, dstdev_block_count, 0,
 +  mixed, 0);
   close(fddstdev);
 + if (ret)
 + goto leave_with_error;
   fddstdev = -1;

You change the code to call close(fddstdev) twice.

[...]
 +zero_dev_error:
 + if (ret) {
 + ret  0 ?
 + fprintf(stderr, ERROR: failed to zero device start '%s' - 
 %s\n,
 + file, strerror(-ret)) :
 + fprintf(stderr, ERROR: failed to zero device start '%s' - 
 %d\n,
 + file, ret);

This is not funny.



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] btrfs-progs: handle error in the btrfs_prepare_device

2013-12-17 Thread Anand Jain





+   ret = btrfs_prepare_device(fddstdev, dstdev, 1, dstdev_block_count, 0,
+mixed, 0);
close(fddstdev);
+   if (ret)
+   goto leave_with_error;




fddstdev = -1;


 yeah moved this 3 lines up. thanks.


You change the code to call close(fddstdev) twice.

[...]

+zero_dev_error:
+   if (ret) {
+   ret  0 ?
+   fprintf(stderr, ERROR: failed to zero device start '%s' - 
%s\n,
+   file, strerror(-ret)) :
+   fprintf(stderr, ERROR: failed to zero device start '%s' - 
%d\n,
+   file, ret);


This is not funny.


 hmm. I am not sure what you mean ?

Thanks, Anand
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Understanding subvolume hierarchy

2013-12-17 Thread Nicolas Michel
Mhmm. Thanks. I'm begining to understand ;)
Still : how can I see/know that id 5 is mapped to id 0 ? And why doing
this? For what purpose? (is it a default btrfs behavior or is it set
by the Ubuntu installer?)

2013/12/17 Chris Murphy li...@colorremedies.com:

 On Dec 16, 2013, at 4:54 PM, Nicolas Michel be.nicolas.mic...@gmail.com 
 wrote:

 OK. thanks for your pretty fast answer :)

 Now my last question is: in this case it was easy as I know that I
 created all these subvolumes as parts of volume 0. But in the btrfs
 subv list / I don't see any information that tells me they belongs to
 id 0. If I have to debug a server/desktop and I don't know the
 hierarchy that has been made, how can I know that my tmp subvolume is
 indeed a child of id 0 ?

 When you do a subvol list it shows you what its top level is. Example:



 # btrfs subvol list /
 ID 256 gen 1047 top level 5 path root
 ID 258 gen 983 top level 5 path home
 ID 259 gen 983 top level 5 path data
 ID 276 gen 1012 top level 5 path root_ro


 root is mounted at /
 home is mounted at /hoome
 data is mounted at /data
 root_ro is not mounted at all


 #cd /data
 # btrfs subvol create data2
 Create subvolume './data2'
 # btrfs subvol list /
 ID 256 gen 1047 top level 5 path root
 ID 258 gen 983 top level 5 path home
 ID 259 gen 1048 top level 5 path data
 ID 276 gen 1012 top level 5 path root_ro
 ID 277 gen 1048 top level 5 path data/data2

 # btrfs subvol list /data
 ID 256 gen 1047 top level 5 path root
 ID 258 gen 983 top level 5 path home
 ID 259 gen 1048 top level 5 path data
 ID 276 gen 1012 top level 5 path root_ro
 ID 277 gen 1048 top level 259 path data2


 So notice that top level 5 data/data2 means the same as top level 259 
 data2 because top level 259 implies data.

 You can also use btrfs subvol show subvol and it will give you more 
 information including whether it's a snapshot, what the parent is; and if 
 it's a parent that has snapshots it'll list the snapshots.

 # btrfs subvol show /data
 /data
 Name:   data
 uuid:   bc45f4be-51c9-2848-bb68-d6e922b8e2bd
 Parent uuid:-
 Creation time:  2013-12-12 16:18:13
 Object ID:  259
 Generation (Gen):   1048
 Gen at creation:11
 Parent: 5
 Top Level:  5
 Flags:  -
 Snapshot(s):
 # btrfs subvol show /data/data2
 /data/data2
 Name:   data2
 uuid:   a66eddf9-107f-0448-a021-da417a982827
 Parent uuid:-
 Creation time:  2013-12-16 17:11:36
 Object ID:  277
 Generation (Gen):   1048
 Gen at creation:1048
 Parent: 259
 Top Level:  259
 Flags:  -
 Snapshot(s):



 Chris Murphy




-- 
Nicolas MICHEL
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Understanding subvolume hierarchy

2013-12-17 Thread Hugo Mills
On Tue, Dec 17, 2013 at 09:30:11AM +0100, Nicolas Michel wrote:
 Mhmm. Thanks. I'm begining to understand ;)
 Still : how can I see/know that id 5 is mapped to id 0 ? And why doing
 this? For what purpose? (is it a default btrfs behavior or is it set
 by the Ubuntu installer?)

   It's always mapped.

   Internally, every tree is identified by a number. FS trees (e.g.
subvolumes) start with numbers allocated dynamically from 256 upwards.
Other trees (chunk tree, extent tree, and all the others) have fixed
well-known numbers between 1 and 255, and the top-level FS tree is
given the number 5.

   To make things marginally simpler for the user to remember, there's
special-case code which looks at subvolume IDs passed from userspace,
and converts 0 to 5. This is all btrfs behaviour -- nothing to do with
Ubuntu.

   Hugo.

 2013/12/17 Chris Murphy li...@colorremedies.com:
 
  On Dec 16, 2013, at 4:54 PM, Nicolas Michel be.nicolas.mic...@gmail.com 
  wrote:
 
  OK. thanks for your pretty fast answer :)
 
  Now my last question is: in this case it was easy as I know that I
  created all these subvolumes as parts of volume 0. But in the btrfs
  subv list / I don't see any information that tells me they belongs to
  id 0. If I have to debug a server/desktop and I don't know the
  hierarchy that has been made, how can I know that my tmp subvolume is
  indeed a child of id 0 ?
 
  When you do a subvol list it shows you what its top level is. Example:
 
 
 
  # btrfs subvol list /
  ID 256 gen 1047 top level 5 path root
  ID 258 gen 983 top level 5 path home
  ID 259 gen 983 top level 5 path data
  ID 276 gen 1012 top level 5 path root_ro
 
 
  root is mounted at /
  home is mounted at /hoome
  data is mounted at /data
  root_ro is not mounted at all
 
 
  #cd /data
  # btrfs subvol create data2
  Create subvolume './data2'
  # btrfs subvol list /
  ID 256 gen 1047 top level 5 path root
  ID 258 gen 983 top level 5 path home
  ID 259 gen 1048 top level 5 path data
  ID 276 gen 1012 top level 5 path root_ro
  ID 277 gen 1048 top level 5 path data/data2
 
  # btrfs subvol list /data
  ID 256 gen 1047 top level 5 path root
  ID 258 gen 983 top level 5 path home
  ID 259 gen 1048 top level 5 path data
  ID 276 gen 1012 top level 5 path root_ro
  ID 277 gen 1048 top level 259 path data2
 
 
  So notice that top level 5 data/data2 means the same as top level 259 
  data2 because top level 259 implies data.
 
  You can also use btrfs subvol show subvol and it will give you more 
  information including whether it's a snapshot, what the parent is; and if 
  it's a parent that has snapshots it'll list the snapshots.
 
  # btrfs subvol show /data
  /data
  Name:   data
  uuid:   bc45f4be-51c9-2848-bb68-d6e922b8e2bd
  Parent uuid:-
  Creation time:  2013-12-12 16:18:13
  Object ID:  259
  Generation (Gen):   1048
  Gen at creation:11
  Parent: 5
  Top Level:  5
  Flags:  -
  Snapshot(s):
  # btrfs subvol show /data/data2
  /data/data2
  Name:   data2
  uuid:   a66eddf9-107f-0448-a021-da417a982827
  Parent uuid:-
  Creation time:  2013-12-16 17:11:36
  Object ID:  277
  Generation (Gen):   1048
  Gen at creation:1048
  Parent: 259
  Top Level:  259
  Flags:  -
  Snapshot(s):
 
 
 
  Chris Murphy
 
 
 
 

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- You can't expect a boy to be depraved until he's gone to --- 
 a good school.  


signature.asc
Description: Digital signature


Re: [PATCH] Btrfs: fix double initialization of the raid kobject

2013-12-17 Thread Miao Xie
On tue, 17 Dec 2013 12:01:12 +0800, Miao Xie wrote:
 We met the following oops when doing space balance:
  kobject (88081b590278): tried to init an initialized object, something 
 is seriously wrong.
  ...
  Call Trace:
   [81937262] dump_stack+0x49/0x5f
   [8137d259] kobject_init+0x89/0xa0
   [8137d36a] kobject_init_and_add+0x2a/0x70
   [a009bd79] ? clear_extent_bit+0x199/0x470 [btrfs]
   [a005e82c] __link_block_group+0xfc/0x120 [btrfs]
   [a006b9db] btrfs_make_block_group+0x24b/0x370 [btrfs]
   [a00a899b] __btrfs_alloc_chunk+0x54b/0x7e0 [btrfs]
   [a00a8c6f] btrfs_alloc_chunk+0x3f/0x50 [btrfs]
   [a0060123] do_chunk_alloc+0x363/0x440 [btrfs]
   [a00633d4] btrfs_check_data_free_space+0x104/0x310 [btrfs]
   [a0069f4d] btrfs_write_dirty_block_groups+0x48d/0x600 [btrfs]
   [a007aad4] commit_cowonly_roots+0x184/0x250 [btrfs]
   ...
 
 Steps to reproduce:
  # mkfs.btrfs -f dev
  # mount -o nospace_cache dev mnt
  # btrfs balance start mnt
  # dd if=/dev/zero of=mnt/tmpfile bs=1M count=1
 
 The reason of this problem is that we initialized the raid kobject when we 
 added
 a block group into a empty raid list. As we know, when we mounted a btrfs 
 filesystem,
 the raid list was empty, we would initialize the raid kobject when we added 
 the first
 block group. But if there was not data stored in the block group, the block 
 group
 would be freed when doing balance, and the raid list would be empty. And then 
 if we
 allocated a new block group and added it into the raid list, we would 
 initialize
 the raid kobject again, the oops happened.
 
 Fix this problem by initializing the raid kobject just when mounting the fs.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com

This bug was reported by Wang Shilong, so add

Reported-by: Wang Shilong wangsl.f...@cn.fujitsu.com

Thanks
Miao

 ---
  fs/btrfs/extent-tree.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)
 
 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index cd4d9ca..d667aad 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3464,8 +3464,10 @@ static int update_space_info(struct btrfs_fs_info 
 *info, u64 flags,
   return ret;
   }
  
 - for (i = 0; i  BTRFS_NR_RAID_TYPES; i++)
 + for (i = 0; i  BTRFS_NR_RAID_TYPES; i++) {
   INIT_LIST_HEAD(found-block_groups[i]);
 + kobject_init(found-block_group_kobjs[i], btrfs_raid_ktype);
 + }
   init_rwsem(found-groups_sem);
   spin_lock_init(found-lock);
   found-flags = flags  BTRFS_BLOCK_GROUP_TYPE_MASK;
 @@ -8423,9 +8425,8 @@ static void __link_block_group(struct btrfs_space_info 
 *space_info,
   int ret;
  
   kobject_get(space_info-kobj); /* put in release */
 - ret = kobject_init_and_add(kobj, btrfs_raid_ktype,
 -space_info-kobj, %s,
 -get_raid_name(index));
 + ret = kobject_add(kobj, space_info-kobj, %s,
 +   get_raid_name(index));
   if (ret) {
   pr_warn(btrfs: failed to add kobject for block cache. 
 ignoring.\n);
   kobject_put(space_info-kobj);
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Wang Shilong
If we change our default subvolume, btrfs receive will fail to find
subvolume. To fix this problem, i have two ideas.

1.make btrfs snapshot ioctl support passing source subvolume's objectid
2.when we want to using interval subvolume path, we mount it other place
that use subvolume 5 as its default subvolume.

We'd better use the second approach because it won't bother kernel change.

Reported-by: Michael Welsh Duggan m...@md5i.com
Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 cmds-receive.c | 51 +++
 utils.c| 28 
 utils.h|  1 +
 3 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index ed44107..c2cf8a3 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -40,6 +40,7 @@
 #include sys/types.h
 #include sys/xattr.h
 #include uuid/uuid.h
+#include sys/mount.h
 
 #include ctree.h
 #include ioctl.h
@@ -199,6 +200,10 @@ static int process_snapshot(const char *path, const u8 
*uuid, u64 ctransid,
char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
struct btrfs_ioctl_vol_args_v2 args_v2;
struct subvol_info *parent_subvol = NULL;
+   char *dev = NULL;
+   char tmp_name[15] = btrfs-XX;
+   char tmp_dir[30] = /tmp;
+   char *full_path = NULL;
 
ret = finish_subvol(r);
if (ret  0)
@@ -253,13 +258,47 @@ static int process_snapshot(const char *path, const u8 
*uuid, u64 ctransid,
}
}*/
 
-   args_v2.fd = openat(r-mnt_fd, parent_subvol-path,
-   O_RDONLY | O_NOATIME);
+   ret = mnt_to_dev(r-root_path, dev);
+   if (ret)
+   goto out;
+   if (!mktemp(tmp_name)) {
+   fprintf(stderr, ERROR: fail to generate a tmp file\n);
+   goto out;
+   }
+   strncat(tmp_dir, /, 1);
+   strncat(tmp_dir, tmp_name, strlen(tmp_name));
+
+   ret = mkdir(tmp_dir, 0777);
+   if (ret) {
+   fprintf(stderr, ERROR: fail to make dir: %s\n, tmp_dir);
+   goto out;
+   }
+   /* if we change default subvolume, using btrfs interval
+* subvolume path to lookup may return us ENOENT.To handle
+* such case, we mount this btrfs filesystem other place
+* where we use fs tree as our default subvolume.
+*/
+   ret = mount(dev, tmp_dir, btrfs, 0, -o subvolid=5);
+   if (ret) {
+   fprintf(stderr, ERROR: fail to mount dev: %s, dev);
+   goto out;
+   }
+
+   full_path = calloc(1, strlen(parent_subvol-path) + strlen(tmp_dir));
+   if (!full_path) {
+   ret = -ENOMEM;
+   goto out_umount;
+   }
+   strncat(full_path, tmp_dir, strlen(tmp_dir));
+   strncat(full_path, /, 1);
+   strncat(full_path, parent_subvol-path, strlen(parent_subvol-path));
+
+   args_v2.fd = open(full_path, O_RDONLY | O_NOATIME);
if (args_v2.fd  0) {
ret = -errno;
fprintf(stderr, ERROR: open %s failed. %s\n,
parent_subvol-path, strerror(-ret));
-   goto out;
+   goto out_umount;
}
 
ret = ioctl(r-dest_dir_fd, BTRFS_IOC_SNAP_CREATE_V2, args_v2);
@@ -269,10 +308,14 @@ static int process_snapshot(const char *path, const u8 
*uuid, u64 ctransid,
fprintf(stderr, ERROR: creating snapshot %s - %s 
failed. %s\n, parent_subvol-path,
path, strerror(-ret));
-   goto out;
}
 
+out_umount:
+   umount(tmp_dir);
+   rmdir(tmp_dir);
 out:
+   free(full_path);
+   free(dev);
if (parent_subvol) {
free(parent_subvol-path);
free(parent_subvol);
diff --git a/utils.c b/utils.c
index a92696e..da5291b 100644
--- a/utils.c
+++ b/utils.c
@@ -2194,6 +2194,34 @@ out:
return ret;
 }
 
+/*
+ * Given mount point, this function will return
+ * its corresponding device
+ */
+int mnt_to_dev(const char *mnt_dir, char **dev)
+{
+   struct mntent *mnt;
+   FILE *f;
+   int ret = -1;
+
+   f = setmntent(/proc/self/mounts, r);
+   if (f == NULL)
+   return ret;
+   while ((mnt = getmntent(f)) != NULL) {
+   if (strcmp(mnt-mnt_type, btrfs))
+   continue;
+   if (strcmp(mnt-mnt_dir, mnt_dir))
+   continue;
+   *dev = strdup(mnt-mnt_fsname);
+   if (*dev)
+   ret = 0;
+   break;
+   }
+   endmntent(f);
+
+   return ret;
+}
+
 /* This finds the mount point for a given fsid,
  *  subvols of the same fs/fsid can be mounted
  *  so here this picks and lowest subvol id
diff --git a/utils.h b/utils.h
index 00f1c18..9b2f79c 100644
--- a/utils.h
+++ b/utils.h
@@ -98,5 +98,6 @@ int btrfs_scan_lblkid(int 

[PATCH v4 01/18] btrfs: Cleanup the unused struct async_sched.

2013-12-17 Thread Qu Wenruo
The struct async_sched is not used by any codes and can be removed.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
Reviewed-by: Josef Bacik jba...@fusionio.com
---
Changelog:
v1-v2:
  None.
v2-v3:
  None.
v3-v4:
  None:
---
 fs/btrfs/volumes.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 92303f4..c63ed39 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5322,13 +5322,6 @@ static void btrfs_end_bio(struct bio *bio, int err)
}
 }
 
-struct async_sched {
-   struct bio *bio;
-   int rw;
-   struct btrfs_fs_info *info;
-   struct btrfs_work work;
-};
-
 /*
  * see run_scheduled_bios for a description of why bios are collected for
  * async submit.
-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 00/18] Replace btrfs_workers with kernel workqueue based btrfs_workqueue

2013-12-17 Thread Qu Wenruo
Add a new btrfs_workqueue_struct which use kernel workqueue to implement
most of the original btrfs_workers, to replace btrfs_workers.

With this patchset, redundant workqueue codes are replaced with kernel
workqueue infrastructure, which not only reduces the code size but also the
effort to maintain it.

The result from sysbench shows minor improvement on the following server:
CPU: two-way Xeon X5660
RAM: 4G 
HDD: SAS HDD, 150G total, 100G partition for btrfs test

Test result on default mount option:
https://docs.google.com/spreadsheet/ccc?key=0AhpkL3ehzX3pdENjajJTWFg5d1BWbExnYWFpMTJxeUEusp=sharing

Test result on -o compress mount option:
https://docs.google.com/spreadsheet/ccc?key=0AhpkL3ehzX3pdHdTTEJ6OW96SXJFaDR5enB1SzMzc0Eusp=sharing

Changelog:
v1-v2:
  - Fix some workqueue flags.
v2-v3:
  - Add the thresholding mechanism to simulate the old behavior
  - Convert all the btrfs_workers to btrfs_workrqueue_struct.
  - Fix some potential deadlock when executed in IRQ handler.
v3-v4:
  - Change the ordered workqueue implement to fix the performance drop in 32K
multi thread random write.
  - Change the high priority workqueue implement to get an independent high
workqueue without starving problem.
  - Simplify the btrfs_alloc_workqueue parameters.
  - Coding style cleanup.
  - Remove the redundant _struct suffix.

Qu Wenruo (18):
  btrfs: Cleanup the unused struct async_sched.
  btrfs: Added btrfs_workqueue_struct implemented ordered execution
based on kernel workqueue
  btrfs: Add high priority workqueue support for btrfs_workqueue_struct
  btrfs: Add threshold workqueue based on kernel workqueue
  btrfs: Replace fs_info-workers with btrfs_workqueue.
  btrfs: Replace fs_info-delalloc_workers with btrfs_workqueue
  btrfs: Replace fs_info-submit_workers with btrfs_workqueue.
  btrfs: Replace fs_info-flush_workers with btrfs_workqueue.
  btrfs: Replace fs_info-endio_* workqueue with btrfs_workqueue.
  btrfs: Replace fs_info-rmw_workers workqueue with btrfs_workqueue.
  btrfs: Replace fs_info-cache_workers workqueue with btrfs_workqueue.
  btrfs: Replace fs_info-readahead_workers workqueue with
btrfs_workqueue.
  btrfs: Replace fs_info-fixup_workers workqueue with btrfs_workqueue.
  btrfs: Replace fs_info-delayed_workers workqueue with
btrfs_workqueue.
  btrfs: Replace fs_info-qgroup_rescan_worker workqueue with
btrfs_workqueue.
  btrfs: Replace fs_info-scrub_* workqueue with btrfs_workqueue.
  btrfs: Cleanup the old btrfs_worker.
  btrfs: Cleanup the _struct suffix in btrfs_workequeue

 fs/btrfs/async-thread.c  | 821 ---
 fs/btrfs/async-thread.h  | 117 ++-
 fs/btrfs/ctree.h |  39 ++-
 fs/btrfs/delayed-inode.c |   6 +-
 fs/btrfs/disk-io.c   | 212 +---
 fs/btrfs/extent-tree.c   |   4 +-
 fs/btrfs/inode.c |  38 +--
 fs/btrfs/ordered-data.c  |  11 +-
 fs/btrfs/qgroup.c|  15 +-
 fs/btrfs/raid56.c|  21 +-
 fs/btrfs/reada.c |   4 +-
 fs/btrfs/scrub.c |  70 ++--
 fs/btrfs/super.c |  36 +--
 fs/btrfs/volumes.c   |  16 +-
 14 files changed, 430 insertions(+), 980 deletions(-)

-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 04/18] btrfs: Add threshold workqueue based on kernel workqueue

2013-12-17 Thread Qu Wenruo
The original btrfs_workers has thresholding functions to dynamically
create or destroy kthreads.

Though there is no such function in kernel workqueue because the worker
is not created manually, we can still use the workqueue_set_max_active
to simulated the behavior, mainly to achieve a better HDD performance by
setting a high threshold on submit_workers.
(Sadly, no resource can be saved)

So in this patch, extra workqueue pending counters are introduced to
dynamically change the max active of each btrfs_workqueue_struct, hoping
to restore the behavior of the original thresholding function.

Also, workqueue_set_max_active use a mutex to protect workqueue_struct,
which is not meant to be called too frequently, so a new interval
mechanism is applied, that will only call workqueue_set_max_active after
a count of work is queued. Hoping to balance both the random and
sequence performance on HDD.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v2-v3:
  - Add thresholding mechanism to simulate the old thresholding mechanism.
  - Will not enable thresholding when thresh is set to small value.
v3-v4:
  None
---
 fs/btrfs/async-thread.c | 107 
 fs/btrfs/async-thread.h |   3 +-
 2 files changed, 101 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index 73b9f94..a986be7 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -30,6 +30,9 @@
 #define WORK_ORDER_DONE_BIT 2
 #define WORK_HIGH_PRIO_BIT 3
 
+#define NO_THRESHOLD (-1)
+#define DFT_THRESHOLD (32)
+
 /*
  * container for the kthread task pointer and the list of pending work
  * One of these is allocated per thread.
@@ -736,6 +739,14 @@ struct __btrfs_workqueue_struct {
 
/* Spinlock for ordered_list */
spinlock_t list_lock;
+
+   /* Thresholding related variants */
+   atomic_t pending;
+   int max_active;
+   int current_max;
+   int thresh;
+   unsigned int count;
+   spinlock_t thres_lock;
 };
 
 struct btrfs_workqueue_struct {
@@ -744,19 +755,34 @@ struct btrfs_workqueue_struct {
 };
 
 static inline struct __btrfs_workqueue_struct
-*__btrfs_alloc_workqueue(char *name, int flags, int max_active)
+*__btrfs_alloc_workqueue(char *name, int flags, int max_active, int thresh)
 {
struct __btrfs_workqueue_struct *ret = kzalloc(sizeof(*ret), GFP_NOFS);
 
if (unlikely(!ret))
return NULL;
 
+   ret-max_active = max_active;
+   atomic_set(ret-pending, 0);
+   if (thresh == 0)
+   thresh = DFT_THRESHOLD;
+   /* For low threshold, disabling threshold is a better choice */
+   if (thresh  DFT_THRESHOLD) {
+   ret-current_max = max_active;
+   ret-thresh = NO_THRESHOLD;
+   } else {
+   ret-current_max = 1;
+   ret-thresh = thresh;
+   }
+
if (flags  WQ_HIGHPRI)
ret-normal_wq = alloc_workqueue(%s-%s-high, flags,
-max_active, btrfs, name);
+ret-max_active,
+btrfs, name);
else
ret-normal_wq = alloc_workqueue(%s-%s, flags,
-max_active, btrfs, name);
+ret-max_active, btrfs,
+name);
if (unlikely(!ret-normal_wq)) {
kfree(ret);
return NULL;
@@ -764,6 +790,7 @@ static inline struct __btrfs_workqueue_struct
 
INIT_LIST_HEAD(ret-ordered_list);
spin_lock_init(ret-list_lock);
+   spin_lock_init(ret-thres_lock);
return ret;
 }
 
@@ -772,7 +799,8 @@ __btrfs_destroy_workqueue(struct __btrfs_workqueue_struct 
*wq);
 
 struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char *name,
 int flags,
-int max_active)
+int max_active,
+int thresh)
 {
struct btrfs_workqueue_struct *ret = kzalloc(sizeof(*ret), GFP_NOFS);
 
@@ -780,14 +808,15 @@ struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char 
*name,
return NULL;
 
ret-normal = __btrfs_alloc_workqueue(name, flags  ~WQ_HIGHPRI,
- max_active);
+ max_active, thresh);
if (unlikely(!ret-normal)) {
kfree(ret);
return NULL;
}
 
if (flags  WQ_HIGHPRI) {
-   ret-high = __btrfs_alloc_workqueue(name, flags, max_active);
+   ret-high = __btrfs_alloc_workqueue(name, flags, max_active,
+   thresh);
if 

[PATCH v4 07/18] btrfs: Replace fs_info-submit_workers with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Much like the fs_info-workers, replace the fs_info-submit_workers
use the same btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  None
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 17 +
 fs/btrfs/super.c   |  2 +-
 fs/btrfs/volumes.c | 11 ++-
 fs/btrfs/volumes.h |  2 +-
 5 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index a86c9a1..4411a2b 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1499,7 +1499,7 @@ struct btrfs_fs_info {
struct btrfs_workers endio_meta_write_workers;
struct btrfs_workers endio_write_workers;
struct btrfs_workers endio_freespace_worker;
-   struct btrfs_workers submit_workers;
+   struct btrfs_workqueue_struct *submit_workers;
struct btrfs_workers caching_workers;
struct btrfs_workers readahead_workers;
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 1098435..cda9766 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2017,7 +2017,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_stop_workers(fs_info-endio_meta_write_workers);
btrfs_stop_workers(fs_info-endio_write_workers);
btrfs_stop_workers(fs_info-endio_freespace_worker);
-   btrfs_stop_workers(fs_info-submit_workers);
+   btrfs_destroy_workqueue(fs_info-submit_workers);
btrfs_stop_workers(fs_info-delayed_workers);
btrfs_stop_workers(fs_info-caching_workers);
btrfs_stop_workers(fs_info-readahead_workers);
@@ -2482,18 +2482,19 @@ int open_ctree(struct super_block *sb,
btrfs_init_workers(fs_info-flush_workers, flush_delalloc,
   fs_info-thread_pool_size, NULL);
 
-   btrfs_init_workers(fs_info-submit_workers, submit,
-  min_t(u64, fs_devices-num_devices,
-  fs_info-thread_pool_size), NULL);
 
btrfs_init_workers(fs_info-caching_workers, cache,
   fs_info-thread_pool_size, NULL);
 
-   /* a higher idle thresh on the submit workers makes it much more
+   /*
+* a higher idle thresh on the submit workers makes it much more
 * likely that bios will be send down in a sane order to the
 * devices
 */
-   fs_info-submit_workers.idle_thresh = 64;
+   fs_info-submit_workers =
+   btrfs_alloc_workqueue(submit, flags,
+ min_t(u64, fs_devices-num_devices,
+   max_active), 64);
 
btrfs_init_workers(fs_info-fixup_workers, fixup, 1,
   fs_info-generic_worker);
@@ -2544,7 +2545,6 @@ int open_ctree(struct super_block *sb,
 * return -ENOMEM if any of these fail.
 */
ret = btrfs_start_workers(fs_info-generic_worker);
-   ret |= btrfs_start_workers(fs_info-submit_workers);
ret |= btrfs_start_workers(fs_info-fixup_workers);
ret |= btrfs_start_workers(fs_info-endio_workers);
ret |= btrfs_start_workers(fs_info-endio_meta_workers);
@@ -2562,7 +2562,8 @@ int open_ctree(struct super_block *sb,
err = -ENOMEM;
goto fail_sb_buffer;
}
-   if (!(fs_info-workers  fs_info-delalloc_workers)) {
+   if (!(fs_info-workers  fs_info-delalloc_workers 
+ fs_info-submit_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 875560e..9f1d0a5 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1247,7 +1247,7 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info 
*fs_info,
btrfs_set_max_workers(fs_info-generic_worker, new_pool_size);
btrfs_workqueue_set_max(fs_info-workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-delalloc_workers, new_pool_size);
-   btrfs_set_max_workers(fs_info-submit_workers, new_pool_size);
+   btrfs_workqueue_set_max(fs_info-submit_workers, new_pool_size);
btrfs_set_max_workers(fs_info-caching_workers, new_pool_size);
btrfs_set_max_workers(fs_info-fixup_workers, new_pool_size);
btrfs_set_max_workers(fs_info-endio_workers, new_pool_size);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c63ed39..e07bd64 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -415,7 +415,8 @@ loop_lock:
device-running_pending = 1;
 
spin_unlock(device-io_lock);
-   btrfs_requeue_work(device-work);
+   btrfs_queue_work(fs_info-submit_workers,
+device-work);
goto done;
}
/* unplug every 64 requests just for good measure */
@@ -439,7 +440,7 @@ done:
blk_finish_plug(plug);

[PATCH v4 16/18] btrfs: Replace fs_info-scrub_* workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-scrub_* with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace scrub_*.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h |  6 ++--
 fs/btrfs/scrub.c | 93 ++--
 fs/btrfs/super.c |  4 +--
 3 files changed, 55 insertions(+), 48 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index df51fa3..5d71258 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1587,9 +1587,9 @@ struct btrfs_fs_info {
atomic_t scrub_cancel_req;
wait_queue_head_t scrub_pause_wait;
int scrub_workers_refcnt;
-   struct btrfs_workers scrub_workers;
-   struct btrfs_workers scrub_wr_completion_workers;
-   struct btrfs_workers scrub_nocow_workers;
+   struct btrfs_workqueue_struct *scrub_workers;
+   struct btrfs_workqueue_struct *scrub_wr_completion_workers;
+   struct btrfs_workqueue_struct *scrub_nocow_workers;
 
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
u32 check_integrity_print_mask;
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 561e2f1..1618d6d 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -96,7 +96,8 @@ struct scrub_bio {
 #endif
int page_count;
int next_free;
-   struct btrfs_work   work;
+   struct btrfs_work_struct
+   work;
 };
 
 struct scrub_block {
@@ -154,7 +155,8 @@ struct scrub_fixup_nodatasum {
struct btrfs_device *dev;
u64 logical;
struct btrfs_root   *root;
-   struct btrfs_work   work;
+   struct btrfs_work_struct
+   work;
int mirror_num;
 };
 
@@ -172,7 +174,8 @@ struct scrub_copy_nocow_ctx {
int mirror_num;
u64 physical_for_dev_replace;
struct list_headinodes;
-   struct btrfs_work   work;
+   struct btrfs_work_struct
+   work;
 };
 
 struct scrub_warning {
@@ -232,7 +235,7 @@ static int scrub_pages(struct scrub_ctx *sctx, u64 logical, 
u64 len,
   u64 gen, int mirror_num, u8 *csum, int force,
   u64 physical_for_dev_replace);
 static void scrub_bio_end_io(struct bio *bio, int err);
-static void scrub_bio_end_io_worker(struct btrfs_work *work);
+static void scrub_bio_end_io_worker(struct btrfs_work_struct *work);
 static void scrub_block_complete(struct scrub_block *sblock);
 static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
   u64 extent_logical, u64 extent_len,
@@ -249,14 +252,14 @@ static int scrub_add_page_to_wr_bio(struct scrub_ctx 
*sctx,
struct scrub_page *spage);
 static void scrub_wr_submit(struct scrub_ctx *sctx);
 static void scrub_wr_bio_end_io(struct bio *bio, int err);
-static void scrub_wr_bio_end_io_worker(struct btrfs_work *work);
+static void scrub_wr_bio_end_io_worker(struct btrfs_work_struct *work);
 static int write_page_nocow(struct scrub_ctx *sctx,
u64 physical_for_dev_replace, struct page *page);
 static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root,
  struct scrub_copy_nocow_ctx *ctx);
 static int copy_nocow_pages(struct scrub_ctx *sctx, u64 logical, u64 len,
int mirror_num, u64 physical_for_dev_replace);
-static void copy_nocow_pages_worker(struct btrfs_work *work);
+static void copy_nocow_pages_worker(struct btrfs_work_struct *work);
 
 
 static void scrub_pending_bio_inc(struct scrub_ctx *sctx)
@@ -394,7 +397,8 @@ struct scrub_ctx *scrub_setup_ctx(struct btrfs_device *dev, 
int is_dev_replace)
sbio-index = i;
sbio-sctx = sctx;
sbio-page_count = 0;
-   sbio-work.func = scrub_bio_end_io_worker;
+   btrfs_init_work(sbio-work, scrub_bio_end_io_worker,
+   NULL, NULL);
 
if (i != SCRUB_BIOS_PER_SCTX - 1)
sctx-bios[i]-next_free = i + 1;
@@ -699,7 +703,7 @@ out:
return -EIO;
 }
 
-static void scrub_fixup_nodatasum(struct btrfs_work *work)
+static void scrub_fixup_nodatasum(struct btrfs_work_struct *work)
 {
int ret;
struct scrub_fixup_nodatasum *fixup;
@@ -965,9 +969,10 @@ nodatasum_case:
fixup_nodatasum-root = fs_info-extent_root;
fixup_nodatasum-mirror_num = failed_mirror_index + 1;
scrub_pending_trans_workers_inc(sctx);
-   fixup_nodatasum-work.func = scrub_fixup_nodatasum;
-   btrfs_queue_worker(fs_info-scrub_workers,
-  fixup_nodatasum-work);
+   

[PATCH v4 18/18] btrfs: Cleanup the _struct suffix in btrfs_workequeue

2013-12-17 Thread Qu Wenruo
Since the _struct suffix is mainly used for distinguish the differnt
btrfs_work between the original and the newly created one,
there is no need using the suffix since all btrfs_workers are changed
into btrfs_workqueue.

Also this patch fixed some codes whose code style is changed due to the
too long _struct suffix.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v3-v4:
  - Remove the _struct suffix.
---
 fs/btrfs/async-thread.c  | 64 
 fs/btrfs/async-thread.h  | 34 -
 fs/btrfs/ctree.h | 44 -
 fs/btrfs/delayed-inode.c |  4 +--
 fs/btrfs/disk-io.c   | 14 +--
 fs/btrfs/extent-tree.c   |  2 +-
 fs/btrfs/inode.c | 18 +++---
 fs/btrfs/ordered-data.c  |  2 +-
 fs/btrfs/ordered-data.h  |  4 +--
 fs/btrfs/qgroup.c|  2 +-
 fs/btrfs/raid56.c| 14 +--
 fs/btrfs/reada.c |  5 ++--
 fs/btrfs/scrub.c | 23 -
 fs/btrfs/volumes.c   |  2 +-
 fs/btrfs/volumes.h   |  2 +-
 15 files changed, 115 insertions(+), 119 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index 16a5eec..f896426 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -32,7 +32,7 @@
 #define NO_THRESHOLD (-1)
 #define DFT_THRESHOLD (32)
 
-struct __btrfs_workqueue_struct {
+struct __btrfs_workqueue {
struct workqueue_struct *normal_wq;
/* List head pointing to ordered work list */
struct list_head ordered_list;
@@ -49,15 +49,15 @@ struct __btrfs_workqueue_struct {
spinlock_t thres_lock;
 };
 
-struct btrfs_workqueue_struct {
-   struct __btrfs_workqueue_struct *normal;
-   struct __btrfs_workqueue_struct *high;
+struct btrfs_workqueue {
+   struct __btrfs_workqueue *normal;
+   struct __btrfs_workqueue *high;
 };
 
-static inline struct __btrfs_workqueue_struct
+static inline struct __btrfs_workqueue
 *__btrfs_alloc_workqueue(char *name, int flags, int max_active, int thresh)
 {
-   struct __btrfs_workqueue_struct *ret = kzalloc(sizeof(*ret), GFP_NOFS);
+   struct __btrfs_workqueue *ret = kzalloc(sizeof(*ret), GFP_NOFS);
 
if (unlikely(!ret))
return NULL;
@@ -95,14 +95,14 @@ static inline struct __btrfs_workqueue_struct
 }
 
 static inline void
-__btrfs_destroy_workqueue(struct __btrfs_workqueue_struct *wq);
+__btrfs_destroy_workqueue(struct __btrfs_workqueue *wq);
 
-struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char *name,
-int flags,
-int max_active,
-int thresh)
+struct btrfs_workqueue *btrfs_alloc_workqueue(char *name,
+ int flags,
+ int max_active,
+ int thresh)
 {
-   struct btrfs_workqueue_struct *ret = kzalloc(sizeof(*ret), GFP_NOFS);
+   struct btrfs_workqueue *ret = kzalloc(sizeof(*ret), GFP_NOFS);
 
if (unlikely(!ret))
return NULL;
@@ -131,7 +131,7 @@ struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char 
*name,
  * This hook WILL be called in IRQ handler context,
  * so workqueue_set_max_active MUST NOT be called in this hook
  */
-static inline void thresh_queue_hook(struct __btrfs_workqueue_struct *wq)
+static inline void thresh_queue_hook(struct __btrfs_workqueue *wq)
 {
if (wq-thresh == NO_THRESHOLD)
return;
@@ -143,7 +143,7 @@ static inline void thresh_queue_hook(struct 
__btrfs_workqueue_struct *wq)
  * This hook is called in kthread content.
  * So workqueue_set_max_active is called here.
  */
-static inline void thresh_exec_hook(struct __btrfs_workqueue_struct *wq)
+static inline void thresh_exec_hook(struct __btrfs_workqueue *wq)
 {
int new_max_active;
long pending;
@@ -186,10 +186,10 @@ out:
}
 }
 
-static void run_ordered_work(struct __btrfs_workqueue_struct *wq)
+static void run_ordered_work(struct __btrfs_workqueue *wq)
 {
struct list_head *list = wq-ordered_list;
-   struct btrfs_work_struct *work;
+   struct btrfs_work *work;
spinlock_t *lock = wq-list_lock;
unsigned long flags;
 
@@ -197,7 +197,7 @@ static void run_ordered_work(struct 
__btrfs_workqueue_struct *wq)
spin_lock_irqsave(lock, flags);
if (list_empty(list))
break;
-   work = list_entry(list-next, struct btrfs_work_struct,
+   work = list_entry(list-next, struct btrfs_work,
  ordered_list);
if (!test_bit(WORK_DONE_BIT, work-flags))
break;
@@ -229,10 +229,10 @@ static void run_ordered_work(struct 
__btrfs_workqueue_struct *wq)
 
 static void normal_work_helper(struct work_struct 

[PATCH v4 12/18] btrfs: Replace fs_info-readahead_workers workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-readahead_workers with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace readahead_workers.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 12 
 fs/btrfs/reada.c   |  9 +
 fs/btrfs/super.c   |  2 +-
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 8630986..302dc46 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1501,7 +1501,7 @@ struct btrfs_fs_info {
struct btrfs_workqueue_struct *endio_freespace_worker;
struct btrfs_workqueue_struct *submit_workers;
struct btrfs_workqueue_struct *caching_workers;
-   struct btrfs_workers readahead_workers;
+   struct btrfs_workqueue_struct *readahead_workers;
 
/*
 * fixup workers take dirty pages that didn't properly go through
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d8f42d2..4d49d87 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2019,7 +2019,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_destroy_workqueue(fs_info-submit_workers);
btrfs_stop_workers(fs_info-delayed_workers);
btrfs_destroy_workqueue(fs_info-caching_workers);
-   btrfs_stop_workers(fs_info-readahead_workers);
+   btrfs_destroy_workqueue(fs_info-readahead_workers);
btrfs_destroy_workqueue(fs_info-flush_workers);
btrfs_stop_workers(fs_info-qgroup_rescan_workers);
 }
@@ -2518,14 +2518,11 @@ int open_ctree(struct super_block *sb,
btrfs_init_workers(fs_info-delayed_workers, delayed-meta,
   fs_info-thread_pool_size,
   fs_info-generic_worker);
-   btrfs_init_workers(fs_info-readahead_workers, readahead,
-  fs_info-thread_pool_size,
-  fs_info-generic_worker);
+   fs_info-readahead_workers =
+   btrfs_alloc_workqueue(readahead, flags, max_active, 2);
btrfs_init_workers(fs_info-qgroup_rescan_workers, qgroup-rescan, 1,
   fs_info-generic_worker);
 
-   fs_info-readahead_workers.idle_thresh = 2;
-
/*
 * btrfs_start_workers can really only fail because of ENOMEM so just
 * return -ENOMEM if any of these fail.
@@ -2533,7 +2530,6 @@ int open_ctree(struct super_block *sb,
ret = btrfs_start_workers(fs_info-generic_worker);
ret |= btrfs_start_workers(fs_info-fixup_workers);
ret |= btrfs_start_workers(fs_info-delayed_workers);
-   ret |= btrfs_start_workers(fs_info-readahead_workers);
ret |= btrfs_start_workers(fs_info-qgroup_rescan_workers);
if (ret) {
err = -ENOMEM;
@@ -2545,7 +2541,7 @@ int open_ctree(struct super_block *sb,
  fs_info-endio_meta_write_workers 
  fs_info-endio_write_workers  fs_info-endio_raid56_workers 
  fs_info-endio_freespace_worker  fs_info-rmw_workers 
- fs_info-caching_workers)) {
+ fs_info-caching_workers  fs_info-readahead_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
index 1031b69..854b69a 100644
--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c
@@ -91,7 +91,8 @@ struct reada_zone {
 };
 
 struct reada_machine_work {
-   struct btrfs_work   work;
+   struct btrfs_work_struct
+   work;
struct btrfs_fs_info*fs_info;
 };
 
@@ -732,7 +733,7 @@ static int reada_start_machine_dev(struct btrfs_fs_info 
*fs_info,
 
 }
 
-static void reada_start_machine_worker(struct btrfs_work *work)
+static void reada_start_machine_worker(struct btrfs_work_struct *work)
 {
struct reada_machine_work *rmw;
struct btrfs_fs_info *fs_info;
@@ -792,10 +793,10 @@ static void reada_start_machine(struct btrfs_fs_info 
*fs_info)
/* FIXME we cannot handle this properly right now */
BUG();
}
-   rmw-work.func = reada_start_machine_worker;
+   btrfs_init_work(rmw-work, reada_start_machine_worker, NULL, NULL);
rmw-fs_info = fs_info;
 
-   btrfs_queue_worker(fs_info-readahead_workers, rmw-work);
+   btrfs_queue_work(fs_info-readahead_workers, rmw-work);
 }
 
 #ifdef DEBUG
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 5bfe566..7a46e23 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1257,7 +1257,7 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info 
*fs_info,
btrfs_workqueue_set_max(fs_info-endio_write_workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-endio_freespace_worker, new_pool_size);
btrfs_set_max_workers(fs_info-delayed_workers, new_pool_size);
-   btrfs_set_max_workers(fs_info-readahead_workers, 

[PATCH v4 17/18] btrfs: Cleanup the old btrfs_worker.

2013-12-17 Thread Qu Wenruo
Since all the btrfs_worker is replaced with the newly created
btrfs_workqueue, the old codes can be easily remove.

Signed-off-by: Quwenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Reuse the old async-thred.[ch] files.
v3-v4:
  - Reuse the old WORK_* bits.
---
 fs/btrfs/async-thread.c | 706 +---
 fs/btrfs/async-thread.h | 100 ---
 fs/btrfs/ctree.h|   1 -
 fs/btrfs/disk-io.c  |  12 -
 fs/btrfs/super.c|   8 -
 5 files changed, 3 insertions(+), 824 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index a986be7..16a5eec 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -25,713 +25,13 @@
 #include linux/workqueue.h
 #include async-thread.h
 
-#define WORK_QUEUED_BIT 0
-#define WORK_DONE_BIT 1
-#define WORK_ORDER_DONE_BIT 2
-#define WORK_HIGH_PRIO_BIT 3
+#define WORK_DONE_BIT 0
+#define WORK_ORDER_DONE_BIT 1
+#define WORK_HIGH_PRIO_BIT 2
 
 #define NO_THRESHOLD (-1)
 #define DFT_THRESHOLD (32)
 
-/*
- * container for the kthread task pointer and the list of pending work
- * One of these is allocated per thread.
- */
-struct btrfs_worker_thread {
-   /* pool we belong to */
-   struct btrfs_workers *workers;
-
-   /* list of struct btrfs_work that are waiting for service */
-   struct list_head pending;
-   struct list_head prio_pending;
-
-   /* list of worker threads from struct btrfs_workers */
-   struct list_head worker_list;
-
-   /* kthread */
-   struct task_struct *task;
-
-   /* number of things on the pending list */
-   atomic_t num_pending;
-
-   /* reference counter for this struct */
-   atomic_t refs;
-
-   unsigned long sequence;
-
-   /* protects the pending list. */
-   spinlock_t lock;
-
-   /* set to non-zero when this thread is already awake and kicking */
-   int working;
-
-   /* are we currently idle */
-   int idle;
-};
-
-static int __btrfs_start_workers(struct btrfs_workers *workers);
-
-/*
- * btrfs_start_workers uses kthread_run, which can block waiting for memory
- * for a very long time.  It will actually throttle on page writeback,
- * and so it may not make progress until after our btrfs worker threads
- * process all of the pending work structs in their queue
- *
- * This means we can't use btrfs_start_workers from inside a btrfs worker
- * thread that is used as part of cleaning dirty memory, which pretty much
- * involves all of the worker threads.
- *
- * Instead we have a helper queue who never has more than one thread
- * where we scheduler thread start operations.  This worker_start struct
- * is used to contain the work and hold a pointer to the queue that needs
- * another worker.
- */
-struct worker_start {
-   struct btrfs_work work;
-   struct btrfs_workers *queue;
-};
-
-static void start_new_worker_func(struct btrfs_work *work)
-{
-   struct worker_start *start;
-   start = container_of(work, struct worker_start, work);
-   __btrfs_start_workers(start-queue);
-   kfree(start);
-}
-
-/*
- * helper function to move a thread onto the idle list after it
- * has finished some requests.
- */
-static void check_idle_worker(struct btrfs_worker_thread *worker)
-{
-   if (!worker-idle  atomic_read(worker-num_pending) 
-   worker-workers-idle_thresh / 2) {
-   unsigned long flags;
-   spin_lock_irqsave(worker-workers-lock, flags);
-   worker-idle = 1;
-
-   /* the list may be empty if the worker is just starting */
-   if (!list_empty(worker-worker_list) 
-   !worker-workers-stopping) {
-   list_move(worker-worker_list,
-worker-workers-idle_list);
-   }
-   spin_unlock_irqrestore(worker-workers-lock, flags);
-   }
-}
-
-/*
- * helper function to move a thread off the idle list after new
- * pending work is added.
- */
-static void check_busy_worker(struct btrfs_worker_thread *worker)
-{
-   if (worker-idle  atomic_read(worker-num_pending) =
-   worker-workers-idle_thresh) {
-   unsigned long flags;
-   spin_lock_irqsave(worker-workers-lock, flags);
-   worker-idle = 0;
-
-   if (!list_empty(worker-worker_list) 
-   !worker-workers-stopping) {
-   list_move_tail(worker-worker_list,
- worker-workers-worker_list);
-   }
-   spin_unlock_irqrestore(worker-workers-lock, flags);
-   }
-}
-
-static void check_pending_worker_creates(struct btrfs_worker_thread *worker)
-{
-   struct btrfs_workers *workers = worker-workers;
-   struct worker_start *start;
-   unsigned long flags;
-
-   rmb();
-   if (!workers-atomic_start_pending)
-   return;
-
-   start = kzalloc(sizeof(*start), GFP_NOFS);

[PATCH v4 09/18] btrfs: Replace fs_info-endio_* workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-endio_* workqueues with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace endio_*.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h|  12 +++---
 fs/btrfs/disk-io.c  | 104 +---
 fs/btrfs/inode.c|  20 +-
 fs/btrfs/ordered-data.h |   2 +-
 fs/btrfs/super.c|  11 ++---
 5 files changed, 68 insertions(+), 81 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 097364d..5096164 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1492,13 +1492,13 @@ struct btrfs_fs_info {
struct btrfs_workqueue_struct *workers;
struct btrfs_workqueue_struct *delalloc_workers;
struct btrfs_workqueue_struct *flush_workers;
-   struct btrfs_workers endio_workers;
-   struct btrfs_workers endio_meta_workers;
-   struct btrfs_workers endio_raid56_workers;
+   struct btrfs_workqueue_struct *endio_workers;
+   struct btrfs_workqueue_struct *endio_meta_workers;
+   struct btrfs_workqueue_struct *endio_raid56_workers;
struct btrfs_workers rmw_workers;
-   struct btrfs_workers endio_meta_write_workers;
-   struct btrfs_workers endio_write_workers;
-   struct btrfs_workers endio_freespace_worker;
+   struct btrfs_workqueue_struct *endio_meta_write_workers;
+   struct btrfs_workqueue_struct *endio_write_workers;
+   struct btrfs_workqueue_struct *endio_freespace_worker;
struct btrfs_workqueue_struct *submit_workers;
struct btrfs_workers caching_workers;
struct btrfs_workers readahead_workers;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 139960f..4f8591a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -54,7 +54,7 @@
 #endif
 
 static struct extent_io_ops btree_extent_io_ops;
-static void end_workqueue_fn(struct btrfs_work *work);
+static void end_workqueue_fn(struct btrfs_work_struct *work);
 static void free_fs_root(struct btrfs_root *root);
 static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
int read_only);
@@ -85,7 +85,7 @@ struct end_io_wq {
int error;
int metadata;
struct list_head list;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
 };
 
 /*
@@ -681,32 +681,31 @@ static void end_workqueue_bio(struct bio *bio, int err)
 
fs_info = end_io_wq-info;
end_io_wq-error = err;
-   end_io_wq-work.func = end_workqueue_fn;
-   end_io_wq-work.flags = 0;
+   btrfs_init_work(end_io_wq-work, end_workqueue_fn, NULL, NULL);
 
if (bio-bi_rw  REQ_WRITE) {
if (end_io_wq-metadata == BTRFS_WQ_ENDIO_METADATA)
-   btrfs_queue_worker(fs_info-endio_meta_write_workers,
-  end_io_wq-work);
+   btrfs_queue_work(fs_info-endio_meta_write_workers,
+end_io_wq-work);
else if (end_io_wq-metadata == BTRFS_WQ_ENDIO_FREE_SPACE)
-   btrfs_queue_worker(fs_info-endio_freespace_worker,
-  end_io_wq-work);
+   btrfs_queue_work(fs_info-endio_freespace_worker,
+end_io_wq-work);
else if (end_io_wq-metadata == BTRFS_WQ_ENDIO_RAID56)
-   btrfs_queue_worker(fs_info-endio_raid56_workers,
-  end_io_wq-work);
+   btrfs_queue_work(fs_info-endio_raid56_workers,
+end_io_wq-work);
else
-   btrfs_queue_worker(fs_info-endio_write_workers,
-  end_io_wq-work);
+   btrfs_queue_work(fs_info-endio_write_workers,
+end_io_wq-work);
} else {
if (end_io_wq-metadata == BTRFS_WQ_ENDIO_RAID56)
-   btrfs_queue_worker(fs_info-endio_raid56_workers,
-  end_io_wq-work);
+   btrfs_queue_work(fs_info-endio_raid56_workers,
+end_io_wq-work);
else if (end_io_wq-metadata)
-   btrfs_queue_worker(fs_info-endio_meta_workers,
-  end_io_wq-work);
+   btrfs_queue_work(fs_info-endio_meta_workers,
+end_io_wq-work);
else
-   btrfs_queue_worker(fs_info-endio_workers,
-  end_io_wq-work);
+   btrfs_queue_work(fs_info-endio_workers,
+end_io_wq-work);
}
 }
 

[PATCH v4 03/18] btrfs: Add high priority workqueue support for btrfs_workqueue_struct

2013-12-17 Thread Qu Wenruo
Add high priority function to btrfs_workqueue.

This is implemented by embedding a btrfs_workqueue into a
btrfs_workqueue and use some helper functions to differ the normal
priority wq and high priority wq.
So the high priority wq is completely independent from the normal
workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  None
v3-v4:
  - Implement high priority workqueue independently.
Now high priority wq is implemented as a normal btrfs_workqueue,
with independent ordering/thresholding mechanism.
This fixed the problem that high priority wq and normal wq shared one
ordered wq.
---
 fs/btrfs/async-thread.c | 89 +++--
 fs/btrfs/async-thread.h |  5 ++-
 2 files changed, 82 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index f05d57e..73b9f94 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -729,7 +729,7 @@ void btrfs_queue_worker(struct btrfs_workers *workers, 
struct btrfs_work *work)
spin_unlock_irqrestore(worker-lock, flags);
 }
 
-struct btrfs_workqueue_struct {
+struct __btrfs_workqueue_struct {
struct workqueue_struct *normal_wq;
/* List head pointing to ordered work list */
struct list_head ordered_list;
@@ -738,6 +738,38 @@ struct btrfs_workqueue_struct {
spinlock_t list_lock;
 };
 
+struct btrfs_workqueue_struct {
+   struct __btrfs_workqueue_struct *normal;
+   struct __btrfs_workqueue_struct *high;
+};
+
+static inline struct __btrfs_workqueue_struct
+*__btrfs_alloc_workqueue(char *name, int flags, int max_active)
+{
+   struct __btrfs_workqueue_struct *ret = kzalloc(sizeof(*ret), GFP_NOFS);
+
+   if (unlikely(!ret))
+   return NULL;
+
+   if (flags  WQ_HIGHPRI)
+   ret-normal_wq = alloc_workqueue(%s-%s-high, flags,
+max_active, btrfs, name);
+   else
+   ret-normal_wq = alloc_workqueue(%s-%s, flags,
+max_active, btrfs, name);
+   if (unlikely(!ret-normal_wq)) {
+   kfree(ret);
+   return NULL;
+   }
+
+   INIT_LIST_HEAD(ret-ordered_list);
+   spin_lock_init(ret-list_lock);
+   return ret;
+}
+
+static inline void
+__btrfs_destroy_workqueue(struct __btrfs_workqueue_struct *wq);
+
 struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char *name,
 int flags,
 int max_active)
@@ -747,19 +779,25 @@ struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char 
*name,
if (unlikely(!ret))
return NULL;
 
-   ret-normal_wq = alloc_workqueue(%s-%s, flags, max_active,
-btrfs, name);
-   if (unlikely(!ret-normal_wq)) {
+   ret-normal = __btrfs_alloc_workqueue(name, flags  ~WQ_HIGHPRI,
+ max_active);
+   if (unlikely(!ret-normal)) {
kfree(ret);
return NULL;
}
 
-   INIT_LIST_HEAD(ret-ordered_list);
-   spin_lock_init(ret-list_lock);
+   if (flags  WQ_HIGHPRI) {
+   ret-high = __btrfs_alloc_workqueue(name, flags, max_active);
+   if (unlikely(!ret-high)) {
+   __btrfs_destroy_workqueue(ret-normal);
+   kfree(ret);
+   return NULL;
+   }
+   }
return ret;
 }
 
-static void run_ordered_work(struct btrfs_workqueue_struct *wq)
+static void run_ordered_work(struct __btrfs_workqueue_struct *wq)
 {
struct list_head *list = wq-ordered_list;
struct btrfs_work_struct *work;
@@ -832,8 +870,8 @@ void btrfs_init_work(struct btrfs_work_struct *work,
work-flags = 0;
 }
 
-void btrfs_queue_work(struct btrfs_workqueue_struct *wq,
- struct btrfs_work_struct *work)
+static inline void __btrfs_queue_work(struct __btrfs_workqueue_struct *wq,
+ struct btrfs_work_struct *work)
 {
unsigned long flags;
 
@@ -846,13 +884,42 @@ void btrfs_queue_work(struct btrfs_workqueue_struct *wq,
queue_work(wq-normal_wq, work-normal_work);
 }
 
-void btrfs_destroy_workqueue(struct btrfs_workqueue_struct *wq)
+void btrfs_queue_work(struct btrfs_workqueue_struct *wq,
+ struct btrfs_work_struct *work)
+{
+   struct __btrfs_workqueue_struct *dest_wq;
+
+   if (test_bit(WORK_HIGH_PRIO_BIT, work-flags)  wq-high)
+   dest_wq = wq-high;
+   else
+   dest_wq = wq-normal;
+   __btrfs_queue_work(dest_wq, work);
+}
+
+static inline void
+__btrfs_destroy_workqueue(struct __btrfs_workqueue_struct *wq)
 {
destroy_workqueue(wq-normal_wq);
kfree(wq);
 }
 
+void btrfs_destroy_workqueue(struct 

[PATCH v4 14/18] btrfs: Replace fs_info-delayed_workers workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-delayed_workers with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  None
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h |  2 +-
 fs/btrfs/delayed-inode.c | 10 +-
 fs/btrfs/disk-io.c   | 10 --
 fs/btrfs/super.c |  2 +-
 4 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 845615e..698cebc 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1509,7 +1509,7 @@ struct btrfs_fs_info {
 * for the sys_munmap function call path
 */
struct btrfs_workqueue_struct *fixup_workers;
-   struct btrfs_workers delayed_workers;
+   struct btrfs_workqueue_struct *delayed_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
int thread_pool_size;
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 8d292fb..e4ad5ea 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1260,10 +1260,10 @@ void btrfs_remove_delayed_node(struct inode *inode)
 struct btrfs_async_delayed_work {
struct btrfs_delayed_root *delayed_root;
int nr;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
 };
 
-static void btrfs_async_run_delayed_root(struct btrfs_work *work)
+static void btrfs_async_run_delayed_root(struct btrfs_work_struct *work)
 {
struct btrfs_async_delayed_work *async_work;
struct btrfs_delayed_root *delayed_root;
@@ -1361,11 +1361,11 @@ static int btrfs_wq_run_delayed_node(struct 
btrfs_delayed_root *delayed_root,
return -ENOMEM;
 
async_work-delayed_root = delayed_root;
-   async_work-work.func = btrfs_async_run_delayed_root;
-   async_work-work.flags = 0;
+   btrfs_init_work(async_work-work, btrfs_async_run_delayed_root,
+   NULL, NULL);
async_work-nr = nr;
 
-   btrfs_queue_worker(root-fs_info-delayed_workers, async_work-work);
+   btrfs_queue_work(root-fs_info-delayed_workers, async_work-work);
return 0;
 }
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e5dec5a..9053df8 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2017,7 +2017,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_destroy_workqueue(fs_info-endio_write_workers);
btrfs_destroy_workqueue(fs_info-endio_freespace_worker);
btrfs_destroy_workqueue(fs_info-submit_workers);
-   btrfs_stop_workers(fs_info-delayed_workers);
+   btrfs_destroy_workqueue(fs_info-delayed_workers);
btrfs_destroy_workqueue(fs_info-caching_workers);
btrfs_destroy_workqueue(fs_info-readahead_workers);
btrfs_destroy_workqueue(fs_info-flush_workers);
@@ -2515,9 +2515,8 @@ int open_ctree(struct super_block *sb,
btrfs_alloc_workqueue(endio-write, flags, max_active, 2);
fs_info-endio_freespace_worker =
btrfs_alloc_workqueue(freespace-write, flags, max_active, 0);
-   btrfs_init_workers(fs_info-delayed_workers, delayed-meta,
-  fs_info-thread_pool_size,
-  fs_info-generic_worker);
+   fs_info-delayed_workers =
+   btrfs_alloc_workqueue(delayed-meta, flags, max_active, 0);
fs_info-readahead_workers =
btrfs_alloc_workqueue(readahead, flags, max_active, 2);
btrfs_init_workers(fs_info-qgroup_rescan_workers, qgroup-rescan, 1,
@@ -2528,7 +2527,6 @@ int open_ctree(struct super_block *sb,
 * return -ENOMEM if any of these fail.
 */
ret = btrfs_start_workers(fs_info-generic_worker);
-   ret |= btrfs_start_workers(fs_info-delayed_workers);
ret |= btrfs_start_workers(fs_info-qgroup_rescan_workers);
if (ret) {
err = -ENOMEM;
@@ -2541,7 +2539,7 @@ int open_ctree(struct super_block *sb,
  fs_info-endio_write_workers  fs_info-endio_raid56_workers 
  fs_info-endio_freespace_worker  fs_info-rmw_workers 
  fs_info-caching_workers  fs_info-readahead_workers 
- fs_info-fixup_workers)) {
+ fs_info-fixup_workers  fs_info-delayed_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f7fd00c..83d3477 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1255,7 +1255,7 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info 
*fs_info,
new_pool_size);
btrfs_workqueue_set_max(fs_info-endio_write_workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-endio_freespace_worker, new_pool_size);
-   btrfs_set_max_workers(fs_info-delayed_workers, new_pool_size);
+   btrfs_workqueue_set_max(fs_info-delayed_workers, new_pool_size);

[PATCH v4 13/18] btrfs: Replace fs_info-fixup_workers workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-fixup_workers with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace fixup_workers.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 10 +-
 fs/btrfs/inode.c   |  8 
 fs/btrfs/super.c   |  1 -
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 302dc46..845615e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1508,7 +1508,7 @@ struct btrfs_fs_info {
 * the cow mechanism and make them safe to write.  It happens
 * for the sys_munmap function call path
 */
-   struct btrfs_workers fixup_workers;
+   struct btrfs_workqueue_struct *fixup_workers;
struct btrfs_workers delayed_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 4d49d87..e5dec5a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2006,7 +2006,7 @@ static noinline int next_root_backup(struct btrfs_fs_info 
*info,
 static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info)
 {
btrfs_stop_workers(fs_info-generic_worker);
-   btrfs_stop_workers(fs_info-fixup_workers);
+   btrfs_destroy_workqueue(fs_info-fixup_workers);
btrfs_destroy_workqueue(fs_info-delalloc_workers);
btrfs_destroy_workqueue(fs_info-workers);
btrfs_destroy_workqueue(fs_info-endio_workers);
@@ -2494,8 +2494,8 @@ int open_ctree(struct super_block *sb,
  min_t(u64, fs_devices-num_devices,
max_active), 64);
 
-   btrfs_init_workers(fs_info-fixup_workers, fixup, 1,
-  fs_info-generic_worker);
+   fs_info-fixup_workers =
+   btrfs_alloc_workqueue(fixup, flags, 1, 0);
 
/*
 * endios are largely parallel and should have a very
@@ -2528,7 +2528,6 @@ int open_ctree(struct super_block *sb,
 * return -ENOMEM if any of these fail.
 */
ret = btrfs_start_workers(fs_info-generic_worker);
-   ret |= btrfs_start_workers(fs_info-fixup_workers);
ret |= btrfs_start_workers(fs_info-delayed_workers);
ret |= btrfs_start_workers(fs_info-qgroup_rescan_workers);
if (ret) {
@@ -2541,7 +2540,8 @@ int open_ctree(struct super_block *sb,
  fs_info-endio_meta_write_workers 
  fs_info-endio_write_workers  fs_info-endio_raid56_workers 
  fs_info-endio_freespace_worker  fs_info-rmw_workers 
- fs_info-caching_workers  fs_info-readahead_workers)) {
+ fs_info-caching_workers  fs_info-readahead_workers 
+ fs_info-fixup_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d4f8dfb..62e4fc2 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1727,10 +1727,10 @@ int btrfs_set_extent_delalloc(struct inode *inode, u64 
start, u64 end,
 /* see btrfs_writepage_start_hook for details on why this is required */
 struct btrfs_writepage_fixup {
struct page *page;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
 };
 
-static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
+static void btrfs_writepage_fixup_worker(struct btrfs_work_struct *work)
 {
struct btrfs_writepage_fixup *fixup;
struct btrfs_ordered_extent *ordered;
@@ -1821,9 +1821,9 @@ static int btrfs_writepage_start_hook(struct page *page, 
u64 start, u64 end)
 
SetPageChecked(page);
page_cache_get(page);
-   fixup-work.func = btrfs_writepage_fixup_worker;
+   btrfs_init_work(fixup-work, btrfs_writepage_fixup_worker, NULL, NULL);
fixup-page = page;
-   btrfs_queue_worker(root-fs_info-fixup_workers, fixup-work);
+   btrfs_queue_work(root-fs_info-fixup_workers, fixup-work);
return -EBUSY;
 }
 
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 7a46e23..f7fd00c 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1249,7 +1249,6 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info 
*fs_info,
btrfs_workqueue_set_max(fs_info-delalloc_workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-submit_workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-caching_workers, new_pool_size);
-   btrfs_set_max_workers(fs_info-fixup_workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-endio_workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-endio_meta_workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-endio_meta_write_workers,
-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More 

[PATCH v4 15/18] btrfs: Replace fs_info-qgroup_rescan_worker workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-qgroup_rescan_worker with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace qgroup_rescan_workers.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  4 ++--
 fs/btrfs/disk-io.c | 10 +-
 fs/btrfs/qgroup.c  | 17 +
 3 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 698cebc..df51fa3 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1630,9 +1630,9 @@ struct btrfs_fs_info {
/* qgroup rescan items */
struct mutex qgroup_rescan_lock; /* protects the progress item */
struct btrfs_key qgroup_rescan_progress;
-   struct btrfs_workers qgroup_rescan_workers;
+   struct btrfs_workqueue_struct *qgroup_rescan_workers;
struct completion qgroup_rescan_completion;
-   struct btrfs_work qgroup_rescan_work;
+   struct btrfs_work_struct qgroup_rescan_work;
 
/* filesystem state */
unsigned long fs_state;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 9053df8..fb94e94 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2021,7 +2021,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_destroy_workqueue(fs_info-caching_workers);
btrfs_destroy_workqueue(fs_info-readahead_workers);
btrfs_destroy_workqueue(fs_info-flush_workers);
-   btrfs_stop_workers(fs_info-qgroup_rescan_workers);
+   btrfs_destroy_workqueue(fs_info-qgroup_rescan_workers);
 }
 
 static void free_root_extent_buffers(struct btrfs_root *root)
@@ -2519,15 +2519,14 @@ int open_ctree(struct super_block *sb,
btrfs_alloc_workqueue(delayed-meta, flags, max_active, 0);
fs_info-readahead_workers =
btrfs_alloc_workqueue(readahead, flags, max_active, 2);
-   btrfs_init_workers(fs_info-qgroup_rescan_workers, qgroup-rescan, 1,
-  fs_info-generic_worker);
+   fs_info-qgroup_rescan_workers =
+   btrfs_alloc_workqueue(qgroup-rescan, flags, 1, 0);
 
/*
 * btrfs_start_workers can really only fail because of ENOMEM so just
 * return -ENOMEM if any of these fail.
 */
ret = btrfs_start_workers(fs_info-generic_worker);
-   ret |= btrfs_start_workers(fs_info-qgroup_rescan_workers);
if (ret) {
err = -ENOMEM;
goto fail_sb_buffer;
@@ -2539,7 +2538,8 @@ int open_ctree(struct super_block *sb,
  fs_info-endio_write_workers  fs_info-endio_raid56_workers 
  fs_info-endio_freespace_worker  fs_info-rmw_workers 
  fs_info-caching_workers  fs_info-readahead_workers 
- fs_info-fixup_workers  fs_info-delayed_workers)) {
+ fs_info-fixup_workers  fs_info-delayed_workers 
+ fs_info-qgroup_rescan_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 4e6ef49..521144e 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1516,8 +1516,8 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans,
ret = qgroup_rescan_init(fs_info, 0, 1);
if (!ret) {
qgroup_rescan_zero_tracking(fs_info);
-   btrfs_queue_worker(fs_info-qgroup_rescan_workers,
-  fs_info-qgroup_rescan_work);
+   btrfs_queue_work(fs_info-qgroup_rescan_workers,
+fs_info-qgroup_rescan_work);
}
ret = 0;
}
@@ -1981,7 +1981,7 @@ out:
return ret;
 }
 
-static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
+static void btrfs_qgroup_rescan_worker(struct btrfs_work_struct *work)
 {
struct btrfs_fs_info *fs_info = container_of(work, struct btrfs_fs_info,
 qgroup_rescan_work);
@@ -2092,7 +2092,8 @@ qgroup_rescan_init(struct btrfs_fs_info *fs_info, u64 
progress_objectid,
 
memset(fs_info-qgroup_rescan_work, 0,
   sizeof(fs_info-qgroup_rescan_work));
-   fs_info-qgroup_rescan_work.func = btrfs_qgroup_rescan_worker;
+   btrfs_init_work(fs_info-qgroup_rescan_work,
+   btrfs_qgroup_rescan_worker, NULL, NULL);
 
if (ret) {
 err:
@@ -2155,8 +2156,8 @@ btrfs_qgroup_rescan(struct btrfs_fs_info *fs_info)
 
qgroup_rescan_zero_tracking(fs_info);
 
-   btrfs_queue_worker(fs_info-qgroup_rescan_workers,
-  fs_info-qgroup_rescan_work);
+   btrfs_queue_work(fs_info-qgroup_rescan_workers,
+fs_info-qgroup_rescan_work);
 
return 0;
 }
@@ -2187,6 +2188,6 @@ void
 btrfs_qgroup_rescan_resume(struct btrfs_fs_info *fs_info)
 {
if 

[PATCH v4 10/18] btrfs: Replace fs_info-rmw_workers workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-rmw_workers with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace rmw_workers.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 12 
 fs/btrfs/raid56.c  | 35 ---
 3 files changed, 21 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 5096164..294b373 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1495,7 +1495,7 @@ struct btrfs_fs_info {
struct btrfs_workqueue_struct *endio_workers;
struct btrfs_workqueue_struct *endio_meta_workers;
struct btrfs_workqueue_struct *endio_raid56_workers;
-   struct btrfs_workers rmw_workers;
+   struct btrfs_workqueue_struct *rmw_workers;
struct btrfs_workqueue_struct *endio_meta_write_workers;
struct btrfs_workqueue_struct *endio_write_workers;
struct btrfs_workqueue_struct *endio_freespace_worker;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 4f8591a..8b2977b 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2012,7 +2012,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_destroy_workqueue(fs_info-endio_workers);
btrfs_destroy_workqueue(fs_info-endio_meta_workers);
btrfs_destroy_workqueue(fs_info-endio_raid56_workers);
-   btrfs_stop_workers(fs_info-rmw_workers);
+   btrfs_destroy_workqueue(fs_info-rmw_workers);
btrfs_destroy_workqueue(fs_info-endio_meta_write_workers);
btrfs_destroy_workqueue(fs_info-endio_write_workers);
btrfs_destroy_workqueue(fs_info-endio_freespace_worker);
@@ -2509,9 +2509,8 @@ int open_ctree(struct super_block *sb,
btrfs_alloc_workqueue(endio-meta-write, flags, max_active, 2);
fs_info-endio_raid56_workers =
btrfs_alloc_workqueue(endio-raid56, flags, max_active, 4);
-   btrfs_init_workers(fs_info-rmw_workers,
-  rmw, fs_info-thread_pool_size,
-  fs_info-generic_worker);
+   fs_info-rmw_workers =
+   btrfs_alloc_workqueue(rmw, flags, max_active, 2);
fs_info-endio_write_workers =
btrfs_alloc_workqueue(endio-write, flags, max_active, 2);
fs_info-endio_freespace_worker =
@@ -2525,8 +2524,6 @@ int open_ctree(struct super_block *sb,
btrfs_init_workers(fs_info-qgroup_rescan_workers, qgroup-rescan, 1,
   fs_info-generic_worker);
 
-   fs_info-rmw_workers.idle_thresh = 2;
-
fs_info-readahead_workers.idle_thresh = 2;
 
/*
@@ -2535,7 +2532,6 @@ int open_ctree(struct super_block *sb,
 */
ret = btrfs_start_workers(fs_info-generic_worker);
ret |= btrfs_start_workers(fs_info-fixup_workers);
-   ret |= btrfs_start_workers(fs_info-rmw_workers);
ret |= btrfs_start_workers(fs_info-delayed_workers);
ret |= btrfs_start_workers(fs_info-caching_workers);
ret |= btrfs_start_workers(fs_info-readahead_workers);
@@ -2549,7 +2545,7 @@ int open_ctree(struct super_block *sb,
  fs_info-endio_workers  fs_info-endio_meta_workers 
  fs_info-endio_meta_write_workers 
  fs_info-endio_write_workers  fs_info-endio_raid56_workers 
- fs_info-endio_freespace_worker)) {
+ fs_info-endio_freespace_worker  fs_info-rmw_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 24ac218..5afa564 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -87,7 +87,7 @@ struct btrfs_raid_bio {
/*
 * for scheduling work in the helper threads
 */
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
 
/*
 * bio list and bio_list_lock are used
@@ -166,8 +166,8 @@ struct btrfs_raid_bio {
 
 static int __raid56_parity_recover(struct btrfs_raid_bio *rbio);
 static noinline void finish_rmw(struct btrfs_raid_bio *rbio);
-static void rmw_work(struct btrfs_work *work);
-static void read_rebuild_work(struct btrfs_work *work);
+static void rmw_work(struct btrfs_work_struct *work);
+static void read_rebuild_work(struct btrfs_work_struct *work);
 static void async_rmw_stripe(struct btrfs_raid_bio *rbio);
 static void async_read_rebuild(struct btrfs_raid_bio *rbio);
 static int fail_bio_stripe(struct btrfs_raid_bio *rbio, struct bio *bio);
@@ -1416,20 +1416,18 @@ cleanup:
 
 static void async_rmw_stripe(struct btrfs_raid_bio *rbio)
 {
-   rbio-work.flags = 0;
-   rbio-work.func = rmw_work;
+   btrfs_init_work(rbio-work, rmw_work, NULL, NULL);
 
-   btrfs_queue_worker(rbio-fs_info-rmw_workers,
-  rbio-work);
+   btrfs_queue_work(rbio-fs_info-rmw_workers,
+rbio-work);
 

[PATCH v4 06/18] btrfs: Replace fs_info-delalloc_workers with btrfs_workqueue

2013-12-17 Thread Qu Wenruo
Much like the fs_info-workers, replace the fs_info-delalloc_workers
use the same btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  None
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 12 
 fs/btrfs/inode.c   | 18 --
 fs/btrfs/super.c   |  2 +-
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index b3093c3..a86c9a1 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1490,7 +1490,7 @@ struct btrfs_fs_info {
 */
struct btrfs_workers generic_worker;
struct btrfs_workqueue_struct *workers;
-   struct btrfs_workers delalloc_workers;
+   struct btrfs_workqueue_struct *delalloc_workers;
struct btrfs_workers flush_workers;
struct btrfs_workers endio_workers;
struct btrfs_workers endio_meta_workers;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 258c59a..1098435 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2008,7 +2008,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
 {
btrfs_stop_workers(fs_info-generic_worker);
btrfs_stop_workers(fs_info-fixup_workers);
-   btrfs_stop_workers(fs_info-delalloc_workers);
+   btrfs_destroy_workqueue(fs_info-delalloc_workers);
btrfs_destroy_workqueue(fs_info-workers);
btrfs_stop_workers(fs_info-endio_workers);
btrfs_stop_workers(fs_info-endio_meta_workers);
@@ -2476,8 +2476,8 @@ int open_ctree(struct super_block *sb,
btrfs_alloc_workqueue(worker, flags | WQ_HIGHPRI,
  max_active, 16);
 
-   btrfs_init_workers(fs_info-delalloc_workers, delalloc,
-  fs_info-thread_pool_size, NULL);
+   fs_info-delalloc_workers =
+   btrfs_alloc_workqueue(delalloc, flags, max_active, 2);
 
btrfs_init_workers(fs_info-flush_workers, flush_delalloc,
   fs_info-thread_pool_size, NULL);
@@ -2495,9 +2495,6 @@ int open_ctree(struct super_block *sb,
 */
fs_info-submit_workers.idle_thresh = 64;
 
-   fs_info-delalloc_workers.idle_thresh = 2;
-   fs_info-delalloc_workers.ordered = 1;
-
btrfs_init_workers(fs_info-fixup_workers, fixup, 1,
   fs_info-generic_worker);
btrfs_init_workers(fs_info-endio_workers, endio,
@@ -2548,7 +2545,6 @@ int open_ctree(struct super_block *sb,
 */
ret = btrfs_start_workers(fs_info-generic_worker);
ret |= btrfs_start_workers(fs_info-submit_workers);
-   ret |= btrfs_start_workers(fs_info-delalloc_workers);
ret |= btrfs_start_workers(fs_info-fixup_workers);
ret |= btrfs_start_workers(fs_info-endio_workers);
ret |= btrfs_start_workers(fs_info-endio_meta_workers);
@@ -2566,7 +2562,7 @@ int open_ctree(struct super_block *sb,
err = -ENOMEM;
goto fail_sb_buffer;
}
-   if (!(fs_info-workers)) {
+   if (!(fs_info-workers  fs_info-delalloc_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index f1a7744..220db71 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -305,7 +305,7 @@ struct async_cow {
u64 start;
u64 end;
struct list_head extents;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
 };
 
 static noinline int add_async_extent(struct async_cow *cow,
@@ -980,7 +980,7 @@ out_unlock:
 /*
  * work queue call back to started compression on a file and pages
  */
-static noinline void async_cow_start(struct btrfs_work *work)
+static noinline void async_cow_start(struct btrfs_work_struct *work)
 {
struct async_cow *async_cow;
int num_added = 0;
@@ -998,7 +998,7 @@ static noinline void async_cow_start(struct btrfs_work 
*work)
 /*
  * work queue call back to submit previously compressed pages
  */
-static noinline void async_cow_submit(struct btrfs_work *work)
+static noinline void async_cow_submit(struct btrfs_work_struct *work)
 {
struct async_cow *async_cow;
struct btrfs_root *root;
@@ -1019,7 +1019,7 @@ static noinline void async_cow_submit(struct btrfs_work 
*work)
submit_compressed_extents(async_cow-inode, async_cow);
 }
 
-static noinline void async_cow_free(struct btrfs_work *work)
+static noinline void async_cow_free(struct btrfs_work_struct *work)
 {
struct async_cow *async_cow;
async_cow = container_of(work, struct async_cow, work);
@@ -1056,17 +1056,15 @@ static int cow_file_range_async(struct inode *inode, 
struct page *locked_page,
async_cow-end = cur_end;
INIT_LIST_HEAD(async_cow-extents);
 
-   async_cow-work.func = async_cow_start;
-   async_cow-work.ordered_func = async_cow_submit;
-

[PATCH v4 05/18] btrfs: Replace fs_info-workers with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Use the newly created btrfs_workqueue_struct to replace the original
fs_info-workers

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  None
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 41 +
 fs/btrfs/super.c   |  2 +-
 3 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 54ab861..b3093c3 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1489,7 +1489,7 @@ struct btrfs_fs_info {
 * two
 */
struct btrfs_workers generic_worker;
-   struct btrfs_workers workers;
+   struct btrfs_workqueue_struct *workers;
struct btrfs_workers delalloc_workers;
struct btrfs_workers flush_workers;
struct btrfs_workers endio_workers;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8072cfa..258c59a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -107,7 +107,7 @@ struct async_submit_bio {
 * can't tell us where in the file the bio should go
 */
u64 bio_offset;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
int error;
 };
 
@@ -741,12 +741,12 @@ int btrfs_bio_wq_end_io(struct btrfs_fs_info *info, 
struct bio *bio,
 unsigned long btrfs_async_submit_limit(struct btrfs_fs_info *info)
 {
unsigned long limit = min_t(unsigned long,
-   info-workers.max_workers,
+   info-thread_pool_size,
info-fs_devices-open_devices);
return 256 * limit;
 }
 
-static void run_one_async_start(struct btrfs_work *work)
+static void run_one_async_start(struct btrfs_work_struct *work)
 {
struct async_submit_bio *async;
int ret;
@@ -759,7 +759,7 @@ static void run_one_async_start(struct btrfs_work *work)
async-error = ret;
 }
 
-static void run_one_async_done(struct btrfs_work *work)
+static void run_one_async_done(struct btrfs_work_struct *work)
 {
struct btrfs_fs_info *fs_info;
struct async_submit_bio *async;
@@ -786,7 +786,7 @@ static void run_one_async_done(struct btrfs_work *work)
   async-bio_offset);
 }
 
-static void run_one_async_free(struct btrfs_work *work)
+static void run_one_async_free(struct btrfs_work_struct *work)
 {
struct async_submit_bio *async;
 
@@ -814,11 +814,9 @@ int btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, 
struct inode *inode,
async-submit_bio_start = submit_bio_start;
async-submit_bio_done = submit_bio_done;
 
-   async-work.func = run_one_async_start;
-   async-work.ordered_func = run_one_async_done;
-   async-work.ordered_free = run_one_async_free;
+   btrfs_init_work(async-work, run_one_async_start,
+   run_one_async_done, run_one_async_free);
 
-   async-work.flags = 0;
async-bio_flags = bio_flags;
async-bio_offset = bio_offset;
 
@@ -827,9 +825,9 @@ int btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, 
struct inode *inode,
atomic_inc(fs_info-nr_async_submits);
 
if (rw  REQ_SYNC)
-   btrfs_set_work_high_prio(async-work);
+   btrfs_set_work_high_priority(async-work);
 
-   btrfs_queue_worker(fs_info-workers, async-work);
+   btrfs_queue_work(fs_info-workers, async-work);
 
while (atomic_read(fs_info-async_submit_draining) 
  atomic_read(fs_info-nr_async_submits)) {
@@ -2011,7 +2009,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_stop_workers(fs_info-generic_worker);
btrfs_stop_workers(fs_info-fixup_workers);
btrfs_stop_workers(fs_info-delalloc_workers);
-   btrfs_stop_workers(fs_info-workers);
+   btrfs_destroy_workqueue(fs_info-workers);
btrfs_stop_workers(fs_info-endio_workers);
btrfs_stop_workers(fs_info-endio_meta_workers);
btrfs_stop_workers(fs_info-endio_raid56_workers);
@@ -2109,6 +2107,8 @@ int open_ctree(struct super_block *sb,
int err = -EINVAL;
int num_backups_tried = 0;
int backup_index = 0;
+   int max_active;
+   int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND;
bool create_uuid_tree;
bool check_uuid_tree;
 
@@ -2468,12 +2468,13 @@ int open_ctree(struct super_block *sb,
goto fail_alloc;
}
 
+   max_active = fs_info-thread_pool_size;
btrfs_init_workers(fs_info-generic_worker,
   genwork, 1, NULL);
 
-   btrfs_init_workers(fs_info-workers, worker,
-  fs_info-thread_pool_size,
-  fs_info-generic_worker);
+   fs_info-workers =
+   btrfs_alloc_workqueue(worker, flags | WQ_HIGHPRI,
+ max_active, 16);
 

[PATCH v4 08/18] btrfs: Replace fs_info-flush_workers with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-submit_workers with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace submit_workers.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h|  4 ++--
 fs/btrfs/disk-io.c  | 10 --
 fs/btrfs/inode.c|  8 
 fs/btrfs/ordered-data.c | 13 +++--
 fs/btrfs/ordered-data.h |  2 +-
 5 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 4411a2b..097364d 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1491,7 +1491,7 @@ struct btrfs_fs_info {
struct btrfs_workers generic_worker;
struct btrfs_workqueue_struct *workers;
struct btrfs_workqueue_struct *delalloc_workers;
-   struct btrfs_workers flush_workers;
+   struct btrfs_workqueue_struct *flush_workers;
struct btrfs_workers endio_workers;
struct btrfs_workers endio_meta_workers;
struct btrfs_workers endio_raid56_workers;
@@ -3622,7 +3622,7 @@ struct btrfs_delalloc_work {
int delay_iput;
struct completion completion;
struct list_head list;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
 };
 
 struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode,
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index cda9766..139960f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2021,7 +2021,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_stop_workers(fs_info-delayed_workers);
btrfs_stop_workers(fs_info-caching_workers);
btrfs_stop_workers(fs_info-readahead_workers);
-   btrfs_stop_workers(fs_info-flush_workers);
+   btrfs_destroy_workqueue(fs_info-flush_workers);
btrfs_stop_workers(fs_info-qgroup_rescan_workers);
 }
 
@@ -2479,9 +2479,8 @@ int open_ctree(struct super_block *sb,
fs_info-delalloc_workers =
btrfs_alloc_workqueue(delalloc, flags, max_active, 2);
 
-   btrfs_init_workers(fs_info-flush_workers, flush_delalloc,
-  fs_info-thread_pool_size, NULL);
-
+   fs_info-flush_workers =
+   btrfs_alloc_workqueue(flush_delalloc, flags, max_active, 0);
 
btrfs_init_workers(fs_info-caching_workers, cache,
   fs_info-thread_pool_size, NULL);
@@ -2556,14 +2555,13 @@ int open_ctree(struct super_block *sb,
ret |= btrfs_start_workers(fs_info-delayed_workers);
ret |= btrfs_start_workers(fs_info-caching_workers);
ret |= btrfs_start_workers(fs_info-readahead_workers);
-   ret |= btrfs_start_workers(fs_info-flush_workers);
ret |= btrfs_start_workers(fs_info-qgroup_rescan_workers);
if (ret) {
err = -ENOMEM;
goto fail_sb_buffer;
}
if (!(fs_info-workers  fs_info-delalloc_workers 
- fs_info-submit_workers)) {
+ fs_info-submit_workers  fs_info-flush_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 220db71..929f1ee 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8163,7 +8163,7 @@ out_notrans:
return ret;
 }
 
-static void btrfs_run_delalloc_work(struct btrfs_work *work)
+static void btrfs_run_delalloc_work(struct btrfs_work_struct *work)
 {
struct btrfs_delalloc_work *delalloc_work;
struct inode *inode;
@@ -8201,7 +8201,7 @@ struct btrfs_delalloc_work 
*btrfs_alloc_delalloc_work(struct inode *inode,
work-inode = inode;
work-wait = wait;
work-delay_iput = delay_iput;
-   work-work.func = btrfs_run_delalloc_work;
+   btrfs_init_work(work-work, btrfs_run_delalloc_work, NULL, NULL);
 
return work;
 }
@@ -8253,8 +8253,8 @@ static int __start_delalloc_inodes(struct btrfs_root 
*root, int delay_iput)
goto out;
}
list_add_tail(work-list, works);
-   btrfs_queue_worker(root-fs_info-flush_workers,
-  work-work);
+   btrfs_queue_work(root-fs_info-flush_workers,
+work-work);
 
cond_resched();
spin_lock(root-delalloc_lock);
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 69582d5..e0c3cf0 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -552,7 +552,7 @@ void btrfs_remove_ordered_extent(struct inode *inode,
wake_up(entry-wait);
 }
 
-static void btrfs_run_ordered_extent_work(struct btrfs_work *work)
+static void btrfs_run_ordered_extent_work(struct btrfs_work_struct *work)
 {
struct btrfs_ordered_extent *ordered;
 
@@ -585,10 +585,11 @@ int btrfs_wait_ordered_extents(struct btrfs_root *root, 
int nr)
atomic_inc(ordered-refs);

[PATCH v4 02/18] btrfs: Added btrfs_workqueue_struct implemented ordered execution based on kernel workqueue

2013-12-17 Thread Qu Wenruo
Use kernel workqueue to implement a new btrfs_workqueue_struct, which
has the ordering execution feature like the btrfs_worker.

The func is executed in a concurrency way, and the
ordred_func/ordered_free is executed in the sequence them are queued
after the corresponding func is done.

The new btrfs_workqueue works much like the original one, one workqueue
for normal work and a list for ordered work.
When a work is queued, ordered work will be added to the list and helper
function will be queued into the workqueue.
The helper function will execute a normal work and then check and execute as 
many
ordered work as possible in the sequence they were queued.

At this patch, high priority work queue or thresholding is not added yet.
The high priority feature and thresholding will be added in the following 
patches.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
Cc: Josef Bacik jba...@fusionio.com
---
Changelog:
v1-v2:
  None.
v2-v3:
  - Fix the potential deadline discovered by kernel lockdep.
  - Reuse the async-thread.[ch] files.
  - Make the ordered_func optional, which makes it adaptable to
all btrfs_workers.
v3-v4:
  - Use the old list method to implement ordered workqueue.
Previous 3 wq implement needs extra time waiting for scheduling,
which caused up to 40% performance drop in compress tests.
The old list method(after executing a normal work, check the order_list
and executing) does not need the extra scheduling things.
  - Simplify the btrfs_alloc_workqueue parameters.
Now only one name is needed, and ordered work mechanism is determined using
work-ordered_func.
  - Fix memory leak in btrfs_destroy_workqueue.
---
 fs/btrfs/async-thread.c | 130 
 fs/btrfs/async-thread.h |  27 ++
 2 files changed, 157 insertions(+)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index c1e0b0c..f05d57e 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (C) 2007 Oracle.  All rights reserved.
+ * Copyright (C) 2013 Fujitsu.  All rights reserved.
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public
@@ -21,6 +22,7 @@
 #include linux/list.h
 #include linux/spinlock.h
 #include linux/freezer.h
+#include linux/workqueue.h
 #include async-thread.h
 
 #define WORK_QUEUED_BIT 0
@@ -726,3 +728,131 @@ void btrfs_queue_worker(struct btrfs_workers *workers, 
struct btrfs_work *work)
wake_up_process(worker-task);
spin_unlock_irqrestore(worker-lock, flags);
 }
+
+struct btrfs_workqueue_struct {
+   struct workqueue_struct *normal_wq;
+   /* List head pointing to ordered work list */
+   struct list_head ordered_list;
+
+   /* Spinlock for ordered_list */
+   spinlock_t list_lock;
+};
+
+struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char *name,
+int flags,
+int max_active)
+{
+   struct btrfs_workqueue_struct *ret = kzalloc(sizeof(*ret), GFP_NOFS);
+
+   if (unlikely(!ret))
+   return NULL;
+
+   ret-normal_wq = alloc_workqueue(%s-%s, flags, max_active,
+btrfs, name);
+   if (unlikely(!ret-normal_wq)) {
+   kfree(ret);
+   return NULL;
+   }
+
+   INIT_LIST_HEAD(ret-ordered_list);
+   spin_lock_init(ret-list_lock);
+   return ret;
+}
+
+static void run_ordered_work(struct btrfs_workqueue_struct *wq)
+{
+   struct list_head *list = wq-ordered_list;
+   struct btrfs_work_struct *work;
+   spinlock_t *lock = wq-list_lock;
+   unsigned long flags;
+
+   while (1) {
+   spin_lock_irqsave(lock, flags);
+   if (list_empty(list))
+   break;
+   work = list_entry(list-next, struct btrfs_work_struct,
+ ordered_list);
+   if (!test_bit(WORK_DONE_BIT, work-flags))
+   break;
+
+   /*
+* we are going to call the ordered done function, but
+* we leave the work item on the list as a barrier so
+* that later work items that are done don't have their
+* functions called before this one returns
+*/
+   if (test_and_set_bit(WORK_ORDER_DONE_BIT, work-flags))
+   break;
+   spin_unlock_irqrestore(lock, flags);
+   work-ordered_func(work);
+
+   /* now take the lock again and drop our item from the list */
+   spin_lock_irqsave(lock, flags);
+   list_del(work-ordered_list);
+   spin_unlock_irqrestore(lock, flags);
+
+   /*
+* we don't want to call the ordered free functions
+* with the lock held though
+

[PATCH v4 11/18] btrfs: Replace fs_info-cache_workers workqueue with btrfs_workqueue.

2013-12-17 Thread Qu Wenruo
Replace the fs_info-cache_workers with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
Changelog:
v1-v2:
  None
v2-v3:
  - Use the btrfs_workqueue_struct to replace caching_workers.
v3-v4:
  - Use the simplified btrfs_alloc_workqueue API.
---
 fs/btrfs/ctree.h   |  4 ++--
 fs/btrfs/disk-io.c | 10 +-
 fs/btrfs/extent-tree.c |  6 +++---
 fs/btrfs/super.c   |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 294b373..8630986 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1205,7 +1205,7 @@ struct btrfs_caching_control {
struct list_head list;
struct mutex mutex;
wait_queue_head_t wait;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
struct btrfs_block_group_cache *block_group;
u64 progress;
atomic_t count;
@@ -1500,7 +1500,7 @@ struct btrfs_fs_info {
struct btrfs_workqueue_struct *endio_write_workers;
struct btrfs_workqueue_struct *endio_freespace_worker;
struct btrfs_workqueue_struct *submit_workers;
-   struct btrfs_workers caching_workers;
+   struct btrfs_workqueue_struct *caching_workers;
struct btrfs_workers readahead_workers;
 
/*
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8b2977b..d8f42d2 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2018,7 +2018,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
btrfs_destroy_workqueue(fs_info-endio_freespace_worker);
btrfs_destroy_workqueue(fs_info-submit_workers);
btrfs_stop_workers(fs_info-delayed_workers);
-   btrfs_stop_workers(fs_info-caching_workers);
+   btrfs_destroy_workqueue(fs_info-caching_workers);
btrfs_stop_workers(fs_info-readahead_workers);
btrfs_destroy_workqueue(fs_info-flush_workers);
btrfs_stop_workers(fs_info-qgroup_rescan_workers);
@@ -2481,8 +2481,8 @@ int open_ctree(struct super_block *sb,
fs_info-flush_workers =
btrfs_alloc_workqueue(flush_delalloc, flags, max_active, 0);
 
-   btrfs_init_workers(fs_info-caching_workers, cache,
-  fs_info-thread_pool_size, NULL);
+   fs_info-caching_workers =
+   btrfs_alloc_workqueue(cache, flags, max_active, 0);
 
/*
 * a higher idle thresh on the submit workers makes it much more
@@ -2533,7 +2533,6 @@ int open_ctree(struct super_block *sb,
ret = btrfs_start_workers(fs_info-generic_worker);
ret |= btrfs_start_workers(fs_info-fixup_workers);
ret |= btrfs_start_workers(fs_info-delayed_workers);
-   ret |= btrfs_start_workers(fs_info-caching_workers);
ret |= btrfs_start_workers(fs_info-readahead_workers);
ret |= btrfs_start_workers(fs_info-qgroup_rescan_workers);
if (ret) {
@@ -2545,7 +2544,8 @@ int open_ctree(struct super_block *sb,
  fs_info-endio_workers  fs_info-endio_meta_workers 
  fs_info-endio_meta_write_workers 
  fs_info-endio_write_workers  fs_info-endio_raid56_workers 
- fs_info-endio_freespace_worker  fs_info-rmw_workers)) {
+ fs_info-endio_freespace_worker  fs_info-rmw_workers 
+ fs_info-caching_workers)) {
err = -ENOMEM;
goto fail_sb_buffer;
}
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 45d98d0..80ecc14 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -377,7 +377,7 @@ static u64 add_new_free_space(struct 
btrfs_block_group_cache *block_group,
return total_added;
 }
 
-static noinline void caching_thread(struct btrfs_work *work)
+static noinline void caching_thread(struct btrfs_work_struct *work)
 {
struct btrfs_block_group_cache *block_group;
struct btrfs_fs_info *fs_info;
@@ -547,7 +547,7 @@ static int cache_block_group(struct btrfs_block_group_cache 
*cache,
caching_ctl-block_group = cache;
caching_ctl-progress = cache-key.objectid;
atomic_set(caching_ctl-count, 1);
-   caching_ctl-work.func = caching_thread;
+   btrfs_init_work(caching_ctl-work, caching_thread, NULL, NULL);
 
spin_lock(cache-lock);
/*
@@ -638,7 +638,7 @@ static int cache_block_group(struct btrfs_block_group_cache 
*cache,
 
btrfs_get_block_group(cache);
 
-   btrfs_queue_worker(fs_info-caching_workers, caching_ctl-work);
+   btrfs_queue_work(fs_info-caching_workers, caching_ctl-work);
 
return ret;
 }
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index da3ec84..5bfe566 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1248,7 +1248,7 @@ static void btrfs_resize_thread_pool(struct btrfs_fs_info 
*fs_info,
btrfs_workqueue_set_max(fs_info-workers, new_pool_size);
btrfs_workqueue_set_max(fs_info-delalloc_workers, new_pool_size);

[PATCH v2 1/3] btrfs-progs: don't replicate the stripe_len defines

2013-12-17 Thread Anand Jain
a clean up patch, the BTRFS_STRIPE_LEN is been duplicated across
btrfs-progs, the kernel defines it in volume.h so do the same
for progs.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 v2: commit update

 btrfs-convert.c |   19 +--
 chunk-recover.c |1 -
 cmds-chunk.c|1 -
 volumes.h   |2 ++
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/btrfs-convert.c b/btrfs-convert.c
index ae10eed..65fe707 100644
--- a/btrfs-convert.c
+++ b/btrfs-convert.c
@@ -43,7 +43,6 @@
 #include ext2fs/ext2_ext_attr.h
 
 #define INO_OFFSET (BTRFS_FIRST_FREE_OBJECTID - EXT2_ROOT_INO)
-#define STRIPE_LEN (64 * 1024)
 #define EXT2_IMAGE_SUBVOL_OBJECTID BTRFS_FIRST_FREE_OBJECTID
 
 /*
@@ -134,11 +133,11 @@ static int cache_free_extents(struct btrfs_root *root, 
ext2_filsys ext2_fs)
 
for (i = 0; i  BTRFS_SUPER_MIRROR_MAX; i++) {
bytenr = btrfs_sb_offset(i);
-   bytenr = ~((u64)STRIPE_LEN - 1);
+   bytenr = ~((u64)BTRFS_STRIPE_LEN - 1);
if (bytenr = blocksize * ext2_fs-super-s_blocks_count)
break;
clear_extent_dirty(root-fs_info-free_space_cache, bytenr,
-  bytenr + STRIPE_LEN - 1, 0);
+  bytenr + BTRFS_STRIPE_LEN - 1, 0);
}
 
clear_extent_dirty(root-fs_info-free_space_cache,
@@ -207,9 +206,9 @@ static int intersect_with_sb(u64 bytenr, u64 num_bytes)
 
for (i = 0; i  BTRFS_SUPER_MIRROR_MAX; i++) {
offset = btrfs_sb_offset(i);
-   offset = ~((u64)STRIPE_LEN - 1);
+   offset = ~((u64)BTRFS_STRIPE_LEN - 1);
 
-   if (bytenr  offset + STRIPE_LEN 
+   if (bytenr  offset + BTRFS_STRIPE_LEN 
bytenr + num_bytes  offset)
return 1;
}
@@ -450,8 +449,8 @@ static int block_iterate_proc(ext2_filsys ext2_fs,
}
 
if (sb_region) {
-   bytenr += STRIPE_LEN - 1;
-   bytenr = ~((u64)STRIPE_LEN - 1);
+   bytenr += BTRFS_STRIPE_LEN - 1;
+   bytenr = ~((u64)BTRFS_STRIPE_LEN - 1);
} else {
cache = btrfs_lookup_block_group(root-fs_info, bytenr);
BUG_ON(!cache);
@@ -1523,7 +1522,7 @@ static int create_chunk_mapping(struct btrfs_trans_handle 
*trans,
btrfs_set_stack_chunk_length(chunk, cache-key.offset);
btrfs_set_stack_chunk_owner(chunk,
extent_root-root_key.objectid);
-   btrfs_set_stack_chunk_stripe_len(chunk, STRIPE_LEN);
+   btrfs_set_stack_chunk_stripe_len(chunk, BTRFS_STRIPE_LEN);
btrfs_set_stack_chunk_type(chunk, cache-flags);
btrfs_set_stack_chunk_io_align(chunk, device-io_align);
btrfs_set_stack_chunk_io_width(chunk, device-io_width);
@@ -2098,10 +2097,10 @@ static int cleanup_sys_chunk(struct btrfs_root *fs_root,
}
for (i = 0; i  BTRFS_SUPER_MIRROR_MAX; i++) {
offset = btrfs_sb_offset(i);
-   offset = ~((u64)STRIPE_LEN - 1);
+   offset = ~((u64)BTRFS_STRIPE_LEN - 1);
 
ret = relocate_extents_range(fs_root, ext2_root,
-offset, offset + STRIPE_LEN);
+offset, offset + BTRFS_STRIPE_LEN);
if (ret)
goto fail;
}
diff --git a/chunk-recover.c b/chunk-recover.c
index bcde39e..b072ba6 100644
--- a/chunk-recover.c
+++ b/chunk-recover.c
@@ -41,7 +41,6 @@
 #include btrfsck.h
 #include commands.h
 
-#define BTRFS_STRIPE_LEN   (64 * 1024)
 #define BTRFS_NUM_MIRRORS  2
 
 struct recover_control {
diff --git a/cmds-chunk.c b/cmds-chunk.c
index 4d7fce0..348229c 100644
--- a/cmds-chunk.c
+++ b/cmds-chunk.c
@@ -42,7 +42,6 @@
 #include commands.h
 
 #define BTRFS_CHUNK_TREE_REBUILD_ABORTED   -7500
-#define BTRFS_STRIPE_LEN   (64 * 1024)
 #define BTRFS_NUM_MIRRORS  2
 
 struct recover_control {
diff --git a/volumes.h b/volumes.h
index 2802cb0..b1ff3d0 100644
--- a/volumes.h
+++ b/volumes.h
@@ -19,6 +19,8 @@
 #ifndef __BTRFS_VOLUMES_
 #define __BTRFS_VOLUMES_
 
+#define BTRFS_STRIPE_LEN   (64 * 1024)
+
 struct btrfs_device {
struct list_head dev_list;
struct btrfs_root *dev_root;
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/3] btrfs-progs: handle error in the btrfs_prepare_device

2013-12-17 Thread Anand Jain
this patch will handle the strerror reporting of the error instead of
printing errno,  and also replaced the BUG_ON with the error handling

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 v3: fix per Stefan review, update error message
 v2: commit update

 cmds-device.c  |7 +++
 cmds-replace.c |9 -
 mkfs.c |9 -
 utils.c|   30 +++---
 4 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/cmds-device.c b/cmds-device.c
index bc4a8dc..ada0bcd 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -111,13 +111,11 @@ static int cmd_add_dev(int argc, char **argv)
 
res = btrfs_prepare_device(devfd, argv[i], 1, dev_block_count,
   0, mixed, discard);
+   close(devfd);
if (res) {
-   fprintf(stderr, ERROR: Unable to init '%s'\n, 
argv[i]);
-   close(devfd);
ret++;
-   continue;
+   goto error_out;
}
-   close(devfd);
 
strncpy_null(ioctl_args.name, argv[i]);
res = ioctl(fdmnt, BTRFS_IOC_ADD_DEV, ioctl_args);
@@ -130,6 +128,7 @@ static int cmd_add_dev(int argc, char **argv)
 
}
 
+error_out:
close_file_or_dir(fdmnt, dirstream);
return !!ret;
 }
diff --git a/cmds-replace.c b/cmds-replace.c
index d9b0940..c683d6c 100644
--- a/cmds-replace.c
+++ b/cmds-replace.c
@@ -276,12 +276,11 @@ static int cmd_start_replace(int argc, char **argv)
}
strncpy((char *)start_args.start.tgtdev_name, dstdev,
BTRFS_DEVICE_PATH_NAME_MAX);
-   if (btrfs_prepare_device(fddstdev, dstdev, 1, dstdev_block_count, 0,
-mixed, 0)) {
-   fprintf(stderr, Error: Failed to prepare device '%s'\n,
-   dstdev);
+   ret = btrfs_prepare_device(fddstdev, dstdev, 1, dstdev_block_count, 0,
+mixed, 0);
+   if (ret)
goto leave_with_error;
-   }
+
close(fddstdev);
fddstdev = -1;
 
diff --git a/mkfs.c b/mkfs.c
index 33369f9..18df087 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1446,6 +1446,10 @@ int main(int ac, char **av)
first_file = file;
ret = btrfs_prepare_device(fd, file, zero_end, dev_block_count,
   block_count, mixed, discard);
+   if (ret) {
+   close(fd);
+   exit(1);
+   }
if (block_count  block_count  dev_block_count) {
fprintf(stderr, %s is smaller than requested size\n, 
file);
exit(1);
@@ -1553,8 +1557,11 @@ int main(int ac, char **av)
}
ret = btrfs_prepare_device(fd, file, zero_end, dev_block_count,
   block_count, mixed, discard);
+   if (ret) {
+   close(fd);
+   exit(1);
+   }
mixed = old_mixed;
-   BUG_ON(ret);
 
ret = btrfs_add_to_fsid(trans, root, fd, file, dev_block_count,
sectorsize, sectorsize, sectorsize);
diff --git a/utils.c b/utils.c
index f499023..03947bd 100644
--- a/utils.c
+++ b/utils.c
@@ -581,13 +581,13 @@ int btrfs_prepare_device(int fd, char *file, int 
zero_end, u64 *block_count_ret,
ret = fstat(fd, st);
if (ret  0) {
fprintf(stderr, unable to stat %s\n, file);
-   exit(1);
+   return 1;
}
 
block_count = btrfs_device_size(fd, st);
if (block_count == 0) {
fprintf(stderr, unable to find %s size\n, file);
-   exit(1);
+   return 1;
}
if (max_block_count)
block_count = min(block_count, max_block_count);
@@ -612,26 +612,34 @@ int btrfs_prepare_device(int fd, char *file, int 
zero_end, u64 *block_count_ret,
}
 
ret = zero_dev_start(fd);
-   if (ret) {
-   fprintf(stderr, failed to zero device start %d\n, ret);
-   exit(1);
-   }
+   if (ret)
+   goto zero_dev_error;
 
for (i = 0 ; i  BTRFS_SUPER_MIRROR_MAX; i++) {
bytenr = btrfs_sb_offset(i);
if (bytenr = block_count)
break;
-   zero_blocks(fd, bytenr, BTRFS_SUPER_INFO_SIZE);
+   ret = zero_blocks(fd, bytenr, BTRFS_SUPER_INFO_SIZE);
+   if (ret)
+   goto zero_dev_error;
}
 
if (zero_end) {
ret = zero_dev_end(fd, block_count);
-   if (ret) {
-   fprintf(stderr, failed to zero device end %d\n, ret);
-   exit(1);
-   }
+  

[PATCH v3 2/3] btrfs-progs: use stripe_len define here

2013-12-17 Thread Anand Jain
Signed-off-by: Anand Jain anand.j...@oracle.com
---
 v3: volume.c needs BTRFS_STRIPE_LEN as well
 v2: commit update

 btrfs-convert.c |2 +-
 btrfs-image.c   |2 +-
 volumes.c   |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/btrfs-convert.c b/btrfs-convert.c
index 65fe707..df20c15 100644
--- a/btrfs-convert.c
+++ b/btrfs-convert.c
@@ -1715,7 +1715,7 @@ static int prepare_system_chunk_sb(struct 
btrfs_super_block *super)
 
btrfs_set_stack_chunk_length(chunk, btrfs_super_total_bytes(super));
btrfs_set_stack_chunk_owner(chunk, BTRFS_EXTENT_TREE_OBJECTID);
-   btrfs_set_stack_chunk_stripe_len(chunk, 64 * 1024);
+   btrfs_set_stack_chunk_stripe_len(chunk, BTRFS_STRIPE_LEN);
btrfs_set_stack_chunk_type(chunk, BTRFS_BLOCK_GROUP_SYSTEM);
btrfs_set_stack_chunk_io_align(chunk, sectorsize);
btrfs_set_stack_chunk_io_width(chunk, sectorsize);
diff --git a/btrfs-image.c b/btrfs-image.c
index 7bcfc06..1b2831a 100644
--- a/btrfs-image.c
+++ b/btrfs-image.c
@@ -1350,7 +1350,7 @@ static void update_super_old(u8 *buffer)
 
btrfs_set_stack_chunk_length(chunk, (u64)-1);
btrfs_set_stack_chunk_owner(chunk, BTRFS_EXTENT_TREE_OBJECTID);
-   btrfs_set_stack_chunk_stripe_len(chunk, 64 * 1024);
+   btrfs_set_stack_chunk_stripe_len(chunk, BTRFS_STRIPE_LEN);
btrfs_set_stack_chunk_type(chunk, BTRFS_BLOCK_GROUP_SYSTEM);
btrfs_set_stack_chunk_io_align(chunk, sectorsize);
btrfs_set_stack_chunk_io_width(chunk, sectorsize);
diff --git a/volumes.c b/volumes.c
index c38da6c..7a9c955 100644
--- a/volumes.c
+++ b/volumes.c
@@ -773,7 +773,7 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
int looped = 0;
int ret;
int index;
-   int stripe_len = 64 * 1024;
+   int stripe_len = BTRFS_STRIPE_LEN;
struct btrfs_key key;
u64 offset;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send in 3.12 : can't find snapshot?

2013-12-17 Thread Wang Shilong

Hello Michael,

I sent a patch to fix the issue(cc you already), can you have a try
and see if it fix your problem.

Thanks,
Wang
On 12/17/2013 09:28 AM, Michael Welsh Duggan wrote:

Wang Shilong wangshilong1...@gmail.com writes:


Hello Michael,


I built the new btrfs-progs 3.12 recently.  I note that the version
information doesn't seem to match this:

# ./btrfs --version
Btrfs v0.20-rc1-358-g194aa4a

Regardless, I was trying to use btrfs send (which worked in the older
btrfs), and failed.  Here's an example:

# ./btrfs send -vvv -p /snapshots/bo /snapshots/bp   /dev/null
At subvol /snapshots/bp
ERROR: open @/snapshots/bp failed. No such file or directory

Any idea what might be going on here?

Here's the volume information:

# ./btrfs sub show /
/
Name:   @
uuid:   e5e505d6-1309-8447-b51e-73f08c9401d1
Parent uuid:156f93b9-1175-dc42-a1ee-65c00c5dcc2a
Creation time:  2013-07-17 20:44:46
Object ID:  259
Generation (Gen):   296321
Gen at creation:20
Parent: 5
Top Level:  5
Flags:  -
Snapshot(s):
snapshots/bo
snapshots/bp

Kernel information:

Here it seemed that you changed your default sub-volume.(259 not 5)
I sent a patch before to fix this problem, it has not been pushed into
chris's master branch, patch url is:

https://patchwork.kernel.org/patch/3258971/

But is has been pushed into david's latest integration branch , you
can pull from:

git pull http://github.com/kdave/btrfs-progs.git integration-20131211

After compiling this version the above tests works.  Now, however, the
receive fails:

 # ./btrfs send -p /snapshots/bo /snapshots/bp | ./btrfs receive 
/backup/snapshots/root/
 At subvol /snapshots/bp
 At snapshot bp
 ioctl(BTRFS_IOC_TREE_SEARCH, uuid, key 48f0ebae83fd32f1, UUID_KEY, 
90139d8200afeaab) ret=-1, error: No such file or directory
 ioctl(BTRFS_IOC_TREE_SEARCH, uuid, key 48f0ebae83fd32f1, UUID_KEY, 
90139d8200afeaab) ret=-1, error: No such file or directory
 ERROR: could not find parent subvolume

More volume information:

# ./btrfs sub show /backup/snapshots/root/bo
/backup/snapshots/root/bo
 Name:   bo
 uuid:   5e15ef24-f2d0-194f-886d-3f7afc7413a4
 Parent uuid:9a226af3-8497-744b-90f7-d7e54d58946d
 Creation time:  2013-12-13 17:51:57
 Object ID:  1030
 Generation (Gen):   1046
 Gen at creation:1042
 Parent: 5
 Top Level:  5
 Flags:  readonly
 Snapshot(s):
# ./btrfs sub show /snapshots/bo
/snapshots/bo
 Name:   bo
 uuid:   f132fd83-aeeb-f048-abea-af00829d1390
 Parent uuid:e5e505d6-1309-8447-b51e-73f08c9401d1
 Creation time:  2013-12-13 17:50:15
 Object ID:  404
 Generation (Gen):   296977
 Gen at creation:291623
 Parent: 259
 Top Level:  259
 Flags:  readonly
 Snapshot(s):
# ./btrfs sub show /snapshots/bp
/snapshots/bp
 Name:   bp
 uuid:   6f73d3f2-5f9b-4944-b2d2-3003331b2d10
 Parent uuid:e5e505d6-1309-8447-b51e-73f08c9401d1
 Creation time:  2013-12-15 22:24:57
 Object ID:  405
 Generation (Gen):   296977
 Gen at creation:296301
 Parent: 259
 Top Level:  259
 Flags:  readonly
 Snapshot(s):




--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] btrfs: Check read-only status of roots during send

2013-12-17 Thread Wang Shilong

Hello David,

Nice work, Before this patch for btrfs send.
we have to join a transaction to avoid commit root changed.

I send a plus patch that remove transaction protection from btrfs send.
and  a little comment below.

[...]
On 12/17/2013 12:34 AM, David Sterba wrote:

All the subvolues that are involved in send must be read-only during the
  s via SUBVOL_SETFLAGS
+*/
+   int send_in_progress;


Why not use u32 here?

Thanks,
Wang

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: remove transaction from btrfs send

2013-12-17 Thread Wang Shilong
Since daivd did the work that force us to use readonly snapshot,
we can safely remove transaction protection from btrfs send.

Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
---
 fs/btrfs/send.c | 33 -
 1 file changed, 33 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 945d1db..9e832f2 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -4522,7 +4522,6 @@ out:
 static int full_send_tree(struct send_ctx *sctx)
 {
int ret;
-   struct btrfs_trans_handle *trans = NULL;
struct btrfs_root *send_root = sctx-send_root;
struct btrfs_key key;
struct btrfs_key found_key;
@@ -4544,19 +4543,6 @@ static int full_send_tree(struct send_ctx *sctx)
key.type = BTRFS_INODE_ITEM_KEY;
key.offset = 0;
 
-join_trans:
-   /*
-* We need to make sure the transaction does not get committed
-* while we do anything on commit roots. Join a transaction to prevent
-* this.
-*/
-   trans = btrfs_join_transaction(send_root);
-   if (IS_ERR(trans)) {
-   ret = PTR_ERR(trans);
-   trans = NULL;
-   goto out;
-   }
-
/*
 * Make sure the tree has not changed after re-joining. We detect this
 * by comparing start_ctransid and ctransid. They should always match.
@@ -4580,19 +4566,6 @@ join_trans:
goto out_finish;
 
while (1) {
-   /*
-* When someone want to commit while we iterate, end the
-* joined transaction and rejoin.
-*/
-   if (btrfs_should_end_transaction(trans, send_root)) {
-   ret = btrfs_end_transaction(trans, send_root);
-   trans = NULL;
-   if (ret  0)
-   goto out;
-   btrfs_release_path(path);
-   goto join_trans;
-   }
-
eb = path-nodes[0];
slot = path-slots[0];
btrfs_item_key_to_cpu(eb, found_key, slot);
@@ -4620,12 +4593,6 @@ out_finish:
 
 out:
btrfs_free_path(path);
-   if (trans) {
-   if (!ret)
-   ret = btrfs_end_transaction(trans, send_root);
-   else
-   btrfs_end_transaction(trans, send_root);
-   }
return ret;
 }
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] btrfs: Check read-only status of roots during send

2013-12-17 Thread David Sterba
On Tue, Dec 17, 2013 at 07:58:24PM +0800, Wang Shilong wrote:
 Nice work, Before this patch for btrfs send.
 we have to join a transaction to avoid commit root changed.

That sounds like a good improvement.

 I send a plus patch that remove transaction protection from btrfs send.
 and  a little comment below.
 
 [...]
 On 12/17/2013 12:34 AM, David Sterba wrote:
 All the subvolues that are involved in send must be read-only during the
   s via SUBVOL_SETFLAGS
 + */
 +int send_in_progress;
 
 Why not use u32 here?

The int type should be enough to hold refs for all running sends, if
this is your concern.

I thought it's a refcount, it should not go below 0 but if it does, then
it should be caught. I'll update the patch to check if send_in_progress
is not negative after the decrements.

thanks,
david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: move the extent buffer radix tree into the fs_info

2013-12-17 Thread David Sterba
On Tue, Dec 17, 2013 at 01:19:39AM +, Chris Mason wrote:
 On Tue, 2013-12-17 at 02:06 +0100, David Sterba wrote:
  On Mon, Dec 16, 2013 at 01:26:26PM -0500, Josef Bacik wrote:
   I need to create a fake tree to test qgroups and I don't want to have to 
   setup a
   fake btree_inode.  The fact is we only use the radix tree for the 
   fs_info, so
   everybody else who allocates an extent_io_tree is just wasting the space 
   anyway.
   This patch moves the radix tree and its lock into btrfs_fs_info so there 
   is less
   stuff I have to fake to do qgroup sanity tests.  Thanks,
  
  This would make the fs_info::buffer_lock a global hotspot if
  alloc_extent_buffer and release_extent_buffer are called frequently.
  
 
 But since the only place that was really using it was the metadata
 btree, the lock shouldn't be hotter than before right?

Right. What confused me first is that the number of trees that are
initialized by extent_io_tree_init is higher, but the only user is
metadata btree.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2013-12-17 Thread David Sterba
Subject: [PATCH 4/3] btrfs: check balance of send_in_progress

Warn if the balance goes below zero, which appears to be unlikely
though. Otherwise cleans up the code a bit.

Signed-off-by: David Sterba dste...@suse.cz
---

A followup to 3/3 that adds the check if send_in_progress is not going below
zero. It's a separate patch rather than folded into 3/3 so the change is
clearly visible. I'm not convinced that it's necessary to be that cautious
because it looks almost impossible to happen, but on the other hand we'd never
know that it happened.

 fs/btrfs/send.c |   38 --
 1 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 468eba26ad8c..46ea0cdfb88b 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -4618,6 +4618,21 @@ out:
return ret;
 }
 
+static void btrfs_root_dec_send_in_progress(struct btrfs_root* root)
+{
+   spin_lock(root-root_item_lock);
+   root-send_in_progress--;
+   /*
+* Not much left to do, we don't know why it's unbalanced and
+* can't blindly reset it to 0.
+*/
+   if (root-send_in_progress  0)
+   btrfs_err(root-fs_info,
+   send_in_progres unbalanced %d root %llu\n,
+   root-send_in_progress, root-root_key.objectid);
+   spin_unlock(root-root_item_lock);
+}
+
 long btrfs_ioctl_send(struct file *mnt_file, void __user *arg_)
 {
int ret = 0;
@@ -4835,24 +4850,11 @@ long btrfs_ioctl_send(struct file *mnt_file, void 
__user *arg_)
}
 
 out:
-   for (i = 0; i  clone_sources_to_rollback; i++) {
-   struct btrfs_root *r = sctx-clone_roots[i].root;
-
-   spin_lock(r-root_item_lock);
-   r-send_in_progress--;
-   spin_unlock(r-root_item_lock);
-   }
-   if (!IS_ERR(sctx-parent_root)) {
-   struct btrfs_root *r = sctx-parent_root;
-
-   spin_lock(r-root_item_lock);
-   r-send_in_progress--;
-   spin_unlock(r-root_item_lock);
-   }
-
-   spin_lock(send_root-root_item_lock);
-   send_root-send_in_progress--;
-   spin_unlock(send_root-root_item_lock);
+   for (i = 0; i  clone_sources_to_rollback; i++)
+   btrfs_root_dec_send_in_progress(sctx-clone_roots[i].root);
+   if (!IS_ERR(sctx-parent_root))
+   btrfs_root_dec_send_in_progress(sctx-parent_root);
+   btrfs_root_dec_send_in_progress(send_root);
 
kfree(arg);
vfree(clone_sources_tmp);
-- 
1.7.9

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] btrfs-progs: handle error in the btrfs_prepare_device

2013-12-17 Thread David Sterba
On Tue, Dec 17, 2013 at 04:37:35PM +0800, Anand Jain wrote:
 +zero_dev_error:
 +   if (ret) {
 +   ret  0 ?
 +   fprintf(stderr, ERROR: failed to zero device start '%s' - 
 %s\n,
 +   file, strerror(-ret)) :
 +   fprintf(stderr, ERROR: failed to zero device start '%s' - 
 %d\n,
 +   file, ret);
 
 This is not funny.
 
  hmm. I am not sure what you mean ?

It's rather obfuscated, though it's a valid C. No need to minimize the
source line count but the time to understand them.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: fix btrfstune silence on failure

2013-12-17 Thread David Sterba
On Fri, Dec 13, 2013 at 05:59:46PM +0800, Gui Hecheng wrote:
 Originally, btrfstune will fail without any options and just exit
 with no failure prompt.

Works for me:

$ ./btrfstune
usage: btrfstune [options] device
-S valueenable/disable seeding
-r  enable extended inode refs
-x enable skinny metadata extent refs

 Now, the number of arguments are checked before parse options
 and error msg will show up upon failure.

No, the arguments should be parsed first. The btrfstune utility does not
use the same parser helpers like check_argc_exact and actually the bug
you see could be caused by missing optind = 1 before the while () loop.

Can you please test if this helps?

--- a/btrfstune.c
+++ b/btrfstune.c
@@ -115,6 +115,7 @@ int main(int argc, char *argv[])
int skinny_flag = 0;
int ret;

+   optind = 1;
while(1) {
int c = getopt(argc, argv, S:rx);
if (c  0)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: move the extent buffer radix tree into the fs_info

2013-12-17 Thread Josef Bacik


On 12/16/2013 08:06 PM, David Sterba wrote:

On Mon, Dec 16, 2013 at 01:26:26PM -0500, Josef Bacik wrote:

I need to create a fake tree to test qgroups and I don't want to have to setup a
fake btree_inode.  The fact is we only use the radix tree for the fs_info, so
everybody else who allocates an extent_io_tree is just wasting the space anyway.
This patch moves the radix tree and its lock into btrfs_fs_info so there is less
stuff I have to fake to do qgroup sanity tests.  Thanks,

This would make the fs_info::buffer_lock a global hotspot if
alloc_extent_buffer and release_extent_buffer are called frequently.

But, you can get rid of the buffer_lock completely, because the radix
tree can be safely protected by rcu_read_lock/_unlock:

* alloc_extent_buffer uses radix_preload that turns off preepmtion by
itself, so the lock here would be pointless


Except you still need a lock for other inserts.


* release_extent_buffer locks around radix_tree_delete, here a rcu
locking will be ok as well


No it won't.  RCU just makes sure readers don't get screwed, you still 
need to have real locking around the insertions/deletions, look at 
pagecache, we have mapping-tree_lock for this even though it uses rcu 
for the lookups.  Thanks,


Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: fix btrfstune silence on failure

2013-12-17 Thread Wang Shilong
Hi dave,

 On Fri, Dec 13, 2013 at 05:59:46PM +0800, Gui Hecheng wrote:
 Originally, btrfstune will fail without any options and just exit
 with no failure prompt.
 
 Works for me:
 
 $ ./btrfstune
 usage: btrfstune [options] device
   -S valueenable/disable seeding
   -r  enable extended inode refs
   -x enable skinny metadata extent refs
This is not the problem that this patch addressed,
you can try this:

# btrfstune /dev/sdb

This will not print out anything though it return 1.

 
 Now, the number of arguments are checked before parse options
 and error msg will show up upon failure.
 
 No, the arguments should be parsed first. The btrfstune utility does not
 use the same parser helpers like check_argc_exact and actually the bug
 you see could be caused by missing optind = 1 before the while () loop.
 
 Can you please test if this helps?
 
 --- a/btrfstune.c
 +++ b/btrfstune.c
 @@ -115,6 +115,7 @@ int main(int argc, char *argv[])
int skinny_flag = 0;
int ret;
 
 +   optind = 1;

The default value of optind is 1, though we'd better assign the value.

I think Gui Hecheng s patch is right way to fix the problem, but maybe we can a 
check after arg passing,
something like:

if (!(seeding_flag + exrefs_flag + skinny_flag))
fprintf(stderr , You should assign at least one option for btrfstune);

What is your idea^_^

Thanks,
Wang
while(1) {
int c = getopt(argc, argv, S:rx);
if (c  0)
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread David Sterba
On Tue, Dec 17, 2013 at 05:13:49PM +0800, Wang Shilong wrote:
 If we change our default subvolume, btrfs receive will fail to find
 subvolume. To fix this problem, i have two ideas.
 
 1.make btrfs snapshot ioctl support passing source subvolume's objectid

 2.when we want to using interval subvolume path, we mount it other place
 that use subvolume 5 as its default subvolume.

3. Tell the user to mount the toplevel subvol by himself and run receive
   again

 We'd better use the second approach because it won't bother kernel change.

I don't think that the silent mount is the right way to fix it, that way
the btrfs tool tooks responsibility not to break anything.  Like the
unhandled umount failure below. I think admins and power users do not
like to see some random tool mess with the system like this.

 @@ -199,6 +200,10 @@ static int process_snapshot(const char *path, const u8 
 *uuid, u64 ctransid,
   char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
   struct btrfs_ioctl_vol_args_v2 args_v2;
   struct subvol_info *parent_subvol = NULL;
 + char *dev = NULL;
 + char tmp_name[15] = btrfs-XX;
 + char tmp_dir[30] = /tmp;

Mounting valuable data under /tmp is dangerous, what if some /tmp
cleaner starts to remove old files. I've seen that happen in practice.

 @@ -269,10 +308,14 @@ static int process_snapshot(const char *path, const u8 
 *uuid, u64 ctransid,
   fprintf(stderr, ERROR: creating snapshot %s - %s 
   failed. %s\n, parent_subvol-path,
   path, strerror(-ret));
 - goto out;
   }
  
 +out_umount:
 + umount(tmp_dir);

umount fails for whatever reason,

 + rmdir(tmp_dir);

at least this does not delete the files recursively.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: move the extent buffer radix tree into the fs_info

2013-12-17 Thread David Sterba
On Tue, Dec 17, 2013 at 09:56:07AM -0500, Josef Bacik wrote:
 * alloc_extent_buffer uses radix_preload that turns off preepmtion by
 itself, so the lock here would be pointless
 
 Except you still need a lock for other inserts.
 
 * release_extent_buffer locks around radix_tree_delete, here a rcu
 locking will be ok as well
 
 No it won't.  RCU just makes sure readers don't get screwed, you still need
 to have real locking around the insertions/deletions, look at pagecache, we
 have mapping-tree_lock for this even though it uses rcu for the lookups.

Oh, my bad sorry, that would be too easy.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Wang Shilong

Hi dave,

 On Tue, Dec 17, 2013 at 05:13:49PM +0800, Wang Shilong wrote:
 If we change our default subvolume, btrfs receive will fail to find
 subvolume. To fix this problem, i have two ideas.
 
 1.make btrfs snapshot ioctl support passing source subvolume's objectid
 
 2.when we want to using interval subvolume path, we mount it other place
 that use subvolume 5 as its default subvolume.
 
 3. Tell the user to mount the toplevel subvol by himself and run receive
   again

If we really don't want to bother kernel change, i think we can add a option 
for btrfs receive(for example -f)
to force tool to resolve such ENOENT and at the same time, we output something 
like:

fprintf(stderr, Default subvolume is changed,……….)

if -f is not assigned, we will fail here.

 
 We'd better use the second approach because it won't bother kernel change.
 
 I don't think that the silent mount is the right way to fix it, that way
 the btrfs tool tooks responsibility not to break anything.  Like the
 unhandled umount failure below. I think admins and power users do not
 like to see some random tool mess with the system like this.

 
 @@ -199,6 +200,10 @@ static int process_snapshot(const char *path, const u8 
 *uuid, u64 ctransid,
  char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
  struct btrfs_ioctl_vol_args_v2 args_v2;
  struct subvol_info *parent_subvol = NULL;
 +char *dev = NULL;
 +char tmp_name[15] = btrfs-XX;
 +char tmp_dir[30] = /tmp;
 
 Mounting valuable data under /tmp is dangerous, what if some /tmp
 cleaner starts to remove old files. I've seen that happen in practice.

Agree with  this.

 
 @@ -269,10 +308,14 @@ static int process_snapshot(const char *path, const u8 
 *uuid, u64 ctransid,
  fprintf(stderr, ERROR: creating snapshot %s - %s 
  failed. %s\n, parent_subvol-path,
  path, strerror(-ret));
 -goto out;
  }
 
 +out_umount:
 +umount(tmp_dir);
 
 umount fails for whatever reason,

will fix it.

 
 +rmdir(tmp_dir);
 
 at least this does not delete the files recursively.

Why we need delete the files recursively here,
I only create dir ,something like /tmp/btrfs-X, and i only want to delete 
the temp dir
btrfs- here…

Thanks,
Wang

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix use of uninitialized err variable

2013-12-17 Thread Filipe David Manana
On Tue, Dec 17, 2013 at 3:27 PM, David Sterba dste...@suse.cz wrote:
 On Mon, Dec 16, 2013 at 05:03:25PM +, Filipe David Manana wrote:
 On Mon, Dec 16, 2013 at 2:34 PM, David Sterba dste...@suse.cz wrote:
  On Fri, Dec 13, 2013 at 07:39:34PM +, Filipe David Borba Manana wrote:
  From the compiler:
 
  fs/btrfs/file.c: In function ‘prepare_pages.isra.18’:
  fs/btrfs/file.c:1265:6: warning: ‘err’ may be used uninitialized in this 
  function [-Wuninitialized]
 
  My gcc 4.8.1 does not see this warning, nor do I while inspecting the
  souces in current next-master.

 Here it's gcc 4.6.3.

 I've seen that some versions of gcc produce bogus warnings of that sort
 and manual review is needed, but I haven't found a code path that would
 lead to uninitialized use of err.

 The warning points to

 1259 if (i == 0)
 1260 err = prepare_uptodate_page(pages[i], pos,
 1261 force_uptodate);
 1262 if (i == num_pages - 1)
 1263 err = prepare_uptodate_page(pages[i],
 1264 pos + write_bytes, 
 false);
 1265 if (err) {
 

 1266 page_cache_release(pages[i]);
 1267 faili = i - 1;
 1268 goto fail;
 1269 }

 But the loop starts from i = 0 and the variable is initialized before
 the check. So ti's gcc that does not see that, not a real error.

Right, my intention was to silence a compiler warning. Should have
made it more explicit in the commit message title.




-- 
Filipe David Manana,

Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] btrfs: subpagesize-blocksize: Define extent_buffer_head

2013-12-17 Thread David Sterba
On Mon, Dec 16, 2013 at 10:17:18AM -0600, Chandra Seetharaman wrote:
 On Mon, 2013-12-16 at 14:32 +0200, saeed bishara wrote:
  On Thu, Dec 12, 2013 at 1:38 AM, Chandra Seetharaman
  sekha...@us.ibm.com wrote:
   In order to handle multiple extent buffers per page, first we
   need to create a way to handle all the extent buffers that
   are attached to a page.
  
   This patch creates a new data structure eb_head, and moves
   fields that are common to all extent buffers in a page from
   extent buffer to eb_head.
  
   This also adds changes that are needed to handle multiple
   extent buffers per page case.
  
   Signed-off-by: Chandra Seetharaman sekha...@us.ibm.com
   ---
 
 snip
 
   diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
   index 54ab861..02de448 100644
   --- a/fs/btrfs/ctree.h
   +++ b/fs/btrfs/ctree.h
   @@ -2106,14 +2106,16 @@ static inline void btrfs_set_token_##name(struct 
   extent_buffer *eb, \
#define BTRFS_SETGET_HEADER_FUNCS(name, type, member, bits)\
static inline u##bits btrfs_##name(struct extent_buffer *eb)   \
{  \
   -   type *p = page_address(eb-pages[0]);   \
   +   type *p = page_address(eb_head(eb)-pages[0]) + \
   +   (eb-start  (PAGE_CACHE_SIZE -1)); \
  you can use PAGE_CACHE_MASK instead of PAGE_CACHE_SIZE - 1
 
 PAGE_CACHE_MASK get the page part of the value, not the offset in the
 page, i.e it is defined as
 
 #define PAGE_MASK (~(PAGE_SIZE-1))

Use ~PAGE_CACHE_MASK to get the offset. It's common, though not obvious
at first.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix use of uninitialized err variable

2013-12-17 Thread David Sterba
On Mon, Dec 16, 2013 at 05:03:25PM +, Filipe David Manana wrote:
 On Mon, Dec 16, 2013 at 2:34 PM, David Sterba dste...@suse.cz wrote:
  On Fri, Dec 13, 2013 at 07:39:34PM +, Filipe David Borba Manana wrote:
  From the compiler:
 
  fs/btrfs/file.c: In function ‘prepare_pages.isra.18’:
  fs/btrfs/file.c:1265:6: warning: ‘err’ may be used uninitialized in this 
  function [-Wuninitialized]
 
  My gcc 4.8.1 does not see this warning, nor do I while inspecting the
  souces in current next-master.
 
 Here it's gcc 4.6.3.

I've seen that some versions of gcc produce bogus warnings of that sort
and manual review is needed, but I haven't found a code path that would
lead to uninitialized use of err.

The warning points to

1259 if (i == 0)
1260 err = prepare_uptodate_page(pages[i], pos,
1261 force_uptodate);
1262 if (i == num_pages - 1)
1263 err = prepare_uptodate_page(pages[i],
1264 pos + write_bytes, 
false);
1265 if (err) {


1266 page_cache_release(pages[i]);
1267 faili = i - 1;
1268 goto fail;
1269 }

But the loop starts from i = 0 and the variable is initialized before
the check. So ti's gcc that does not see that, not a real error.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Michael Welsh Duggan
David Sterba dste...@suse.cz writes:

 On Tue, Dec 17, 2013 at 05:13:49PM +0800, Wang Shilong wrote:
 If we change our default subvolume, btrfs receive will fail to find
 subvolume. To fix this problem, i have two ideas.
 
 1.make btrfs snapshot ioctl support passing source subvolume's objectid

 2.when we want to using interval subvolume path, we mount it other place
 that use subvolume 5 as its default subvolume.

 3. Tell the user to mount the toplevel subvol by himself and run receive
again

Ugh.  I hope that would be considered a short-term hack waiting for a
better solution, perhaps requiring a kernel upgrade.  From a user's
perspective there is no reason this should be necessary, and requiring
this would be extraordinarily surprising.  Why is btrfs unable to find
my snapshot?  It's right there!  Moreover, this used to work just fine
in previous versions of btrfs-progs.

 We'd better use the second approach because it won't bother kernel change.

 I don't think that the silent mount is the right way to fix it, that way
 the btrfs tool tooks responsibility not to break anything.  Like the
 unhandled umount failure below. I think admins and power users do not
 like to see some random tool mess with the system like this.

 @@ -199,6 +200,10 @@ static int process_snapshot(const char *path,
 const u8 *uuid, u64 ctransid,
  char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
  struct btrfs_ioctl_vol_args_v2 args_v2;
  struct subvol_info *parent_subvol = NULL;
 +char *dev = NULL;
 +char tmp_name[15] = btrfs-XX;
 +char tmp_dir[30] = /tmp;

 Mounting valuable data under /tmp is dangerous, what if some /tmp
 cleaner starts to remove old files. I've seen that happen in practice.

Agreed.  If you _were_ to continue to implement it like this, you should
include code to respect the TMPDIR envvar at the very least.

-- 
Michael Welsh Duggan
(m...@cert.org)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What is needed to build an AFS fileserver on top of BTRFS?

2013-12-17 Thread David Howells

It has occurred to me and others that something like BTRFS could be a good fit
to build an AFS fileserver directly on top of.  The question is what facilities
would be needed from BTRFS to make this work?

So I thought I'd kick off a shopping list;-)

 (1) 64-bit data version numbers that increase monotonically with each write.

 Yes, this is likely to cause some performance degredation as it introduces
 an ordering over data writes and metadata writes to a file.  Maybe writes
 can be batched to improve performance?

 (2) Storage for ACLs and AFS UIDs.  Having shareable ACLs might also be useful.

 Xattrs would likely do for this.

 (3) The ability to snapshot a filesystem to make backups and for pushing to
 read-only volume servers.

 (4) A 32-bit vnode number and 32-bit vnode uniquifier/generation number.

 These don't necessarily have to be stored by BTRFS directly but could
 instead be in a separate database file that gets snapshotted also.

 (5) The ability to set the vnode number, vnode uniquifier and data version
 number to specific values.  Necessary to clone volumes and restore
 volume dumps.

David
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is needed to build an AFS fileserver on top of BTRFS?

2013-12-17 Thread Chris Mason
On Tue, 2013-12-17 at 16:53 +, David Howells wrote:
 It has occurred to me and others that something like BTRFS could be a good fit
 to build an AFS fileserver directly on top of.  The question is what 
 facilities
 would be needed from BTRFS to make this work?
 
 So I thought I'd kick off a shopping list;-)
 
  (1) 64-bit data version numbers that increase monotonically with each write.
 
  Yes, this is likely to cause some performance degredation as it 
 introduces
  an ordering over data writes and metadata writes to a file.  Maybe writes
  can be batched to improve performance?

  (2) Storage for ACLs and AFS UIDs.  Having shareable ACLs might also be 
 useful.
 
  Xattrs would likely do for this.
 
  (3) The ability to snapshot a filesystem to make backups and for pushing to
  read-only volume servers.
 
  (4) A 32-bit vnode number and 32-bit vnode uniquifier/generation number.
 
  These don't necessarily have to be stored by BTRFS directly but could
  instead be in a separate database file that gets snapshotted also.
 
  (5) The ability to set the vnode number, vnode uniquifier and data version
  number to specific values.  Necessary to clone volumes and restore
  volume dumps.

Hmmm, what exactly are vnodes?  Could we put them in xattrs?

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is needed to build an AFS fileserver on top of BTRFS?

2013-12-17 Thread Hugo Mills
On Tue, Dec 17, 2013 at 04:53:16PM +, David Howells wrote:
 It has occurred to me and others that something like BTRFS could be
 a good fit to build an AFS fileserver directly on top of. The
 question is what facilities would be needed from BTRFS to make this
 work? So I thought I'd kick off a shopping list;-)

  (1) 64-bit data version numbers that increase monotonically with
 each write. Yes, this is likely to cause some performance
 degredation as it introduces an ordering over data writes and
 metadata writes to a file. Maybe writes can be batched to improve
 performance?

   Do these have to be per-file? If not, then you might be able to get
away with using the transid, which is a filesystem-global
monotonically-increasing number.

   btrfs batches disk writes already, and uses the transid to
differentiate these -- the writes come at 30 second intervals (by
default, although there's an option to change the period). There may
be multiple distinct changes to a single file within that transaction
(although obviously, only the state of the file after the last one
gets written to disk). I don't know exactly what you need it for, so
this may or may not be appropriate here.

   Ceph uses transids for [something, mumble, wavy-hand] -- I don't
know if the use-case for Ceph is equivalent to the use-case for AFS.

  (2) Storage for ACLs and AFS UIDs. Having shareable ACLs might also
 be useful. Xattrs would likely do for this.

   This would seem like a reasonable place to put them, given that
that's what POSIX ACLs do, and we have POSIX ACL support already.

  (3) The ability to snapshot a filesystem to make backups and for
  pushing to read-only volume servers.

   We have snapshots of subvolumes, but not the filesystem as a whole.

  (4) A 32-bit vnode number and 32-bit vnode uniquifier/generation
 number. These don't necessarily have to be stored by BTRFS directly
 but could instead be in a separate database file that gets
 snapshotted also.
 
  (5) The ability to set the vnode number, vnode uniquifier and data
  version number to specific values. Necessary to clone volumes
  and restore volume dumps.

   What's a vnode meant to represent? I'm not familiar with the
terminology.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Are you the man who rules the Universe? Well,  I ---   
  try not to.   


signature.asc
Description: Digital signature


Re: What is needed to build an AFS fileserver on top of BTRFS?

2013-12-17 Thread David Howells
Chris Mason c...@fb.com wrote:

 Hmmm, what exactly are vnodes?  Could we put them in xattrs?

vnode numbers are AFS's equivalent of inode numbers.  Since they're one per
file, they could be the object filename.

Probably there would have to be a table of {vnode,latest_uniquifier} as the
uniquifier must still go up even if the vnode is unused for a while, so there
could also be a table of {vnode,btrfs_file}.

David
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is needed to build an AFS fileserver on top of BTRFS?

2013-12-17 Thread David Howells
Hugo Mills h...@carfax.org.uk wrote:

   (1) 64-bit data version numbers that increase monotonically with
  each write. Yes, this is likely to cause some performance
  degredation as it introduces an ordering over data writes and
  metadata writes to a file. Maybe writes can be batched to improve
  performance?
 
Do these have to be per-file? If not, then you might be able to get
 away with using the transid, which is a filesystem-global
 monotonically-increasing number.

Yes.  If you send a write RPC op to the server, you get back the new version
number.  If the new version number is not the old version number + 1 you know
there was a collision with a write from another client and you have to flush
your cache for that file and request a new callback (ie. a promise to notify
you if someone else changes the file).

   (3) The ability to snapshot a filesystem to make backups and for
   pushing to read-only volume servers.
 
We have snapshots of subvolumes, but not the filesystem as a whole.

By filesystem I meant the current state of an AFS volume.  Very likely this
would be represented by a BTRFS subvolume, if I understand it correctly.  You
might have several AFS volumes represented within a BTRFS filesystem.  They
would be manipulated independently.

   (5) The ability to set the vnode number, vnode uniquifier and data
   version number to specific values. Necessary to clone volumes
   and restore volume dumps.
 
What's a vnode meant to represent? I'm not familiar with the
 terminology.

AFS's equivalent of an inode with a 32-bit number representing it.  See my
reply to Chris's question about the same thing.

David
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Feature Req: mkfs.btrfs -d dup option on single device

2013-12-17 Thread Imran Geriskovan
On 12/12/13, Chris Mason c...@fb.com wrote:
 For me anyway, data=dup in mixed mode is definitely an accident ;)
 I personally think data dup is a false sense of security, but drives
 have gotten so huge that it may actually make sense in a few
 configurations.

Sure, it's not about any security regarding the device.

It's about the capability of recovering from any
bit-rot which can creep into your backups and can be
detected when you need the file after 20-30 generations
of backups which is too late. (Who keeps that much
incremental archive and reads backup logs of millions of
files, regularly?)

 Someone asks for it roughly once a year, so it probably isn't a horrible
 idea.
 -chris

Today, I've brought up an old 2 GB Seagate from the basement.
Literaly, it has been Rusted. So it deserves the title of
Spinning Rust for real. I had no hope whether it would work,
but out of curiosity I plugged it into a USB-IDE box.

It spinned up and wow!; it showed up among the devices.
It had two swap and an ext2 partition. I remembered that it was
one of the disk used for linux installations more than
10 years ago. I mounted it . Most of the files dates back to 2001-07.

They are more than 12 years old and they seem to be intact
with just one inode size missmatch. (See fsck output below).

If there were BTRFS (and -d dup :) ) at the time, now I would
perform a scrub and report the outcome here. Hence,
'Digital Archeology' can surely benefit from Btrfs. :)

PS: And regarding the SSD data retension debate this can be an
interesting benchmark for a device whick was kept in an unfavorable
environment.

Regards,
Imran


FSCK output:

fsck from util-linux 2.20.1
e2fsck 1.42.8 (20-Jun-2013)
/dev/sdb3 has gone 4209 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Special (device/socket/fifo) inode 82669 has non-zero size.  Fixy? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sdb3: * FILE SYSTEM WAS MODIFIED *
/dev/sdb3: 41930/226688 files (1.0% non-contiguous), 200558/453096 blocks
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [OpenAFS-devel] Re: What is needed to build an AFS fileserver on top of BTRFS?

2013-12-17 Thread Jeffrey Hutzelman
On Tue, 2013-12-17 at 17:47 +, David Howells wrote:
 Hugo Mills h...@carfax.org.uk wrote:
 
(1) 64-bit data version numbers that increase monotonically with
   each write. Yes, this is likely to cause some performance
   degredation as it introduces an ordering over data writes and
   metadata writes to a file. Maybe writes can be batched to improve
   performance?
  
 Do these have to be per-file? If not, then you might be able to get
  away with using the transid, which is a filesystem-global
  monotonically-increasing number.
 
 Yes.  If you send a write RPC op to the server, you get back the new version
 number.  If the new version number is not the old version number + 1 you know
 there was a collision with a write from another client and you have to flush
 your cache for that file and request a new callback (ie. a promise to notify
 you if someone else changes the file).

Right.  So, the DV must increment by exactly one for each successful
StoreData (and not for other changes).  This is important because
clients cache data and metadata independently, and cached data is
labeled with the file's DV.  This means that even if metadata for a file
has to be refetched for some reason (for example, an expired callback),
the _data_ doesn't have to be refetched unless it has actually changed,
or been evicted from the client's cache due to cache pressure.

-- Jeff

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [OpenAFS-devel] Re: What is needed to build an AFS fileserver on top of BTRFS?

2013-12-17 Thread Jeffrey Hutzelman
On Tue, 2013-12-17 at 17:40 +, David Howells wrote:
 Chris Mason c...@fb.com wrote:
 
  Hmmm, what exactly are vnodes?  Could we put them in xattrs?
 
 vnode numbers are AFS's equivalent of inode numbers.  Since they're one per
 file, they could be the object filename.

Yes, in fact, the volume, vnode number, uniqifier, and DV are
effectively the name the fileserver uses for the underlying inode.
Note that if the fileserver is maintaining the vnode indices, then you
don't actually _need_ to store a uniqifier for normal operation, because
at any given time, a volume can contain at most one vnode with a
particular vnode number, and that vnode's uniqifier is stored in the
index.  The uniqifier is used on-the-wire to distinguish different files
that existed at different points in time with the same vnode number.

 Probably there would have to be a table of {vnode,latest_uniquifier} as the
 uniquifier must still go up even if the vnode is unused for a while, so there
 could also be a table of {vnode,btrfs_file}.

No, you don't actually  have to do this.  The OpenAFS fileserver
maintains a single uniqifier for an entire volume, and simply increments
it every time a vnode is created.

-- Jeff


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs RAID1 File System Grew Something Extra

2013-12-17 Thread Garry T. Williams
I have been using btrfs for my /home partition on my home machine for
a few years now.  I created the file system RAID1 using two disk
partitions.  Recently I noticed btrfs fi df shows extra Data, System,
and Metadata allocations.  And btrfs fi show indicates extra
allocations on one of my disk drives accounting for the 20 MiB
allocation in the df display.

I'm confused.  What does this mean?

garry@vfr$ sudo btrfs subvolume list /home
garry@vfr$ sudo btrfs filesystem df /home
Data, RAID1: total=32.00GiB, used=21.01GiB
-- Data, single: total=8.00MiB, used=0.00
System, RAID1: total=8.00MiB, used=12.00KiB
-- System, single: total=4.00MiB, used=0.00
Metadata, RAID1: total=15.00GiB, used=424.60MiB
-- Metadata, single: total=8.00MiB, used=0.00
garry@vfr$ sudo btrfs filesystem show /home
Label: none  uuid: 6c3aeff6-9a50-4481-a175-7b98980eb638
Total devices 2 FS bytes used 21.43GiB
-- devid1 size 373.76GiB used 47.03GiB path /dev/sda4
devid2 size 373.76GiB used 47.01GiB path /dev/sdb4

Btrfs v3.12
garry@vfr$

If it matters, I create a snapshot each night and run a rsync backup
to another drive and then delete the snapshot.

garry@vfr$ uname -r
3.11.10-200.fc19.x86_64
garry@vfr$ rpm -q btrfs-progs
btrfs-progs-3.12-1.fc19.x86_64

-- 
Garry T. Williams

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs RAID1 File System Grew Something Extra

2013-12-17 Thread Anand Jain


Garry,

 this is a known bug in mkfs.btrfs, the workaround for now is
 to run balance on FS having some data. so that unused group-
 profile will go away.

HTH, Anand

On 12/18/2013 10:03 AM, Garry T. Williams wrote:

I have been using btrfs for my /home partition on my home machine for
a few years now.  I created the file system RAID1 using two disk
partitions.  Recently I noticed btrfs fi df shows extra Data, System,
and Metadata allocations.  And btrfs fi show indicates extra
allocations on one of my disk drives accounting for the 20 MiB
allocation in the df display.

I'm confused.  What does this mean?

 garry@vfr$ sudo btrfs subvolume list /home
 garry@vfr$ sudo btrfs filesystem df /home
 Data, RAID1: total=32.00GiB, used=21.01GiB
-- Data, single: total=8.00MiB, used=0.00
 System, RAID1: total=8.00MiB, used=12.00KiB
-- System, single: total=4.00MiB, used=0.00
 Metadata, RAID1: total=15.00GiB, used=424.60MiB
-- Metadata, single: total=8.00MiB, used=0.00
 garry@vfr$ sudo btrfs filesystem show /home
 Label: none  uuid: 6c3aeff6-9a50-4481-a175-7b98980eb638
Total devices 2 FS bytes used 21.43GiB
-- devid1 size 373.76GiB used 47.03GiB path /dev/sda4
devid2 size 373.76GiB used 47.01GiB path /dev/sdb4

 Btrfs v3.12
 garry@vfr$

If it matters, I create a snapshot each night and run a rsync backup
to another drive and then delete the snapshot.

garry@vfr$ uname -r
3.11.10-200.fc19.x86_64
garry@vfr$ rpm -q btrfs-progs
btrfs-progs-3.12-1.fc19.x86_64


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Miao Xie
On Tue, 17 Dec 2013 10:40:41 -0500, Michael Welsh Duggan wrote:
 David Sterba dste...@suse.cz writes:
 
 On Tue, Dec 17, 2013 at 05:13:49PM +0800, Wang Shilong wrote:
 If we change our default subvolume, btrfs receive will fail to find
 subvolume. To fix this problem, i have two ideas.

 1.make btrfs snapshot ioctl support passing source subvolume's objectid

 2.when we want to using interval subvolume path, we mount it other place
 that use subvolume 5 as its default subvolume.

 3. Tell the user to mount the toplevel subvol by himself and run receive
again
 
 Ugh.  I hope that would be considered a short-term hack waiting for a
 better solution, perhaps requiring a kernel upgrade.  From a user's
 perspective there is no reason this should be necessary, and requiring
 this would be extraordinarily surprising.  Why is btrfs unable to find
 my snapshot?  It's right there!  Moreover, this used to work just fine
 in previous versions of btrfs-progs.

Though the snapshot is still in the fs, it is inaccessible because you mount
some subvolume as the root, and you can not find the path to the snapshot.

For example:
There are two subvolumes in the fs, and they are in the root directory of the
fs, just like
real root directory
 |-subv0
 |-subv1

Then if you mount the subv1 as the root directory, the real root directory of
the fs and subv0 will be shielded,
+---+
|real root directory|
| |-subv0  |
+---+
  |-subv1
you can only access the files, directories, subvolumes... in the subv1. So the 
tool
will report can not find 

BTW, it is impossible that the previous version of btrfs-progs can work well in
this case.

 We'd better use the second approach because it won't bother kernel change.

 I don't think that the silent mount is the right way to fix it, that way
 the btrfs tool tooks responsibility not to break anything.  Like the
 unhandled umount failure below. I think admins and power users do not
 like to see some random tool mess with the system like this.

 @@ -199,6 +200,10 @@ static int process_snapshot(const char *path,
 const u8 *uuid, u64 ctransid,
 char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
 struct btrfs_ioctl_vol_args_v2 args_v2;
 struct subvol_info *parent_subvol = NULL;
 +   char *dev = NULL;
 +   char tmp_name[15] = btrfs-XX;
 +   char tmp_dir[30] = /tmp;

 Mounting valuable data under /tmp is dangerous, what if some /tmp
 cleaner starts to remove old files. I've seen that happen in practice.
 
 Agreed.  If you _were_ to continue to implement it like this, you should
 include code to respect the TMPDIR envvar at the very least.

Since the TMPDIR is not safe, I think the approach that David said is better.
Let's tell the users why we can not find the subvolume, and ask the users to
make the final decision.

Thanks
Miao
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: Cleanup the unused btrfs_check_super_valid.

2013-12-17 Thread Qu Wenruo
Since in David's commit 1104a8855, there is nothing really check the
super block now, the btrfs_check_super_valid function can be removed if
no one else needs the function.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
Cc: David Sterba dste...@suse.cz
---
 fs/btrfs/disk-io.c | 18 --
 1 file changed, 18 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8072cfa..3bda365 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -56,8 +56,6 @@
 static struct extent_io_ops btree_extent_io_ops;
 static void end_workqueue_fn(struct btrfs_work *work);
 static void free_fs_root(struct btrfs_root *root);
-static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
-   int read_only);
 static void btrfs_destroy_ordered_operations(struct btrfs_transaction *t,
 struct btrfs_root *root);
 static void btrfs_destroy_ordered_extents(struct btrfs_root *root);
@@ -2354,13 +2352,6 @@ int open_ctree(struct super_block *sb,
 
memcpy(fs_info-fsid, fs_info-super_copy-fsid, BTRFS_FSID_SIZE);
 
-   ret = btrfs_check_super_valid(fs_info, sb-s_flags  MS_RDONLY);
-   if (ret) {
-   printk(KERN_ERR btrfs: superblock contains fatal errors\n);
-   err = -EINVAL;
-   goto fail_alloc;
-   }
-
disk_super = fs_info-super_copy;
if (!btrfs_super_root(disk_super))
goto fail_alloc;
@@ -3705,15 +3696,6 @@ int btrfs_read_buffer(struct extent_buffer *buf, u64 
parent_transid)
return btree_read_extent_buffer_pages(root, buf, 0, parent_transid);
 }
 
-static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
- int read_only)
-{
-   /*
-* Placeholder for checks
-*/
-   return 0;
-}
-
 static void btrfs_error_commit_super(struct btrfs_root *root)
 {
mutex_lock(root-fs_info-cleaner_mutex);
-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Michael Welsh Duggan
Miao Xie mi...@cn.fujitsu.com writes:

 On Tue, 17 Dec 2013 10:40:41 -0500, Michael Welsh Duggan wrote:
 David Sterba dste...@suse.cz writes:
 
 On Tue, Dec 17, 2013 at 05:13:49PM +0800, Wang Shilong wrote:
 If we change our default subvolume, btrfs receive will fail to find
 subvolume. To fix this problem, i have two ideas.

 1.make btrfs snapshot ioctl support passing source subvolume's
 objectid

 2.when we want to using interval subvolume path, we mount it other
 place
 that use subvolume 5 as its default subvolume.

 3. Tell the user to mount the toplevel subvol by himself and run
 receive
again
 
 Ugh.  I hope that would be considered a short-term hack waiting for a
 better solution, perhaps requiring a kernel upgrade.  From a user's
 perspective there is no reason this should be necessary, and requiring
 this would be extraordinarily surprising.  Why is btrfs unable to find
 my snapshot?  It's right there!  Moreover, this used to work just fine
 in previous versions of btrfs-progs.

 Though the snapshot is still in the fs, it is inaccessible because you
 mount
 some subvolume as the root, and you can not find the path to the snapshot.

 For example:
 There are two subvolumes in the fs, and they are in the root directory
 of the
 fs, just like
   real root directory
|-subv0
|-subv1

 Then if you mount the subv1 as the root directory, the real root
 directory of
 the fs and subv0 will be shielded,
   +---+
   |real root directory|
   | |-subv0  |
   +---+
 |-subv1
 you can only access the files, directories, subvolumes... in the subv1. So 
 the tool
 will report can not find 

 BTW, it is impossible that the previous version of btrfs-progs can work well 
 in
 this case.

In that case I either misunderstand completely, or my problem is almost
decidedly different.  To recap, this is the command that failed:

# ./btrfs send -p /snapshots/bo /snapshots/bp | ./btrfs receive 
/backup/snapshots/root/
At subvol /snapshots/bp
At snapshot bp
ioctl(BTRFS_IOC_TREE_SEARCH, uuid, key 48f0ebae83fd32f1, UUID_KEY, 
90139d8200afeaab) ret=-1, error: No such file or directory
ioctl(BTRFS_IOC_TREE_SEARCH, uuid, key 48f0ebae83fd32f1, UUID_KEY, 
90139d8200afeaab) ret=-1, error: No such file or directory
ERROR: could not find parent subvolume

Now, I believe you are saying that this means that it can't find the
bo snapshot in the backup volume.  But it is mounted in the expected
location:

# ls -ld /backup/snapshots/root/bo/
drwxr-xr-x 1 root root 280 Dec 13 17:54 /backup/snapshots/root/bo/

and 

# ./btrfs sub list -p /backup/ | grep root/bo
ID 1030 gen 1046 parent 5 top level 5 path snapshots/root/bo

# btrfs sub show /backup/snapshots/root/bo/
/backup/snapshots/root/bo
Name:   bo
uuid:   5e15ef24-f2d0-194f-886d-3f7afc7413a4
Parent uuid:9a226af3-8497-744b-90f7-d7e54d58946d
Creation time:  2013-12-13 17:51:57
Object ID:  1030
Generation (Gen):   1046
Gen at creation:1042
Parent: 5
Top Level:  5
Flags:  readonly
Snapshot(s):

Maybe I am missing some terminology here?  Is there some output I can
send to make the problem clearer?

-- 
Michael Welsh Duggan
(m...@md5i.com)

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Wang Shilong

Hello Michael,

On 12/18/2013 11:29 AM, Michael Welsh Duggan wrote:

Miao Xie mi...@cn.fujitsu.com writes:


On Tue, 17 Dec 2013 10:40:41 -0500, Michael Welsh Duggan wrote:

David Sterba dste...@suse.cz writes:


On Tue, Dec 17, 2013 at 05:13:49PM +0800, Wang Shilong wrote:

If we change our default subvolume, btrfs receive will fail to find
subvolume. To fix this problem, i have two ideas.

1.make btrfs snapshot ioctl support passing source subvolume's
objectid
2.when we want to using interval subvolume path, we mount it other
place
that use subvolume 5 as its default subvolume.

3. Tell the user to mount the toplevel subvol by himself and run
receive
again

Ugh.  I hope that would be considered a short-term hack waiting for a
better solution, perhaps requiring a kernel upgrade.  From a user's
perspective there is no reason this should be necessary, and requiring
this would be extraordinarily surprising.  Why is btrfs unable to find
my snapshot?  It's right there!  Moreover, this used to work just fine
in previous versions of btrfs-progs.

Though the snapshot is still in the fs, it is inaccessible because you
mount
some subvolume as the root, and you can not find the path to the snapshot.

For example:
There are two subvolumes in the fs, and they are in the root directory
of the
fs, just like
real root directory
 |-subv0
 |-subv1

Then if you mount the subv1 as the root directory, the real root
directory of
the fs and subv0 will be shielded,
+---+
|real root directory|
| |-subv0  |
+---+
  |-subv1
you can only access the files, directories, subvolumes... in the subv1. So the 
tool
will report can not find 

BTW, it is impossible that the previous version of btrfs-progs can work well in
this case.

In that case I either misunderstand completely, or my problem is almost
decidedly different.  To recap, this is the command that failed:

 # ./btrfs send -p /snapshots/bo /snapshots/bp | ./btrfs receive 
/backup/snapshots/root/
 At subvol /snapshots/bp
 At snapshot bp
 ioctl(BTRFS_IOC_TREE_SEARCH, uuid, key 48f0ebae83fd32f1, UUID_KEY, 
90139d8200afeaab) ret=-1, error: No such file or directory
 ioctl(BTRFS_IOC_TREE_SEARCH, uuid, key 48f0ebae83fd32f1, UUID_KEY, 
90139d8200afeaab) ret=-1, error: No such file or directory
 ERROR: could not find parent subvolume

It seems that you use older kernel version but use the latest 
btrfs-progs, new btrfs-progs use uuid tree to search but

this tree did not exist yet.

Can you try to upgrade your kernel?

Thanks,
Wang
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2] btrfs-progs: fix btrfstune silence on failure

2013-12-17 Thread Gui Hecheng
Originally, btrfstune will fail without any options, like this:

# btrfstune /dev/sdb

An error prompt  usage should show up upon this condition.

Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com
---
V1 - V2:
add optind assignment to make reviewers happy;
print error msg if no options provided
---
 btrfstune.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/btrfstune.c b/btrfstune.c
index 50724ba..da82f36 100644
--- a/btrfstune.c
+++ b/btrfstune.c
@@ -115,6 +115,7 @@ int main(int argc, char *argv[])
int skinny_flag = 0;
int ret;
 
+   optind = 1;
while(1) {
int c = getopt(argc, argv, S:rx);
if (c  0)
@@ -143,6 +144,13 @@ int main(int argc, char *argv[])
return 1;
}
 
+   if (!(seeding_flag + extrefs_flag + skinny_flag)) {
+   fprintf(stderr,
+   ERROR: At least one option should be assigned.\n);
+   print_usage();
+   return 1;
+   }
+
if (check_mounted(device)) {
fprintf(stderr, %s is mounted\n, device);
return 1;
@@ -176,6 +184,7 @@ int main(int argc, char *argv[])
} else {
root-fs_info-readonly = 1;
ret = 1;
+   fprintf(stderr, btrfstune failed\n);
}
close_ctree(root);
 
-- 
1.8.0.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/3] btrfs-progs: handle error in the btrfs_prepare_device

2013-12-17 Thread Anand Jain
this patch will handle the strerror reporting of the error instead of
printing errno,  and also replaced the BUG_ON with the error handling

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 v4: replaced ? statement with proper if statement
 v3: fix per Stefan review, update error message
 v2: commit update

 cmds-device.c  |7 +++
 cmds-replace.c |9 -
 mkfs.c |9 -
 utils.c|   31 ---
 4 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/cmds-device.c b/cmds-device.c
index bc4a8dc..ada0bcd 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -111,13 +111,11 @@ static int cmd_add_dev(int argc, char **argv)
 
res = btrfs_prepare_device(devfd, argv[i], 1, dev_block_count,
   0, mixed, discard);
+   close(devfd);
if (res) {
-   fprintf(stderr, ERROR: Unable to init '%s'\n, 
argv[i]);
-   close(devfd);
ret++;
-   continue;
+   goto error_out;
}
-   close(devfd);
 
strncpy_null(ioctl_args.name, argv[i]);
res = ioctl(fdmnt, BTRFS_IOC_ADD_DEV, ioctl_args);
@@ -130,6 +128,7 @@ static int cmd_add_dev(int argc, char **argv)
 
}
 
+error_out:
close_file_or_dir(fdmnt, dirstream);
return !!ret;
 }
diff --git a/cmds-replace.c b/cmds-replace.c
index d9b0940..c683d6c 100644
--- a/cmds-replace.c
+++ b/cmds-replace.c
@@ -276,12 +276,11 @@ static int cmd_start_replace(int argc, char **argv)
}
strncpy((char *)start_args.start.tgtdev_name, dstdev,
BTRFS_DEVICE_PATH_NAME_MAX);
-   if (btrfs_prepare_device(fddstdev, dstdev, 1, dstdev_block_count, 0,
-mixed, 0)) {
-   fprintf(stderr, Error: Failed to prepare device '%s'\n,
-   dstdev);
+   ret = btrfs_prepare_device(fddstdev, dstdev, 1, dstdev_block_count, 0,
+mixed, 0);
+   if (ret)
goto leave_with_error;
-   }
+
close(fddstdev);
fddstdev = -1;
 
diff --git a/mkfs.c b/mkfs.c
index 33369f9..18df087 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1446,6 +1446,10 @@ int main(int ac, char **av)
first_file = file;
ret = btrfs_prepare_device(fd, file, zero_end, dev_block_count,
   block_count, mixed, discard);
+   if (ret) {
+   close(fd);
+   exit(1);
+   }
if (block_count  block_count  dev_block_count) {
fprintf(stderr, %s is smaller than requested size\n, 
file);
exit(1);
@@ -1553,8 +1557,11 @@ int main(int ac, char **av)
}
ret = btrfs_prepare_device(fd, file, zero_end, dev_block_count,
   block_count, mixed, discard);
+   if (ret) {
+   close(fd);
+   exit(1);
+   }
mixed = old_mixed;
-   BUG_ON(ret);
 
ret = btrfs_add_to_fsid(trans, root, fd, file, dev_block_count,
sectorsize, sectorsize, sectorsize);
diff --git a/utils.c b/utils.c
index f499023..f37083a 100644
--- a/utils.c
+++ b/utils.c
@@ -581,13 +581,13 @@ int btrfs_prepare_device(int fd, char *file, int 
zero_end, u64 *block_count_ret,
ret = fstat(fd, st);
if (ret  0) {
fprintf(stderr, unable to stat %s\n, file);
-   exit(1);
+   return 1;
}
 
block_count = btrfs_device_size(fd, st);
if (block_count == 0) {
fprintf(stderr, unable to find %s size\n, file);
-   exit(1);
+   return 1;
}
if (max_block_count)
block_count = min(block_count, max_block_count);
@@ -612,26 +612,35 @@ int btrfs_prepare_device(int fd, char *file, int 
zero_end, u64 *block_count_ret,
}
 
ret = zero_dev_start(fd);
-   if (ret) {
-   fprintf(stderr, failed to zero device start %d\n, ret);
-   exit(1);
-   }
+   if (ret)
+   goto zero_dev_error;
 
for (i = 0 ; i  BTRFS_SUPER_MIRROR_MAX; i++) {
bytenr = btrfs_sb_offset(i);
if (bytenr = block_count)
break;
-   zero_blocks(fd, bytenr, BTRFS_SUPER_INFO_SIZE);
+   ret = zero_blocks(fd, bytenr, BTRFS_SUPER_INFO_SIZE);
+   if (ret)
+   goto zero_dev_error;
}
 
if (zero_end) {
ret = zero_dev_end(fd, block_count);
-   if (ret) {
-   fprintf(stderr, failed to zero device end %d\n, ret);
- 

Re: Btrfs RAID1 File System Grew Something Extra

2013-12-17 Thread Garry T. Williams
On 12-18-13 10:46:29 Anand Jain wrote:
 On 12/18/2013 10:03 AM, Garry T. Williams wrote:
  I have been using btrfs for my /home partition on my home machine for
  a few years now.  I created the file system RAID1 using two disk
  partitions.  Recently I noticed btrfs fi df shows extra Data, System,
  and Metadata allocations.  And btrfs fi show indicates extra
  allocations on one of my disk drives accounting for the 20 MiB
  allocation in the df display.

   this is a known bug in mkfs.btrfs, the workaround for now is
   to run balance on FS having some data. so that unused group-
   profile will go away.

Thanks.

garry@vfr$ sudo btrfs balance start /home
Done, had to relocate 50 out of 50 chunks
garry@vfr$ sudo btrfs filesystem df /home
Data, RAID1: total=22.00GiB, used=21.02GiB
System, RAID1: total=32.00MiB, used=12.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, RAID1: total=1.00GiB, used=419.60MiB

Hmmm.

Well, it's better, but the extra allocation for System is baffling.  I
believe that this happened sometime after creating the file system.

Also balance on a RAID1 file system with exactly two drives doesn't
make much sense to me.  Why would any chunks have to be relocated?
I'm clearly missing something here.

-- 
Garry T. Williams

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Michael Welsh Duggan
Wang Shilong wangsl.f...@cn.fujitsu.com writes:

 It seems that you use older kernel version but use the latest
 btrfs-progs, new btrfs-progs use uuid tree to search but
 this tree did not exist yet.

 Can you try to upgrade your kernel?

What version is necessary?  (I am currently on 3.11.10.)

-- 
Michael Welsh Duggan
(m...@md5i.com)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: receive: fix the case that we can not find subvolume

2013-12-17 Thread Wang Shilong

On 12/18/2013 12:06 PM, Michael Welsh Duggan wrote:

Wang Shilong wangsl.f...@cn.fujitsu.com writes:


It seems that you use older kernel version but use the latest
btrfs-progs, new btrfs-progs use uuid tree to search but
this tree did not exist yet.

Can you try to upgrade your kernel?

What version is necessary?  (I am currently on 3.11.10.)

3.12 is ok, btw, can you run for 3.11.10

#dmesg

Let's see if it output somthing like:

btrfs: can not found root: 9

Thanks,
Wang




--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html