date:20130901

Re: btrfs-progs: v3, move out print in cmd_df to another function

2013-09-01 Thread Wang Shilong


On 09/02/2013 10:58 AM, Anand Jain wrote:


 Wang, kindly note that this was fixed in v4.


Yeah...This is sent by my shell script by accident, I have fixed it and
it won't send the repeated message again.

Thanks for reminding .. ^_^



Anand


On 09/01/2013 02:15 PM, Wang Shilong wrote:

Hello, Using checkpatch.pl, i get the following warnings(errors):
ERROR: "foo * bar" should be "foo *bar"
#37: FILE: cmds-filesystem.c:47:
+static char * group_type_str(u64 flag)

ERROR: "foo * bar" should be "foo *bar"
#69: FILE: cmds-filesystem.c:63:
+static char * group_profile_str(u64 flag)

total: 2 errors, 0 warnings, 229 lines checked

patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Thanks, Wang
--
To unsubscribe from this list: send the line "unsubscribe 
linux-btrfs" in

the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs-progs: v3, move out print in cmd_df to another function

2013-09-01 Thread Anand Jain



 Wang, kindly note that this was fixed in v4.

Anand


On 09/01/2013 02:15 PM, Wang Shilong wrote:

Hello, Using checkpatch.pl, i get the following warnings(errors):
ERROR: "foo * bar" should be "foo *bar"
#37: FILE: cmds-filesystem.c:47:
+static char * group_type_str(u64 flag)

ERROR: "foo * bar" should be "foo *bar"
#69: FILE: cmds-filesystem.c:63:
+static char * group_profile_str(u64 flag)

total: 2 errors, 0 warnings, 229 lines checked

patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Thanks, Wang
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] btrfs-progs: replace fails start but in the background v2

2013-09-01 Thread Anand Jain

when the balance is running, the replace start ioctl
fails (for the right reasons). but since the cli has
put ioctl thread to background (for right reasons)
the user won't know that cli failed to start.

so before cli goes to the background, it should check
if mutually_exclusive_operation_running is not held.

this is done by newly introduced ioctl
 BTRFS_IOC_CHECK_DEV_EXCL_OPS by the following kernel patch:

btrfs: introduce BTRFS_IOC_CHECK_DEV_EXCL_OPS ioctl to check dev excl op

v2:
 fix checkpatch.pl warnings as spotted by Wang

Signed-off-by: Anand Jain 
---
 cmds-device.c  |4 ++--
 cmds-replace.c |   14 ++
 ioctl.h|1 +
 man/btrfs.8.in |4 ++--
 utils.c|   15 +++
 utils.h|1 +
 6 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/cmds-device.c b/cmds-device.c
index 8446502..27f7f84 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -45,7 +45,7 @@ static const char * const device_cmd_group_usage[] = {
 };
 
 static const char * const cmd_add_dev_usage[] = {
-   "btrfs device add  [...] ",
+   "btrfs device add [-f]  [...] ",
"Add a device to a filesystem",
NULL
 };
@@ -85,7 +85,7 @@ static int cmd_add_dev(int argc, char **argv)
int devfd, res;
u64 dev_block_count = 0;
int mixed = 0;
-
+printf("asj: add %s\n", argv[i]);
res = test_dev_for_mkfs(argv[i], force, estr);
if (res) {
fprintf(stderr, "%s", estr);
diff --git a/cmds-replace.c b/cmds-replace.c
index e3ff695..0b2cbc8 100644
--- a/cmds-replace.c
+++ b/cmds-replace.c
@@ -203,6 +203,20 @@ static int cmd_start_replace(int argc, char **argv)
goto leave_with_error;
}
 
+   /* check if there is some other device exclusive
+* operation running in the FS which won't let this replace
+* to run. And ENOTTY is when older kernel doesn't support
+* lock checking ioctl
+*/
+   ret = is_dev_excl_op_free(fdmnt);
+   if (ret && ret != -ENOTTY) {
+   fprintf(stderr,
+   "ERROR: replace start failed on \"%s\" - %s\n",
+   path,
+   ret > 0 ? btrfs_err_str(ret) : strerror(-ret));
+   goto leave_with_error;
+   }
+
srcdev = argv[optind];
dstdev = argv[optind + 1];
 
diff --git a/ioctl.h b/ioctl.h
index e959720..afadcbc 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -598,6 +598,7 @@ struct btrfs_ioctl_clone_range_args {
 #define BTRFS_IOC_DEV_REPLACE _IOWR(BTRFS_IOCTL_MAGIC, 53, \
struct btrfs_ioctl_dev_replace_args)
 #define BTRFS_IOC_DEDUP_CTL _IOW(BTRFS_IOCTL_MAGIC, 55, int)
+#define BTRFS_IOC_CHECK_DEV_EXCL_OPS _IO(BTRFS_IOCTL_MAGIC, 56)
 
 #ifdef __cplusplus
 }
diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index d493d7e..0cca5c3 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -47,7 +47,7 @@ btrfs \- control a btrfs filesystem
 \fBbtrfs\fP \fB[filesystem] balance status\fP [-v] \fI\fP
 .PP
 .PP
-\fBbtrfs\fP \fBdevice add\fP \fI\fP [\fI...\fP] \fI\fP
+\fBbtrfs\fP \fBdevice add\fP [-f] \fI\fP [\fI...\fP] 
\fI\fP
 .PP
 \fBbtrfs\fP \fBdevice delete\fP \fI\fP [\fI...\fP] \fI\fP
 .PP
@@ -386,7 +386,7 @@ be verbose
 .RE
 .TP
 
-\fBdevice add\fR\fI  \fP[\fI...\fP] \fI\fR
+\fBdevice add\fR [-f] \fI  \fP[\fI...\fP] \fI\fR
 Add device(s) to the filesystem identified by \fI\fR.
 .TP
 
diff --git a/utils.c b/utils.c
index b039a03..c53b7f1 100644
--- a/utils.c
+++ b/utils.c
@@ -1993,3 +1993,18 @@ int is_vol_small(char *file)
return 0;
}
 }
+
+/* Returns:
+ * BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS:
+ * If the device is locked to prevent other device operations
+ * from the user cli like device add remove replace balance etc..
+ * < 0:
+ * For any other error including if kernel don't support the
+ * ioctl (-ENOTTY)
+ */
+int is_dev_excl_op_free(int fd)
+{
+   int ret;
+   ret = ioctl(fd, BTRFS_IOC_CHECK_DEV_EXCL_OPS, NULL);
+   return ret > 0 ? ret : -errno;
+}
diff --git a/utils.h b/utils.h
index eb6fba3..a1e5d67 100644
--- a/utils.h
+++ b/utils.h
@@ -83,4 +83,5 @@ int test_num_disk_vs_raid(u64 metadata_profile, u64 
data_profile,
u64 dev_cnt, int mixed, char *estr);
 int get_btrfs_mount(const char *dev, char *mp, size_t mp_size);
 int is_vol_small(char *file);
+int is_dev_excl_op_free(int fd);
 #endif
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs: separate out tests into their own directory V2

2013-09-01 Thread Wang Shilong


Sorry, please ignore this thread...

Thanks,
wang
On 09/01/2013 02:15 PM, Wang Shilong wrote:

Hello, Using checkpatch.pl, i get the following warnings(errors):
WARNING: kfree(NULL) is safe this check is probably not required
#132: FILE: fs/btrfs/free-space-cache.c:3035:
+   if (map)
+   kfree(map);

WARNING: line over 80 characters
#882: FILE: fs/btrfs/tests/free-space-tests.c:211:
+   ret = test_add_free_space_entry(cache, 4 * 1024 * 1024, 1 * 1024 * 
1024, 1);

WARNING: line over 80 characters
#927: FILE: fs/btrfs/tests/free-space-tests.c:256:
+   ret = test_add_free_space_entry(cache, 1 * 1024 * 1024, 4 * 1024 * 
1024, 1);

WARNING: line over 80 characters
#947: FILE: fs/btrfs/tests/free-space-tests.c:276:
+   ret = test_add_free_space_entry(cache, 4 * 1024 * 1024, 4 * 1024 * 
1024, 1);

WARNING: line over 80 characters
#953: FILE: fs/btrfs/tests/free-space-tests.c:282:
+   ret = test_add_free_space_entry(cache, 2 * 1024 * 1024, 2 * 1024 * 
1024, 0);

WARNING: line over 80 characters
#1016: FILE: fs/btrfs/tests/free-space-tests.c:345:
+   ret = test_add_free_space_entry(cache, 1 * 1024 * 1024, 2 * 1024 * 
1024, 1);

WARNING: line over 80 characters
#1022: FILE: fs/btrfs/tests/free-space-tests.c:351:
+   ret = test_add_free_space_entry(cache, 3 * 1024 * 1024, 1 * 1024 * 
1024, 0);

WARNING: line over 80 characters
#1030: FILE: fs/btrfs/tests/free-space-tests.c:359:
+   test_msg("Error removing bitmap and extent overlapping %d\n", 
ret);

total: 0 errors, 8 warnings, 997 lines checked

patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Thanks, Wang
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] btrfs-progs: Update the man page of btrfs

2013-09-01 Thread Qu Wenruo


Sorry for the late reply.

I checked the new man page, which seems OK for me.

Thank you.

Qu Wenruo

于 2013年08月06日 07:28, David Sterba 写道:

On Tue, Jul 23, 2013 at 10:43:22AM +0800, Qu Wenruo wrote:

Update the man page of "btrfs" command to keep up with new commands.

Thanks. Please check if I haven't missed anything, I've used 'wiggle' to
guess-apply the patch and fixed a few conflicts myself.

david




--
-
Qu Wenruo
Development Dept.I
Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
No. 6 Wenzhu Road, Nanjing, 210012, China
TEL: +86+25-86630566-8526
COINS: 7998-8526
FAX: +86+25-83317685
MAIL: quwen...@cn.fujitsu.com
-

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Device delete returns "unable to go below four devices on raid10" on 5 drive setup

2013-09-01 Thread Steven Post

On Sun, 2013-09-01 at 14:08 +0200, Steven Post wrote:
> On Sat, 2013-08-31 at 18:03 -0600, Chris Murphy wrote:
> > On Aug 31, 2013, at 5:55 PM, Steven Post  
> > wrote:
> > 
> > > On Sat, 2013-08-31 at 11:42 -0600, Chris Murphy wrote:
> > >> 
> > >> Yes. It might take a few minutes after the chunks are reallocated for 
> > >> the device to be removed from the volume. I've had some cases where even 
> > >> a reboot was needed for the information in fi sh to refresh.
> > > 
> > > I see, so that might be normal behaviour.
> > 
> > No, I think it's expected for deleted devices to not appear in the volume 
> > listing anymore, but with older kernels I had that experience. I haven't 
> > tried it recently with newer kernels.
> 
> I might have phrased that a bit incorrectly, with 'normal behaviour' I
> meant it was known to do that and not cause major problems. Of course I
> would expect the entry to just disappear.
> I'll let you know the result of the 'device delete' operation on the
> second machine (3.10.7 kernel).

The 3.10.7 kernel doesn't seem to have this issue, the removed device is
not listed anymore immediately after the 'device delete' command
returns. Although in that instance the array still had 6 drives, not 5
as with the old one.

[...]

Best regards,
Steven


signature.asc
Description: This is a digitally signed message part

[PATCH] Btrfs: do not add replace target to the alloc_list

2013-09-01 Thread Ilya Dryomov

If replace was suspended by the umount, replace target device is added
to the fs_devices->alloc_list during a later mount.  This is obviously
wrong.  ->is_tgtdev_for_dev_replace is supposed to guard against that,
but ->is_tgtdev_for_dev_replace is (and can only ever be) initialized
*after* everything is opened and fs_devices lists are populated.  Fix
this by checking the devid instead: for replace targets it's always
equal to BTRFS_DEV_REPLACE_DEVID.

Cc: Stefan Behrens 
Signed-off-by: Ilya Dryomov 
---
At first I thought this was caused by my btrfs_device rollback patch,
but no, it just made it easier to spot -- previously one had to reboot
or rmmod/insmod before mounting a suspended replace.

 fs/btrfs/volumes.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c9a0977..5b99f19 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -793,7 +793,8 @@ static int __btrfs_open_devices(struct btrfs_fs_devices 
*fs_devices,
fs_devices->rotating = 1;
 
fs_devices->open_devices++;
-   if (device->writeable && !device->is_tgtdev_for_dev_replace) {
+   if (device->writeable &&
+   device->devid != BTRFS_DEV_REPLACE_DEVID) {
fs_devices->rw_devices++;
list_add(&device->dev_alloc_list,
 &fs_devices->alloc_list);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Mixed blocks, he can avoid ENOSPACE error, when he can't allocated metadata blocks?

2013-09-01 Thread Тимофей Титовец

Hello list, sorry for my bad english anyway.
if my message is  delirium, just ignore this message.
My question:
When using mixed blocks, metadata and data chunks has be merge, but we
have (when using mixed) speed penalty.
how many penalty will be have if we using mixed?

Kernel 3.11-rc7, Ubuntu 13.10 x64
my simple test:
#Mixed
sudo mkfs.btrfs -f -M /dev/sdb #sdb old seagate hdd 80G
sudo mount /dev/sdb /mnt
time sudo dd if=/dev/zero of=/mnt/zero bs=1M count=4096
~48.5s
time sudo rm /mnt/zero
~1.5s
sudo umount /mnt
---
#default setting
sudo mkfs.btrfs -f /dev/sdb
sudo mount /dev/sdb /mnt
time sudo dd if=/dev/zero of=/mnt/zero bs=1M count=4096
~48.5s
time sudo rm /mnt/zero
~1.5s
sudo umount /mnt
---
#raw write
time sudo dd if=/dev/zero of=/dev/sdb bs=1M count=4096
~59s

I don't see any different between mixed and defaults blocks.
So using mixed blocks have sense? Or this is outdated function?
Can i change in future type of chunk allocation between mixed and
default (with using balance function)?

i believe in mixed blocks because this function can awoid problems
with ENOSPACE error (when system can't allocated metadata chunk) and
can up space utilization on the disk.
On slow hdd, we no see any different between profiles, and on speed
ssd perfomance penalty should not be significant.

(i just using btrfs on my PC (1 TB hdd) and laptop (ssd 128 GB), i
like this fs and i haven't any problem
(Only enospace, but i just add temp drive and using balance to fix this)
and i just try optimizing fs).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Unmountable filesystem parent transid verify failed

2013-09-01 Thread ronnie sahlberg

Hi again.
Sorry for top posting.


I have a 9 disk filesystem that does not mount anymore and need some
help/advice so I can recover the data.

What happened was that I was running a btrfs delete device
under Ubuntu 13.04   Kernel 3.8
and after a long time of moving data around it crashed with a SEGV.

Now the filesystem does not mount and none of the recovery options I
have tried work.

I have upgraded to Debian testing and are now using kerne3.10-2-amd64



When I try btrfsck I get heaps of these :
Ignoring transid failure
parent transid verify failed on 24419581267968 wanted 301480 found 301495
parent transid verify failed on 24419581267968 wanted 301480 found 301495
parent transid verify failed on 24419581267968 wanted 301480 found 301495
parent transid verify failed on 24419581267968 wanted 301480 found 301495


I have tried using btrfs-image :
but it too crashes eventually with :

btrfs-image -c9 -t4 /dev/sde btrfs-image
...
btrfs-image: ctree.c:787: read_node_slot: Assertion `!(level == 0)' failed.
Aborted


mount -o ro,recovery fails
# mount -o ro,recovery /dev/sde /DATA
mount: wrong fs type, bad option, bad superblock on /dev/sde,
...


# btrfs-zero-log /dev/sde
eventually fails with :
btrfs-zero-log: ctree.c:342: __btrfs_cow_block: Assertion
`!(btrfs_header_generation(buf) > trans->transid)' failed.
Aborted


What should I try next?


regards
ronnie sahlberg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Device delete returns "unable to go below four devices on raid10" on 5 drive setup

2013-09-01 Thread Steven Post

On Sat, 2013-08-31 at 18:03 -0600, Chris Murphy wrote:
> On Aug 31, 2013, at 5:55 PM, Steven Post  wrote:
> 
> > On Sat, 2013-08-31 at 11:42 -0600, Chris Murphy wrote:
> >> 
> >> Yes. It might take a few minutes after the chunks are reallocated for the 
> >> device to be removed from the volume. I've had some cases where even a 
> >> reboot was needed for the information in fi sh to refresh.
> > 
> > I see, so that might be normal behaviour.
> 
> No, I think it's expected for deleted devices to not appear in the volume 
> listing anymore, but with older kernels I had that experience. I haven't 
> tried it recently with newer kernels.

I might have phrased that a bit incorrectly, with 'normal behaviour' I
meant it was known to do that and not cause major problems. Of course I
would expect the entry to just disappear.
I'll let you know the result of the 'device delete' operation on the
second machine (3.10.7 kernel).

Back on the 3.2 kernel, the "filesystem show" command still showed the
removed device, but still with less used space that the others after the
balance. Shutdown + physical removal + boot didn't produce any error,
since this array is used as the root filesystem I think I would have
noticed a serious problem by now. I successfully added the 1 TB drive to
the array (after partitioning).
So it seems it's just the output of the 'filesystem show' command that
is lagging behind, even after a reboot and a 20 hour idle period.

> 
> > 
> > As an aside, I'd rather not recreate the arrays if it can be done
> > without recreating.
> 
> It should work. But it's an experimental file system. I'd at least make a 
> backup if you're going to do this with the device add/remove method.

Naturally, all important files have a backup, but for the rest of the
volume I can live with the fact that it would be lost. Also if every one
created a new filesystem instead of using device add/delete, this code
wouldn't get much testing ;)

Best regards,
Steven

signature.asc
Description: This is a digitally signed message part

[PATCH v7] Btrfs: optimize key searches in btrfs_search_slot

2013-09-01 Thread Filipe David Borba Manana

When the binary search returns 0 (exact match), the target key
will necessarily be at slot 0 of all nodes below the current one,
so in this case the binary search is not needed because it will
always return 0, and we waste time doing it, holding node locks
for longer than necessary, etc.

Below follow histograms with the times spent on the current approach of
doing a binary search when the previous binary search returned 0, and
times for the new approach, which directly picks the first item/child
node in the leaf/node.

Current approach:

Count: 6682
Range: 35.000 - 8370.000; Mean: 85.837; Median: 75.000; Stddev: 106.429
Percentiles:  90th: 124.000; 95th: 145.000; 99th: 206.000
  35.000 -   61.080:  1235 
  61.080 -  106.053:  4207 #
 106.053 -  183.606:  1122 ##
 183.606 -  317.341:   111 #
 317.341 -  547.959: 6 |
 547.959 - 8370.000: 1 |

Approach proposed by this patch:

Count: 6682
Range:  6.000 - 135.000; Mean: 16.690; Median: 16.000; Stddev:  7.160
Percentiles:  90th: 23.000; 95th: 27.000; 99th: 40.000
   6.000 -8.418:58 #
   8.418 -   11.670:  1149 #
  11.670 -   16.046:  2418 #
  16.046 -   21.934:  2098 ##
  21.934 -   29.854:   744 
  29.854 -   40.511:   154 ###
  40.511 -   54.848:41 #
  54.848 -   74.136: 5 |
  74.136 -  100.087: 9 |
 100.087 -  135.000: 6 |

These samples were captured during a run of the btrfs tests 001, 002 and
004 in the xfstests, with a leaf/node size of 4Kb.

Signed-off-by: Filipe David Borba Manana 
Signed-off-by: Josef Bacik 
---

V2: Simplified code, removed unnecessary code.
V3: Replaced BUG_ON() with the new ASSERT() from Josef.
V4: Addressed latest comments from Zach Brown and Josef Bacik.
Surrounded all code that is used for the assertion with a
#ifdef CONFIG_BTRFS_ASSERT ... #endif block. Also changed
offset arguments to be more strictly correct.
V5: Updated histograms to reflect latest version of the code.
V6: Use single assert macro and no more #ifdef CONFIG_BTRFS_ASSERT
... #endif logic, as suggested by Miao Xie.
V7: Added back the #ifdef ... #endif logic, to avoid compiler
warning about unused function when CONFIG_BTRFS_ASSERT is
not enabled.

 fs/btrfs/ctree.c |   41 +++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 5fa521b..4d602f7 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2426,6 +2426,39 @@ done:
return ret;
 }
 
+#ifdef CONFIG_BTRFS_ASSERT
+static int key_search_validate(struct extent_buffer *b,
+  struct btrfs_key *key,
+  int level)
+{
+   struct btrfs_disk_key disk_key;
+   unsigned long offset;
+
+   btrfs_cpu_key_to_disk(&disk_key, key);
+
+   if (level == 0)
+   offset = offsetof(struct btrfs_leaf, items[0].key);
+   else
+   offset = offsetof(struct btrfs_node, ptrs[0].key);
+
+   return !memcmp_extent_buffer(b, &disk_key, offset, sizeof(disk_key));
+}
+#endif
+
+static int key_search(struct extent_buffer *b, struct btrfs_key *key,
+ int level, int *prev_cmp, int *slot)
+{
+   if (*prev_cmp != 0) {
+   *prev_cmp = bin_search(b, key, level, slot);
+   return *prev_cmp;
+   }
+
+   ASSERT(key_search_validate(b, key, level));
+   *slot = 0;
+
+   return 0;
+}
+
 /*
  * look for key in the tree.  path is filled in with nodes along the way
  * if key is found, we return zero and you can find the item in the leaf
@@ -2454,6 +2487,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, 
struct btrfs_root
int write_lock_level = 0;
u8 lowest_level = 0;
int min_write_lock_level;
+   int prev_cmp;
 
lowest_level = p->lowest_level;
WARN_ON(lowest_level && ins_len > 0);
@@ -2484,6 +2518,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, 
struct btrfs_root
min_write_lock_level = write_lock_level;
 
 again:
+   prev_cmp = -1;
/*
 * we try very hard to do read locks on the root
 */
@@ -2584,7 +2619,7 @@ cow_done:
if (!cow)
btrfs_unlock_up_safe(p, level + 1);
 
-   ret = bin_search(b, key, level, &slot);
+   ret = key_search(b, key, level, &prev_cmp, &slot);
 
if (level != 0) {
int dec = 0;
@@ -2719,6 +2754,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, struct 
btrfs_key *key,
int level;
int lowest_unlock = 1;
u8 lowest_level = 0;
+   int prev_cmp;
 
lowest_level = p->lowest_level;
WARN_ON(p->nodes[0] != NULL);
@@ -2729,6 +2765,7 @@ int btrfs_search_old_slot(struct b

Re: [PATCH v6] Btrfs: optimize key searches in btrfs_search_slot

2013-09-01 Thread Filipe David Manana

On Sun, Sep 1, 2013 at 8:21 AM, Miao Xie  wrote:
> On  sat, 31 Aug 2013 13:54:56 +0100, Filipe David Borba Manana wrote:
>> When the binary search returns 0 (exact match), the target key
>> will necessarily be at slot 0 of all nodes below the current one,
>> so in this case the binary search is not needed because it will
>> always return 0, and we waste time doing it, holding node locks
>> for longer than necessary, etc.
>>
>> Below follow histograms with the times spent on the current approach of
>> doing a binary search when the previous binary search returned 0, and
>> times for the new approach, which directly picks the first item/child
>> node in the leaf/node.
>>
>> Current approach:
>>
>> Count: 6682
>> Range: 35.000 - 8370.000; Mean: 85.837; Median: 75.000; Stddev: 106.429
>> Percentiles:  90th: 124.000; 95th: 145.000; 99th: 206.000
>>   35.000 -   61.080:  1235 
>>   61.080 -  106.053:  4207 
>> #
>>  106.053 -  183.606:  1122 ##
>>  183.606 -  317.341:   111 #
>>  317.341 -  547.959: 6 |
>>  547.959 - 8370.000: 1 |
>>
>> Approach proposed by this patch:
>>
>> Count: 6682
>> Range:  6.000 - 135.000; Mean: 16.690; Median: 16.000; Stddev:  7.160
>> Percentiles:  90th: 23.000; 95th: 27.000; 99th: 40.000
>>6.000 -8.418:58 #
>>8.418 -   11.670:  1149 #
>>   11.670 -   16.046:  2418 
>> #
>>   16.046 -   21.934:  2098 ##
>>   21.934 -   29.854:   744 
>>   29.854 -   40.511:   154 ###
>>   40.511 -   54.848:41 #
>>   54.848 -   74.136: 5 |
>>   74.136 -  100.087: 9 |
>>  100.087 -  135.000: 6 |
>>
>> These samples were captured during a run of the btrfs tests 001, 002 and
>> 004 in the xfstests, with a leaf/node size of 4Kb.
>>
>> Signed-off-by: Filipe David Borba Manana 
>> Signed-off-by: Josef Bacik 
>> ---
>>
>> V2: Simplified code, removed unnecessary code.
>> V3: Replaced BUG_ON() with the new ASSERT() from Josef.
>> V4: Addressed latest comments from Zach Brown and Josef Bacik.
>> Surrounded all code that is used for the assertion with a
>> #ifdef CONFIG_BTRFS_ASSERT ... #endif block. Also changed
>> offset arguments to be more strictly correct.
>> V5: Updated histograms to reflect latest version of the code.
>> V6: Use single assert macro and no more #ifdef CONFIG_BTRFS_ASSERT
>> ... #endif logic, as suggested by Miao Xie.
>>
>>  fs/btrfs/ctree.c |   39 +--
>>  1 file changed, 37 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index 5fa521b..5f38157 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -2426,6 +2426,37 @@ done:
>>   return ret;
>>  }
>>
>> +static int key_search_validate(struct extent_buffer *b,
>> +struct btrfs_key *key,
>> +int level)
>> +{
>> + struct btrfs_disk_key disk_key;
>> + unsigned long offset;
>> +
>> + btrfs_cpu_key_to_disk(&disk_key, key);
>> +
>> + if (level == 0)
>> + offset = offsetof(struct btrfs_leaf, items[0].key);
>> + else
>> + offset = offsetof(struct btrfs_node, ptrs[0].key);
>> +
>> + return !memcmp_extent_buffer(b, &disk_key, offset, sizeof(disk_key));
>> +}
>
> Maybe I didn't explain clearly in the previous mail, what I suggested was to
> move "#ifdef CONFIG_BTRFS_ASSERT" out of the function, not to remove it. The 
> final
> code is:
>
> #ifdef CONFIG_BTRFS_ASSERT
> static int key_search_validate()
> {
> }
> #endif
>
> static int key_search()
> {
> ...
> ASSERT(key_search_validate(b, key, level));
> ...
> }
>
> If there is no "#ifdef CONFIG_BTRFS_ASSERT" wrapper around 
> key_search_validate(),
> the compiler will output the unused function warning if CONFIG_BTRFS_ASSERT 
> is not set.

Ok. I misunderstood what you meant before. If the goal is not to
remove the #ifdef #endif, then honestly I'm not seeing what value the
suggestion brings in compared to patch v5, as it seems purely a style
preference (and highly subjective whether it's better or not).
Nevertheless I'm fine with it and hopefully everyone else will be too.

thanks

>
> Thanks
> Miao
>
>> +
>> +static int key_search(struct extent_buffer *b, struct btrfs_key *key,
>> +   int level, int *prev_cmp, int *slot)
>> +{
>> + if (*prev_cmp != 0) {
>> + *prev_cmp = bin_search(b, key, level, slot);
>> + return *prev_cmp;
>> + }
>> +
>> + ASSERT(key_search_validate(b, key, level));
>> + *slot = 0;
>> +
>> + return 0;
>> +}
>> +
>>  /*
>>   * look for key in the tree.  path is filled in with nodes along the way
>>   * if key is found, we return zero and you can find the item in the leaf
>> @@ -2454,6 +2485,7 @@ int btrfs_search_slot(struct btrfs_trans_handle

[PATCH] btrfs: use list_for_each_entry_safe() when delete items

2013-09-01 Thread Azat Khuzhin

Replace list_for_each_entry() by list_for_each_entry_safe() in
__btrfs_close_devices()

list_for_each_entry() {
list_replace_rcu();
call_rcu(); <--We may free the device, if we get next
   device by the current one, the page fault
   may happen.
}

Signed-off-by: Azat Khuzhin 
Reviewed-by: Miao Xie 
---
V2: Add some comments from Miao into commit message

 fs/btrfs/volumes.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 78b8717..1d1b595 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -616,13 +616,13 @@ static void free_device(struct rcu_head *head)
 
 static int __btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
 {
-   struct btrfs_device *device;
+   struct btrfs_device *device, *next;
 
if (--fs_devices->opened > 0)
return 0;
 
mutex_lock(&fs_devices->device_list_mutex);
-   list_for_each_entry(device, &fs_devices->devices, dev_list) {
+   list_for_each_entry_safe(device, next, &fs_devices->devices, dev_list) {
struct btrfs_device *new_device;
struct rcu_string *name;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] rwsem: add rwsem_is_contended

2013-09-01 Thread Michel Lespinasse

Hi Josef,

On Fri, Aug 30, 2013 at 7:14 AM, Josef Bacik  wrote:
> Btrfs uses an rwsem to control access to its extent tree.  Threads will hold a
> read lock on this rwsem while they scan the extent tree, and if need_resched()
> they will drop the lock and schedule.  The transaction commit needs to take a
> write lock for this rwsem for a very short period to switch out the commit
> roots.  If there are a lot of threads doing this caching operation we can 
> starve
> out the committers which slows everybody out.  To address this we want to add
> this functionality to see if our rwsem has anybody waiting to take a write 
> lock
> so we can drop it and schedule for a bit to allow the commit to continue.
> Thanks,
>
> Signed-off-by: Josef Bacik 

FYI, I once tried to introduce something like this before, but my use
case was pretty weak so it was not accepted at the time. I don't think
there were any objections to the API itself though, and I think it's
potentially a good idea if you use case justifies it.

Two comments:

- Note that there are two rwsem implementations - if you are going to
add functionality to rwsem.h you want to add the same functionality in
rwsem-spinlock.h as well.

- I would prefer if you could avoid taking the wait_lock in your
rwsem.h implementation. In your use case (read lock is known to be
held), checking for sem->count < 0 would be sufficient to indicate a
writer is queued (or getting onto the queue). In the general case,
some architectures have the various values set up so that
RWSEM_WAITING_BIAS != RWSEM_ACTIVE_WRITE_BIAS - for these
architectures at least, you can check for waiters by looking if the
lowest bit of RWSEM_WAITING_BIAS is set in sem->count.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6] Btrfs: optimize key searches in btrfs_search_slot

2013-09-01 Thread Miao Xie

On  sat, 31 Aug 2013 13:54:56 +0100, Filipe David Borba Manana wrote:
> When the binary search returns 0 (exact match), the target key
> will necessarily be at slot 0 of all nodes below the current one,
> so in this case the binary search is not needed because it will
> always return 0, and we waste time doing it, holding node locks
> for longer than necessary, etc.
> 
> Below follow histograms with the times spent on the current approach of
> doing a binary search when the previous binary search returned 0, and
> times for the new approach, which directly picks the first item/child
> node in the leaf/node.
> 
> Current approach:
> 
> Count: 6682
> Range: 35.000 - 8370.000; Mean: 85.837; Median: 75.000; Stddev: 106.429
> Percentiles:  90th: 124.000; 95th: 145.000; 99th: 206.000
>   35.000 -   61.080:  1235 
>   61.080 -  106.053:  4207 
> #
>  106.053 -  183.606:  1122 ##
>  183.606 -  317.341:   111 #
>  317.341 -  547.959: 6 |
>  547.959 - 8370.000: 1 |
> 
> Approach proposed by this patch:
> 
> Count: 6682
> Range:  6.000 - 135.000; Mean: 16.690; Median: 16.000; Stddev:  7.160
> Percentiles:  90th: 23.000; 95th: 27.000; 99th: 40.000
>6.000 -8.418:58 #
>8.418 -   11.670:  1149 #
>   11.670 -   16.046:  2418 
> #
>   16.046 -   21.934:  2098 ##
>   21.934 -   29.854:   744 
>   29.854 -   40.511:   154 ###
>   40.511 -   54.848:41 #
>   54.848 -   74.136: 5 |
>   74.136 -  100.087: 9 |
>  100.087 -  135.000: 6 |
> 
> These samples were captured during a run of the btrfs tests 001, 002 and
> 004 in the xfstests, with a leaf/node size of 4Kb.
> 
> Signed-off-by: Filipe David Borba Manana 
> Signed-off-by: Josef Bacik 
> ---
> 
> V2: Simplified code, removed unnecessary code.
> V3: Replaced BUG_ON() with the new ASSERT() from Josef.
> V4: Addressed latest comments from Zach Brown and Josef Bacik.
> Surrounded all code that is used for the assertion with a
> #ifdef CONFIG_BTRFS_ASSERT ... #endif block. Also changed
> offset arguments to be more strictly correct.
> V5: Updated histograms to reflect latest version of the code.
> V6: Use single assert macro and no more #ifdef CONFIG_BTRFS_ASSERT
> ... #endif logic, as suggested by Miao Xie.
> 
>  fs/btrfs/ctree.c |   39 +--
>  1 file changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 5fa521b..5f38157 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2426,6 +2426,37 @@ done:
>   return ret;
>  }
>  
> +static int key_search_validate(struct extent_buffer *b,
> +struct btrfs_key *key,
> +int level)
> +{
> + struct btrfs_disk_key disk_key;
> + unsigned long offset;
> +
> + btrfs_cpu_key_to_disk(&disk_key, key);
> +
> + if (level == 0)
> + offset = offsetof(struct btrfs_leaf, items[0].key);
> + else
> + offset = offsetof(struct btrfs_node, ptrs[0].key);
> +
> + return !memcmp_extent_buffer(b, &disk_key, offset, sizeof(disk_key));
> +}

Maybe I didn't explain clearly in the previous mail, what I suggested was to
move "#ifdef CONFIG_BTRFS_ASSERT" out of the function, not to remove it. The 
final
code is:

#ifdef CONFIG_BTRFS_ASSERT
static int key_search_validate()
{
}
#endif

static int key_search()
{
...
ASSERT(key_search_validate(b, key, level));
...
}

If there is no "#ifdef CONFIG_BTRFS_ASSERT" wrapper around 
key_search_validate(),
the compiler will output the unused function warning if CONFIG_BTRFS_ASSERT is 
not set.

Thanks
Miao

> +
> +static int key_search(struct extent_buffer *b, struct btrfs_key *key,
> +   int level, int *prev_cmp, int *slot)
> +{
> + if (*prev_cmp != 0) {
> + *prev_cmp = bin_search(b, key, level, slot);
> + return *prev_cmp;
> + }
> +
> + ASSERT(key_search_validate(b, key, level));
> + *slot = 0;
> +
> + return 0;
> +}
> +
>  /*
>   * look for key in the tree.  path is filled in with nodes along the way
>   * if key is found, we return zero and you can find the item in the leaf
> @@ -2454,6 +2485,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>   int write_lock_level = 0;
>   u8 lowest_level = 0;
>   int min_write_lock_level;
> + int prev_cmp;
>  
>   lowest_level = p->lowest_level;
>   WARN_ON(lowest_level && ins_len > 0);
> @@ -2484,6 +2516,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>   min_write_lock_level = write_lock_level;
>  
>  again:
> + prev_cmp = -1;
>   /*
>* we try very hard to do read locks on the root
>*/
> @@ -2584,7 +2617,7 @@ cow_done:
>

Re: btrfs-progs: v3, move out print in cmd_df to another function

Re: btrfs-progs: v3, move out print in cmd_df to another function

[PATCH] btrfs-progs: replace fails start but in the background v2

Re: Btrfs: separate out tests into their own directory V2

Re: [PATCH 2/2] btrfs-progs: Update the man page of btrfs

Re: Device delete returns "unable to go below four devices on raid10" on 5 drive setup

[PATCH] Btrfs: do not add replace target to the alloc_list

Re: Mixed blocks, he can avoid ENOSPACE error, when he can't allocated metadata blocks?

Unmountable filesystem parent transid verify failed

Re: Device delete returns "unable to go below four devices on raid10" on 5 drive setup

[PATCH v7] Btrfs: optimize key searches in btrfs_search_slot

Re: [PATCH v6] Btrfs: optimize key searches in btrfs_search_slot

[PATCH] btrfs: use list_for_each_entry_safe() when delete items

Re: [PATCH] rwsem: add rwsem_is_contended

Re: [PATCH v6] Btrfs: optimize key searches in btrfs_search_slot

15 matches

Site Navigation

Mail list logo

Footer information