Re: btrfs-progs: v3, move out print in cmd_df to another function
On 09/02/2013 10:58 AM, Anand Jain wrote: Wang, kindly note that this was fixed in v4. Yeah...This is sent by my shell script by accident, I have fixed it and it won't send the repeated message again. Thanks for reminding .. ^_^ Anand On 09/01/2013 02:15 PM, Wang Shilong wrote: Hello, Using checkpatch.pl, i get the following warnings(errors): ERROR: "foo * bar" should be "foo *bar" #37: FILE: cmds-filesystem.c:47: +static char * group_type_str(u64 flag) ERROR: "foo * bar" should be "foo *bar" #69: FILE: cmds-filesystem.c:63: +static char * group_profile_str(u64 flag) total: 2 errors, 0 warnings, 229 lines checked patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Thanks, Wang -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-progs: v3, move out print in cmd_df to another function
Wang, kindly note that this was fixed in v4. Anand On 09/01/2013 02:15 PM, Wang Shilong wrote: Hello, Using checkpatch.pl, i get the following warnings(errors): ERROR: "foo * bar" should be "foo *bar" #37: FILE: cmds-filesystem.c:47: +static char * group_type_str(u64 flag) ERROR: "foo * bar" should be "foo *bar" #69: FILE: cmds-filesystem.c:63: +static char * group_profile_str(u64 flag) total: 2 errors, 0 warnings, 229 lines checked patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Thanks, Wang -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: replace fails start but in the background v2
when the balance is running, the replace start ioctl fails (for the right reasons). but since the cli has put ioctl thread to background (for right reasons) the user won't know that cli failed to start. so before cli goes to the background, it should check if mutually_exclusive_operation_running is not held. this is done by newly introduced ioctl BTRFS_IOC_CHECK_DEV_EXCL_OPS by the following kernel patch: btrfs: introduce BTRFS_IOC_CHECK_DEV_EXCL_OPS ioctl to check dev excl op v2: fix checkpatch.pl warnings as spotted by Wang Signed-off-by: Anand Jain --- cmds-device.c |4 ++-- cmds-replace.c | 14 ++ ioctl.h|1 + man/btrfs.8.in |4 ++-- utils.c| 15 +++ utils.h|1 + 6 files changed, 35 insertions(+), 4 deletions(-) diff --git a/cmds-device.c b/cmds-device.c index 8446502..27f7f84 100644 --- a/cmds-device.c +++ b/cmds-device.c @@ -45,7 +45,7 @@ static const char * const device_cmd_group_usage[] = { }; static const char * const cmd_add_dev_usage[] = { - "btrfs device add [...] ", + "btrfs device add [-f] [...] ", "Add a device to a filesystem", NULL }; @@ -85,7 +85,7 @@ static int cmd_add_dev(int argc, char **argv) int devfd, res; u64 dev_block_count = 0; int mixed = 0; - +printf("asj: add %s\n", argv[i]); res = test_dev_for_mkfs(argv[i], force, estr); if (res) { fprintf(stderr, "%s", estr); diff --git a/cmds-replace.c b/cmds-replace.c index e3ff695..0b2cbc8 100644 --- a/cmds-replace.c +++ b/cmds-replace.c @@ -203,6 +203,20 @@ static int cmd_start_replace(int argc, char **argv) goto leave_with_error; } + /* check if there is some other device exclusive +* operation running in the FS which won't let this replace +* to run. And ENOTTY is when older kernel doesn't support +* lock checking ioctl +*/ + ret = is_dev_excl_op_free(fdmnt); + if (ret && ret != -ENOTTY) { + fprintf(stderr, + "ERROR: replace start failed on \"%s\" - %s\n", + path, + ret > 0 ? btrfs_err_str(ret) : strerror(-ret)); + goto leave_with_error; + } + srcdev = argv[optind]; dstdev = argv[optind + 1]; diff --git a/ioctl.h b/ioctl.h index e959720..afadcbc 100644 --- a/ioctl.h +++ b/ioctl.h @@ -598,6 +598,7 @@ struct btrfs_ioctl_clone_range_args { #define BTRFS_IOC_DEV_REPLACE _IOWR(BTRFS_IOCTL_MAGIC, 53, \ struct btrfs_ioctl_dev_replace_args) #define BTRFS_IOC_DEDUP_CTL _IOW(BTRFS_IOCTL_MAGIC, 55, int) +#define BTRFS_IOC_CHECK_DEV_EXCL_OPS _IO(BTRFS_IOCTL_MAGIC, 56) #ifdef __cplusplus } diff --git a/man/btrfs.8.in b/man/btrfs.8.in index d493d7e..0cca5c3 100644 --- a/man/btrfs.8.in +++ b/man/btrfs.8.in @@ -47,7 +47,7 @@ btrfs \- control a btrfs filesystem \fBbtrfs\fP \fB[filesystem] balance status\fP [-v] \fI\fP .PP .PP -\fBbtrfs\fP \fBdevice add\fP \fI\fP [\fI...\fP] \fI\fP +\fBbtrfs\fP \fBdevice add\fP [-f] \fI\fP [\fI...\fP] \fI\fP .PP \fBbtrfs\fP \fBdevice delete\fP \fI\fP [\fI...\fP] \fI\fP .PP @@ -386,7 +386,7 @@ be verbose .RE .TP -\fBdevice add\fR\fI \fP[\fI...\fP] \fI\fR +\fBdevice add\fR [-f] \fI \fP[\fI...\fP] \fI\fR Add device(s) to the filesystem identified by \fI\fR. .TP diff --git a/utils.c b/utils.c index b039a03..c53b7f1 100644 --- a/utils.c +++ b/utils.c @@ -1993,3 +1993,18 @@ int is_vol_small(char *file) return 0; } } + +/* Returns: + * BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS: + * If the device is locked to prevent other device operations + * from the user cli like device add remove replace balance etc.. + * < 0: + * For any other error including if kernel don't support the + * ioctl (-ENOTTY) + */ +int is_dev_excl_op_free(int fd) +{ + int ret; + ret = ioctl(fd, BTRFS_IOC_CHECK_DEV_EXCL_OPS, NULL); + return ret > 0 ? ret : -errno; +} diff --git a/utils.h b/utils.h index eb6fba3..a1e5d67 100644 --- a/utils.h +++ b/utils.h @@ -83,4 +83,5 @@ int test_num_disk_vs_raid(u64 metadata_profile, u64 data_profile, u64 dev_cnt, int mixed, char *estr); int get_btrfs_mount(const char *dev, char *mp, size_t mp_size); int is_vol_small(char *file); +int is_dev_excl_op_free(int fd); #endif -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs: separate out tests into their own directory V2
Sorry, please ignore this thread... Thanks, wang On 09/01/2013 02:15 PM, Wang Shilong wrote: Hello, Using checkpatch.pl, i get the following warnings(errors): WARNING: kfree(NULL) is safe this check is probably not required #132: FILE: fs/btrfs/free-space-cache.c:3035: + if (map) + kfree(map); WARNING: line over 80 characters #882: FILE: fs/btrfs/tests/free-space-tests.c:211: + ret = test_add_free_space_entry(cache, 4 * 1024 * 1024, 1 * 1024 * 1024, 1); WARNING: line over 80 characters #927: FILE: fs/btrfs/tests/free-space-tests.c:256: + ret = test_add_free_space_entry(cache, 1 * 1024 * 1024, 4 * 1024 * 1024, 1); WARNING: line over 80 characters #947: FILE: fs/btrfs/tests/free-space-tests.c:276: + ret = test_add_free_space_entry(cache, 4 * 1024 * 1024, 4 * 1024 * 1024, 1); WARNING: line over 80 characters #953: FILE: fs/btrfs/tests/free-space-tests.c:282: + ret = test_add_free_space_entry(cache, 2 * 1024 * 1024, 2 * 1024 * 1024, 0); WARNING: line over 80 characters #1016: FILE: fs/btrfs/tests/free-space-tests.c:345: + ret = test_add_free_space_entry(cache, 1 * 1024 * 1024, 2 * 1024 * 1024, 1); WARNING: line over 80 characters #1022: FILE: fs/btrfs/tests/free-space-tests.c:351: + ret = test_add_free_space_entry(cache, 3 * 1024 * 1024, 1 * 1024 * 1024, 0); WARNING: line over 80 characters #1030: FILE: fs/btrfs/tests/free-space-tests.c:359: + test_msg("Error removing bitmap and extent overlapping %d\n", ret); total: 0 errors, 8 warnings, 997 lines checked patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Thanks, Wang -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs-progs: Update the man page of btrfs
Sorry for the late reply. I checked the new man page, which seems OK for me. Thank you. Qu Wenruo 于 2013年08月06日 07:28, David Sterba 写道: On Tue, Jul 23, 2013 at 10:43:22AM +0800, Qu Wenruo wrote: Update the man page of "btrfs" command to keep up with new commands. Thanks. Please check if I haven't missed anything, I've used 'wiggle' to guess-apply the patch and fixed a few conflicts myself. david -- - Qu Wenruo Development Dept.I Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST) No. 6 Wenzhu Road, Nanjing, 210012, China TEL: +86+25-86630566-8526 COINS: 7998-8526 FAX: +86+25-83317685 MAIL: quwen...@cn.fujitsu.com - -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Device delete returns "unable to go below four devices on raid10" on 5 drive setup
On Sun, 2013-09-01 at 14:08 +0200, Steven Post wrote: > On Sat, 2013-08-31 at 18:03 -0600, Chris Murphy wrote: > > On Aug 31, 2013, at 5:55 PM, Steven Post > > wrote: > > > > > On Sat, 2013-08-31 at 11:42 -0600, Chris Murphy wrote: > > >> > > >> Yes. It might take a few minutes after the chunks are reallocated for > > >> the device to be removed from the volume. I've had some cases where even > > >> a reboot was needed for the information in fi sh to refresh. > > > > > > I see, so that might be normal behaviour. > > > > No, I think it's expected for deleted devices to not appear in the volume > > listing anymore, but with older kernels I had that experience. I haven't > > tried it recently with newer kernels. > > I might have phrased that a bit incorrectly, with 'normal behaviour' I > meant it was known to do that and not cause major problems. Of course I > would expect the entry to just disappear. > I'll let you know the result of the 'device delete' operation on the > second machine (3.10.7 kernel). The 3.10.7 kernel doesn't seem to have this issue, the removed device is not listed anymore immediately after the 'device delete' command returns. Although in that instance the array still had 6 drives, not 5 as with the old one. [...] Best regards, Steven signature.asc Description: This is a digitally signed message part
[PATCH] Btrfs: do not add replace target to the alloc_list
If replace was suspended by the umount, replace target device is added to the fs_devices->alloc_list during a later mount. This is obviously wrong. ->is_tgtdev_for_dev_replace is supposed to guard against that, but ->is_tgtdev_for_dev_replace is (and can only ever be) initialized *after* everything is opened and fs_devices lists are populated. Fix this by checking the devid instead: for replace targets it's always equal to BTRFS_DEV_REPLACE_DEVID. Cc: Stefan Behrens Signed-off-by: Ilya Dryomov --- At first I thought this was caused by my btrfs_device rollback patch, but no, it just made it easier to spot -- previously one had to reboot or rmmod/insmod before mounting a suspended replace. fs/btrfs/volumes.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c9a0977..5b99f19 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -793,7 +793,8 @@ static int __btrfs_open_devices(struct btrfs_fs_devices *fs_devices, fs_devices->rotating = 1; fs_devices->open_devices++; - if (device->writeable && !device->is_tgtdev_for_dev_replace) { + if (device->writeable && + device->devid != BTRFS_DEV_REPLACE_DEVID) { fs_devices->rw_devices++; list_add(&device->dev_alloc_list, &fs_devices->alloc_list); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mixed blocks, he can avoid ENOSPACE error, when he can't allocated metadata blocks?
Hello list, sorry for my bad english anyway. if my message is delirium, just ignore this message. My question: When using mixed blocks, metadata and data chunks has be merge, but we have (when using mixed) speed penalty. how many penalty will be have if we using mixed? Kernel 3.11-rc7, Ubuntu 13.10 x64 my simple test: #Mixed sudo mkfs.btrfs -f -M /dev/sdb #sdb old seagate hdd 80G sudo mount /dev/sdb /mnt time sudo dd if=/dev/zero of=/mnt/zero bs=1M count=4096 ~48.5s time sudo rm /mnt/zero ~1.5s sudo umount /mnt --- #default setting sudo mkfs.btrfs -f /dev/sdb sudo mount /dev/sdb /mnt time sudo dd if=/dev/zero of=/mnt/zero bs=1M count=4096 ~48.5s time sudo rm /mnt/zero ~1.5s sudo umount /mnt --- #raw write time sudo dd if=/dev/zero of=/dev/sdb bs=1M count=4096 ~59s I don't see any different between mixed and defaults blocks. So using mixed blocks have sense? Or this is outdated function? Can i change in future type of chunk allocation between mixed and default (with using balance function)? i believe in mixed blocks because this function can awoid problems with ENOSPACE error (when system can't allocated metadata chunk) and can up space utilization on the disk. On slow hdd, we no see any different between profiles, and on speed ssd perfomance penalty should not be significant. (i just using btrfs on my PC (1 TB hdd) and laptop (ssd 128 GB), i like this fs and i haven't any problem (Only enospace, but i just add temp drive and using balance to fix this) and i just try optimizing fs). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Unmountable filesystem parent transid verify failed
Hi again. Sorry for top posting. I have a 9 disk filesystem that does not mount anymore and need some help/advice so I can recover the data. What happened was that I was running a btrfs delete device under Ubuntu 13.04 Kernel 3.8 and after a long time of moving data around it crashed with a SEGV. Now the filesystem does not mount and none of the recovery options I have tried work. I have upgraded to Debian testing and are now using kerne3.10-2-amd64 When I try btrfsck I get heaps of these : Ignoring transid failure parent transid verify failed on 24419581267968 wanted 301480 found 301495 parent transid verify failed on 24419581267968 wanted 301480 found 301495 parent transid verify failed on 24419581267968 wanted 301480 found 301495 parent transid verify failed on 24419581267968 wanted 301480 found 301495 I have tried using btrfs-image : but it too crashes eventually with : btrfs-image -c9 -t4 /dev/sde btrfs-image ... btrfs-image: ctree.c:787: read_node_slot: Assertion `!(level == 0)' failed. Aborted mount -o ro,recovery fails # mount -o ro,recovery /dev/sde /DATA mount: wrong fs type, bad option, bad superblock on /dev/sde, ... # btrfs-zero-log /dev/sde eventually fails with : btrfs-zero-log: ctree.c:342: __btrfs_cow_block: Assertion `!(btrfs_header_generation(buf) > trans->transid)' failed. Aborted What should I try next? regards ronnie sahlberg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Device delete returns "unable to go below four devices on raid10" on 5 drive setup
On Sat, 2013-08-31 at 18:03 -0600, Chris Murphy wrote: > On Aug 31, 2013, at 5:55 PM, Steven Post wrote: > > > On Sat, 2013-08-31 at 11:42 -0600, Chris Murphy wrote: > >> > >> Yes. It might take a few minutes after the chunks are reallocated for the > >> device to be removed from the volume. I've had some cases where even a > >> reboot was needed for the information in fi sh to refresh. > > > > I see, so that might be normal behaviour. > > No, I think it's expected for deleted devices to not appear in the volume > listing anymore, but with older kernels I had that experience. I haven't > tried it recently with newer kernels. I might have phrased that a bit incorrectly, with 'normal behaviour' I meant it was known to do that and not cause major problems. Of course I would expect the entry to just disappear. I'll let you know the result of the 'device delete' operation on the second machine (3.10.7 kernel). Back on the 3.2 kernel, the "filesystem show" command still showed the removed device, but still with less used space that the others after the balance. Shutdown + physical removal + boot didn't produce any error, since this array is used as the root filesystem I think I would have noticed a serious problem by now. I successfully added the 1 TB drive to the array (after partitioning). So it seems it's just the output of the 'filesystem show' command that is lagging behind, even after a reboot and a 20 hour idle period. > > > > > As an aside, I'd rather not recreate the arrays if it can be done > > without recreating. > > It should work. But it's an experimental file system. I'd at least make a > backup if you're going to do this with the device add/remove method. Naturally, all important files have a backup, but for the rest of the volume I can live with the fact that it would be lost. Also if every one created a new filesystem instead of using device add/delete, this code wouldn't get much testing ;) Best regards, Steven signature.asc Description: This is a digitally signed message part
[PATCH v7] Btrfs: optimize key searches in btrfs_search_slot
When the binary search returns 0 (exact match), the target key will necessarily be at slot 0 of all nodes below the current one, so in this case the binary search is not needed because it will always return 0, and we waste time doing it, holding node locks for longer than necessary, etc. Below follow histograms with the times spent on the current approach of doing a binary search when the previous binary search returned 0, and times for the new approach, which directly picks the first item/child node in the leaf/node. Current approach: Count: 6682 Range: 35.000 - 8370.000; Mean: 85.837; Median: 75.000; Stddev: 106.429 Percentiles: 90th: 124.000; 95th: 145.000; 99th: 206.000 35.000 - 61.080: 1235 61.080 - 106.053: 4207 # 106.053 - 183.606: 1122 ## 183.606 - 317.341: 111 # 317.341 - 547.959: 6 | 547.959 - 8370.000: 1 | Approach proposed by this patch: Count: 6682 Range: 6.000 - 135.000; Mean: 16.690; Median: 16.000; Stddev: 7.160 Percentiles: 90th: 23.000; 95th: 27.000; 99th: 40.000 6.000 -8.418:58 # 8.418 - 11.670: 1149 # 11.670 - 16.046: 2418 # 16.046 - 21.934: 2098 ## 21.934 - 29.854: 744 29.854 - 40.511: 154 ### 40.511 - 54.848:41 # 54.848 - 74.136: 5 | 74.136 - 100.087: 9 | 100.087 - 135.000: 6 | These samples were captured during a run of the btrfs tests 001, 002 and 004 in the xfstests, with a leaf/node size of 4Kb. Signed-off-by: Filipe David Borba Manana Signed-off-by: Josef Bacik --- V2: Simplified code, removed unnecessary code. V3: Replaced BUG_ON() with the new ASSERT() from Josef. V4: Addressed latest comments from Zach Brown and Josef Bacik. Surrounded all code that is used for the assertion with a #ifdef CONFIG_BTRFS_ASSERT ... #endif block. Also changed offset arguments to be more strictly correct. V5: Updated histograms to reflect latest version of the code. V6: Use single assert macro and no more #ifdef CONFIG_BTRFS_ASSERT ... #endif logic, as suggested by Miao Xie. V7: Added back the #ifdef ... #endif logic, to avoid compiler warning about unused function when CONFIG_BTRFS_ASSERT is not enabled. fs/btrfs/ctree.c | 41 +++-- 1 file changed, 39 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 5fa521b..4d602f7 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -2426,6 +2426,39 @@ done: return ret; } +#ifdef CONFIG_BTRFS_ASSERT +static int key_search_validate(struct extent_buffer *b, + struct btrfs_key *key, + int level) +{ + struct btrfs_disk_key disk_key; + unsigned long offset; + + btrfs_cpu_key_to_disk(&disk_key, key); + + if (level == 0) + offset = offsetof(struct btrfs_leaf, items[0].key); + else + offset = offsetof(struct btrfs_node, ptrs[0].key); + + return !memcmp_extent_buffer(b, &disk_key, offset, sizeof(disk_key)); +} +#endif + +static int key_search(struct extent_buffer *b, struct btrfs_key *key, + int level, int *prev_cmp, int *slot) +{ + if (*prev_cmp != 0) { + *prev_cmp = bin_search(b, key, level, slot); + return *prev_cmp; + } + + ASSERT(key_search_validate(b, key, level)); + *slot = 0; + + return 0; +} + /* * look for key in the tree. path is filled in with nodes along the way * if key is found, we return zero and you can find the item in the leaf @@ -2454,6 +2487,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root int write_lock_level = 0; u8 lowest_level = 0; int min_write_lock_level; + int prev_cmp; lowest_level = p->lowest_level; WARN_ON(lowest_level && ins_len > 0); @@ -2484,6 +2518,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root min_write_lock_level = write_lock_level; again: + prev_cmp = -1; /* * we try very hard to do read locks on the root */ @@ -2584,7 +2619,7 @@ cow_done: if (!cow) btrfs_unlock_up_safe(p, level + 1); - ret = bin_search(b, key, level, &slot); + ret = key_search(b, key, level, &prev_cmp, &slot); if (level != 0) { int dec = 0; @@ -2719,6 +2754,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, struct btrfs_key *key, int level; int lowest_unlock = 1; u8 lowest_level = 0; + int prev_cmp; lowest_level = p->lowest_level; WARN_ON(p->nodes[0] != NULL); @@ -2729,6 +2765,7 @@ int btrfs_search_old_slot(struct b
Re: [PATCH v6] Btrfs: optimize key searches in btrfs_search_slot
On Sun, Sep 1, 2013 at 8:21 AM, Miao Xie wrote: > On sat, 31 Aug 2013 13:54:56 +0100, Filipe David Borba Manana wrote: >> When the binary search returns 0 (exact match), the target key >> will necessarily be at slot 0 of all nodes below the current one, >> so in this case the binary search is not needed because it will >> always return 0, and we waste time doing it, holding node locks >> for longer than necessary, etc. >> >> Below follow histograms with the times spent on the current approach of >> doing a binary search when the previous binary search returned 0, and >> times for the new approach, which directly picks the first item/child >> node in the leaf/node. >> >> Current approach: >> >> Count: 6682 >> Range: 35.000 - 8370.000; Mean: 85.837; Median: 75.000; Stddev: 106.429 >> Percentiles: 90th: 124.000; 95th: 145.000; 99th: 206.000 >> 35.000 - 61.080: 1235 >> 61.080 - 106.053: 4207 >> # >> 106.053 - 183.606: 1122 ## >> 183.606 - 317.341: 111 # >> 317.341 - 547.959: 6 | >> 547.959 - 8370.000: 1 | >> >> Approach proposed by this patch: >> >> Count: 6682 >> Range: 6.000 - 135.000; Mean: 16.690; Median: 16.000; Stddev: 7.160 >> Percentiles: 90th: 23.000; 95th: 27.000; 99th: 40.000 >>6.000 -8.418:58 # >>8.418 - 11.670: 1149 # >> 11.670 - 16.046: 2418 >> # >> 16.046 - 21.934: 2098 ## >> 21.934 - 29.854: 744 >> 29.854 - 40.511: 154 ### >> 40.511 - 54.848:41 # >> 54.848 - 74.136: 5 | >> 74.136 - 100.087: 9 | >> 100.087 - 135.000: 6 | >> >> These samples were captured during a run of the btrfs tests 001, 002 and >> 004 in the xfstests, with a leaf/node size of 4Kb. >> >> Signed-off-by: Filipe David Borba Manana >> Signed-off-by: Josef Bacik >> --- >> >> V2: Simplified code, removed unnecessary code. >> V3: Replaced BUG_ON() with the new ASSERT() from Josef. >> V4: Addressed latest comments from Zach Brown and Josef Bacik. >> Surrounded all code that is used for the assertion with a >> #ifdef CONFIG_BTRFS_ASSERT ... #endif block. Also changed >> offset arguments to be more strictly correct. >> V5: Updated histograms to reflect latest version of the code. >> V6: Use single assert macro and no more #ifdef CONFIG_BTRFS_ASSERT >> ... #endif logic, as suggested by Miao Xie. >> >> fs/btrfs/ctree.c | 39 +-- >> 1 file changed, 37 insertions(+), 2 deletions(-) >> >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c >> index 5fa521b..5f38157 100644 >> --- a/fs/btrfs/ctree.c >> +++ b/fs/btrfs/ctree.c >> @@ -2426,6 +2426,37 @@ done: >> return ret; >> } >> >> +static int key_search_validate(struct extent_buffer *b, >> +struct btrfs_key *key, >> +int level) >> +{ >> + struct btrfs_disk_key disk_key; >> + unsigned long offset; >> + >> + btrfs_cpu_key_to_disk(&disk_key, key); >> + >> + if (level == 0) >> + offset = offsetof(struct btrfs_leaf, items[0].key); >> + else >> + offset = offsetof(struct btrfs_node, ptrs[0].key); >> + >> + return !memcmp_extent_buffer(b, &disk_key, offset, sizeof(disk_key)); >> +} > > Maybe I didn't explain clearly in the previous mail, what I suggested was to > move "#ifdef CONFIG_BTRFS_ASSERT" out of the function, not to remove it. The > final > code is: > > #ifdef CONFIG_BTRFS_ASSERT > static int key_search_validate() > { > } > #endif > > static int key_search() > { > ... > ASSERT(key_search_validate(b, key, level)); > ... > } > > If there is no "#ifdef CONFIG_BTRFS_ASSERT" wrapper around > key_search_validate(), > the compiler will output the unused function warning if CONFIG_BTRFS_ASSERT > is not set. Ok. I misunderstood what you meant before. If the goal is not to remove the #ifdef #endif, then honestly I'm not seeing what value the suggestion brings in compared to patch v5, as it seems purely a style preference (and highly subjective whether it's better or not). Nevertheless I'm fine with it and hopefully everyone else will be too. thanks > > Thanks > Miao > >> + >> +static int key_search(struct extent_buffer *b, struct btrfs_key *key, >> + int level, int *prev_cmp, int *slot) >> +{ >> + if (*prev_cmp != 0) { >> + *prev_cmp = bin_search(b, key, level, slot); >> + return *prev_cmp; >> + } >> + >> + ASSERT(key_search_validate(b, key, level)); >> + *slot = 0; >> + >> + return 0; >> +} >> + >> /* >> * look for key in the tree. path is filled in with nodes along the way >> * if key is found, we return zero and you can find the item in the leaf >> @@ -2454,6 +2485,7 @@ int btrfs_search_slot(struct btrfs_trans_handle
[PATCH] btrfs: use list_for_each_entry_safe() when delete items
Replace list_for_each_entry() by list_for_each_entry_safe() in __btrfs_close_devices() list_for_each_entry() { list_replace_rcu(); call_rcu(); <--We may free the device, if we get next device by the current one, the page fault may happen. } Signed-off-by: Azat Khuzhin Reviewed-by: Miao Xie --- V2: Add some comments from Miao into commit message fs/btrfs/volumes.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 78b8717..1d1b595 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -616,13 +616,13 @@ static void free_device(struct rcu_head *head) static int __btrfs_close_devices(struct btrfs_fs_devices *fs_devices) { - struct btrfs_device *device; + struct btrfs_device *device, *next; if (--fs_devices->opened > 0) return 0; mutex_lock(&fs_devices->device_list_mutex); - list_for_each_entry(device, &fs_devices->devices, dev_list) { + list_for_each_entry_safe(device, next, &fs_devices->devices, dev_list) { struct btrfs_device *new_device; struct rcu_string *name; -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] rwsem: add rwsem_is_contended
Hi Josef, On Fri, Aug 30, 2013 at 7:14 AM, Josef Bacik wrote: > Btrfs uses an rwsem to control access to its extent tree. Threads will hold a > read lock on this rwsem while they scan the extent tree, and if need_resched() > they will drop the lock and schedule. The transaction commit needs to take a > write lock for this rwsem for a very short period to switch out the commit > roots. If there are a lot of threads doing this caching operation we can > starve > out the committers which slows everybody out. To address this we want to add > this functionality to see if our rwsem has anybody waiting to take a write > lock > so we can drop it and schedule for a bit to allow the commit to continue. > Thanks, > > Signed-off-by: Josef Bacik FYI, I once tried to introduce something like this before, but my use case was pretty weak so it was not accepted at the time. I don't think there were any objections to the API itself though, and I think it's potentially a good idea if you use case justifies it. Two comments: - Note that there are two rwsem implementations - if you are going to add functionality to rwsem.h you want to add the same functionality in rwsem-spinlock.h as well. - I would prefer if you could avoid taking the wait_lock in your rwsem.h implementation. In your use case (read lock is known to be held), checking for sem->count < 0 would be sufficient to indicate a writer is queued (or getting onto the queue). In the general case, some architectures have the various values set up so that RWSEM_WAITING_BIAS != RWSEM_ACTIVE_WRITE_BIAS - for these architectures at least, you can check for waiters by looking if the lowest bit of RWSEM_WAITING_BIAS is set in sem->count. -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6] Btrfs: optimize key searches in btrfs_search_slot
On sat, 31 Aug 2013 13:54:56 +0100, Filipe David Borba Manana wrote: > When the binary search returns 0 (exact match), the target key > will necessarily be at slot 0 of all nodes below the current one, > so in this case the binary search is not needed because it will > always return 0, and we waste time doing it, holding node locks > for longer than necessary, etc. > > Below follow histograms with the times spent on the current approach of > doing a binary search when the previous binary search returned 0, and > times for the new approach, which directly picks the first item/child > node in the leaf/node. > > Current approach: > > Count: 6682 > Range: 35.000 - 8370.000; Mean: 85.837; Median: 75.000; Stddev: 106.429 > Percentiles: 90th: 124.000; 95th: 145.000; 99th: 206.000 > 35.000 - 61.080: 1235 > 61.080 - 106.053: 4207 > # > 106.053 - 183.606: 1122 ## > 183.606 - 317.341: 111 # > 317.341 - 547.959: 6 | > 547.959 - 8370.000: 1 | > > Approach proposed by this patch: > > Count: 6682 > Range: 6.000 - 135.000; Mean: 16.690; Median: 16.000; Stddev: 7.160 > Percentiles: 90th: 23.000; 95th: 27.000; 99th: 40.000 >6.000 -8.418:58 # >8.418 - 11.670: 1149 # > 11.670 - 16.046: 2418 > # > 16.046 - 21.934: 2098 ## > 21.934 - 29.854: 744 > 29.854 - 40.511: 154 ### > 40.511 - 54.848:41 # > 54.848 - 74.136: 5 | > 74.136 - 100.087: 9 | > 100.087 - 135.000: 6 | > > These samples were captured during a run of the btrfs tests 001, 002 and > 004 in the xfstests, with a leaf/node size of 4Kb. > > Signed-off-by: Filipe David Borba Manana > Signed-off-by: Josef Bacik > --- > > V2: Simplified code, removed unnecessary code. > V3: Replaced BUG_ON() with the new ASSERT() from Josef. > V4: Addressed latest comments from Zach Brown and Josef Bacik. > Surrounded all code that is used for the assertion with a > #ifdef CONFIG_BTRFS_ASSERT ... #endif block. Also changed > offset arguments to be more strictly correct. > V5: Updated histograms to reflect latest version of the code. > V6: Use single assert macro and no more #ifdef CONFIG_BTRFS_ASSERT > ... #endif logic, as suggested by Miao Xie. > > fs/btrfs/ctree.c | 39 +-- > 1 file changed, 37 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 5fa521b..5f38157 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -2426,6 +2426,37 @@ done: > return ret; > } > > +static int key_search_validate(struct extent_buffer *b, > +struct btrfs_key *key, > +int level) > +{ > + struct btrfs_disk_key disk_key; > + unsigned long offset; > + > + btrfs_cpu_key_to_disk(&disk_key, key); > + > + if (level == 0) > + offset = offsetof(struct btrfs_leaf, items[0].key); > + else > + offset = offsetof(struct btrfs_node, ptrs[0].key); > + > + return !memcmp_extent_buffer(b, &disk_key, offset, sizeof(disk_key)); > +} Maybe I didn't explain clearly in the previous mail, what I suggested was to move "#ifdef CONFIG_BTRFS_ASSERT" out of the function, not to remove it. The final code is: #ifdef CONFIG_BTRFS_ASSERT static int key_search_validate() { } #endif static int key_search() { ... ASSERT(key_search_validate(b, key, level)); ... } If there is no "#ifdef CONFIG_BTRFS_ASSERT" wrapper around key_search_validate(), the compiler will output the unused function warning if CONFIG_BTRFS_ASSERT is not set. Thanks Miao > + > +static int key_search(struct extent_buffer *b, struct btrfs_key *key, > + int level, int *prev_cmp, int *slot) > +{ > + if (*prev_cmp != 0) { > + *prev_cmp = bin_search(b, key, level, slot); > + return *prev_cmp; > + } > + > + ASSERT(key_search_validate(b, key, level)); > + *slot = 0; > + > + return 0; > +} > + > /* > * look for key in the tree. path is filled in with nodes along the way > * if key is found, we return zero and you can find the item in the leaf > @@ -2454,6 +2485,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, > struct btrfs_root > int write_lock_level = 0; > u8 lowest_level = 0; > int min_write_lock_level; > + int prev_cmp; > > lowest_level = p->lowest_level; > WARN_ON(lowest_level && ins_len > 0); > @@ -2484,6 +2516,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, > struct btrfs_root > min_write_lock_level = write_lock_level; > > again: > + prev_cmp = -1; > /* >* we try very hard to do read locks on the root >*/ > @@ -2584,7 +2617,7 @@ cow_done: >