[PATCH v2 0/9] enhance btrfs qgroup show command
The patchset enhanced btrfs qgroup show command. Firstly, we restructure show_qgroups, make it easy to add new features. And then we add '-p' '-c', '-l',and '-e' options to print the parent qgroup id, child qgroup id, max referenced size and max exclusive size of qgroup respectively, add '-F' and '-f' option to list qgroups that impact the given path. Besides that, the users may want to sort qgroups according to some items. For this case, we introduce '--sort' option. With this option, we can sort the qgroup by qgroupid, rfer, excl, max_rfer and max_excl. And finally, Since there are so many columns can be output, the users may be confused about the output result, so i add '-t' option to print the result as a table. You can pull this patchset from the URL: git://github.com/wangshilong/Btrfs-progs.git qgroup Changelog v1-v2: rebase the patchset on david's integration-20130920 Wang Shilong (9): Btrfs-progs: restructure show_qgroups Btrfs-progs: introduces '-p' option to print the ID of the parent qgroups Btrfs-progs: introduces '-c' option to print the ID of the child qgroups Btrfs-progs: introduce '-l' option to print max referenced size of qgroups Btrfs-progs: introduce '-e' option to print max exclusive size of qgroups Btrfs-progs: list all qgroups impact given path(include ancestral qgroups) Btrfs-progs: list all qgroups impact given path(exclude ancestral qgroups) Btrfs-progs: enhance btrfs qgroup show to sort qgroups Btrfs-progs: enhance btrfs qgroup to print the result as a table cmds-qgroup.c | 190 - ctree.h | 11 + qgroup.c | 1218 + qgroup.h | 71 4 files changed, 1396 insertions(+), 94 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/9] Btrfs-progs: introduces '-p' option to print the ID of the parent qgroups
From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch introduces '-p' option to print the ID of the parent qgroups. You may use it like: btrfs qgroup show -p path For Example: qgroupid(2/0) / \ / \ / \ qgroupid(1/0) qgroupid(1/1) \ / \/ qgroupid(0/1) If we use the command: btrfs qgroup show -p path The result will output 0/1 -- -- 1/0,1/1 1/0 -- -- 2/0 1/1 -- -- 2/0 2/0 -- -- -- Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 23 --- qgroup.c | 22 ++ qgroup.h | 1 + 3 files changed, 43 insertions(+), 3 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index d3c699f..96098c1 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -202,22 +202,39 @@ static int cmd_qgroup_destroy(int argc, char **argv) } static const char * const cmd_qgroup_show_usage[] = { - btrfs qgroup show path, + btrfs qgroup show -p path, Show all subvolume quota groups., + -p print parent qgroup id, NULL }; static int cmd_qgroup_show(int argc, char **argv) { + char *path; int ret = 0; int fd; int e; - char *path = argv[1]; DIR *dirstream = NULL; + int c; - if (check_argc_exact(argc, 2)) + optind = 1; + while (1) { + c = getopt(argc, argv, p); + if (c 0) + break; + switch (c) { + case 'p': + btrfs_qgroup_setup_print_column( + BTRFS_QGROUP_PARENT); + break; + default: + usage(cmd_qgroup_show_usage); + } + } + if (check_argc_exact(argc - optind, 1)) usage(cmd_qgroup_show_usage); + path = argv[optind]; fd = open_file_or_dir(path, dirstream); if (fd 0) { fprintf(stderr, ERROR: can't access '%s'\n, path); diff --git a/qgroup.c b/qgroup.c index bd9658e..0dbf28c 100644 --- a/qgroup.c +++ b/qgroup.c @@ -88,6 +88,11 @@ struct { .need_print = 1, }, { + .name = parent, + .column_name= Parent, + .need_print = 0, + }, + { .name = NULL, .column_name= NULL, .need_print = 0, @@ -108,6 +113,20 @@ void btrfs_qgroup_setup_print_column(enum btrfs_qgroup_column_enum column) btrfs_qgroup_columns[i].need_print = 1; } +static void print_parent_column(struct btrfs_qgroup *qgroup) +{ + struct btrfs_qgroup_list *list = NULL; + + list_for_each_entry(list, qgroup-qgroups, next_qgroup) { + printf(%llu/%llu, (list-qgroup)-qgroupid 48, + ((1ll 48) - 1) (list-qgroup)-qgroupid); + if (!list_is_last(list-next_qgroup, qgroup-qgroups)) + printf(,); + } + if (list_empty(qgroup-qgroups)) + printf(---); +} + static void print_qgroup_column(struct btrfs_qgroup *qgroup, enum btrfs_qgroup_column_enum column) { @@ -125,6 +144,9 @@ static void print_qgroup_column(struct btrfs_qgroup *qgroup, case BTRFS_QGROUP_EXCL: printf(%lld, qgroup-excl); break; + case BTRFS_QGROUP_PARENT: + print_parent_column(qgroup); + break; default: break; } diff --git a/qgroup.h b/qgroup.h index 8b34cd7..cefdfe1 100644 --- a/qgroup.h +++ b/qgroup.h @@ -26,6 +26,7 @@ enum btrfs_qgroup_column_enum { BTRFS_QGROUP_QGROUPID, BTRFS_QGROUP_RFER, BTRFS_QGROUP_EXCL, + BTRFS_QGROUP_PARENT, BTRFS_QGROUP_ALL, }; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/9] Btrfs-progs: introduce '-e' option to print max exclusive size of qgroups
From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch introduce '-e' option to print max exclusive size of qgroups. You may use it like this: btrfs qgroup -e path Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 9 +++-- qgroup.c | 8 qgroup.h | 1 + 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index 32537f9..0bfca33 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -202,11 +202,12 @@ static int cmd_qgroup_destroy(int argc, char **argv) } static const char * const cmd_qgroup_show_usage[] = { - btrfs qgroup show -pcl path, + btrfs qgroup show -pcle path, Show all subvolume quota groups., -p print parent qgroup id, -c print child qgroup id, -l print max referenced size of qgroup, + -e print max exclusive size of qgroup, NULL }; @@ -221,7 +222,7 @@ static int cmd_qgroup_show(int argc, char **argv) optind = 1; while (1) { - c = getopt(argc, argv, pcl); + c = getopt(argc, argv, pcle); if (c 0) break; switch (c) { @@ -237,6 +238,10 @@ static int cmd_qgroup_show(int argc, char **argv) btrfs_qgroup_setup_print_column( BTRFS_QGROUP_MAX_RFER); break; + case 'e': + btrfs_qgroup_setup_print_column( + BTRFS_QGROUP_MAX_EXCL); + break; default: usage(cmd_qgroup_show_usage); } diff --git a/qgroup.c b/qgroup.c index f9eb52d..2cd37b1 100644 --- a/qgroup.c +++ b/qgroup.c @@ -92,6 +92,11 @@ struct { .need_print = 0, }, { + .name = max_excl, + .column_name= Max_excl, + .need_print = 0, + }, + { .name = parent, .column_name= Parent, .need_print = 0, @@ -173,6 +178,9 @@ static void print_qgroup_column(struct btrfs_qgroup *qgroup, case BTRFS_QGROUP_MAX_RFER: printf(%llu, qgroup-max_rfer); break; + case BTRFS_QGROUP_MAX_EXCL: + printf(%llu, qgroup-max_excl); + break; case BTRFS_QGROUP_CHILD: print_child_column(qgroup); break; diff --git a/qgroup.h b/qgroup.h index 168fac0..e7a65ba 100644 --- a/qgroup.h +++ b/qgroup.h @@ -27,6 +27,7 @@ enum btrfs_qgroup_column_enum { BTRFS_QGROUP_RFER, BTRFS_QGROUP_EXCL, BTRFS_QGROUP_MAX_RFER, + BTRFS_QGROUP_MAX_EXCL, BTRFS_QGROUP_PARENT, BTRFS_QGROUP_CHILD, BTRFS_QGROUP_ALL, -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/9] Btrfs-progs: list all qgroups impact given path(exclude ancestral qgroups)
From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch introduces '-f' option which can help you filter the qgroups by the path name, you may use it like: btrfs qgroup show -f path For example: qgroupid(2/0) / \ / \ qgroupid(1/0) / \ / \ / \ qgroupid(0/1) qgroupid(0/2) sub1 sub2 / \ /\ dir1 file1 If we use the command: btrfs qgroup show -f sub1/dir1 The result will output 0/1 -- -- '-f' option helps you list all qgroups impact given path. (exclude ancestral qgroups) Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 18 ++ qgroup.c | 16 +++- qgroup.h | 1 + 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index 5480d2a..bcf0487 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -210,6 +210,8 @@ static const char * const cmd_qgroup_show_usage[] = { -e print max exclusive size of qgroup, -F list all qgroups which impact the given path (include ancestral qgroups), + -f list all qgroups which impact the given path + (exclude ancestral qgroups), NULL }; @@ -229,7 +231,7 @@ static int cmd_qgroup_show(int argc, char **argv) optind = 1; while (1) { - c = getopt(argc, argv, pcleF); + c = getopt(argc, argv, pcleFf); if (c 0) break; switch (c) { @@ -252,6 +254,9 @@ static int cmd_qgroup_show(int argc, char **argv) case 'F': filter_flag |= 0x1; break; + case 'f': + filter_flag |= 0x2; + break; default: usage(cmd_qgroup_show_usage); } @@ -268,9 +273,14 @@ static int cmd_qgroup_show(int argc, char **argv) if (filter_flag) { qgroupid = btrfs_get_path_rootid(fd); - btrfs_qgroup_setup_filter(filter_set, - BTRFS_QGROUP_FILTER_ALL_PARENT, - qgroupid); + if (filter_flag 0x1) + btrfs_qgroup_setup_filter(filter_set, + BTRFS_QGROUP_FILTER_ALL_PARENT, + qgroupid); + if (filter_flag 0x2) + btrfs_qgroup_setup_filter(filter_set, + BTRFS_QGROUP_FILTER_PARENT, + qgroupid); } ret = btrfs_show_qgroups(fd, filter_set); e = errno; diff --git a/qgroup.c b/qgroup.c index 306b638..28772d6 100644 --- a/qgroup.c +++ b/qgroup.c @@ -468,6 +468,18 @@ static int filter_all_parent_insert(struct qgroup_lookup *sort_tree, return 0; } +static int filter_by_parent(struct btrfs_qgroup *bq, u64 data) +{ + struct btrfs_qgroup *qgroup = + (struct btrfs_qgroup *)(unsigned long)data; + + if (data == 0) + return 0; + if (qgroup-qgroupid == bq-qgroupid) + return 1; + return 0; +} + static int filter_by_all_parent(struct btrfs_qgroup *bq, u64 data) { struct qgroup_lookup lookup; @@ -502,6 +514,7 @@ static int filter_by_all_parent(struct btrfs_qgroup *bq, u64 data) } static btrfs_qgroup_filter_func all_filter_funcs[] = { + [BTRFS_QGROUP_FILTER_PARENT]= filter_by_parent, [BTRFS_QGROUP_FILTER_ALL_PARENT]= filter_by_all_parent, }; @@ -586,7 +599,8 @@ static void pre_process_filter_set(struct qgroup_lookup *lookup, for (i = 0; i set-nfilters; i++) { - if (set-filters[i].filter_func == filter_by_all_parent) { + if (set-filters[i].filter_func == filter_by_all_parent + || set-filters[i].filter_func == filter_by_parent) { qgroup_for_filter = qgroup_tree_search(lookup, set-filters[i].data); set-filters[i].data = diff --git a/qgroup.h b/qgroup.h index bcc2b4b..5fcdd8a 100644 --- a/qgroup.h +++ b/qgroup.h @@ -49,6 +49,7 @@ enum btrfs_qgroup_column_enum { }; enum btrfs_qgroup_filter_enum { + BTRFS_QGROUP_FILTER_PARENT, BTRFS_QGROUP_FILTER_ALL_PARENT, BTRFS_QGROUP_FILTER_MAX, }; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH v2 1/9] Btrfs-progs: restructure show_qgroups
From: Wang Shilong wangsl-f...@cn.fujitsu.com The current show_qgroups() just shows a little information, and it is hard to add some functions which the users need in the future, so i restructure it, make it easy to add new functions. In order to improve the scalability of show_qgroups(), i add some important structures: struct qgroup_lookup { struct rb_root root; } /* *store qgroup's information */ struct btrfs_qgroup { struct rb_node rb_node; u64 qgroupid; u64 generation; u64 rfer; u64 rfer_cmpr; u64 excl_cmpr; u64 flags; u64 max_rfer; u64 max_excl; u64 rsv_rfer; u64 rsv_excl; struct list_head qgroups; struct list_head members; } /* *glue structure to represent the relations *between qgroups */ struct btrfs_qgroup_list { struct list_head next_qgroups; struct list_head next_member; struct btrfs_qgroup *qgroup; struct btrfs_qgroup *member; } The above 3 structures are used to manage all the information of qgroups. struct { char *name; char *column_name; int need_print; } btrfs_qgroup_columns[] We define a arrary to manage all the columns that can be outputed, and use a member variant(-need_print) to control the output of the relative column. Some columns are outputed by default. But we can change it according to the requirement of the users. For example: if outputing max referenced size of qgroup is needed,the function 'btrfs_qgroup_setup_column()' will be called, and the parameter 'BTRFS_QGROUP_MAX_RFER' (extend in the future) will be passsed to the function. After the function is done, when showing qgroups, max referenced size of qgroup will be output. Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 91 +-- ctree.h | 11 ++ qgroup.c | 509 ++ qgroup.h | 10 ++ 4 files changed, 531 insertions(+), 90 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index ff2a1fa..d3c699f 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -106,95 +106,6 @@ static int qgroup_create(int create, int argc, char **argv) return 0; } -static void print_qgroup_info(u64 objectid, struct btrfs_qgroup_info_item *info) -{ - printf(%llu/%llu %lld %lld\n, objectid 48, - objectid ((1ll 48) - 1), - btrfs_stack_qgroup_info_referenced(info), - btrfs_stack_qgroup_info_exclusive(info)); -} - -static int list_qgroups(int fd) -{ - int ret; - struct btrfs_ioctl_search_args args; - struct btrfs_ioctl_search_key *sk = args.key; - struct btrfs_ioctl_search_header *sh; - unsigned long off = 0; - unsigned int i; - struct btrfs_qgroup_info_item *info; - - memset(args, 0, sizeof(args)); - - /* search in the quota tree */ - sk-tree_id = BTRFS_QUOTA_TREE_OBJECTID; - - /* -* set the min and max to backref keys. The search will -* only send back this type of key now. -*/ - sk-max_type = BTRFS_QGROUP_INFO_KEY; - sk-min_type = BTRFS_QGROUP_INFO_KEY; - sk-max_objectid = 0; - sk-max_offset = (u64)-1; - sk-max_transid = (u64)-1; - - /* just a big number, doesn't matter much */ - sk-nr_items = 4096; - - while (1) { - ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, args); - if (ret 0) - return ret; - - /* the ioctl returns the number of item it found in nr_items */ - if (sk-nr_items == 0) - break; - - off = 0; - - /* -* for each item, pull the key out of the header and then -* read the root_ref item it contains -*/ - for (i = 0; i sk-nr_items; i++) { - sh = (struct btrfs_ioctl_search_header *)(args.buf + - off); - off += sizeof(*sh); - - if (sh-objectid != 0) - goto done; - - if (sh-type != BTRFS_QGROUP_INFO_KEY) - goto done; - - info = (struct btrfs_qgroup_info_item *) - (args.buf + off); - print_qgroup_info(sh-offset, info); - - off += sh-len; - - /* -* record the mins in sk so we can make sure the -
[PATCH v2 8/9] Btrfs-progs: enhance btrfs qgroup show to sort qgroups
From: Wang Shilong wangsl-f...@cn.fujitsu.com You might want to list qgroups in order of some items, such as 'qgroupid', 'rfer' and so on, you can use '--sort'. Now you can sort the qgroups by 'qgroupid', 'rfer','excl','max_rfer' and 'max_excl'. For example: If you want to list qgroups in order of 'qgroupid'. You can use the option like that: btrfs qgroup show --sort=+/-qgroupid path Here, '+' means the result is sorted by ascending order. '-' is by descending order. If you don't specify either '+' nor '-', the result is sorted by default - ascending order. If you want to combine sort items, you do it like that: btrfs qgroup show --sort=-qgroupid,+rfer,max_rfer,excl path Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 25 +- qgroup.c | 256 -- qgroup.h | 33 +++- 3 files changed, 302 insertions(+), 12 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index bcf0487..780fb21 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -202,7 +202,8 @@ static int cmd_qgroup_destroy(int argc, char **argv) } static const char * const cmd_qgroup_show_usage[] = { - btrfs qgroup show -pcleF path, + btrfs qgroup show -pcleF + [--sort=qgroupid,rfer,excl,max_rfer,max_excl] path, Show subvolume quota groups., -p print parent qgroup id, -c print child qgroup id, @@ -212,6 +213,11 @@ static const char * const cmd_qgroup_show_usage[] = { (include ancestral qgroups), -f list all qgroups which impact the given path (exclude ancestral qgroups), + --sort=qgroupid,rfer,excl,max_rfer,max_excl, + list qgroups in order of qgroupid, + rfer,max_rfer or max_excl, + you can use '+' or '-' in front of each item., + (+:ascending, -:descending, ascending default, NULL }; @@ -226,12 +232,19 @@ static int cmd_qgroup_show(int argc, char **argv) u64 qgroupid; int filter_flag = 0; + struct btrfs_qgroup_comparer_set *comparer_set; struct btrfs_qgroup_filter_set *filter_set; filter_set = btrfs_qgroup_alloc_filter_set(); + comparer_set = btrfs_qgroup_alloc_comparer_set(); + struct option long_options[] = { + {sort, 1, NULL, 'S'}, + {0, 0, 0, 0} + }; optind = 1; while (1) { - c = getopt(argc, argv, pcleFf); + c = getopt_long(argc, argv, pcleFf, + long_options, NULL); if (c 0) break; switch (c) { @@ -257,6 +270,12 @@ static int cmd_qgroup_show(int argc, char **argv) case 'f': filter_flag |= 0x2; break; + case 'S': + ret = btrfs_qgroup_parse_sort_string(optarg, +comparer_set); + if (ret) + usage(cmd_qgroup_show_usage); + break; default: usage(cmd_qgroup_show_usage); } @@ -282,7 +301,7 @@ static int cmd_qgroup_show(int argc, char **argv) BTRFS_QGROUP_FILTER_PARENT, qgroupid); } - ret = btrfs_show_qgroups(fd, filter_set); + ret = btrfs_show_qgroups(fd, filter_set, comparer_set); e = errno; close_file_or_dir(fd, dirstream); if (ret 0) diff --git a/qgroup.c b/qgroup.c index 28772d6..84f5fc1 100644 --- a/qgroup.c +++ b/qgroup.c @@ -22,6 +22,7 @@ #include ioctl.h #define BTRFS_QGROUP_NFILTERS_INCREASE (2 * BTRFS_QGROUP_FILTER_MAX) +#define BTRFS_QGROUP_NCOMPS_INCREASE (2 * BTRFS_QGROUP_COMP_MAX) struct qgroup_lookup { struct rb_root root; @@ -122,6 +123,7 @@ struct { }; static btrfs_qgroup_filter_func all_filter_funcs[]; +static btrfs_qgroup_comp_func all_comp_funcs[]; void btrfs_qgroup_setup_print_column(enum btrfs_qgroup_column_enum column) { @@ -236,6 +238,188 @@ static int comp_entry_with_qgroupid(struct btrfs_qgroup *entry1, return is_descending ? -ret : ret; } +static int comp_entry_with_rfer(struct btrfs_qgroup *entry1, + struct btrfs_qgroup *entry2, + int is_descending) +{ + int ret; + + if (entry1-rfer entry2-rfer) + ret = 1; + else if (entry1-rfer entry2-rfer) + ret = -1; + else + ret = 0; + + return is_descending ? -ret : ret; +} + +static int comp_entry_with_excl(struct btrfs_qgroup *entry1, + struct btrfs_qgroup *entry2, + int
[PATCH v2 6/9] Btrfs-progs: list all qgroups impact given path(include ancestral qgroups)
From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch introduces '-F' option which can help you filter the qgroups by the path name, you may use it like: btrfs qgroup show -F path For example: qgroupid(2/0) / \ / \ qgroupid(1/0) / \ / \ / \ qgroupid(0/1) qgroupid(0/2) sub1 sub2 / \ /\ dir1 file1 If we use the command: btrfs qgroup show -F sub1/dir1 The result will output 0/1 -- -- 1/0 -- -- 2/0 -- -- '-F' option help you list all qgroups impact given path. (include ancestral qgroups). Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 24 +- qgroup.c | 239 +- qgroup.h | 28 ++- 3 files changed, 281 insertions(+), 10 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index 0bfca33..5480d2a 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -202,12 +202,14 @@ static int cmd_qgroup_destroy(int argc, char **argv) } static const char * const cmd_qgroup_show_usage[] = { - btrfs qgroup show -pcle path, - Show all subvolume quota groups., + btrfs qgroup show -pcleF path, + Show subvolume quota groups., -p print parent qgroup id, -c print child qgroup id, -l print max referenced size of qgroup, -e print max exclusive size of qgroup, + -F list all qgroups which impact the given path + (include ancestral qgroups), NULL }; @@ -219,10 +221,15 @@ static int cmd_qgroup_show(int argc, char **argv) int e; DIR *dirstream = NULL; int c; + u64 qgroupid; + int filter_flag = 0; + + struct btrfs_qgroup_filter_set *filter_set; + filter_set = btrfs_qgroup_alloc_filter_set(); optind = 1; while (1) { - c = getopt(argc, argv, pcle); + c = getopt(argc, argv, pcleF); if (c 0) break; switch (c) { @@ -242,6 +249,9 @@ static int cmd_qgroup_show(int argc, char **argv) btrfs_qgroup_setup_print_column( BTRFS_QGROUP_MAX_EXCL); break; + case 'F': + filter_flag |= 0x1; + break; default: usage(cmd_qgroup_show_usage); } @@ -256,7 +266,13 @@ static int cmd_qgroup_show(int argc, char **argv) return 1; } - ret = btrfs_show_qgroups(fd); + if (filter_flag) { + qgroupid = btrfs_get_path_rootid(fd); + btrfs_qgroup_setup_filter(filter_set, + BTRFS_QGROUP_FILTER_ALL_PARENT, + qgroupid); + } + ret = btrfs_show_qgroups(fd, filter_set); e = errno; close_file_or_dir(fd, dirstream); if (ret 0) diff --git a/qgroup.c b/qgroup.c index 2cd37b1..306b638 100644 --- a/qgroup.c +++ b/qgroup.c @@ -21,12 +21,20 @@ #include ctree.h #include ioctl.h +#define BTRFS_QGROUP_NFILTERS_INCREASE (2 * BTRFS_QGROUP_FILTER_MAX) + struct qgroup_lookup { struct rb_root root; }; struct btrfs_qgroup { struct rb_node rb_node; + struct rb_node sort_node; + /* +*all_parent_node is used to +*filter a qgroup's all parent +*/ + struct rb_node all_parent_node; u64 qgroupid; /* @@ -113,6 +121,8 @@ struct { }, }; +static btrfs_qgroup_filter_func all_filter_funcs[]; + void btrfs_qgroup_setup_print_column(enum btrfs_qgroup_column_enum column) { int i; @@ -433,6 +443,205 @@ void __free_all_qgroups(struct qgroup_lookup *root_tree) } } +static int filter_all_parent_insert(struct qgroup_lookup *sort_tree, + struct btrfs_qgroup *bq) +{ + struct rb_node **p = sort_tree-root.rb_node; + struct rb_node *parent = NULL; + struct btrfs_qgroup *curr; + int ret; + + while (*p) { + parent = *p; + curr = rb_entry(parent, struct btrfs_qgroup, all_parent_node); + + ret = comp_entry_with_qgroupid(bq, curr, 0); + if (ret 0) + p = (*p)-rb_left; + else if (ret 0) + p = (*p)-rb_right; + else + return -EEXIST; + } + rb_link_node(bq-all_parent_node, parent, p); + rb_insert_color(bq-all_parent_node,
[PATCH v2 4/9] Btrfs-progs: introduce '-l' option to print max referenced size of qgroups
From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch introduces '-l' option to print max referenced size of qgroups. You may use it like: btrfs qgroup show -l path Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 9 +++-- qgroup.c | 7 +++ qgroup.h | 1 + 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index 147bedc..32537f9 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -202,10 +202,11 @@ static int cmd_qgroup_destroy(int argc, char **argv) } static const char * const cmd_qgroup_show_usage[] = { - btrfs qgroup show -pc path, + btrfs qgroup show -pcl path, Show all subvolume quota groups., -p print parent qgroup id, -c print child qgroup id, + -l print max referenced size of qgroup, NULL }; @@ -220,7 +221,7 @@ static int cmd_qgroup_show(int argc, char **argv) optind = 1; while (1) { - c = getopt(argc, argv, pc); + c = getopt(argc, argv, pcl); if (c 0) break; switch (c) { @@ -232,6 +233,10 @@ static int cmd_qgroup_show(int argc, char **argv) btrfs_qgroup_setup_print_column( BTRFS_QGROUP_CHILD); break; + case 'l': + btrfs_qgroup_setup_print_column( + BTRFS_QGROUP_MAX_RFER); + break; default: usage(cmd_qgroup_show_usage); } diff --git a/qgroup.c b/qgroup.c index 1592dd4..f9eb52d 100644 --- a/qgroup.c +++ b/qgroup.c @@ -87,6 +87,10 @@ struct { .column_name= Excl, .need_print = 1, }, + { .name = max_rfer, + .column_name= Max_rfer, + .need_print = 0, + }, { .name = parent, .column_name= Parent, @@ -166,6 +170,9 @@ static void print_qgroup_column(struct btrfs_qgroup *qgroup, case BTRFS_QGROUP_PARENT: print_parent_column(qgroup); break; + case BTRFS_QGROUP_MAX_RFER: + printf(%llu, qgroup-max_rfer); + break; case BTRFS_QGROUP_CHILD: print_child_column(qgroup); break; diff --git a/qgroup.h b/qgroup.h index 33682ae..168fac0 100644 --- a/qgroup.h +++ b/qgroup.h @@ -26,6 +26,7 @@ enum btrfs_qgroup_column_enum { BTRFS_QGROUP_QGROUPID, BTRFS_QGROUP_RFER, BTRFS_QGROUP_EXCL, + BTRFS_QGROUP_MAX_RFER, BTRFS_QGROUP_PARENT, BTRFS_QGROUP_CHILD, BTRFS_QGROUP_ALL, -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/9] Btrfs-progs: introduces '-c' option to print the ID of the child qgroups
From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch introduces '-c' option to print the ID of the child qgroups. You may use it like: btrfs qgroup show -c path For Example: qgroupid(2/0) / \ / \ / \ qgroupid(1/0) qgroupid(1/1) \/ \ / qgroupid(0/1) If we use the command: btrfs qgroup show -c path The result will output 0/1 -- -- -- 1/0 -- -- 0/1 1/1 -- -- 0/1 2/0 -- -- 1/0,1/1 Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 9 +++-- qgroup.c | 22 ++ qgroup.h | 1 + 3 files changed, 30 insertions(+), 2 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index 96098c1..147bedc 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -202,9 +202,10 @@ static int cmd_qgroup_destroy(int argc, char **argv) } static const char * const cmd_qgroup_show_usage[] = { - btrfs qgroup show -p path, + btrfs qgroup show -pc path, Show all subvolume quota groups., -p print parent qgroup id, + -c print child qgroup id, NULL }; @@ -219,7 +220,7 @@ static int cmd_qgroup_show(int argc, char **argv) optind = 1; while (1) { - c = getopt(argc, argv, p); + c = getopt(argc, argv, pc); if (c 0) break; switch (c) { @@ -227,6 +228,10 @@ static int cmd_qgroup_show(int argc, char **argv) btrfs_qgroup_setup_print_column( BTRFS_QGROUP_PARENT); break; + case 'c': + btrfs_qgroup_setup_print_column( + BTRFS_QGROUP_CHILD); + break; default: usage(cmd_qgroup_show_usage); } diff --git a/qgroup.c b/qgroup.c index 0dbf28c..1592dd4 100644 --- a/qgroup.c +++ b/qgroup.c @@ -93,6 +93,11 @@ struct { .need_print = 0, }, { + .name = child, + .column_name= Child, + .need_print = 0, + }, + { .name = NULL, .column_name= NULL, .need_print = 0, @@ -127,6 +132,20 @@ static void print_parent_column(struct btrfs_qgroup *qgroup) printf(---); } +static void print_child_column(struct btrfs_qgroup *qgroup) +{ + struct btrfs_qgroup_list *list = NULL; + + list_for_each_entry(list, qgroup-members, next_member) { + printf(%llu/%llu, (list-member)-qgroupid 48, + ((1ll 48) - 1) (list-member)-qgroupid); + if (!list_is_last(list-next_member, qgroup-members)) + printf(,); + } + if (list_empty(qgroup-members)) + printf(---); +} + static void print_qgroup_column(struct btrfs_qgroup *qgroup, enum btrfs_qgroup_column_enum column) { @@ -147,6 +166,9 @@ static void print_qgroup_column(struct btrfs_qgroup *qgroup, case BTRFS_QGROUP_PARENT: print_parent_column(qgroup); break; + case BTRFS_QGROUP_CHILD: + print_child_column(qgroup); + break; default: break; } diff --git a/qgroup.h b/qgroup.h index cefdfe1..33682ae 100644 --- a/qgroup.h +++ b/qgroup.h @@ -27,6 +27,7 @@ enum btrfs_qgroup_column_enum { BTRFS_QGROUP_RFER, BTRFS_QGROUP_EXCL, BTRFS_QGROUP_PARENT, + BTRFS_QGROUP_CHILD, BTRFS_QGROUP_ALL, }; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 9/9] Btrfs-progs: enhance btrfs qgroup to print the result as a table
From: Wang Shilong wangsl-f...@cn.fujitsu.com This patch introduce '-t' option which can help you print the result as a table. You can use it like: btrfs qgroup show -t path However, to table the result better, we make '-p' and '-c' not present at the same time.If you still want to show both of them at the same time, you may print the result without '-t' option. For example: btrfs qgroup show -tpl path The result will output as the follow format: qgroupid rfer excl max_excl parent -- 0/2651289752576 1289752576 0 --- 1/0 0 0 10999511627776 2/0,3/0 2/0 0 0 0 --- 3/0 0 0 0 --- Signed-off-by: Wang Shilong wangsl-f...@cn.fujitsu.com Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- cmds-qgroup.c | 20 ++- qgroup.c | 179 +++--- qgroup.h | 3 +- 3 files changed, 191 insertions(+), 11 deletions(-) diff --git a/cmds-qgroup.c b/cmds-qgroup.c index 780fb21..1f71e3c 100644 --- a/cmds-qgroup.c +++ b/cmds-qgroup.c @@ -202,13 +202,14 @@ static int cmd_qgroup_destroy(int argc, char **argv) } static const char * const cmd_qgroup_show_usage[] = { - btrfs qgroup show -pcleF + btrfs qgroup show -pcleFt [--sort=qgroupid,rfer,excl,max_rfer,max_excl] path, Show subvolume quota groups., -p print parent qgroup id, -c print child qgroup id, -l print max referenced size of qgroup, -e print max exclusive size of qgroup, + -t print the result as a table, -F list all qgroups which impact the given path (include ancestral qgroups), -f list all qgroups which impact the given path @@ -231,6 +232,8 @@ static int cmd_qgroup_show(int argc, char **argv) int c; u64 qgroupid; int filter_flag = 0; + int is_table_result = 0; + int table_better = 0; struct btrfs_qgroup_comparer_set *comparer_set; struct btrfs_qgroup_filter_set *filter_set; @@ -243,16 +246,18 @@ static int cmd_qgroup_show(int argc, char **argv) optind = 1; while (1) { - c = getopt_long(argc, argv, pcleFf, + c = getopt_long(argc, argv, pcleFft, long_options, NULL); if (c 0) break; switch (c) { case 'p': + table_better |= 0x1; btrfs_qgroup_setup_print_column( BTRFS_QGROUP_PARENT); break; case 'c': + table_better |= 0x2; btrfs_qgroup_setup_print_column( BTRFS_QGROUP_CHILD); break; @@ -264,6 +269,9 @@ static int cmd_qgroup_show(int argc, char **argv) btrfs_qgroup_setup_print_column( BTRFS_QGROUP_MAX_EXCL); break; + case 't': + is_table_result = 1; + break; case 'F': filter_flag |= 0x1; break; @@ -301,7 +309,13 @@ static int cmd_qgroup_show(int argc, char **argv) BTRFS_QGROUP_FILTER_PARENT, qgroupid); } - ret = btrfs_show_qgroups(fd, filter_set, comparer_set); + if (is_table_result table_better == 3) { + fprintf(stderr, + ERROR: '-p' and '-c' can't used at the same time\n); + exit(1); + } + ret = btrfs_show_qgroups(fd, filter_set, comparer_set, +is_table_result); e = errno; close_file_or_dir(fd, dirstream); if (ret 0) diff --git a/qgroup.c b/qgroup.c index 84f5fc1..eafb1bf 100644 --- a/qgroup.c +++ b/qgroup.c @@ -80,40 +80,48 @@ struct { char *name; char *column_name; int need_print; + int max_len; } btrfs_qgroup_columns[] = { { .name = qgroupid, .column_name= Qgroupid, .need_print = 1, + .max_len= 8, }, { .name = rfer, .column_name= Rfer, .need_print = 1, + .max_len= 4, }, { .name = excl, .column_name= Excl, .need_print = 1, + .max_len= 4, }, { .name = max_rfer, .column_name= Max_rfer, .need_print = 0, +
Re: [PATCH] btrfs-progs: Add dependencies explicitly to fix a parallel build issue
On 09/18/2013 09:51 AM, Eric Sandeen wrote: On 9/17/13 8:11 PM, rongqing...@windriver.com wrote: From: Roy Li rongqing...@windriver.com The dependencies of all: version.h or other similar ones can not fix the parallel build failure, only reduce the times; In fact, many *.o files require version.h file. #grep '#include version.h' ./ -r ./btrfs-corrupt-block.c:#include version.h ./btrfs.c:#include version.h ./btrfs-image.c:#include version.h ./cmds-filesystem.c:#include version.h ./btrfs-show-super.c:#include version.h ./btrfs-select-super.c:#include version.h ./cmds-restore.c:#include version.h ./btrfs-find-root.c:#include version.h ./mkfs.c:#include version.h ./btrfs-zero-log.c:#include version.h ./btrfs-defrag.c:#include version.h ./cmds-chunk.c:#include version.h ./btrfstune.c:#include version.h ./btrfs-calc-size.c:#include version.h ./btrfs-map-logical.c:#include version.h ./cmds-check.c:#include version.h ./btrfs-debug-tree.c:#include version.h Signed-off-by: Roy Li rongqing...@windriver.com --- Sorry, The patch [btrfs-progs: fix parallel build] sent by me on Sep 3 can not fix the build failure, when build enough times on a 16 core cpu, the build failure happens again, so I refix it again. Makefile |6 ++ 1 file changed, 6 insertions(+) diff --git a/Makefile b/Makefile index c43cb68..a7c259c 100644 --- a/Makefile +++ b/Makefile @@ -81,6 +81,12 @@ endif @echo [CC] $@ $(Q)$(CC) $(DEPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c $ +btrfs-corrupt-block.o btrfs.o btrfs-image.o cmds-filesystem.o:version.h +btrfs-show-super.o btrfs-select-super.o cmds-restore.o:version.h +btrfs-find-root.o mkfs.o btrfs-zero-log.o btrfs-defrag.o cmds-chunk.o:version.h +btrfstune.o btrfs-calc-size.o btrfs-map-logical.o cmds-check.o:version.h +btrfs-debug-tree.o:version.h + %.static.o: %.c @echo [CC] $@ $(Q)$(CC) $(DEPFLAGS) $(AM_CFLAGS) $(STATIC_CFLAGS) -c $ -o $@ I think this can be done more cleanly, I'll send a patch. -Eric If this one is not cleanly, how is the below one From a56ac083a789605904507b602d9ebf196ae6746d Mon Sep 17 00:00:00 2001 From: Roy Li rongqing...@windriver.com Date: Mon, 23 Sep 2013 09:53:30 +0800 Subject: [PATCH] btrfs-progs: Add dependencies explicitly to fix a parallel build issue The dependencies of all: version.h or other similar ones can not fix the parallel build failure, only reduce the times; In fact, many *.o files require version.h file. #grep '#include version.h' ./ -r ./btrfs-corrupt-block.c:#include version.h ./btrfs.c:#include version.h ./btrfs-image.c:#include version.h ./cmds-filesystem.c:#include version.h ./btrfs-show-super.c:#include version.h ./btrfs-select-super.c:#include version.h ./cmds-restore.c:#include version.h ./btrfs-find-root.c:#include version.h ./mkfs.c:#include version.h ./btrfs-zero-log.c:#include version.h ./btrfs-defrag.c:#include version.h ./cmds-chunk.c:#include version.h ./btrfstune.c:#include version.h ./btrfs-calc-size.c:#include version.h ./btrfs-map-logical.c:#include version.h ./cmds-check.c:#include version.h ./btrfs-debug-tree.c:#include version.h Signed-off-by: Roy Li rongqing...@windriver.com --- Makefile |2 ++ 1 file changed, 2 insertions(+) diff --git a/Makefile b/Makefile index c43cb68..901ffa2 100644 --- a/Makefile +++ b/Makefile @@ -81,6 +81,8 @@ endif @echo [CC] $@ $(Q)$(CC) $(DEPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c $ +%.o: version.h + %.static.o: %.c @echo [CC] $@ $(Q)$(CC) $(DEPFLAGS) $(AM_CFLAGS) $(STATIC_CFLAGS) -c $ -o $@ -- 1.7.10.4 -- Best Reagrds, Roy | RongQing Li -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
sự chú ý...
Thân Email dùng, Hộp thư của bạn đã vượt quá giới hạn lưu trữ mà là 20.00 GB như thiết lập bởi quản trị viên của bạn, bạn đang chạy trên 19,99 GB, bạn có thể không có thể gửi hoặc nhận thư mới cho đến khi bạn xác nhận hộp thư điện tử của bạn. Vui lòng nhấp vào liên kết dưới đây để xác nhận tài khoản email của bạn, Nếu trang không xuất hiện trên trình duyệt của bạn, bạn có thể sao chép và dán liên kết vào trình duyệt của bạn và điền thông tin tài khoản của bạn, click vào XÁC MINH để cập nhật tài khoản. http://tiny.cc/hunx2w Cảm ơn! Việt Nam quản trị WebMail! Số trường hợp: 894162/2013 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
sự chú ý...
Thân Email dùng, Hộp thư của bạn đã vượt quá giới hạn lưu trữ mà là 20.00 GB như thiết lập bởi quản trị viên của bạn, bạn đang chạy trên 19,99 GB, bạn có thể không có thể gửi hoặc nhận thư mới cho đến khi bạn xác nhận hộp thư điện tử của bạn. Vui lòng nhấp vào liên kết dưới đây để xác nhận tài khoản email của bạn, Nếu trang không xuất hiện trên trình duyệt của bạn, bạn có thể sao chép và dán liên kết vào trình duyệt của bạn và điền thông tin tài khoản của bạn, click vào XÁC MINH để cập nhật tài khoản. http://tiny.cc/hunx2w Cảm ơn! Việt Nam quản trị WebMail! Số trường hợp: 894162/2013 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] xfstest/btrfs/001: fix the misuse of subvolume set-default
The command is btrfs subvolume set-default subvolid path. It uses @subvolid to control the default subvolume and @subvolid=0 has always been parsed into FS_TREE no matter what subvolume @path points to. So in order to set a subvolume to the default one, we need to get the id of this subvolume first. Also fix a typo: s/sbuvolid/subvolid/g Signed-off-by: Liu Bo bo.li@oracle.com --- tests/btrfs/001 |5 +++-- tests/btrfs/001.out |2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/tests/btrfs/001 b/tests/btrfs/001 index 9aa2431..1864e01 100755 --- a/tests/btrfs/001 +++ b/tests/btrfs/001 @@ -77,12 +77,13 @@ ls $SCRATCH_MNT/subvol echo Creating file bar in subvol dd if=/dev/zero of=$SCRATCH_MNT/subvol/bar bs=1M count=1 /dev/null echo Setting subvol to the default -$BTRFS_UTIL_PROG subvolume set-default 0 $SCRATCH_MNT/subvol | _filter_scratch +subid=`$BTRFS_UTIL_PROG subvolume list $SCRATCH_MNT | grep subvol | awk '{print $2}'` +$BTRFS_UTIL_PROG subvolume set-default $subid $SCRATCH_MNT | _filter_scratch _scratch_remount echo List root dir which is now subvol ls $SCRATCH_MNT _scratch_unmount -echo Mounting sbuvolid=0 for the root dir +echo Mounting subvolid=0 for the root dir _scratch_mount -o subvolid=0 echo List root dir ls $SCRATCH_MNT diff --git a/tests/btrfs/001.out b/tests/btrfs/001.out index c782bde..7810c27 100644 --- a/tests/btrfs/001.out +++ b/tests/btrfs/001.out @@ -22,7 +22,7 @@ Creating file bar in subvol Setting subvol to the default List root dir which is now subvol bar -Mounting sbuvolid=0 for the root dir +Mounting subvolid=0 for the root dir List root dir snap subvol -- 1.7.7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix sync fs to actually wait for all data to be persisted
On Mon, Sep 23, 2013 at 2:30 AM, Miao Xie mi...@cn.fujitsu.com wrote: On sun, 22 Sep 2013 21:55:53 +0100, Filipe David Borba Manana wrote: Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/super.c |5 + 1 file changed, 5 insertions(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 6ab0df5..557e38f 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -913,6 +913,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait) struct btrfs_trans_handle *trans; struct btrfs_fs_info *fs_info = btrfs_sb(sb); struct btrfs_root *root = fs_info-tree_root; + int ret; trace_btrfs_sync_fs(wait); @@ -921,6 +922,10 @@ int btrfs_sync_fs(struct super_block *sb, int wait) return 0; } + ret = btrfs_start_all_delalloc_inodes(fs_info, 0); + if (ret) + return ret; + I don't think we should call btrfs_start_all_delalloc_inodes(), because this function is also called by do_sync(), but do_sync() syncs the whole fs before calling it, so if we add btrfs_start_all_delalloc_inodes() here, we will sync the fs twice, and the second one is unnecessary. Where is that do_sync() function exactly? I'm not finding any with that exact name in fs/btrfs/* nor fs/* I used this approach because (besides working) it's what is done in btrfs_commit_transaction() (via btrfs_start_delalloc_flush and btrfs_wait_delalloc_flush). Why can it be like that in the transaction commit and not in btrfs_sync_fs() ? Calling writeback_inodes_sb() before btrfs_sync_fs() is better way to fix this problem. Just tested it, and it works that way too. Uploading a new patch. Thanks for the feedback/review Miao. Thanks Miao btrfs_wait_all_ordered_extents(fs_info); trans = btrfs_attach_transaction_barrier(root); -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Btrfs: fix sync fs to actually wait for all data to be persisted
Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as suggested by Miao Xie. fs/btrfs/super.c |1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 6ab0df5..38b4392 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -921,6 +921,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait) return 0; } + writeback_inodes_sb(sb, WB_REASON_SYNC); btrfs_wait_all_ordered_extents(fs_info); trans = btrfs_attach_transaction_barrier(root); -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9] enhance btrfs qgroup show command
Wang Shilong posted on Mon, 23 Sep 2013 10:18:19 +0800 as excerpted: On 09/23/2013 09:53 AM, Dusty Mabe wrote: There is one other thing I have noticed while playing around with quota and qgroups. If I delete subvolumes I can manage to get some of the qgroup information to be reported as a negative number. If you are interested check out my steps at http://dustymabe.com/2013/09/22/btrfs-how-big-are-my-snapshots/ . The system I was using was Fedora 19 (so not the latest stuff so this may be a known issue already). I will take a look at this issue. FWIW, I remember seeing discussion of the negative qgroup numbers on- list, but as I don't use quotas I didn't track resolution. So it should indeed be a known issue, but I'm not sure if it's fixed or someone's at least working on it yet or not. But that might give you a duplicating instance if you need one, anyway. (Unless of course Dusty reported it earlier too and that's what I'm remembering.) -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix sync fs to actually wait for all data to be persisted
On Mon, Sep 23, 2013 at 10:11:42AM +0100, Filipe David Manana wrote: On Mon, Sep 23, 2013 at 2:30 AM, Miao Xie mi...@cn.fujitsu.com wrote: On sun, 22 Sep 2013 21:55:53 +0100, Filipe David Borba Manana wrote: Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/super.c |5 + 1 file changed, 5 insertions(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 6ab0df5..557e38f 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -913,6 +913,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait) struct btrfs_trans_handle *trans; struct btrfs_fs_info *fs_info = btrfs_sb(sb); struct btrfs_root *root = fs_info-tree_root; + int ret; trace_btrfs_sync_fs(wait); @@ -921,6 +922,10 @@ int btrfs_sync_fs(struct super_block *sb, int wait) return 0; } + ret = btrfs_start_all_delalloc_inodes(fs_info, 0); + if (ret) + return ret; + I don't think we should call btrfs_start_all_delalloc_inodes(), because this function is also called by do_sync(), but do_sync() syncs the whole fs before calling it, so if we add btrfs_start_all_delalloc_inodes() here, we will sync the fs twice, and the second one is unnecessary. Where is that do_sync() function exactly? I'm not finding any with that exact name in fs/btrfs/* nor fs/* I think it should refer to sync_filesystem() :) -liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Btrfs: fix sync fs to actually wait for all data to be persisted
On Mon, Sep 23, 2013 at 10:23 AM, Filipe David Borba Manana fdman...@gmail.com wrote: Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as suggested by Miao Xie. fs/btrfs/super.c |1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 6ab0df5..38b4392 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -921,6 +921,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait) return 0; } + writeback_inodes_sb(sb, WB_REASON_SYNC); btrfs_wait_all_ordered_extents(fs_info); Ignore this 2nd patch version please, for 2 reasons: 1) It triggers a WARN_ON because writeback_inodes_sb() requires the sb-u_mount semaphore to be acquired before, which is not always the case (it is when called through btrfs_kill_super, otherwise it isn't) 2) It doesn't guarantee that inodes are actually written (see comment of writeback_inodes_sb()), so we can return 0 (success) when the writes actually didn't happen/succeed. Because of this, btrfs_start_all_delalloc_inodes() is more honest. trans = btrfs_attach_transaction_barrier(root); -- 1.7.9.5 -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Btrfs: fix sync fs to actually wait for all data to be persisted
On Mon, Sep 23, 2013 at 10:53:20AM +0100, Filipe David Manana wrote: On Mon, Sep 23, 2013 at 10:23 AM, Filipe David Borba Manana fdman...@gmail.com wrote: Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as suggested by Miao Xie. fs/btrfs/super.c |1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 6ab0df5..38b4392 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -921,6 +921,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait) return 0; } + writeback_inodes_sb(sb, WB_REASON_SYNC); btrfs_wait_all_ordered_extents(fs_info); Ignore this 2nd patch version please, for 2 reasons: 1) It triggers a WARN_ON because writeback_inodes_sb() requires the sb-u_mount semaphore to be acquired before, which is not always the case (it is when called through btrfs_kill_super, otherwise it isn't) 2) It doesn't guarantee that inodes are actually written (see comment of writeback_inodes_sb()), so we can return 0 (success) when the writes actually didn't happen/succeed. Because of this, btrfs_start_all_delalloc_inodes() is more honest. What about case BTRFS_IOC_SYNC: btrfs_start_all_delalloc_inodes(); btrfs_sync_fs(file-f_dentry-d_sb, 1); return 0; This way, there is no impact on calling sync(1). -liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Btrfs: fix sync fs to actually wait for all data to be persisted
On Mon, Sep 23, 2013 at 10:59 AM, Liu Bo bo.li@oracle.com wrote: On Mon, Sep 23, 2013 at 10:53:20AM +0100, Filipe David Manana wrote: On Mon, Sep 23, 2013 at 10:23 AM, Filipe David Borba Manana fdman...@gmail.com wrote: Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as suggested by Miao Xie. fs/btrfs/super.c |1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 6ab0df5..38b4392 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -921,6 +921,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait) return 0; } + writeback_inodes_sb(sb, WB_REASON_SYNC); btrfs_wait_all_ordered_extents(fs_info); Ignore this 2nd patch version please, for 2 reasons: 1) It triggers a WARN_ON because writeback_inodes_sb() requires the sb-u_mount semaphore to be acquired before, which is not always the case (it is when called through btrfs_kill_super, otherwise it isn't) 2) It doesn't guarantee that inodes are actually written (see comment of writeback_inodes_sb()), so we can return 0 (success) when the writes actually didn't happen/succeed. Because of this, btrfs_start_all_delalloc_inodes() is more honest. What about case BTRFS_IOC_SYNC: btrfs_start_all_delalloc_inodes(); btrfs_sync_fs(file-f_dentry-d_sb, 1); return 0; This way, there is no impact on calling sync(1). Sounds ok. Will try it, returning error if btrfs_start_all_delalloc_inodes() returns an error. Thanks for the suggestion and pointing me to sync_filesystem() :) -liubo -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] Btrfs: fix sync fs to actually wait for all data to be persisted
Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as suggested by Miao Xie. V3: Use btrfs_start_all_delalloc_inodes() instead but outside btrfs_sync_fs(), in the sync IOCTL handler. Using writeback_inodes_sb() is not very honest because it doesn't guarantee inode data is persisted and we have no way to know if persistence really happened or not, returning 0 (success) always. Thanks Liu Bo for the suggestion. fs/btrfs/ioctl.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 9d46f60..8792fc8 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4557,9 +4557,15 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_logical_to_ino(root, argp); case BTRFS_IOC_SPACE_INFO: return btrfs_ioctl_space_info(root, argp); - case BTRFS_IOC_SYNC: + case BTRFS_IOC_SYNC: { + int ret; + + ret = btrfs_start_all_delalloc_inodes(root-fs_info, 0); + if (ret) + return ret; btrfs_sync_fs(file-f_dentry-d_sb, 1); return 0; + } case BTRFS_IOC_START_SYNC: return btrfs_ioctl_start_sync(root, argp); case BTRFS_IOC_WAIT_SYNC: -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4] Btrfs: fix sync fs to actually wait for all data to be persisted
Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as suggested by Miao Xie. V3: Use btrfs_start_all_delalloc_inodes() instead but outside btrfs_sync_fs(), in the sync IOCTL handler. Using writeback_inodes_sb() is not very honest because it doesn't guarantee inode data is persisted and we have no way to know if persistence really happened or not, returning 0 (success) always. Thanks Liu Bo for the suggestion. V4: Be even more honest in the sync IOCTL handler - don't always return success regardless of the result of the btrfs_sync_fs() call. fs/btrfs/ioctl.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 9d46f60..385c58f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4557,9 +4557,15 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_logical_to_ino(root, argp); case BTRFS_IOC_SPACE_INFO: return btrfs_ioctl_space_info(root, argp); - case BTRFS_IOC_SYNC: - btrfs_sync_fs(file-f_dentry-d_sb, 1); - return 0; + case BTRFS_IOC_SYNC: { + int ret; + + ret = btrfs_start_all_delalloc_inodes(root-fs_info, 0); + if (ret) + return ret; + ret = btrfs_sync_fs(file-f_dentry-d_sb, 1); + return ret; + } case BTRFS_IOC_START_SYNC: return btrfs_ioctl_start_sync(root, argp); case BTRFS_IOC_WAIT_SYNC: -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Failure to remove snapshot with 3.12* and FS switches to read-only
Hello, My system is configured to do FS snapshots when I upgrade packages. I have a cron job which runs at night to delete these snapshots. Its goal is to keep 10 snapshots maximum, one per day if possible. This works perfectly with 3.11.1 but fails miserably with anything post 3.11. With a pre 3.12 (3.11.0-10050-g3711d86), this is what happens: WARNING: CPU: 7 PID: 3991 at fs/btrfs/uuid-tree.c:171 btrfs_uuid_tree_rem+0x1ec/0x210 [btrfs]() Modules linked in: tcp_lp fuse ebtable_nat xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat bridge iptable_mangle stp llc nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter rfcomm ip6_tables bnep arc4 iwldvm mac80211 vfat fat snd_hda_codec_realtek x86_pkg_temp_thermal coretemp kvm_intel snd_hda_intel snd_hda_codec kvm uvcvideo snd_hwdep snd_seq crc32_pclmul cdc_mbim iwlwifi crc32c_intel videobuf2_vmalloc snd_seq_device videobuf2_memops cdc_ncm ghash_clmulni_intel videobuf2_core snd_pcm usbnet iTCO_wdt iTCO_vendor_support btusb sdhci_pci videodev bluetooth media cfg80211 cdc_acm cdc_wdm mii e1000e sdhci snd_page_alloc thinkpad_acpi tpm_tis mmc_core snd_timer tpm microcode snd tpm_bios serio_raw mei_me lpc_ich i2c_i801 ptp rfkill wmi mfd_core mei shpchp pps_core soundcore pcspkr uinput binfmt_misc btrfs libcrc32c xor raid6_pq dm_crypt i915 i2c_algo_bit drm_kms_helper drm firewire_ohci firewire_core crct10dif_pclmul i2c_core crc_itu_t video CPU: 7 PID: 3991 Comm: btrfs Tainted: GW3.11.0-10050-g3711d86 #2 Hardware name: LENOVO 2392CTO/2392CTO, BIOS G4ET94WW (2.54 ) 05/23/2013 0009 8802bba0bc58 8163cdcf 8802bba0bc90 81065c9d 88007faad3c0 8800c517cb00 8802904a2618 88036cd91000 8802bba0bca0 Call Trace: [8163cdcf] dump_stack+0x45/0x56 [81065c9d] warn_slowpath_common+0x7d/0xa0 [81065d7a] warn_slowpath_null+0x1a/0x20 [a02021cc] btrfs_uuid_tree_rem+0x1ec/0x210 [btrfs] [a0175016] ? btrfs_free_path+0x26/0x30 [btrfs] [a01cf573] ? btrfs_insert_orphan_item+0x63/0x80 [btrfs] [a01ca023] btrfs_ioctl_snap_destroy+0x523/0x630 [btrfs] [a01cd371] btrfs_ioctl+0xe61/0x2790 [btrfs] [8164711c] ? __do_page_fault+0x20c/0x540 [8116a8d7] ? do_mmap_pgoff+0x357/0x3e0 [811b4d5d] do_vfs_ioctl+0x2dd/0x4b0 [811b4fb1] SyS_ioctl+0x81/0xa0 [8164745e] ? do_page_fault+0xe/0x10 [8164bad9] system_call_fastpath+0x16/0x1b ---[ end trace 99eb746a9ff7c964 ]--- [ cut here ] WARNING: CPU: 7 PID: 3991 at fs/btrfs/super.c:255 __btrfs_abort_transaction+0x11d/0x130 [btrfs]() btrfs: Transaction aborted (error -22) Modules linked in: tcp_lp fuse ebtable_nat xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat bridge iptable_mangle stp llc nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter rfcomm ip6_tables bnep arc4 iwldvm mac80211 vfat fat snd_hda_codec_realtek x86_pkg_temp_thermal coretemp kvm_intel snd_hda_intel snd_hda_codec kvm uvcvideo snd_hwdep snd_seq crc32_pclmul cdc_mbim iwlwifi crc32c_intel videobuf2_vmalloc snd_seq_device videobuf2_memops cdc_ncm ghash_clmulni_intel videobuf2_core snd_pcm usbnet iTCO_wdt iTCO_vendor_support btusb sdhci_pci videodev bluetooth media cfg80211 cdc_acm cdc_wdm mii e1000e sdhci snd_page_alloc thinkpad_acpi tpm_tis mmc_core snd_timer tpm microcode snd tpm_bios serio_raw mei_me lpc_ich i2c_i801 ptp rfkill wmi mfd_core mei shpchp pps_core soundcore pcspkr uinput binfmt_misc btrfs libcrc32c xor raid6_pq dm_crypt i915 i2c_algo_bit drm_kms_helper drm firewire_ohci firewire_core crct10dif_pclmul i2c_core crc_itu_t video CPU: 7 PID: 3991 Comm: btrfs Tainted: GW3.11.0-10050-g3711d86 #2 Hardware name: LENOVO 2392CTO/2392CTO, BIOS G4ET94WW (2.54 ) 05/23/2013 0009 8802bba0bc48 8163cdcf 8802bba0bc90 8802bba0bc80 81065c9d ffea 880401bc2000 88003fe7d280 a020b830 08cc 8802bba0bce0 Call Trace: [8163cdcf] dump_stack+0x45/0x56 [81065c9d] warn_slowpath_common+0x7d/0xa0 [81065d0c] warn_slowpath_fmt+0x4c/0x50 [a0171b3d] __btrfs_abort_transaction+0x11d/0x130 [btrfs] [a01ca10a] btrfs_ioctl_snap_destroy+0x60a/0x630 [btrfs] [a01cd371] btrfs_ioctl+0xe61/0x2790 [btrfs] [8164711c] ? __do_page_fault+0x20c/0x540 [8116a8d7] ? do_mmap_pgoff+0x357/0x3e0 [811b4d5d] do_vfs_ioctl+0x2dd/0x4b0 [811b4fb1] SyS_ioctl+0x81/0xa0 [8164745e] ? do_page_fault+0xe/0x10
[offlist] Re: [PATCH 4/4] Btrfs-progs: add super-recover to recover bad supers
On Sat, Sep 21, 2013 at 09:16:25AM +0800, Wang Shilong wrote: - open_ctree_fs_info_restore(target, 0, 0, 0, 1); + open_ctree_fs_info_restore(target, 0, 0, OPEN_CTREE_PARTIAL); Good idea, i think this should be another patch. yes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: Add dependencies explicitly to fix a parallel build issue
On Sun, Sep 22, 2013 at 09:06:39AM +0800, Rongqing Li wrote: I want to know how many cores your cpu has? I can not reproduce it on my 2 cores cpu, but it always happens when run on a server which is a 16 cores cpu and make -j20 Depends on what you call a core, I've tested this on a box with 8 cpu cores and 64 logical cpus. I can clearly see that the build is stalled for a few moments at '[SH] version.h' before it proceeds to '[CC] ...'. The dependency files are generated by an implicit rule .c - .o.d, so there should be no problem for any of the files listed above. Do you means the below: .c.o: $(Q)$(check) $ @echo [CC] $@ $(Q)$(CC) $(DEPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c $ Rather %.o.d: %.c $(Q)$(CC) -MM -MG -MF $@ -MT $(@:.o.d=.o) -MT $(@:.o.d=.static.o) -MT $@ $(AM_CFLAGS) $(CFLAGS) $ the .o.d files are included at the end of Makefile, I'm not completely sure that they get included before the .c.o rule is processed. That way the .o.d files would be empty and the dependency on version.h missing. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Failure to remove snapshot with 3.12* and FS switches to read-only
Hello, My system is configured to do FS snapshots when I upgrade packages. I have a cron job which runs at night to delete these snapshots. Its goal is to keep 10 snapshots maximum, one per day if possible. This works perfectly with 3.11.1 but fails miserably with anything post 3.11. With a pre 3.12 (3.11.0-10050-g3711d86), this is what happens: WARNING: CPU: 7 PID: 3991 at fs/btrfs/uuid-tree.c:171 btrfs_uuid_tree_rem+0x1ec/0x210 [btrfs]() Modules linked in: tcp_lp fuse ebtable_nat xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat bridge iptable_mangle stp llc nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter rfcomm ip6_tables bnep arc4 iwldvm mac80211 vfat fat snd_hda_codec_realtek x86_pkg_temp_thermal coretemp kvm_intel snd_hda_intel snd_hda_codec kvm uvcvideo snd_hwdep snd_seq crc32_pclmul cdc_mbim iwlwifi crc32c_intel videobuf2_vmalloc snd_seq_device videobuf2_memops cdc_ncm ghash_clmulni_intel videobuf2_core snd_pcm usbnet iTCO_wdt iTCO_vendor_support btusb sdhci_pci videodev bluetooth media cfg80211 cdc_acm cdc_wdm mii e1000e sdhci snd_page_alloc thinkpad_acpi tpm_tis mmc_core snd_timer tpm microcode snd tpm_bios serio_raw mei_me lpc_ich i2c_i801 ptp rfkill wmi mfd_core mei shpchp pps_core soundcore pcspkr uinput binfmt_misc btrfs libcrc32c xor raid6_pq dm_crypt i915 i2c_algo_bit drm_kms_helper drm firewire_ohci firewire_core crct10dif_pclmul i2c_core crc_itu_t video CPU: 7 PID: 3991 Comm: btrfs Tainted: GW3.11.0-10050-g3711d86 #2 Hardware name: LENOVO 2392CTO/2392CTO, BIOS G4ET94WW (2.54 ) 05/23/2013 0009 8802bba0bc58 8163cdcf 8802bba0bc90 81065c9d 88007faad3c0 8800c517cb00 8802904a2618 88036cd91000 8802bba0bca0 Call Trace: [8163cdcf] dump_stack+0x45/0x56 [81065c9d] warn_slowpath_common+0x7d/0xa0 [81065d7a] warn_slowpath_null+0x1a/0x20 [a02021cc] btrfs_uuid_tree_rem+0x1ec/0x210 [btrfs] [a0175016] ? btrfs_free_path+0x26/0x30 [btrfs] [a01cf573] ? btrfs_insert_orphan_item+0x63/0x80 [btrfs] [a01ca023] btrfs_ioctl_snap_destroy+0x523/0x630 [btrfs] [a01cd371] btrfs_ioctl+0xe61/0x2790 [btrfs] [8164711c] ? __do_page_fault+0x20c/0x540 [8116a8d7] ? do_mmap_pgoff+0x357/0x3e0 [811b4d5d] do_vfs_ioctl+0x2dd/0x4b0 [811b4fb1] SyS_ioctl+0x81/0xa0 [8164745e] ? do_page_fault+0xe/0x10 [8164bad9] system_call_fastpath+0x16/0x1b ---[ end trace 99eb746a9ff7c964 ]--- [ cut here ] WARNING: CPU: 7 PID: 3991 at fs/btrfs/super.c:255 __btrfs_abort_transaction+0x11d/0x130 [btrfs]() btrfs: Transaction aborted (error -22) Modules linked in: tcp_lp fuse ebtable_nat xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat bridge iptable_mangle stp llc nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter rfcomm ip6_tables bnep arc4 iwldvm mac80211 vfat fat snd_hda_codec_realtek x86_pkg_temp_thermal coretemp kvm_intel snd_hda_intel snd_hda_codec kvm uvcvideo snd_hwdep snd_seq crc32_pclmul cdc_mbim iwlwifi crc32c_intel videobuf2_vmalloc snd_seq_device videobuf2_memops cdc_ncm ghash_clmulni_intel videobuf2_core snd_pcm usbnet iTCO_wdt iTCO_vendor_support btusb sdhci_pci videodev bluetooth media cfg80211 cdc_acm cdc_wdm mii e1000e sdhci snd_page_alloc thinkpad_acpi tpm_tis mmc_core snd_timer tpm microcode snd tpm_bios serio_raw mei_me lpc_ich i2c_i801 ptp rfkill wmi mfd_core mei shpchp pps_core soundcore pcspkr uinput binfmt_misc btrfs libcrc32c xor raid6_pq dm_crypt i915 i2c_algo_bit drm_kms_helper drm firewire_ohci firewire_core crct10dif_pclmul i2c_core crc_itu_t video CPU: 7 PID: 3991 Comm: btrfs Tainted: GW3.11.0-10050-g3711d86 #2 Hardware name: LENOVO 2392CTO/2392CTO, BIOS G4ET94WW (2.54 ) 05/23/2013 0009 8802bba0bc48 8163cdcf 8802bba0bc90 8802bba0bc80 81065c9d ffea 880401bc2000 88003fe7d280 a020b830 08cc 8802bba0bce0 Call Trace: [8163cdcf] dump_stack+0x45/0x56 [81065c9d] warn_slowpath_common+0x7d/0xa0 [81065d0c] warn_slowpath_fmt+0x4c/0x50 [a0171b3d] __btrfs_abort_transaction+0x11d/0x130 [btrfs] [a01ca10a] btrfs_ioctl_snap_destroy+0x60a/0x630 [btrfs] [a01cd371] btrfs_ioctl+0xe61/0x2790 [btrfs] [8164711c] ? __do_page_fault+0x20c/0x540 [8116a8d7] ? do_mmap_pgoff+0x357/0x3e0 [811b4d5d] do_vfs_ioctl+0x2dd/0x4b0 [811b4fb1] SyS_ioctl+0x81/0xa0 [8164745e] ? do_page_fault+0xe/0x10
Re: [PATCH 0/9] enhance btrfs qgroup show command
Hello, Wang Shilong posted on Mon, 23 Sep 2013 10:18:19 +0800 as excerpted: On 09/23/2013 09:53 AM, Dusty Mabe wrote: There is one other thing I have noticed while playing around with quota and qgroups. If I delete subvolumes I can manage to get some of the qgroup information to be reported as a negative number. If you are interested check out my steps at http://dustymabe.com/2013/09/22/btrfs-how-big-are-my-snapshots/ . The system I was using was Fedora 19 (so not the latest stuff so this may be a known issue already). I will take a look at this issue. FWIW, I remember seeing discussion of the negative qgroup numbers on- list, but as I don't use quotas I didn't track resolution. So it should indeed be a known issue, but I'm not sure if it's fixed or someone's at least working on it yet or not. I think this problem reported by Dusty is different from before. Previously, we discussed about negative numbers because quota rescan has not been implemented. In Dusty's reproduced steps, there are no parent qgroups. So i think every qgroup should works as expected, if negative numbers come out, it must be a *bug*. Thanks, Wang But that might give you a duplicating instance if you need one, anyway. (Unless of course Dusty reported it earlier too and that's what I'm remembering.) -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: qgroup scan failed with -12
Hello, I think this problem may be related to qgroup memory leak that you also reported before, however, i have not reproduced it in my test box. By the way, did you machine still exist high memory cost with quota enabled? Thanks, Wang Not sure if it's anything interesting - I had the following entry in dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine. [1878432.675210] btrfs-qgroup-re: page allocation failure: order:5, mode:0x104050 [1878432.675319] CPU: 5 PID: 22251 Comm: btrfs-qgroup-re Not tainted 3.11.0-rc7 #2 [1878432.675417] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 1106 10/17/2011 [1878432.675517] 00104050 88062a981948 81378c52 88083fb4d958 [1878432.675618] 0001 88062a9819d8 810af9a4 [1878432.675721] 817c8100 88062a981978 88083fddea00 88083fddea38 [1878432.675821] Call Trace: [1878432.675874] [81378c52] dump_stack+0x46/0x58 [1878432.675927] [810af9a4] warn_alloc_failed+0x110/0x124 [1878432.675981] [810b1fd8] __alloc_pages_nodemask+0x6a4/0x793 [1878432.676036] [810db7e8] alloc_pages_current+0xc8/0xe5 [1878432.676098] [810af06c] __get_free_pages+0x9/0x36 [1878432.676150] [810e27b9] __kmalloc_track_caller+0x35/0x163 [1878432.676204] [810bde12] krealloc+0x52/0x8c [1878432.676265] [a036cdcb] ulist_add_merge+0xe1/0x14e [btrfs] [1878432.676324] [a036bcf0] find_parent_nodes+0x49c/0x5a5 [btrfs] [1878432.676383] [a036be75] btrfs_find_all_roots+0x7c/0xd7 [btrfs] [1878432.676441] [a036d6e1] ? qgroup_account_ref_step1+0xea/0x102 [btrfs] [1878432.676542] [a036d915] btrfs_qgroup_rescan_worker+0x21c/0x516 [btrfs] [1878432.676645] [a03482cc] worker_loop+0x15e/0x48e [btrfs] [1878432.676702] [a034816e] ? btrfs_queue_worker+0x267/0x267 [btrfs] [1878432.676757] [8104e51a] kthread+0xb5/0xbd [1878432.676809] [8104e465] ? kthread_freezable_should_stop+0x43/0x43 [1878432.676881] [8137da2c] ret_from_fork+0x7c/0xb0 [1878432.676950] [8104e465] ? kthread_freezable_should_stop+0x43/0x43 [1878432.677004] Mem-Info: [1878432.678293] Node 0 DMA per-cpu: [1878432.678341] CPU0: hi:0, btch: 1 usd: 0 [1878432.678392] CPU1: hi:0, btch: 1 usd: 0 [1878432.678443] CPU2: hi:0, btch: 1 usd: 0 [1878432.678494] CPU3: hi:0, btch: 1 usd: 0 [1878432.678544] CPU4: hi:0, btch: 1 usd: 0 [1878432.678595] CPU5: hi:0, btch: 1 usd: 0 [1878432.678646] CPU6: hi:0, btch: 1 usd: 0 [1878432.678697] CPU7: hi:0, btch: 1 usd: 0 [1878432.678747] Node 0 DMA32 per-cpu: [1878432.678797] CPU0: hi: 186, btch: 31 usd: 0 [1878432.678847] CPU1: hi: 186, btch: 31 usd: 0 [1878432.678897] CPU2: hi: 186, btch: 31 usd: 0 [1878432.678948] CPU3: hi: 186, btch: 31 usd: 0 [1878432.678998] CPU4: hi: 186, btch: 31 usd: 0 [1878432.679049] CPU5: hi: 186, btch: 31 usd: 0 [1878432.679111] CPU6: hi: 186, btch: 31 usd: 0 [1878432.679162] CPU7: hi: 186, btch: 31 usd: 2 [1878432.679214] Node 0 Normal per-cpu: [1878432.679270] CPU0: hi: 186, btch: 31 usd: 31 [1878432.679321] CPU1: hi: 186, btch: 31 usd: 0 [1878432.679372] CPU2: hi: 186, btch: 31 usd: 0 [1878432.679444] CPU3: hi: 186, btch: 31 usd: 0 [1878432.679495] CPU4: hi: 186, btch: 31 usd: 0 [1878432.679546] CPU5: hi: 186, btch: 31 usd: 0 [1878432.679596] CPU6: hi: 186, btch: 31 usd: 30 [1878432.679647] CPU7: hi: 186, btch: 31 usd: 169 [1878432.679700] active_anon:1062992 inactive_anon:522620 isolated_anon:0 [1878432.679700] active_file:1050823 inactive_file:1052143 isolated_file:0 [1878432.679700] unevictable:5892 dirty:1004 writeback:0 unstable:0 [1878432.679700] free:2184567 slab_reclaimable:1142980 slab_unreclaimable:86405 [1878432.679700] mapped:754301 shmem:9119 pagetables:21518 bounce:0 [1878432.679700] free_cma:0 [1878432.680007] Node 0 DMA free:15360kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1878432.680322] lowmem_reserve[]: 0 2897 32077 32077 [1878432.680375] Node 0 DMA32 free:417660kB min:2068kB low:2584kB high:3100kB active_anon:4384kB inactive_anon:59696kB active_file:45640kB inactive_file:45652kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3040872kB managed:2968600kB mlocked:0kB dirty:4kB
Re: [PATCH] xfstest/btrfs/001: fix the misuse of subvolume set-default
On Mon, Sep 23, 2013 at 04:47:29PM +0800, Liu Bo wrote: The command is btrfs subvolume set-default subvolid path. It uses @subvolid to control the default subvolume and @subvolid=0 has always been parsed into FS_TREE no matter what subvolume @path points to. So in order to set a subvolume to the default one, we need to get the id of this subvolume first. Also fix a typo: s/sbuvolid/subvolid/g Signed-off-by: Liu Bo bo.li@oracle.com Sent a patch for this already. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstest/btrfs/001: fix the misuse of subvolume set-default
On Mon, Sep 23, 2013 at 09:17:46AM -0400, Josef Bacik wrote: On Mon, Sep 23, 2013 at 04:47:29PM +0800, Liu Bo wrote: The command is btrfs subvolume set-default subvolid path. It uses @subvolid to control the default subvolume and @subvolid=0 has always been parsed into FS_TREE no matter what subvolume @path points to. So in order to set a subvolume to the default one, we need to get the id of this subvolume first. Also fix a typo: s/sbuvolid/subvolid/g Signed-off-by: Liu Bo bo.li@oracle.com Sent a patch for this already. Thanks, oops, sorry I should have noticed that, any chance to fold the typo fix into your patch? -liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: qgroup scan failed with -12
On Mon, Sep 23, 2013 at 07:43:44AM +0700, Tomasz Chmielewski wrote: Not sure if it's anything interesting - I had the following entry in dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine. Can you try the patch here https://bugzilla.kernel.org/attachment.cgi?id=107408action=diff and see if that helps? Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: check if device supports trim
On 9/20/13 11:42 AM, David Sterba wrote: The message about trim was printed unconditionally, we should check if trim is supported at all. Good idea, but I wonder if there's any risk that discard(0,0) will ever be optimized away on the kernel side pass unconditionally? I was thinking we could get this from blkid, but maybe not. In the meantime it does do the right thing, so: Reviewed-by: Eric Sandeen sand...@redhat.com Signed-off-by: David Sterba dste...@suse.cz --- utils.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/utils.c b/utils.c index 5fa193b..6c74654 100644 --- a/utils.c +++ b/utils.c @@ -597,13 +597,16 @@ int btrfs_prepare_device(int fd, char *file, int zero_end, u64 *block_count_ret, } if (discard) { - fprintf(stderr, Performing full device TRIM (%s) ...\n, - pretty_size(block_count)); /* - * We intentionally ignore errors from the discard ioctl. It is - * not necessary for the mkfs functionality but just an optimization. + * We intentionally ignore errors from the discard ioctl. It + * is not necessary for the mkfs functionality but just an + * optimization. */ - discard_blocks(fd, 0, block_count); + if (discard_blocks(fd, 0, 0) == 0) { + fprintf(stderr, Performing full device TRIM (%s) ...\n, + pretty_size(block_count)); + discard_blocks(fd, 0, block_count); + } } ret = zero_dev_start(fd); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Failure to remove snapshot with 3.12* and FS switches to read-only
Mathieu Chouquet-Stringer posted on Mon, 23 Sep 2013 14:54:17 +0200 as excerpted: My system is configured to do FS snapshots when I upgrade packages. I have a cron job which runs at night to delete these snapshots. Its goal is to keep 10 snapshots maximum, one per day if possible. This works perfectly with 3.11.1 but fails miserably with anything post 3.11. With a pre 3.12 (3.11.0-10050-g3711d86), this is what happens: There's a known regression. Check Chris's for-Linus pull request posted within the last 48 hours, there's a patch in it that should fix it. Or wait for -rc2 or 3 depending on when that pull gets processed. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: check if device supports trim
On 9/23/13 10:44 AM, David Sterba wrote: On Mon, Sep 23, 2013 at 10:08:08AM -0500, Eric Sandeen wrote: On 9/20/13 11:42 AM, David Sterba wrote: The message about trim was printed unconditionally, we should check if trim is supported at all. Good idea, but I wonder if there's any risk that discard(0,0) will ever be optimized away on the kernel side pass unconditionally? I hope the checks in blkdev_issue_discard() stay in the order as of now: 40 int blkdev_issue_discard(struct block_device *bdev, sector_t sector, 41 sector_t nr_sects, gfp_t gfp_mask, unsigned long flags) 42 { ... 52 53 if (!q) 54 return -ENXIO; 55 56 if (!blk_queue_discard(q)) 57 return -EOPNOTSUPP; here it returns no matter what the arguments are, setting length to 0 is just cautious. 59 /* Zero-sector (unknown) and one-sector granularities are the same. */ 60 granularity = max(q-limits.discard_granularity 9, 1U); 61 alignment = bdev_discard_alignment(bdev) 9; 62 alignment = sector_div(alignment, granularity); 63 I was thinking we could get this from blkid, but maybe not. Possibly yes, with other information like rotational etc. Alternatively, /sys/block/sdx/queue/dicard_granularity 0 means that the device supports discard, but that's imo even more fragile than the direct call to discard. Perhaps; and I don't think libblkid gives us easy access to that anyway, at least I didn't see it on a quick look. So yeah, I think it's fine as you sent it; it doesn't actually change behavior anyway other than the printf. Thanks, -Eric -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: check if device supports trim
On Mon, Sep 23, 2013 at 10:08:08AM -0500, Eric Sandeen wrote: On 9/20/13 11:42 AM, David Sterba wrote: The message about trim was printed unconditionally, we should check if trim is supported at all. Good idea, but I wonder if there's any risk that discard(0,0) will ever be optimized away on the kernel side pass unconditionally? I hope the checks in blkdev_issue_discard() stay in the order as of now: 40 int blkdev_issue_discard(struct block_device *bdev, sector_t sector, 41 sector_t nr_sects, gfp_t gfp_mask, unsigned long flags) 42 { ... 52 53 if (!q) 54 return -ENXIO; 55 56 if (!blk_queue_discard(q)) 57 return -EOPNOTSUPP; here it returns no matter what the arguments are, setting length to 0 is just cautious. 59 /* Zero-sector (unknown) and one-sector granularities are the same. */ 60 granularity = max(q-limits.discard_granularity 9, 1U); 61 alignment = bdev_discard_alignment(bdev) 9; 62 alignment = sector_div(alignment, granularity); 63 I was thinking we could get this from blkid, but maybe not. Possibly yes, with other information like rotational etc. Alternatively, /sys/block/sdx/queue/dicard_granularity 0 means that the device supports discard, but that's imo even more fragile than the direct call to discard. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9] enhance btrfs qgroup show command
Wang Shilong wangshilong1991 at gmail.com writes: In Dusty's reproduced steps, there are no parent qgroups. So i think every qgroup should works as expected, if negative numbers come out, it must be a *bug*. In that case would you like for me to open a new bug report with the details? Dusty -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handful of btrfs fixes for 3.11.x stable
On Fri, Sep 20, 2013 at 09:53:02AM -0700, Greg KH wrote: On Fri, Sep 20, 2013 at 06:34:39PM +0200, David Sterba wrote: 3d05ca371200b3366530621abf73769024581b79 b13a004528c3e5eb060a26eee795f5a0da7bfe9f 7ef67ffda91cc0c56f33937bfdf1d057b9ee96ca ca6d07c1d74bf7ba3083bc31a9aeeaa1d0ad86aa Josef Bacik (4): Btrfs: reset ret in record_one_backref Btrfs: change how we queue blocks for backref checking Btrfs: skip subvol entries when checking if we've created a dir already Btrfs: remove ourselves from the cluster list under lock Thanks for this, I'll queue them up soon. Are any of these applicable for 3.10 as well? The first one is not applicable. These two apply directly (3rd, 4th): Btrfs: skip subvol entries when checking if we've created a dir already 7ef67ffda91cc0c56f33937bfdf1d057b9ee96ca Btrfs: remove ourselves from the cluster list under lock ca6d07c1d74bf7ba3083bc31a9aeeaa1d0ad86aa I'm no sure about the 2nd, this one leads to crash in 3.11 but I'm not aware of any 3.10.x report, so ti could be affected by later changes. Josef knows better. Btrfs: change how we queue blocks for backref checking b13a004528c3e5eb060a26eee795f5a0da7bfe9f thanks, david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: qgroup scan failed with -12
On Mon, Sep 23, 2013 at 07:43:44AM +0700, Tomasz Chmielewski wrote: Not sure if it's anything interesting - I had the following entry in dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine. Yes this is interesting of course. [1878432.675210] btrfs-qgroup-re: page allocation failure: order:5, mode:0x104050 Order 5 allocation, not guaranteed to succeed. [1878432.675319] CPU: 5 PID: 22251 Comm: btrfs-qgroup-re Not tainted 3.11.0-rc7 #2 [1878432.676204] [810bde12] krealloc+0x52/0x8c [1878432.676324] [a036bcf0] find_parent_nodes+0x49c/0x5a5 [btrfs] [1878432.676383] [a036be75] btrfs_find_all_roots+0x7c/0xd7 [btrfs] [1878432.676441] [a036d6e1] ? qgroup_account_ref_step1+0xea/0x102 [btrfs] [1878432.676542] [a036d915] btrfs_qgroup_rescan_worker+0x21c/0x516 [btrfs] 220 new_nodes = krealloc(old, sizeof(*new_nodes) * new_alloced, 221 gfp_mask); 222 if (!new_nodes) 223 return -ENOMEM; The requested size is between 64k and 128k, with 40 bytes of ulist_node it's 1638 to 3276 elements. So, lots of things going on during the rescan, quite expectable. I don't know if krealloc can be replaced with something more friendly to allocator, eg. a list of page-sized blocks instead of one contiguous array. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v2] btrfs-progs: Add recursive defrag using -r option
On Tue, Sep 17, 2013 at 11:30:52AM -0400, Frank Holton wrote: Thanks for that hint to use ftw. I've updated the code to use it and tried to make sure that I caught all of the styling errors. Dunno what caused that, but the whitespace is completely messed up and squasthed into a single space everywhere. Since the ftw callback doesn't take any additional options I had to add several global variables to pass the fancy_ioctl and range parameters. Should I replace all of the uses of those variables with the globals or just copy into the globals like I did in the code below. Eww a callback without user data, that's kind of underdesigned. I don't see a better workaround than the gobals. It does not attempt to defrag directories anymore in the recursive mode, however, the non recursive mode will still attempt to defrag directories. I figured since that only works when you run as root that it is acceptable for now. Agreed. +static int global_fancy_ioctl; +static struct btrfs_ioctl_defrag_range_args global_range; +static int global_verbose; +static int global_errors; For now, please prefix all of them as defrag_ so they do not get reused like generic variables. @@ -349,7 +403,8 @@ static int cmd_defrag(int argc, char **argv) u64 len = (u64)-1; u32 thresh = 0; int i; - int errors = 0; + int recursive = 0; + global_errors = 0; move this out of the declaration block int ret = 0; int verbose = 0; int fancy_ioctl = 0; The rest looks ok, but I'll take another look when you send the next version where the indentatino is fixed. thanks -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: qgroup scan failed with -12
On Mon, Sep 23, 2013 at 07:19:06PM +0200, David Sterba wrote: On Mon, Sep 23, 2013 at 07:43:44AM +0700, Tomasz Chmielewski wrote: Not sure if it's anything interesting - I had the following entry in dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine. Yes this is interesting of course. [1878432.675210] btrfs-qgroup-re: page allocation failure: order:5, mode:0x104050 Order 5 allocation, not guaranteed to succeed. [1878432.675319] CPU: 5 PID: 22251 Comm: btrfs-qgroup-re Not tainted 3.11.0-rc7 #2 [1878432.676204] [810bde12] krealloc+0x52/0x8c [1878432.676324] [a036bcf0] find_parent_nodes+0x49c/0x5a5 [btrfs] [1878432.676383] [a036be75] btrfs_find_all_roots+0x7c/0xd7 [btrfs] [1878432.676441] [a036d6e1] ? qgroup_account_ref_step1+0xea/0x102 [btrfs] [1878432.676542] [a036d915] btrfs_qgroup_rescan_worker+0x21c/0x516 [btrfs] 220 new_nodes = krealloc(old, sizeof(*new_nodes) * new_alloced, 221 gfp_mask); 222 if (!new_nodes) 223 return -ENOMEM; The requested size is between 64k and 128k, with 40 bytes of ulist_node it's 1638 to 3276 elements. So, lots of things going on during the rescan, quite expectable. I don't know if krealloc can be replaced with something more friendly to allocator, eg. a list of page-sized blocks instead of one contiguous array. I've done that with a patch in bugzilla, hopefully that will fix it. I've not had time to try and reproduce myself, but I assume if you do something like create a random file, then create 10 snapshots and then defrag it will probably hit the same problem. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: cmd_find_new: Sync fs before searching for modified files.
This patch was tested on a kernel with [PATCH v4] Btrfs: fix sync fs to actually wait for all data to be persisted applied. Thanks, chandan -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: cmd_find_new: Sync fs before searching for modified files.
The sync makes sure that 'very recently' introduced delayed work is accounted for in the output of 'btrfs subvolume find-new' command. Signed-off-by: chandan chan...@linux.vnet.ibm.com --- cmds-subvolume.c | 9 + 1 file changed, 9 insertions(+) diff --git a/cmds-subvolume.c b/cmds-subvolume.c index de246ab..8832303 100644 --- a/cmds-subvolume.c +++ b/cmds-subvolume.c @@ -786,6 +786,15 @@ static int cmd_find_new(int argc, char **argv) fprintf(stderr, ERROR: can't access '%s'\n, subvol); return 1; } + + ret = ioctl(fd, BTRFS_IOC_SYNC); + if (ret 0) { + fprintf(stderr, ERROR: unable to fs-syncing '%s' - %s\n, + subvol, strerror(errno)); + close_file_or_dir(fd, dirstream); + return 1; + } + ret = btrfs_list_find_updated_files(fd, 0, last_gen); close_file_or_dir(fd, dirstream); return !!ret; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] btrfs-progs: Add recursive defrag using -r option
Add an option to defrag all files in a directory recursively. Signed-off-by: Frank Holton fhol...@gmail.com --- v3: prefix globals with defrag v2: switch to ftw amd callback cmds-filesystem.c | 156 ++--- 1 file changed, 113 insertions(+), 43 deletions(-) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index f41a72a..44a224f 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@ -22,6 +22,8 @@ #include errno.h #include uuid/uuid.h #include ctype.h +#include fcntl.h +#include ftw.h #include kerncompat.h #include ctree.h @@ -265,7 +267,7 @@ static int cmd_show(int argc, char **argv) fprintf(stderr, ERROR: error %d while scanning\n, ret); return 18; } - + if(searchstart argc) search = argv[searchstart]; @@ -308,7 +310,7 @@ static int cmd_sync(int argc, char **argv) e = errno; close(fd); if( res 0 ){ - fprintf(stderr, ERROR: unable to fs-syncing '%s' - %s\n, + fprintf(stderr, ERROR: unable to fs-syncing '%s' - %s\n, path, strerror(e)); return 16; } @@ -333,6 +335,7 @@ static const char * const cmd_defrag_usage[] = { Defragment a file or a directory, , -v be verbose, + -r defragment files recursively, -c[zlib,lzo] compress the file while defragmenting, -f flush data to disk immediately after defragmenting, -s start defragment only from byte onward, @@ -341,6 +344,57 @@ static const char * const cmd_defrag_usage[] = { NULL }; +static int do_defrag(int fd, int fancy_ioctl, + struct btrfs_ioctl_defrag_range_args *range) +{ + int ret; + + if (!fancy_ioctl) + ret = ioctl(fd, BTRFS_IOC_DEFRAG, NULL); + else + ret = ioctl(fd, BTRFS_IOC_DEFRAG_RANGE, range); + + return ret; +} + +static int defrag_global_fancy_ioctl; +static struct btrfs_ioctl_defrag_range_args defrag_global_range; +static int defrag_global_verbose; +static int defrag_global_errors; +static int defrag_callback(const char *fpath, const struct stat *sb, int typeflag) +{ + int ret = 0; + int e = 0; + int fd = 0; + + if (typeflag == FTW_F) { + if (defrag_global_verbose) + printf(%s\n, fpath); + fd = open(fpath, O_RDWR); + e = errno; + if (fd 0) + goto error; + ret = do_defrag(fd, defrag_global_fancy_ioctl, defrag_global_range); + e = errno; + close(fd); + if (ret e == ENOTTY) { + fprintf(stderr, ERROR: defrag range ioctl not + supported in this kernel, please try + without any options.\n); + defrag_global_errors++; + return ENOTTY; + } + if (ret) + goto error; + } + return 0; + +error: + fprintf(stderr, ERROR: defrag failed on %s - %s\n, fpath, strerror(e)); + defrag_global_errors++; + return 0; +} + static int cmd_defrag(int argc, char **argv) { int fd; @@ -349,17 +403,19 @@ static int cmd_defrag(int argc, char **argv) u64 len = (u64)-1; u32 thresh = 0; int i; - int errors = 0; + int recursive = 0; int ret = 0; - int verbose = 0; - int fancy_ioctl = 0; struct btrfs_ioctl_defrag_range_args range; - int e=0; + int e = 0; int compress_type = BTRFS_COMPRESS_NONE; + defrag_global_errors = 0; + defrag_global_verbose = 0; + defrag_global_errors = 0; + defrag_global_fancy_ioctl = 0; optind = 1; while(1) { - int c = getopt(argc, argv, vc::fs:l:t:); + int c = getopt(argc, argv, vrc::fs:l:t:); if (c 0) break; @@ -368,26 +424,29 @@ static int cmd_defrag(int argc, char **argv) compress_type = BTRFS_COMPRESS_ZLIB; if (optarg) compress_type = parse_compress_type(optarg); - fancy_ioctl = 1; + defrag_global_fancy_ioctl = 1; break; case 'f': flush = 1; - fancy_ioctl = 1; + defrag_global_fancy_ioctl = 1; break; case 'v': - verbose = 1; + defrag_global_verbose = 1; break; case 's': start = parse_size(optarg); - fancy_ioctl = 1; +
balance induced csum errors
SAMSUNG SSD 830 Series CPU0: Intel® Core(TM) i7-2820QM CPU @ 2.30GHz (fam: 06, model: 2a, stepping: 07) 8GB RAM (quite heavily tested, not recently, with several days of memtest) kernel 3.11.1-200.fc19.x86_64 running on baremetal btrfs-progs-0.20.rc1.20130308git704a08c-1.fc19.x86_64 Today I did a scrub on a btrfs volume, with no message or errors in console or dmesg or journal. Immediately after the scrub I did a balance on the volume which resulted in: ERROR: error during balancing '/' - Input/output error In dmesg for the time of that error, this is reported: [ 567.921661] btrfs: relocating block group 6463422464 flags 1 [ 568.282371] btrfs: found 200 extents [ 568.800974] btrfs: found 200 extents [ 568.868567] btrfs: relocating block group 5389680640 flags 1 [ 571.929662] btrfs: found 4410 extents [ 572.896410] btrfs: found 4410 extents [ 572.962479] btrfs: relocating block group 4315938816 flags 1 [ 574.681576] BTRFS info (device sda6): csum failed ino 259 off 428470272 csum 2566472073 private 2181120065 [ 574.692047] BTRFS info (device sda6): csum failed ino 259 off 428470272 csum 2566472073 private 2181120065 Upon reboot with kernel 3.11.1-200.fc19.x86_64 and also kernel-3.10.4-300.fc19.x86_64 the following is reported in dmesg: [6.053511] btrfs no csum found for inode 37693 start 25538560 [6.054463] BTRFS info (device sda6): csum failed ino 37693 off 25538560 csum 3474434693 private 0 [6.055299] btrfs no csum found for inode 37693 start 26218496 [6.056086] BTRFS info (device sda6): csum failed ino 37693 off 26218496 csum 2772176352 private 0 [6.085993] btrfs no csum found for inode 37693 start 22286336 [6.086093] btrfs no csum found for inode 37693 start 22368256 [6.087636] BTRFS info (device sda6): csum failed ino 37693 off 22286336 csum 396494483 private 0 [6.087741] BTRFS info (device sda6): csum failed ino 37693 off 22368256 csum 2249156591 private 0 [root@f19l chris]# btrfs fi show failed to open /dev/sr0: No medium found Label: 'fedora' uuid: d505bdee-ba7c-4a64-9481-d5cd76ab8b3e Total devices 1 FS bytes used 3.64GB devid1 size 12.99GB used 6.51GB path /dev/sda6 The file system is on an SSD, so single profile for both data and metadata: [root@f19l chris]# btrfs fi df / Data: total=6.01GB, used=3.39GB System: total=4.00MB, used=4.00KB Metadata: total=512.00MB, used=258.93MB If this is not the result of a known bug, let me know if there's more information I should provide, I do have a ~22MB btrfs-image -c9 -t4 of the file system. This fs is disposable, but I might try btrfsck --repair --init-csum-tree with a slightly newer btrfs-progs. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: balance induced csum errors
Result of btrfsck (without --repair) on the fs. Checking filesystem on /dev/sda6 UUID: d505bdee-ba7c-4a64-9481-d5cd76ab8b3e checking extents checking fs roots root 257 inode 37693 errors 1800 found 3938304000 bytes used err is 1 total csum bytes: 3557972 total tree bytes: 271794176 total fs tree bytes: 253009920 btree space waste bytes: 79371605 file data blocks allocated: 4546076672 referenced 3631865856 Btrfs v0.20-rc1 Console result of subsequence scrub on the mounted fs: scrub status for d505bdee-ba7c-4a64-9481-d5cd76ab8b3e scrub started at Mon Sep 23 16:23:33 2013 and finished after 8 seconds total bytes scrubbed: 3.67GB with 10 errors error details: csum=10 corrected errors: 0, uncorrectable errors: 10, unverified errors: 0 dmesg result of a subsequent scrub on the mounted file system: [ 30.682058] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 [ 30.682095] btrfs: unable to fixup (regular) error at logical 461914112 on dev /dev/sda6 [ 30.682141] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 [ 30.682174] btrfs: unable to fixup (regular) error at logical 460079104 on dev /dev/sda6 [ 30.689792] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 [ 30.689823] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 [ 30.689824] btrfs: unable to fixup (regular) error at logical 456085504 on dev /dev/sda6 [ 30.689896] btrfs: unable to fixup (regular) error at logical 457531392 on dev /dev/sda6 [ 30.743222] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0 [ 30.743260] btrfs: unable to fixup (regular) error at logical 460230656 on dev /dev/sda6 [ 30.970989] btrfs: checksum error at logical 462082048 on dev /dev/sda6, sector 902504, root 257, inode 37693, offset 22282240, length 4096, links 1 (path: var/log/journal/180d14c18233452d9918c3aec1c6c68b/system.journal) [ 30.970993] btrfs: checksum error at logical 464195584 on dev /dev/sda6, sector 906632, root 257, inode 37693, offset 22638592, length 4096, links 1 (path: var/log/journal/180d14c18233452d9918c3aec1c6c68b/system.journal) [ 30.970997] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 [ 30.970998] btrfs: unable to fixup (regular) error at logical 464195584 on dev /dev/sda6 [ 30.971270] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0 [ 30.971300] btrfs: unable to fixup (regular) error at logical 462082048 on dev /dev/sda6 [ 31.047120] btrfs: checksum error at logical 462123008 on dev /dev/sda6, sector 902584, root 257, inode 37693, offset 22360064, length 4096, links 1 (path: var/log/journal/180d14c18233452d9918c3aec1c6c68b/system.journal) [ 31.047206] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0 [ 31.047235] btrfs: unable to fixup (regular) error at logical 462123008 on dev /dev/sda6 [ 36.290269] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0 [ 36.290305] btrfs: unable to fixup (regular) error at logical 4744409088 on dev /dev/sda6 [ 37.882830] btrfs: bdev /dev/sda6 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 [ 37.882867] btrfs: unable to fixup (regular) error at logical 6730518528 on dev /dev/sda6 Also, there have been no crashes, panics, or power cuts to this system. Thus far it seems like the balance itself is what has caused the csum corruption. Prior to balance, scrub finds no problems. After balance there is some corruption. But isn't it ambiguous whether the data or the metadata have been corrupted since there is only a single copy of each? In which case is it wise to init-csum-tree? Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] Btrfs: fix sync fs to actually wait for all data to be persisted
On mon, 23 Sep 2013 11:35:11 +0100, Filipe David Borba Manana wrote: Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = (\x41 x 6001); open($f,,/mnt/btrfs/foobar); print $f $d; close($f);' btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com Reviewed-by: Miao Xie mi...@cn.fujitsu.com --- V2: Use writeback_inodes_sb() instead of btrfs_start_all_delalloc_inodes(), as suggested by Miao Xie. V3: Use btrfs_start_all_delalloc_inodes() instead but outside btrfs_sync_fs(), in the sync IOCTL handler. Using writeback_inodes_sb() is not very honest because it doesn't guarantee inode data is persisted and we have no way to know if persistence really happened or not, returning 0 (success) always. Thanks Liu Bo for the suggestion. V4: Be even more honest in the sync IOCTL handler - don't always return success regardless of the result of the btrfs_sync_fs() call. fs/btrfs/ioctl.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 9d46f60..385c58f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -4557,9 +4557,15 @@ long btrfs_ioctl(struct file *file, unsigned int return btrfs_ioctl_logical_to_ino(root, argp); case BTRFS_IOC_SPACE_INFO: return btrfs_ioctl_space_info(root, argp); - case BTRFS_IOC_SYNC: - btrfs_sync_fs(file-f_dentry-d_sb, 1); - return 0; + case BTRFS_IOC_SYNC: { + int ret; + + ret = btrfs_start_all_delalloc_inodes(root-fs_info, 0); + if (ret) + return ret; + ret = btrfs_sync_fs(file-f_dentry-d_sb, 1); + return ret; + } case BTRFS_IOC_START_SYNC: return btrfs_ioctl_start_sync(root, argp); case BTRFS_IOC_WAIT_SYNC: -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9] enhance btrfs qgroup show command
On Mon, Sep 23, 2013 at 9:35 PM, Wang Shilong wangsl.f...@cn.fujitsu.com wrote: On 09/24/2013 12:55 AM, Dusty Mabe wrote: In that case would you like for me to open a new bug report with the details? Yes. https://bugzilla.kernel.org/show_bug.cgi?id=61951 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html