On 2018/03/07 5:29, Goffredo Baroncelli wrote: > On 03/06/2018 09:30 AM, Misono, Tomohiro wrote: >> Add new unprivileged ioctl (BTRFS_IOC_GET_SUBVOL_INFO) which searches >> and returns only subvolume related item (ROOT_ITEM/ROOT_BACKREF/ROOT_REF) >> from root tree. The arguments of this ioctl are the same as treesearch >> ioctl and can be used like treesearch ioctl. > > Is it a pro ? The treesearch ioctl is tightly coupled to the btrfs internal > structure, this means that if we would change the btrfs internal structure, > we have to update all the clients of this api. For the treesearch it is an > acceptable compromise between flexibility and speed of developing. But for a > more specialized API, I think that it would make sense to provide a less > coupled api to the internal structure.
Thanks for the comments. The reason I choose the same api is just to minimize the code change in btrfs-progs. For tree search part, it works just switching the ioctl number from BTRFS_IOC_TREE_SEARCH to BTRFS_IOC_GET_SUBVOL_INFO in list_subvol_search()[1]. [1] https://marc.info/?l=linux-btrfs&m=152032537911218&w=2 > > Below some comments > > >> >> Since treesearch ioctl requires root privilege, this ioctl is needed >> to allow normal users to call "btrfs subvolume list/show" etc. >> >> Note that the subvolume name in ROOT_BACKREF/ROOT_REF will not be copied to >> prevent potential name leak. The name can be obtained by calling >> user version's ino_lookup ioctl (BTRFS_IOC_INO_LOOKUP_USER). >> >> Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com> >> --- >> fs/btrfs/ioctl.c | 254 >> +++++++++++++++++++++++++++++++++++++++++++++ >> include/uapi/linux/btrfs.h | 2 + >> 2 files changed, 256 insertions(+) >> >> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c >> index 111ee282b777..1dba309dce31 100644 >> --- a/fs/btrfs/ioctl.c >> +++ b/fs/btrfs/ioctl.c >> @@ -1921,6 +1921,28 @@ static noinline int key_in_sk(struct btrfs_key *key, >> return 1; >> } >> >> +/* >> + * check if key is in sk and subvolume related range >> + */ >> +static noinline int key_in_sk_and_subvol(struct btrfs_key *key, >> + struct btrfs_ioctl_search_key *sk) >> +{ >> + int ret; >> + >> + ret = key_in_sk(key, sk); >> + if (!ret) >> + return 0; >> + >> + if ((key->objectid == BTRFS_FS_TREE_OBJECTID || >> + (key->objectid >= BTRFS_FIRST_FREE_OBJECTID && >> + key->objectid <= BTRFS_LAST_FREE_OBJECTID)) && >> + key->type >= BTRFS_ROOT_ITEM_KEY && >> + key->type <= BTRFS_ROOT_BACKREF_KEY) > > Why returning all the range [BTRFS_ROOT_ITEM_KEY...BTRFS_ROOT_BACKREF_KEY]. > It would be sufficient to return only > > + (key->type == BTRFS_ROOT_ITEM_KEY || > + key->type == BTRFS_ROOT_BACKREF_KEY)) Sorry, it is a mistake, I mean "key->type <= BTRFS_ROOTREF_KEY". Although btrfs-progs only uses BTRFS_ROOT_BACKREF_KEY, I notice libbtrfs uses BTRFS_ROOT_REF_KEY instead. So, I think it is better to return both ROOT_BACKREF_KEY and ROOT_REF_KEY. I will fix this in v2. > > >> + return 1; >> + >> + return 0; >> +} >> + >> static noinline int copy_to_sk(struct btrfs_path *path, >> struct btrfs_key *key, >> struct btrfs_ioctl_search_key *sk, >> @@ -2045,6 +2067,142 @@ static noinline int copy_to_sk(struct btrfs_path >> *path, >> return ret; >> } >> >> +/* >> + * Basically the same as copy_to_sk() but restricts the copied item >> + * within subvolume related one using key_in_sk_and_subvol(). >> + * Also, the name of subvolume will be omitted from >> + * ROOT_BACKREF/ROOT_REF item. >> + */ >> +static noinline int copy_subvol_item(struct btrfs_path *path, >> + struct btrfs_key *key, >> + struct btrfs_ioctl_search_key *sk, >> + size_t *buf_size, >> + char __user *ubuf, >> + unsigned long *sk_offset, >> + int *num_found) >> +{ >> + u64 found_transid; >> + struct extent_buffer *leaf; >> + struct btrfs_ioctl_search_header sh; >> + struct btrfs_key test; >> + unsigned long item_off; >> + unsigned long item_len; >> + int nritems; >> + int i; >> + int slot; >> + int ret = 0; >> + >> + leaf = path->nodes[0]; >> + slot = path->slots[0]; >> + nritems = btrfs_header_nritems(leaf); >> + >> + if (btrfs_header_generation(leaf) > sk->max_transid) { >> + i = nritems; >> + goto advance_key; >> + } >> + found_transid = btrfs_header_generation(leaf); >> + >> + for (i = slot; i < nritems; i++) { >> + item_off = btrfs_item_ptr_offset(leaf, i); >> + item_len = btrfs_item_size_nr(leaf, i); >> + >> + btrfs_item_key_to_cpu(leaf, key, i); >> + if (!key_in_sk_and_subvol(key, sk)) >> + continue; >> + >> + /* will not copy the name of subvolume */ >> + if (key->type == BTRFS_ROOT_BACKREF_KEY || >> + key->type == BTRFS_ROOT_REF_KEY) >> + item_len = sizeof(struct btrfs_root_ref); >> + >> + if (sizeof(sh) + item_len > *buf_size) { >> + if (*num_found) { >> + ret = 1; >> + goto out; >> + } >> + >> + /* >> + * return one empty item back for v1, which does not >> + * handle -EOVERFLOW >> + */ > It is still applicable ? In order to just replace TREE_SEARCH (v1) in user space, I think it still needs the same EOVERFLOW handling like TREE_SEARCH does (in copy_to_sk() and btrfs_ioctl_tree_search()). >> + >> + *buf_size = sizeof(sh) + item_len; >> + item_len = 0; >> + ret = -EOVERFLOW;> + } >> + >> + if (sizeof(sh) + item_len + *sk_offset > *buf_size) { >> + ret = 1; >> + goto out; >> + } >> + >> + sh.objectid = key->objectid; >> + sh.offset = key->offset; >> + sh.type = key->type; >> + sh.len = item_len; >> + sh.transid = found_transid; >> + >> + /* copy search result header */ >> + if (copy_to_user(ubuf + *sk_offset, &sh, sizeof(sh))) { >> + ret = -EFAULT; >> + goto out; >> + } >> + >> + *sk_offset += sizeof(sh); >> + >> + if (item_len) { >> + char __user *up = ubuf + *sk_offset; >> + >> + /* copy the item */ >> + if (read_extent_buffer_to_user(leaf, up, >> + item_off, item_len)) { >> + ret = -EFAULT; >> + goto out; >> + } >> + >> + *sk_offset += item_len; >> + } >> + (*num_found)++; >> + >> + if (ret) /* -EOVERFLOW from above */ >> + goto out; >> + >> + if (*num_found >= sk->nr_items) { >> + ret = 1; >> + goto out; >> + } >> + } >> +advance_key: >> + ret = 0; >> + test.objectid = sk->max_objectid; >> + test.type = sk->max_type; >> + test.offset = sk->max_offset; >> + if (btrfs_comp_cpu_keys(key, &test) >= 0) >> + ret = 1; >> + else if (key->offset < (u64)-1) >> + key->offset++; >> + else if (key->type < (u8)-1) { >> + key->offset = 0; >> + key->type++; >> + } else if (key->objectid < (u64)-1) { >> + key->offset = 0; >> + key->type = 0; >> + key->objectid++; > > It would be doable to check that type is in the range > [BTRFS_ROOT_ITEM_KEY...BTRFS_ROOT_BACKREF_KEY] > > I.e. something like > > > + if (btrfs_comp_cpu_keys(key, &test) >= 0) > + ret = 1; > + else if (key->offset < (u64)-1) > + key->offset++; > + else if (key->type < BTRFS_ROOT_BACKREF_KEY) { > + key->offset = 0; > + key->type++; > + } else if (key->objectid < (u64)-1) { > + key->offset = 0; > + key->type = BTRFS_ROOT_ITEM_KEY; > + key->objectid++; Thanks, I will fix this in v2. > >> + } else >> + ret = 1; >> +out: >> + /* >> + * 0: all items from this leaf copied, continue with next >> + * 1: * more items can be copied, but unused buffer is too small >> + * * all items were found >> + * Either way, it will stops the loop which iterates to the next >> + * leaf >> + * -EOVERFLOW: item was to large for buffer >> + * -EFAULT: could not copy extent buffer back to userspace >> + */ >> + return ret; >> +} >> + >> static noinline int search_ioctl(struct inode *inode, >> struct btrfs_ioctl_search_key *sk, >> size_t *buf_size, >> @@ -2309,6 +2467,100 @@ static noinline int btrfs_ioctl_ino_lookup(struct >> file *file, >> return ret; >> } >> >> +static noinline int search_subvol(struct inode *inode, >> + struct btrfs_ioctl_search_key *sk, >> + size_t *buf_size, >> + char __user *ubuf) >> +{ >> + struct btrfs_fs_info *info = btrfs_sb(inode->i_sb); >> + struct btrfs_root *root; >> + struct btrfs_key key; >> + struct btrfs_path *path; >> + unsigned long sk_offset = 0; >> + int num_found = 0; >> + int ret; >> + >> + path = btrfs_alloc_path(); >> + if (!path) >> + return -ENOMEM; >> + >> + /* search ROOT TREEE */ >> + key.objectid = BTRFS_ROOT_TREE_OBJECTID; >> + key.type = BTRFS_ROOT_ITEM_KEY; >> + key.offset = (u64)-1; >> + root = btrfs_read_fs_root_no_name(info, &key); >> + if (IS_ERR(root)) { >> + btrfs_free_path(path); >> + return -ENOENT; >> + } >> + >> + key.objectid = sk->min_objectid; >> + key.type = sk->min_type; >> + key.offset = sk->min_offset; >> + >> + while (1) { >> + ret = btrfs_search_forward(root, &key, path, sk->min_transid); >> + if (ret != 0) { >> + if (ret > 0) >> + ret = 0; >> + goto err; >> + } >> + >> + ret = copy_subvol_item(path, &key, sk, buf_size, ubuf, >> + &sk_offset, &num_found); >> + btrfs_release_path(path); >> + if (ret) >> + break; >> + } >> + >> + if (ret > 0) >> + ret = 0; >> +err: >> + sk->nr_items = num_found; >> + btrfs_free_path(path); >> + return ret; >> +} >> + >> +/* >> + * Unprivileged ioctl for getting subvolume related item. >> + * The arguments are the same as btrfs_ioctl_tree_search() >> + * and can be used like tree search ioctl. >> + * >> + * Only returns ROOT_ITEM/ROOT_BACKREF/ROOT_REF key of subvolumes >> + * from root tree. Also, the name of subvolume will be omitted from >> + * ROOT_BACKREF/ROOT_REF item to prevent name leak. >> + */ >> +static noinline int btrfs_ioctl_get_subvol_info(struct file *file, >> + void __user *argp) >> +{ >> + struct btrfs_ioctl_search_args __user *uargs; >> + struct btrfs_ioctl_search_key sk; >> + struct inode *inode; >> + size_t buf_size; >> + int ret; >> + >> + uargs = (struct btrfs_ioctl_search_args __user *)argp; >> + >> + if (copy_from_user(&sk, &uargs->key, sizeof(sk))) >> + return -EFAULT; >> + >> + buf_size = sizeof(uargs->buf); >> + >> + inode = file_inode(file); >> + ret = search_subvol(inode, &sk, &buf_size, uargs->buf); >> + >> + /* >> + * In the original implementation an overflow is handled by returning a >> + * search header with a len of zero, so reset ret. >> + */ >> + if (ret == -EOVERFLOW) >> + ret = 0; >> + >> + if (ret == 0 && copy_to_user(&uargs->key, &sk, sizeof(sk))) >> + ret = -EFAULT; >> + return ret; >> +} >> + >> static noinline int btrfs_ioctl_snap_destroy(struct file *file, >> void __user *arg) >> { >> @@ -5663,6 +5915,8 @@ long btrfs_ioctl(struct file *file, unsigned int >> return btrfs_ioctl_get_features(file, argp); >> case BTRFS_IOC_SET_FEATURES: >> return btrfs_ioctl_set_features(file, argp); >> + case BTRFS_IOC_GET_SUBVOL_INFO: >> + return btrfs_ioctl_get_subvol_info(file, argp); >> } >> >> return -ENOTTY; >> diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h >> index c8d99b9ca550..1e9361cf66d5 100644 >> --- a/include/uapi/linux/btrfs.h >> +++ b/include/uapi/linux/btrfs.h >> @@ -843,5 +843,7 @@ enum btrfs_err_code { >> struct btrfs_ioctl_vol_args_v2) >> #define BTRFS_IOC_LOGICAL_INO_V2 _IOWR(BTRFS_IOCTL_MAGIC, 59, \ >> struct btrfs_ioctl_logical_ino_args) >> +#define BTRFS_IOC_GET_SUBVOL_INFO _IOWR(BTRFS_IOCTL_MAGIC, 60, \ >> + struct btrfs_ioctl_search_args) >> >> #endif /* _UAPI_LINUX_BTRFS_H */ >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html