[PATCH 0/6] random bugfixes of the space management
Hello, I have a bunch of random fixes of the space management in git://repo.or.cz/linux-btrfs-devel.git space-manage They are the ENOSPC fixes, as well as fixes for df command. The first one and the last one fixed the wrong free space information reported by df command. The second one fixed ENOSPC when there is tiny space in the filesystem. And The third fixed wrong calculation of stripe size. And the 4th and 5th patches fixed the chunk allocation problem when the block devices have no enough space to allocate a default-size chunk. --- fs/btrfs/ctree.h |2 + fs/btrfs/extent-tree.c | 71 ++- fs/btrfs/super.c | 147 +++- fs/btrfs/volumes.c | 606 +++- fs/btrfs/volumes.h | 27 +++ 5 files changed, 682 insertions(+), 171 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] btrfs: fix wrong data space statistics
Josef has implemented mixed data/metadata chunks, we must add those chunks' space just like data chunks. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/super.c |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 61bd79a..1d21208 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -768,11 +768,10 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf) rcu_read_lock(); list_for_each_entry_rcu(found, head, list) { - if (found-flags (BTRFS_BLOCK_GROUP_METADATA | - BTRFS_BLOCK_GROUP_SYSTEM)) - total_used_data += found-disk_total; - else + if (found-flags BTRFS_BLOCK_GROUP_DATA) total_used_data += found-disk_used; + else + total_used_data += found-disk_total; total_used += found-disk_used; } rcu_read_unlock(); -- 1.7.0.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] btrfs: try to reclaim some space when chunk allocation fails
We cannot write data into files when when there is tiny space in the filesystem. Reproduce steps: # mkfs.btrfs /dev/sda1 # mount /dev/sda1 /mnt # dd if=/dev/zero of=/mnt/tmpfile0 bs=4K count=1 # dd if=/dev/zero of=/mnt/tmpfile1 bs=4K count=99 (fill the filesystem) # umount /mnt # mount /dev/sda1 /mnt # rm -f /mnt/tmpfile0 # dd if=/dev/zero of=/mnt/tmpfile0 bs=4K count=1 (failed with nospec) But if we do the last step again, we can write data successfully. The reason of the problem is that btrfs didn't try to commit the current transaction and reclaim some space when chunk allocation failed. This patch fixes it by committing the current transaction to to reclaim some space when chunk allocation fails. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/extent-tree.c |9 +++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7e5162e..4bcd875 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3162,8 +3162,12 @@ alloc: bytes + 2 * 1024 * 1024, alloc_target, 0); btrfs_end_transaction(trans, root); - if (ret 0) - return ret; + if (ret 0) { + if (ret != -ENOSPC) + return ret; + else + goto commit_trans; + } if (!data_sinfo) { btrfs_set_inode_space_info(root, inode); @@ -3174,6 +3178,7 @@ alloc: spin_unlock(data_sinfo-lock); /* commit the current transaction and try again */ +commit_trans: if (!committed !root-fs_info-open_ioctl_trans) { committed = 1; trans = btrfs_join_transaction(root, 1); -- 1.7.0.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] btrfs: restructure find_free_dev_extent()
- make it return the start position and length of the max free space when it can not find a suitable free space. - make it more readability Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/extent-tree.c |4 +- fs/btrfs/volumes.c | 155 +++ 2 files changed, 91 insertions(+), 68 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 4bcd875..7c1a053 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8098,7 +8098,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr) mutex_lock(root-fs_info-chunk_mutex); list_for_each_entry(device, fs_devices-alloc_list, dev_alloc_list) { u64 min_free = btrfs_block_group_used(block_group-item); - u64 dev_offset, max_avail; + u64 dev_offset; /* * check to make sure we can actually find a chunk with enough @@ -8106,7 +8106,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr) */ if (device-total_bytes device-bytes_used + min_free) { ret = find_free_dev_extent(NULL, device, min_free, - dev_offset, max_avail); + dev_offset, NULL); if (!ret) break; ret = -1; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index e1028f4..15e8c3f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -729,58 +729,82 @@ error: } /* + * find_free_dev_extent - find free space in the specified device + * @trans: transaction handler + * @device:the device which we search the free space in + * @num_bytes: the size of the free space that we need + * @start: store the start of the free space. + * @len: the size of the free space. that we find, or the size of the max + * free space if we don't find suitable free space + * * this uses a pretty simple search, the expectation is that it is * called very infrequently and that a given device has a small number * of extents + * + * @start is used to store the start of the free space if we find. But if we + * don't find suitable free space, it will be used to store the start position + * of the max free space. + * + * @len is used to store the size of the free space that we find. + * But if we don't find suitable free space, it is used to store the size of + * the max free space. */ int find_free_dev_extent(struct btrfs_trans_handle *trans, struct btrfs_device *device, u64 num_bytes, -u64 *start, u64 *max_avail) +u64 *start, u64 *len) { struct btrfs_key key; struct btrfs_root *root = device-dev_root; - struct btrfs_dev_extent *dev_extent = NULL; + struct btrfs_dev_extent *dev_extent; struct btrfs_path *path; - u64 hole_size = 0; - u64 last_byte = 0; - u64 search_start = 0; + u64 hole_size; + u64 max_hole_start; + u64 max_hole_size; + u64 extent_end; + u64 search_start; u64 search_end = device-total_bytes; int ret; - int slot = 0; - int start_found; + int slot; struct extent_buffer *l; - path = btrfs_alloc_path(); - if (!path) - return -ENOMEM; - path-reada = 2; - start_found = 0; - /* FIXME use last free of some kind */ /* we don't want to overwrite the superblock on the drive, * so we make sure to start at an offset of at least 1MB */ - search_start = max((u64)1024 * 1024, search_start); + search_start = 1024 * 1024; - if (root-fs_info-alloc_start + num_bytes = device-total_bytes) + if (root-fs_info-alloc_start + num_bytes = search_end) search_start = max(root-fs_info-alloc_start, search_start); + max_hole_start = search_start; + max_hole_size = 0; + + if (search_start = search_end) { + ret = -ENOSPC; + goto error; + } + + path = btrfs_alloc_path(); + if (!path) { + ret = -ENOMEM; + goto error; + } + path-reada = 2; + key.objectid = device-devid; key.offset = search_start; key.type = BTRFS_DEV_EXTENT_KEY; + ret = btrfs_search_slot(trans, root, key, path, 0, 0); if (ret 0) - goto error; + goto out; if (ret 0) { ret = btrfs_previous_item(root, path, key.objectid, key.type); if (ret 0) - goto error; - if (ret 0) - start_found = 1; + goto out; } - l = path-nodes[0]; - btrfs_item_key_to_cpu(l, key, path-slots[0]); + while (1) {
[PATCH 3/6] btrfs: fix wrong calculation of stripe size
There are two tiny problem: - One is When we check the chunk size is greater than the max chunk size or not, we should take mirrors into account, but the original code didn't. - The other is btrfs shouldn't use the size of the residual free space as the length of of a dup chunk when doing chunk allocation. It is because the device space that a dup chunk needs is twice as large as the chunk size, if we use the size of the residual free space as the length of a dup chunk, we can not get enough free space. Fix it. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/volumes.c | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 177b731..e1028f4 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2177,6 +2177,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, int num_stripes = 1; int min_stripes = 1; int sub_stripes = 0; + int ncopies = 1; int looped = 0; int ret; int index; @@ -2197,12 +2198,14 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, if (type (BTRFS_BLOCK_GROUP_DUP)) { num_stripes = 2; min_stripes = 2; + ncopies = 2; } if (type (BTRFS_BLOCK_GROUP_RAID1)) { if (fs_devices-rw_devices 2) return -ENOSPC; num_stripes = 2; min_stripes = 2; + ncopies = 2; } if (type (BTRFS_BLOCK_GROUP_RAID10)) { num_stripes = fs_devices-rw_devices; @@ -2210,6 +2213,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, return -ENOSPC; num_stripes = ~(u32)1; sub_stripes = 2; + ncopies = 2; min_stripes = 4; } @@ -2239,8 +2243,8 @@ again: map-num_stripes = num_stripes; } - if (calc_size * num_stripes max_chunk_size) { - calc_size = max_chunk_size; + if (calc_size * num_stripes / ncopies max_chunk_size) { + calc_size = max_chunk_size * ncopies; do_div(calc_size, num_stripes); do_div(calc_size, stripe_len); calc_size *= stripe_len; @@ -2321,6 +2325,8 @@ again: if (!looped max_avail 0) { looped = 1; calc_size = max_avail; + if (type BTRFS_BLOCK_GROUP_DUP) + calc_size /= 2; goto again; } kfree(map); -- 1.7.0.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] btrfs: fix wrong free space information of btrfs
When we store data by raid profile in btrfs with two or more different size disks, df command shows there is some free space in the filesystem, but the user can not write any data in fact, df command shows the wrong free space information of btrfs. # mkfs.btrfs -d raid1 /dev/sda9 /dev/sda10 # btrfs-show Label: none uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64 Total devices 2 FS bytes used 28.00KB devid1 size 5.01GB used 2.03GB path /dev/sda9 devid2 size 10.00GB used 2.01GB path /dev/sda10 # btrfs device scan /dev/sda9 /dev/sda10 # mount /dev/sda9 /mnt # dd if=/dev/zero of=tmpfile0 bs=4K count=99 (fill the filesystem) # sync # df -TH Filesystem TypeSizeUsedAvail Use%Mounted on /dev/sda9 btrfs 17G 8.6G5.4G62% /mnt # btrfs-show Label: none uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64 Total devices 2 FS bytes used 3.99GB devid1 size 5.01GB used 5.01GB path /dev/sda9 devid2 size 10.00GB used 4.99GB path /dev/sda10 It is because btrfs cannot allocate chunks when one of the pairing disks has no space, the free space on the other disks can not be used for ever, and should be subtracted from the total space, but btrfs doesn't subtract this space from the total. It is strange to the user. This patch fixes it by calcing the free space that can be used to allocate chunks. Implementation: 1. get all the devices free space, and align them by stripe length. 2. sort the devices by the free space. 3. check the free space of the devices, 3.1. if it is not zero, and then check the number of the devices that has more free space than this device, if the number of the devices is beyond the min stripe number, the free space can be used, and add into total free space. if the number of the devices is below the min stripe number, we can not use the free space, the check ends. 3.2. if the free space is zero, check the next devices, goto 3.1 This implementation is just likely fake chunk allocation. After appling this patch, df can show correct space information: # df -TH Filesystem TypeSizeUsedAvail Use%Mounted on /dev/sda9 btrfs 17G 8.6G0 100%/mnt Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/ctree.h |2 + fs/btrfs/extent-tree.c | 58 +++- fs/btrfs/super.c | 146 ++-- fs/btrfs/volumes.c | 84 +++ fs/btrfs/volumes.h |3 + 5 files changed, 286 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index af52f6d..a068a5d 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2145,6 +2145,7 @@ int btrfs_make_block_group(struct btrfs_trans_handle *trans, int btrfs_remove_block_group(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 group_start); u64 btrfs_reduce_alloc_profile(struct btrfs_root *root, u64 flags); +u64 btrfs_get_alloc_profile(struct btrfs_root *root, int data); void btrfs_set_inode_space_info(struct btrfs_root *root, struct inode *ionde); void btrfs_clear_space_info_full(struct btrfs_fs_info *info); int btrfs_check_data_free_space(struct inode *inode, u64 bytes); @@ -2188,6 +2189,7 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, int btrfs_set_block_group_rw(struct btrfs_root *root, struct btrfs_block_group_cache *cache); void btrfs_put_block_group_cache(struct btrfs_fs_info *info); +u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo); /* ctree.c */ int btrfs_bin_search(struct extent_buffer *eb, struct btrfs_key *key, int level, int *slot); diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7c1a053..fd465e1 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3090,7 +3090,7 @@ static u64 get_alloc_profile(struct btrfs_root *root, u64 flags) return btrfs_reduce_alloc_profile(root, flags); } -static u64 btrfs_get_alloc_profile(struct btrfs_root *root, int data) +u64 btrfs_get_alloc_profile(struct btrfs_root *root, int data) { u64 flags; @@ -8018,6 +8018,62 @@ out: return ret; } +/* + * helper to account the unused space of all the readonly block group in the + * list. takes mirrors into account. + */ +static u64 __btrfs_get_ro_block_group_free_space(struct list_head *groups_list) +{ + struct btrfs_block_group_cache *block_group; + u64 free_bytes = 0; + int factor; + + list_for_each_entry(block_group, groups_list, list) { + spin_lock(block_group-lock); + + if (!block_group-ro) { + spin_unlock(block_group-lock); + continue; + } + + if (block_group-flags (BTRFS_BLOCK_GROUP_RAID1 | +
Re: [PATCH 4/6] btrfs: restructure find_free_dev_extent()
Hi, this patch seems to have the same intention as the patch I sent to the list on Dec 11 Fixing the chunk allocator to allow it to better utilize the devices. The result is quite similar, except that you left the line search_start = max(root-fs_info-alloc_start, search_start); in place, which could lead to disregarding the configured alloc_start. As both patches address the same problem, it might be good to compare them in more detail. -- Arne On 22.12.2010 11:47, Miao Xie wrote: - make it return the start position and length of the max free space when it can not find a suitable free space. - make it more readability Signed-off-by: Miao Xiemi...@cn.fujitsu.com --- fs/btrfs/extent-tree.c |4 +- fs/btrfs/volumes.c | 155 +++ 2 files changed, 91 insertions(+), 68 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 4bcd875..7c1a053 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8098,7 +8098,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr) mutex_lock(root-fs_info-chunk_mutex); list_for_each_entry(device,fs_devices-alloc_list, dev_alloc_list) { u64 min_free = btrfs_block_group_used(block_group-item); - u64 dev_offset, max_avail; + u64 dev_offset; /* * check to make sure we can actually find a chunk with enough @@ -8106,7 +8106,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr) */ if (device-total_bytes device-bytes_used + min_free) { ret = find_free_dev_extent(NULL, device, min_free, - dev_offset,max_avail); + dev_offset, NULL); if (!ret) break; ret = -1; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index e1028f4..15e8c3f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -729,58 +729,82 @@ error: } /* + * find_free_dev_extent - find free space in the specified device + * @trans: transaction handler + * @device: the device which we search the free space in + * @num_bytes: the size of the free space that we need + * @start: store the start of the free space. + * @len: the size of the free space. that we find, or the size of the max + * free space if we don't find suitable free space + * * this uses a pretty simple search, the expectation is that it is * called very infrequently and that a given device has a small number * of extents + * + * @start is used to store the start of the free space if we find. But if we + * don't find suitable free space, it will be used to store the start position + * of the max free space. + * + * @len is used to store the size of the free space that we find. + * But if we don't find suitable free space, it is used to store the size of + * the max free space. */ int find_free_dev_extent(struct btrfs_trans_handle *trans, struct btrfs_device *device, u64 num_bytes, - u64 *start, u64 *max_avail) + u64 *start, u64 *len) { struct btrfs_key key; struct btrfs_root *root = device-dev_root; - struct btrfs_dev_extent *dev_extent = NULL; + struct btrfs_dev_extent *dev_extent; struct btrfs_path *path; - u64 hole_size = 0; - u64 last_byte = 0; - u64 search_start = 0; + u64 hole_size; + u64 max_hole_start; + u64 max_hole_size; + u64 extent_end; + u64 search_start; u64 search_end = device-total_bytes; int ret; - int slot = 0; - int start_found; + int slot; struct extent_buffer *l; - path = btrfs_alloc_path(); - if (!path) - return -ENOMEM; - path-reada = 2; - start_found = 0; - /* FIXME use last free of some kind */ /* we don't want to overwrite the superblock on the drive, * so we make sure to start at an offset of at least 1MB */ - search_start = max((u64)1024 * 1024, search_start); + search_start = 1024 * 1024; - if (root-fs_info-alloc_start + num_bytes= device-total_bytes) + if (root-fs_info-alloc_start + num_bytes= search_end) search_start = max(root-fs_info-alloc_start, search_start); + max_hole_start = search_start; + max_hole_size = 0; + + if (search_start= search_end) { + ret = -ENOSPC; + goto error; + } + + path = btrfs_alloc_path(); + if (!path) { + ret = -ENOMEM; + goto error; + } + path-reada = 2; + key.objectid = device-devid; key.offset = search_start; key.type = BTRFS_DEV_EXTENT_KEY; + ret = btrfs_search_slot(trans,
Re: [PATCH 2/6] btrfs: try to reclaim some space when chunk allocation fails
On Wed, Dec 22, 2010 at 06:47:20PM +0800, Miao Xie wrote: We cannot write data into files when when there is tiny space in the filesystem. Reproduce steps: # mkfs.btrfs /dev/sda1 # mount /dev/sda1 /mnt # dd if=/dev/zero of=/mnt/tmpfile0 bs=4K count=1 # dd if=/dev/zero of=/mnt/tmpfile1 bs=4K count=99 (fill the filesystem) # umount /mnt # mount /dev/sda1 /mnt # rm -f /mnt/tmpfile0 # dd if=/dev/zero of=/mnt/tmpfile0 bs=4K count=1 (failed with nospec) But if we do the last step again, we can write data successfully. The reason of the problem is that btrfs didn't try to commit the current transaction and reclaim some space when chunk allocation failed. This patch fixes it by committing the current transaction to to reclaim some space when chunk allocation fails. Signed-off-by: Miao Xie mi...@cn.fujitsu.com Reviewed-by: Josef Bacik jo...@redhat.com Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] random bugfixes of the space management
Excerpts from Josef Bacik's message of 2010-12-22 14:22:54 -0500: On Wed, Dec 22, 2010 at 06:47:08PM +0800, Miao Xie wrote: Hello, I have a bunch of random fixes of the space management in git://repo.or.cz/linux-btrfs-devel.git space-manage They are the ENOSPC fixes, as well as fixes for df command. The first one and the last one fixed the wrong free space information reported by df command. The second one fixed ENOSPC when there is tiny space in the filesystem. And The third fixed wrong calculation of stripe size. And the 4th and 5th patches fixed the chunk allocation problem when the block devices have no enough space to allocate a default-size chunk. I'll review the rest of them when I have more time, thanks for these Miao. For now they are going with the new compression code into a new branch for 2.6.38 in my git tree. I might have to rebase as patches are added and removed, but these will all go in. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: 21 minutes to read 1.2M file directory
On Tue, Dec 21, 2010 at 03:07:33AM +0200, Felipe Contreras wrote: On Tue, Dec 21, 2010 at 12:24 AM, Andy Isaacson a...@hexapodia.org wrote: I have a directory with 1.2M files in it, which makes readdir very slow on btrfs with cold caches (although it's reasonably fast with hot caches as in the first example below): Sounds like: Bug 21562 - btrfs is dead slow due to fragmentation https://bugzilla.kernel.org/show_bug.cgi?id=21562 Hmmm, how do I look at the btree layout for a given inode? btrfs-image for this filesystem is 1.7GiB .bz2, so I'm afraid it's not reasonable to publish it. -andy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: 21 minutes to read 1.2M file directory
On Wed, Dec 22, 2010 at 12:39:15PM -0800, Andy Isaacson wrote: On Tue, Dec 21, 2010 at 03:07:33AM +0200, Felipe Contreras wrote: On Tue, Dec 21, 2010 at 12:24 AM, Andy Isaacson a...@hexapodia.org wrote: I have a directory with 1.2M files in it, which makes readdir very slow on btrfs with cold caches (although it's reasonably fast with hot caches as in the first example below): Sounds like: Bug 21562 - btrfs is dead slow due to fragmentation https://bugzilla.kernel.org/show_bug.cgi?id=21562 Hmmm, how do I look at the btree layout for a given inode? There's documentation on the tree structures at [1] and [2]. If you know the inode number of the object you're interested in, you need to look in the FS tree for the subvolume it's in and find the (inode_number, EXTENT_DATA, ...) keys for the file. Each of those records will reference an individual disk extent -- and you can get the disk start position and length of the extent from the data stored under the key. Hugo. [1] https://btrfs.wiki.kernel.org/index.php/Btree_Items [2] https://btrfs.wiki.kernel.org/index.php/Data_Structures -- === Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Hail and greetings. We are a flat-pack invasion force from --- Planet Ikea. We come in pieces. signature.asc Description: Digital signature
Re: [PATCH 4/6] btrfs: restructure find_free_dev_extent()
On wed, 22 Dec 2010 13:07:18 +0100, Arne Jansen wrote: this patch seems to have the same intention as the patch I sent to the list on Dec 11 Fixing the chunk allocator to allow it to better utilize the devices. The result is quite similar, except that you left the line Ahhh, partial code is similar indeed. But I think this patch is different with yours. this function can return the offset of the max free space when it can not find a suitable free space now, it is the main purpose of this patch. I think this is also the biggest difference between this patch and yours. The original function is what I need, so I retain it, and this is why the result is similar. search_start = max(root-fs_info-alloc_start, search_start); in place, which could lead to disregarding the configured alloc_start. According to the original code, I think alloc_start just is a suggested parameter, if there is no enough space on the device, we just ignore it. Sometimes, we must retain the original semantic. Thanks Miao As both patches address the same problem, it might be good to compare them in more detail. -- Arne On 22.12.2010 11:47, Miao Xie wrote: - make it return the start position and length of the max free space when it can not find a suitable free space. - make it more readability Signed-off-by: Miao Xiemi...@cn.fujitsu.com --- fs/btrfs/extent-tree.c |4 +- fs/btrfs/volumes.c | 155 +++ 2 files changed, 91 insertions(+), 68 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 4bcd875..7c1a053 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8098,7 +8098,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr) mutex_lock(root-fs_info-chunk_mutex); list_for_each_entry(device,fs_devices-alloc_list, dev_alloc_list) { u64 min_free = btrfs_block_group_used(block_group-item); -u64 dev_offset, max_avail; +u64 dev_offset; /* * check to make sure we can actually find a chunk with enough @@ -8106,7 +8106,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr) */ if (device-total_bytes device-bytes_used + min_free) { ret = find_free_dev_extent(NULL, device, min_free, -dev_offset,max_avail); +dev_offset, NULL); if (!ret) break; ret = -1; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index e1028f4..15e8c3f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -729,58 +729,82 @@ error: } /* + * find_free_dev_extent - find free space in the specified device + * @trans: transaction handler + * @device: the device which we search the free space in + * @num_bytes: the size of the free space that we need + * @start: store the start of the free space. + * @len:the size of the free space. that we find, or the size of the max + * free space if we don't find suitable free space + * * this uses a pretty simple search, the expectation is that it is * called very infrequently and that a given device has a small number * of extents + * + * @start is used to store the start of the free space if we find. But if we + * don't find suitable free space, it will be used to store the start position + * of the max free space. + * + * @len is used to store the size of the free space that we find. + * But if we don't find suitable free space, it is used to store the size of + * the max free space. */ int find_free_dev_extent(struct btrfs_trans_handle *trans, struct btrfs_device *device, u64 num_bytes, - u64 *start, u64 *max_avail) + u64 *start, u64 *len) { struct btrfs_key key; struct btrfs_root *root = device-dev_root; -struct btrfs_dev_extent *dev_extent = NULL; +struct btrfs_dev_extent *dev_extent; struct btrfs_path *path; -u64 hole_size = 0; -u64 last_byte = 0; -u64 search_start = 0; +u64 hole_size; +u64 max_hole_start; +u64 max_hole_size; +u64 extent_end; +u64 search_start; u64 search_end = device-total_bytes; int ret; -int slot = 0; -int start_found; +int slot; struct extent_buffer *l; -path = btrfs_alloc_path(); -if (!path) -return -ENOMEM; -path-reada = 2; -start_found = 0; - /* FIXME use last free of some kind */ /* we don't want to overwrite the superblock on the drive, * so we make sure to start at an offset of at least 1MB */ -search_start = max((u64)1024 * 1024, search_start); +search_start = 1024 * 1024; -if (root-fs_info-alloc_start + num_bytes=