On 06/03/2016 10:43 PM, Josef Bacik wrote:
On 04/01/2016 02:35 AM, Qu Wenruo wrote:
Core implement for inband de-duplication.
It reuse the async_cow_start() facility to do the calculate dedupe hash.
And use dedupe hash to do inband de-duplication at extent level.
The work flow is as below:
1) Run delalloc range for an inode
2) Calculate hash for the delalloc range at the unit of dedupe_bs
3) For hash match(duplicated) case, just increase source extent ref
and insert file extent.
For hash mismatch case, go through the normal cow_file_range()
fallback, and add hash into dedupe_tree.
Compress for hash miss case is not supported yet.
Current implement restore all dedupe hash in memory rb-tree, with LRU
behavior to control the limit.
Signed-off-by: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwen...@cn.fujitsu.com>
---
fs/btrfs/extent-tree.c | 18 ++++
fs/btrfs/inode.c | 235
++++++++++++++++++++++++++++++++++++++++++-------
fs/btrfs/relocation.c | 16 ++++
3 files changed, 236 insertions(+), 33 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 53e1297..dabd721 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
<snip>
@@ -1076,6 +1135,68 @@ out_unlock:
goto out;
}
+static int hash_file_ranges(struct inode *inode, u64 start, u64 end,
+ struct async_cow *async_cow, int *num_added)
+{
+ struct btrfs_root *root = BTRFS_I(inode)->root;
+ struct btrfs_fs_info *fs_info = root->fs_info;
+ struct btrfs_dedupe_info *dedupe_info = fs_info->dedupe_info;
+ struct page *locked_page = async_cow->locked_page;
+ u16 hash_algo;
+ u64 actual_end;
+ u64 isize = i_size_read(inode);
+ u64 dedupe_bs;
+ u64 cur_offset = start;
+ int ret = 0;
+
+ actual_end = min_t(u64, isize, end + 1);
+ /* If dedupe is not enabled, don't split extent into dedupe_bs */
+ if (fs_info->dedupe_enabled && dedupe_info) {
+ dedupe_bs = dedupe_info->blocksize;
+ hash_algo = dedupe_info->hash_type;
+ } else {
+ dedupe_bs = SZ_128M;
+ /* Just dummy, to avoid access NULL pointer */
+ hash_algo = BTRFS_DEDUPE_HASH_SHA256;
+ }
+
+ while (cur_offset < end) {
+ struct btrfs_dedupe_hash *hash = NULL;
+ u64 len;
+
+ len = min(end + 1 - cur_offset, dedupe_bs);
+ if (len < dedupe_bs)
+ goto next;
+
+ hash = btrfs_dedupe_alloc_hash(hash_algo);
+ if (!hash) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ ret = btrfs_dedupe_calc_hash(fs_info, inode, cur_offset, hash);
+ if (ret < 0)
+ goto out;
+
+ ret = btrfs_dedupe_search(fs_info, inode, cur_offset, hash);
+ if (ret < 0)
+ goto out;
You leak hash in both of these cases. Also if btrfs_dedup_search
<snip>
+ if (ret < 0)
+ goto out_qgroup;
+
+ /*
+ * Hash hit won't create a new data extent, so its reserved quota
+ * space won't be freed by new delayed_ref_head.
+ * Need to free it here.
+ */
+ if (btrfs_dedupe_hash_hit(hash))
+ btrfs_qgroup_free_data(inode, file_pos, ram_bytes);
+
+ /* Add missed hash into dedupe tree */
+ if (hash && hash->bytenr == 0) {
+ hash->bytenr = ins.objectid;
+ hash->num_bytes = ins.offset;
+ ret = btrfs_dedupe_add(trans, root->fs_info, hash);
I don't want to flip read only if we fail this in the in-memory mode.
Thanks,
Josef
Right, unlike btrfs_dedupe_del() case, if we fail to insert hash,
nothing wrong will happen.
We would just slightly reduce the dedupe rate.
I'm OK to skip dedupe_add() error.
Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html