On Tue, Jul 28, 2015 at 04:30:36PM +0800, Qu Wenruo wrote: > Although Liu Bo has already submitted a V10 version of his deduplication > implement, here is another implement for it. > > [[CORE FEATURES]] > The main design concept is the following: > 1) Controllable memory usage > 2) No guarantee to dedup every duplication. > 3) No on-disk format change or new format > 4) Page size level deduplication > > [[IMPLEMENT]] > Implement details includes the following: > 1) LRU hash maps to limit the memory usage > The hash -> extent mapping is control by LRU (or unlimited), to > get a controllable memory usage (can be tuned by mount option) > alone with controllable read/write overhead used for hash searching. > > 2) Reuse existing ordered_extent infrastructure > For duplicated page, it will still submit a ordered_extent(only one > page long), to make the full use of all existing infrastructure. > But only not submit a bio. > This can reduce the number of code lines. > > 3) Mount option to control dedup behavior > Deduplication and its memory usage can be tuned by mount option. > No need to indicated ioctl interface. > And further more, it can easily support BTRFS_INODE flag like > compression, to allow further per file dedup fine tunning. > > [[TODO]] > 1. Add support for compressed extent > Shouldn't be quite hard. > 2. Try to merge dedup extent to reduce metadata size > Currently, dedup extent is always in 4K size, although its reference > source can be quite large. > 3. Add support for per file dedup flags > Much easier, just like compression flags. > > [[KNOWN BUG, NEED HELP!]] > On the other hand, since it's still a RFC patch, it must has one or more > problem:
You may have a look at my patchset, one of them is aimed to address the similar problem. Thanks, -liubo > 1) Race between __btrfs_free_extent() and dedup ordered_extent. > The hook in __btrfs_free_extent() will free the corresponding hashes > of a extent, even there is a dedup ordered_extent referring it. > > The problem will happen like the following case: > ====================================================================== > cow_file_range() > Submit dedup ordered_extent for extent A > > commit_transaction() > Extent A needs freeing. As the its ref is decreased to 0. > And dedup ordered_extent can increase only when it hit endio time. > > finish_ordered_io() > Add reference to Extent A for dedup ordered_extent. > But it is already freed in previous transaction. > Causing abort_transaction(). > ====================================================================== > I'd like to keep the current ordered_extent method, as it adds the > least number of code lines. > But I can't find a good idea to either delay transaction until dedup > ordered_extent is done or things like that. > > Trans->ordered seems to be a good idea, but it seems to cause list > corruption without extra protection in tree log infrastructure. > > That's the only problem spotted yet. > Any early review or advice/question on the design is welcomed. > > Thanks. > > Qu Wenruo (14): > btrfs: file-item: Introduce btrfs_setup_file_extent function. > btrfs: Use btrfs_fill_file_extent to reduce duplicated codes > btrfs: dedup: Add basic init/free functions for inband dedup. > btrfs: dedup: Add internal add/remove/search function for btrfs dedup. > btrfs: dedup: add ordered extent hook for inband dedup > btrfs: dedup: Apply dedup hook for write time dedup. > btrfs: extent_map: Add new dedup flag and corresponding hook. > btrfs: extent-map: Introduce orig_block_start member for extent-map. > btrfs: dedup: Add inband dedup hook for read extent. > btrfs: dedup: Introduce btrfs_dedup_free_extent_range function. > btrfs: dedup: Add hook to free dedup hash at extent free time. > btrfs: dedup: Add mount option support for btrfs inband deduplication. > Btrfs: dedup: Support dedup change at remount time. > btrfs: dedup: Add mount option output for inband dedup. > > fs/btrfs/Makefile | 2 +- > fs/btrfs/ctree.h | 16 ++ > fs/btrfs/dedup.c | 701 > ++++++++++++++++++++++++++++++++++++++++++++++++ > fs/btrfs/dedup.h | 132 +++++++++ > fs/btrfs/disk-io.c | 7 + > fs/btrfs/extent-tree.c | 10 + > fs/btrfs/extent_io.c | 6 +- > fs/btrfs/extent_map.h | 4 + > fs/btrfs/file-item.c | 61 +++-- > fs/btrfs/inode.c | 228 ++++++++++++---- > fs/btrfs/ordered-data.c | 32 ++- > fs/btrfs/ordered-data.h | 8 + > fs/btrfs/super.c | 39 ++- > 13 files changed, 1163 insertions(+), 83 deletions(-) > create mode 100644 fs/btrfs/dedup.c > create mode 100644 fs/btrfs/dedup.h > > -- > 2.4.6 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html