On Tue, May 14, 2013 at 11:36:52AM +0200, Stefan Behrens wrote:
> Mapping UUIDs to subvolume IDs is an operation with a high effort
> today. Today, the algorithm even has quadratic effort (based on the
> number of existing subvolumes), which means, that it takes minutes
> to send/receive a single subvolume if 10,000 subvolumes exist. But
> even linear effort would be too much since it is a waste. And these
> data structures to allow mapping UUIDs to subvolume IDs are created
> every time a btrfs send/receive instance is started.
> 
> So the issue to address is that Btrfs send / receive does not work
> as it is today when a high number of subvolumes exist.
> 
> It is much more efficient to maintain a searchable persistent data
> structure in the filesystem, one that is updated whenever a
> subvolume/snapshot is created and deleted, and when the received
> subvolume UUID is set by the btrfs-receive tool.
> 
> Therefore kernel code is added that is able to maintain data
> structures in the filesystem that allow to quickly search for a
> given UUID and to retrieve the subvol ID.
> 
> Now follows the lengthy justification, why a new tree was added
> instead of using the existing root tree:
> 
> The first approach was to not create another tree that holds UUID
> items. Instead, the items should just go into the top root tree.
> Unfortunately this confused the algorithm to assign the objectid
> of subvolumes and snapshots. The reason is that
> btrfs_find_free_objectid() calls btrfs_find_highest_objectid() for
> the first created subvol or snapshot after mounting a filesystem,
> and this function simply searches for the largest used objectid in
> the root tree keys to pick the next objectid to assign. Of course,
> the UUID keys have always been the ones with the highest offset
> value, and the next assigned subvol ID was wastefully huge.
> 
> To use any other existing tree did not look proper. To apply a
> workaround such as setting the objectid to zero in the UUID item
> key and to implement collision handling would either add
> limitations (in case of a btrfs_extend_item() approach to handle
> the collisions) or a lot of complexity and source code (in case a
> key would be looked up that is free of collisions). Adding new code
> that introduces limitations is not good, and adding code that is
> complex and lengthy for no good reason is also not good. That's the
> justification why a completely new tree was introduced.

I'd appreciate if some performance number appear here since it's a speedup.

thanks,
liubo


> 
> v1 -> v2:
> - All review comments from David Sterba, Josef Bacik and Jan Schmidt
>   are addressed.
>   The hugest change was to add a mechanism that handles the case that
>   the filesystem is mounted with an older kernel. Now that case is
>   detected when the filesystem is mounted with a newer kernel again,
>   and the UUID tree is updated in the background.
> 
> Stefan Behrens (8):
>   Btrfs: introduce a tree for items that map UUIDs to something
>   Btrfs: support printing UUID tree elements
>   Btrfs: create UUID tree if required
>   Btrfs: maintain subvolume items in the UUID tree
>   Btrfs: fill UUID tree initially
>   Btrfs: introduce uuid-tree-gen field
>   Btrfs: check UUID tree during mount if required
>   Btrfs: add mount option to force UUID tree checking
> 
>  fs/btrfs/Makefile      |   3 +-
>  fs/btrfs/ctree.h       |  65 ++++-
>  fs/btrfs/disk-io.c     |  62 ++++-
>  fs/btrfs/ioctl.c       |  74 +++++-
>  fs/btrfs/print-tree.c  |  82 +++++++
>  fs/btrfs/super.c       |   8 +-
>  fs/btrfs/transaction.c |  21 +-
>  fs/btrfs/uuid-tree.c   | 637 
> +++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/btrfs/volumes.c     | 258 ++++++++++++++++++++
>  fs/btrfs/volumes.h     |   2 +
>  10 files changed, 1197 insertions(+), 15 deletions(-)
>  create mode 100644 fs/btrfs/uuid-tree.c
> 
> -- 
> 1.8.2.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to