Quoting Nikolay Borisov <n.borisov.l...@gmail.com>:



You're exactly in the pitfall of btrfs backref walk.

For btrfs, it's definitely not an easy work to do backref walk.
btrfs uses hidden backref, that means, under most case, one extent
shared by 1000 snapshots, in extent tree (shows the backref) it can
completely be possible to only have one ref, for the initial subvolume.

For btrfs, you need to walk up the tree to find how it's shared.

It has to be done like that, that's why we call it backref-*walk*.

E.g
          A (subvol 257)     B (Subvol 258, snapshot of 257)
          |    \        /    |
          |        X         |
          |    /        \    |
          C                  D
         / \                / \
        E   F              G   H

In extent tree, E is only referred by subvol 257.
While C has two referencers, 257 and 258.

So in reality, you need to:
1) Do a tree search from subvol 257
   You got a path, E -> C -> A
2) Check each node to see if it's shared.
   E is only referred by C, no extra referencer.
   C is refered by two new tree blocks, A and B.
   A is refered by subvol 257.
   B is refered by subvol 258.
   So E is shared by 257 and 258.

Now, you see how things would go mad, for each extent you must go that
way to determine the real owner of each extent, not to mention we can
have at most 8 levels, tree blocks at level 0~7 can all be shared.

If it's shared by 1000 subvolumes, hope you had a good day then.

Ok, let's do just this issue for the time being. One issue at a time. It
will be easier.

The solution is to temporarily create a copy of the entire backref-tree
in memory. To create this copy, you just do a preorder depth-first
traversal following only forward references.

So this preorder depth-first traversal would visit the nodes in the
following order:
A,C,E,F,D,G,H,B

Oh, it is not a tree, it is a DAG in that example of yours. OK, preorder
is possible on DAG, too. But how did you get a DAG, shouldn't it be all
trees?

When you have the entire backref-tree (backref-DAG?) in memory, doing a
backref-walk is a piece of cake.

Of course, this in-memory backref tree has to be kept in sync with the
filesystem, that is it has to be updated whenever there is a write to
disk. That's not so hard.

Great, now that you have devised a solution and have plenty of
experience writing code why not try and contribute to btrfs?

First, that is what I'm just doing. I'm contributing to discussion on most needed features of btrfs. I'm helping you to get on the right track and waste less time on unimportant stuff.

You might appreciate my help, or not, but I am trying to help.

What you probaby wanted to say is that you would like me to contribute by writing code, pro bono. Unfortunately, I work for money as does the 99% of the population. Why not contribute for free? For the same reason why the rest of the population doesn't work for free. And, I'm not going from door to door and buggin everyone with "why don't you work for free", "why don't you help this noble cause..." blah. Makes no sense to me.


Reply via email to