Quoting Qu Wenruo <quwenruo.bt...@gmx.com>:
So here what we could do is: (From easy to hard)
- Introduce an interface to allow defrag not to touch shared extents
it shouldn't be that difficult compared to other work we are going
to do.
At least, user has their choice.
That defrag wouldn't acomplish much. You can call it defrag, but it is
more like nothing happens.
If one subvolume is not shared by snapshots or reflinks at all, I'd say
that's exactly what user want.
If one subvolume is not shared by snapshots, the super-duper defrag
would produce the same result concering that subvolume.
Therefore, it is a waste of time to consider this case separately and
to go writing the code to cover just this case.
- Introduce different levels for defrag
Allow btrfs to do some calculation and space usage policy to
determine if it's a good idea to defrag some shared extents.
E.g. my extreme case, unshare the extent would make it possible to
defrag the other subvolume to free a huge amount of space.
A compromise, let user to choose if they want to sacrifice some space.
Meh. You can always defrag one chosen subvolume perfectly, without
unsharing any file extents.
If the subvolume is shared by another snapshot, you always need to face
the decision whether to unshare.
It's unavoidable.
In my opinion, unsharing is a very bad thing to do. If the user orders
it, then OK, but I think it that it is rarely required.
Unsharing can be done manually by just copying the data to another
place (partition). So, if someone really wants to unshare, he can
always easily do it.
When you unshare, it is hard to go back. Unsharing is a one-way road.
When you unshare, you lose free space. Therefore, the defrag should
not unshare.
In my view, the only real decision that needs to be left to the user
is: what to defrag?
In terms of full or partial defrag:
* Everything
- rarely; waste of time and resources, and it wears out SSDs
- perhaps this shouldn't be allowed at all
* 2% od most fragmented files (2% ot total space used, by size in bytes)
- good idea for daily or weekly defrag
- good default
* Let the user choose between 0.01% and 10% (by size in bytes)
- the best
Options by scope:
- One file (when necessary)
- One subvolume (when necessary)
- A list of subvolumes (with priority from first to last; the first
one on the list would be defragmented best)
- All subvolumes
- All subvolumes, with one exclusion list, and one priority list
- option to include or exclude RO subvolumes - as you said, this is
probably the hardest and implementation should be postponed
Therefore, making a super-duper defrag which can defrag one file
(without unsharing!!!) is a good starting point, instead of wasing
time on your proposal "Introduce different levels for defrag".
So, since it can be done perfectly without unsharing, why unshare at all?
No, you can't.
Go check my initial "red-herring" case.
I might check it, but I think that you can't be right. You are
thinking too low-level. If you can split extents and fuse extents and
create new extents that are shared by multiple files, than what you
are saying is simply not possible. The operations I listed are
sufficient to produce a perfect full defrag. Always.