On 01/10/2018 10:37 PM, waxhead wrote:
> As just a regular user I would think that the first thing you would need
> is an analyze that can tell you if it is a good idea to balance or not
> in the first place.

Tooling to create that is available. Btrfs allows you to read a lot of
different data to analyze, and then you can experiment with your own
algorithms to find out which blockgroup you're going to feed to balance
next.

There's two language options...
* C -> in this case you're extending btrfs-progs to build new tools
* Python -> python-btrfs has everything in it to quickly throw together
things like this, and examples are available with the source.

For example:
- balance_least_used.py -> balance starting from the least used chunk
and work up towards a max of X% used.
- show_free_space_fragmentation.py -> find out which chunks have badly
fragmented free space. Remember: if you have a 1GiB chunk with usage
50%, that doesn't tell you if it has only a handful of extents, filling
up the first 500MiB, with the rest empty, or if it's thousands of
alternating pieces of 4KiB used space and 4KiB free space. ;-] [0]

In the same way you can program something new, like a balance algorithm
that cleans up blocks with high free space fragmentation first.

Or, another thing you could do is first count the number of extents in a
block group and add it to the algorithm. Balance of a block group with a
few extents is much faster than thousands of extents with a lot of
reflinks, like highly deduped data.

Or... look at generation of metadata to find out which parts of data on
your disk have been touched recently, and which weren't... Too many fun
things to play around with. \:D/

As always, first thing to do is make sure you're on 4.14 or otherwise
use nossd, otherwise you might keep shoveling data around forever.

And if your filesystem has been treated badly by <4.14 kernel in ssd
mode for a long time, then first get that cleaned up:

https://www.spinics.net/lists/linux-btrfs/msg70622.html

> Scrub seems like a great place to start - e.g. scrub could auto-analyze
> and report back need to balance. I also think that scrub should
> optionally autobalance if needed.
> 
> Balance may not be needed, but if one can determine that balancing would
> speed up things a bit I don't see why this as an option can't be
> scheduled automatically. Ideally there should be a "scrub and polish"
> option that would scrub, balance and perhaps even defragment in one go.
> 
> In fact, the way I see it btrfs should idealy by itself keep track on
> each data/metadata chunk and it should know , when was this chunk last
> affected by a scrub, balance, defrag etc and perform the required
> operations by itself based on a configuration or similar. Some may
> disagree for good reasons , but for me this is my wishlist for a
> filesystem :) e.g. a pool that just works and only annoys you with the
> need of replacing a bad disk every now and then :)

I don't think these kind of things will ever end up in kernel code.

[0] There's a version in the devel branch in git that also works without
free space tree, taking a slower detour via the extent tree.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to