On Wed, Jan 20, 2021 at 11:02:41AM -0500, Josef Bacik wrote:
> On 1/17/21 1:54 PM, Goffredo Baroncelli wrote:
> > 
> > Hi all,
> > 
> > This is an RFC; I wrote this patch because I find the idea interesting
> > even though it adds more complication to the chunk allocator.
> > 
> > The basic idea is to store the metadata chunk in the fasters disks.
> > The fasters disk are marked by the "preferred_metadata" flag.
> > 
> > BTRFS when allocate a new metadata/system chunk, selects the
> > "preferred_metadata" disks, otherwise it selectes the non
> > "preferred_metadata" disks. The intial patch allowed to use the other
> > kind of disk in case a set is full.
> > 
> > This patches set is based on v5.11-rc2.
> > 
> > For now, the only user of this patch that I am aware is Zygo.
> > However he asked to further constraint the allocation: i.e. avoid to
> > allocated metadata on a not "preferred_metadata"
> > disk. So I extended the patch adding 4 modes to operate.
> > 
> > This is enabled passing the option "preferred_metadata=<mode>" at
> > mount time.
> > 
> 
> I'll echo Zygo's hatred for mount options.  The more complicated policy
> decisions belong in properties and sysfs knobs, not mount options.

Well, I hated it for two distinct reasons.  Only one of those was the
mount option.

The feature is really a per-_device_ policy:  whether we put data or
metadata or both (*) on the device, and in what order we fill devices
with each.

There's nothing filesystem-wide about the feature, other than we might
want to ensure there are enough devices to do allocations with our
raid profiles (e.g. if there are 2 data-only disks and raid1 data,
don't allow one of those disks to become metadata-only).

I had considered a couple of other ideas but dropped them:

        - keep a filesystem-wide "off" feature.  If that makes sense at
        all, it would be as a mount option, e.g. if you had painted your
        filesystem into a corner with data-only devices and didn't have
        enough metadata space to do a device property set to fix it.
        I dropped the idea because we don't have an existing feature
        for any other scenario where that happens (e.g. adding one disk
        to a full raid1 array) and it should be possible to recover
        immediately after the next mount using reserved metadata space.

        - for each device, specify a priority for metadata and data
        separately, with the lowest priority meaning "never use this
        device for that kind of chunk".  That is a more general form
        of the 4-device-type form (metadata only, metadata preferred,
        data preferred, data only), but all of the additional device
        orders that arbitrary sorting order permits will tend to make
        data and metadata fight with each other, to the point that the
        ordering isn't useful.  On the other hand, maybe this idea does
        have a future as a kind of advanced "move the data where I tell
        you to" balancing tool for complicated RAID array reshapes.

> And then for the properties themselves, presumably we'll want to add other
> FS wide properties in the future.  I'm not against adding new actual keys
> and items to the tree itself, but is there a way we could use our existing
> property infrastructure that we use for compression, and simply store the
> xattrs in the tree root?  It looks like we're just toggling a policy
> decision, and we don't actually need the other properties in the item you've
> created, so why not just a btrfs.preferred_metadata property with the value
> stored in it, dropped into the tree_root so it can be read on mount?
> Thanks,
> 
> Josef

(*) I suppose logically there could be a "neither" option, but I'm not
sure what it would be for.

Reply via email to