On 2018-08-10 14:21, Tomasz Pala wrote:
On Fri, Aug 10, 2018 at 07:39:30 -0400, Austin S. Hemmelgarn wrote:

I.e.: every shared segment should be accounted within quota (at least once).
I think what you mean to say here is that every shared extent should be
accounted to quotas for every location it is reflinked from.  IOW, that
if an extent is shared between two subvolumes each with it's own quota,
they should both have it accounted against their quota.

Yes.

Moreover - if there would be per-subvolume RAID levels someday, the data
should be accouted in relation to "default" (filesystem) RAID level,
i.e. having a RAID0 subvolume on RAID1 fs should account half of the
data, and twice the data in an opposite scenario (like "dup" profile on
single-drive filesystem).

This is irrelevant to your point here.  In fact, it goes against it,
you're arguing for quotas to report data like `du`, but all of
chunk-profile stuff is invisible to `du` (and everything else in
userspace that doesn't look through BTRFS ioctls).

My point is user-point, not some system tool like du. Consider this:
1. user wants higher (than default) protection of some data,
2. user wants more storage space with less protection.

Ad. 1 - requesting better redundancy is similar to cp --reflink=never
- there are functional differences, but the cost is similar: trading
   space for security,

Ad. 2 - many would like to have .cache, .ccache, tmp or some build
system directory with faster writes and no redundancy at all. This
requires per-file/directory data profile attrs though.

Since we agreed that transparent data compression is user's storage bonus,
gains from the reduced redundancy should also profit user.
Do you actually know of any services that do this though? I mean, Amazon S3 and similar services have the option of reduced redundancy (and other alternate storage tiers), but they charge per-unit-data-per-unit-time with no hard limit on how much space they use, and charge different rates for different storage tiers. In comparison, what you appear to be talking about is something more similar to Dropbox or Google Drive, where you pay up front for a fixed amount of storage for a fixed amount of time and can't use more than that, and all the services I know of like that offer exactly one option for storage redundancy.

That aside, you seem to be overthinking this. No sane provider is going to give their users the ability to create subvolumes themselves (there's too much opportunity for a tiny bug in your software to cost you a _lot_ of lost revenue, because creating subvolumes can let you escape qgroups) That means in turn that what you're trying to argue for is no different from the provider just selling units of storage for different redundancy levels separately, and charging different rates for each of them. In fact, that approach is better, because it works independent of the underlying storage technology (it will work with hardware RAID, LVM2, MD, ZFS, and even distributed storage platforms like Ceph and Gluster), _and_ it lets them charge differently than the trivial case of N copies costing N times as much as one copy (which is not quite accurate in terms of actual management costs).

Now, if BTRFS were to have the ability to set profiles per-file, then this might be useful, albeit with the option to tune how it gets accounted.

Disclaimer: all the above statements in relation to conception and
understanding of quotas, not to be confused with qgroups.


Reply via email to