On Oct 28, 2012, at 1:06 PM, Michael Kjörling <mich...@kjorling.se> wrote:
> 
> This makes a valid point. And there's another issue potentially at
> play: file system level compression, _which btrfs already supports_.
> Say you store a highly compressible file (a web server access log,
> perhaps) on a compressed volume that is set up with RAID 1 across a
> set of physical drives. You know the pool size, and you know the data
> allocation ratio. What you _don't_ know and can't predict is the
> _compression_ ratio for any new data, which means that you still can't
> calculate how much more payload data can be added to the file before
> you run out of storage space. You _can_ calculate a worst case
> scenario figure (no compression possible on the new payload data), but
> that's about it.

Btrfs definitely makes absolute prediction of free space remaining difficult. 
I'm fine with a verbose feature proposing various "free space scenarios" based 
on compression and redundancy, however I think a summary version can do better. 
Better being defined as: relatively easy to understand, not a lie, but not the 
whole truth, and subject to change if you deviate from the ability of the 
system to estimate.

As the file system is actually used, there is pretty good statistical 
information, which should get better the more full it gets, how it's being used 
and therefore how much time to fullness. And I think it's fine for this to vary 
somewhat based on use. You could have:

Storage remaining
current usage: 3 hours
average usage: 3 weeks

So that's a file system with a lot being written to it currently, way more than 
average.

Storage remaining
current usage: 35 days
average usage: 40 days

You could even include a switch that allows enterprise to assume regularly 
added storage. Computing an average data allocation rate allows a prediction 
for when the file system will be full, assuming that average allocation trend. 
So an optional switch can include the assumption that storage will be grown at 
a certain rate: e.g. 3TB per month. Based on both allocation and growth 
assumptions, storage remaining as time becomes even more useful than a 
seemingly "raw" percentage that fluctuates. 

I think human's are a lot more forgiving of time estimates being pliable than 
percentages anyway, although I don't know why.

Anyway, eventually the file system could stop just short of issuing its own 
purchase orders for the drives it needs to maintain a reasonable 
current/average usage value.

Ergo, when the file system is young and mostly empty, the estimate may be 
really off. But once this statistic actually starts becoming important, on the 
other side of 50%, I think the trend is fairly well established. So it's kindof 
a no consequence estimate early on.


> _However_, while thinking in terms of storage pool sizes, allocation
> ratios and so forth might work for technically inclined people, I'm
> not so sure it'll work for ordinary users. But then again:

Agreed. I wouldn't expose that to them. At least not in the summary. But 
underlying, the ratios are in play, and affect the summary information users 
are given rather than purely raw information, which I think is useless for 
ordinary users.


> * How many ordinary users are going to use multi-device file systems
> in the first place?

It's a very interesting question. What's the role of mobile in all of this too, 
if Gartner is even remotely in the ballpark for latest estimates of (just) 
Android devices equalling (then surpassing) Windows devices in 2015?

However, sticking to desktop for now, with SSD users can form factor wise much 
more easily have a raid1 setup; or much more likely, -d single profile to add 
on more storage, while nothing else file system wise changes. They just have 
more space. They don't have to move things around, establish a new mount 
points, etc. Or even screw with LVM. It hardly gets easier than 'device add X 
Y' done.

> 
> * "btrfs" _is_ the file system administration tool. In a way, it makes
> sense that the data provided by it will be geared more toward
> technically minded people.

It is a fair point. But this is not LVM or md raid. It's way easier to 
understand and manage already, even with some trouble spots. But in the area of 
storage pool/volume management it's perhaps potentially more complex due to 
subvolumes and actually usable snapshots.

So, I must refuse part of the premise that sysadmins and storage experts aren't 
also ordinary people. They can benefit too from useful "at a glance" 
information that, again, doesn't mislead them.

Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to