Re: Shrinking a device - performance?

Austin S. Hemmelgarn Fri, 31 Mar 2017 05:42:35 -0700

On 2017-03-30 11:55, Peter Grandi wrote:

My guess is that very complex risky slow operations like that are
provided by "clever" filesystem developers for "marketing" purposes,
to win box-ticking competitions. That applies to those system
developers who do know better; I suspect that even some filesystem
developers are "optimistic" as to what they can actually achieve.

There are cases where there really is no other sane option. Not
everyone has the kind of budget needed for proper HA setups,


Thnaks for letting me know, that must have never occurred to me, just as
it must have never occurred to me that some people expect extremely
advanced features that imply big-budget high-IOPS high-reliability
storage to be fast and reliable on small-budget storage too :-)

You're missing my point (or intentionally ignoring it). Those types ofoperations are implemented because there are use cases that actuallyneed them, not because some developer thought it would be cool. The onepossible counter-example of this is XFS, which doesn't support shrinkingthe filesystem at all, but that was a conscious decision because theirtarget use case (very large scale data storage) does not need thatfeature and not implementing it allows them to make certain other partsof the filesystem faster.

and if you need maximal uptime and as a result have to reprovision the
system online, then you pretty much need a filesystem that supports
online shrinking.


That's a bigger topic than we can address here. The topic used to be
known in one related domain as "Very Large Databases", which were
defined as databases so large and critical that they the time needed for
maintenance and backup were too slow for taking them them offline etc.;
that is a topics that has largely vanished for discussion, I guess
because most management just don't want to hear it :-).

No, it's mostly vanished because of changes in best current practice.That was a topic in an era where the only platform that could handlehigh-availability was VMS, and software wasn't routinely written tohandle things like load balancing. As a result, people ran a singlesystem which hosted the database, and if that went down, everything wentdown. By contrast, it's rare these days outside of small companies tosee singly hosted databases that aren't specific to the local system,and once you start parallelizing on the system level, backup andmaintenance times generally go down.

Also, it's not really all that slow on most filesystem, BTRFS is just
hurt by it's comparatively poor performance, and the COW metadata
updates that are needed.


Btrfs in realistic situations has pretty good speed *and* performance,
and COW actually helps, as it often results in less head repositioning
than update-in-place. What makes it a bit slower with metadata is having
'dup' by default to recover from especially damaging bitflips in
metadata, but then that does not impact performance, only speed.

I and numerous other people have done benchmarks running single metadataand single data profiles on BTRFS, and it consistently performs worsethan XFS and ext4 even under those circumstances. It's not horribleperformance (it's better for example than trying the same workload onNTFS on Windows), but it's still not what most people would call 'high'performance or speed.

That feature set is arguably not appropriate for VM images, but
lots of people know better :-).

That depends on a lot of factors.  I have no issues personally running
small VM images on BTRFS, but I'm also running on decent SSD's
(>500MB/s read and write speeds), using sparse files, and keeping on
top of managing them. [ ... ]


Having (relatively) big-budget high-IOPS storage for high-IOPS workloads
helps, that must have never occurred to me either :-).

It's not big budget, the SSD's in question are at best mid-rangeconsumer SSD's that cost only marginally more than a decent hard drive,and they really don't get all that great performance in terms of IOPSbecause they're all on the same cheap SATA controller. The point I wastrying to make (which I should have been clearer about) is that theyhave good bulk throughput, which means that the OS can do much moreaggressive writeback caching, which in turn means that COW andfragmentation have less impact.

XFS and 'ext4' are essentially equivalent, except for the fixed-size
inode table limitation of 'ext4' (and XFS reportedly has finer
grained locking). Btrfs is nearly as good as either on most workloads
is single-device mode [ ... ]

No, if you look at actual data, [ ... ]


Well, I have looked at actual data in many published but often poorly
made "benchmarks", and to me they seem they seem quite equivalent
indeed, within somewhat differently shaped performance envelopes, so the
results depend on the testing point within that envelope. I have been
done my own simplistic actual data gathering, most recently here:

  http://www.sabi.co.uk/blog/17-one.html?170302#170302
  http://www.sabi.co.uk/blog/17-one.html?170228#170228

and however simplistic they are fairly informative (and for writes they
point a finger at a layer below the filesystem type).

In terms of performance, yes they are roughly equivalent. Performanceisn't all that matters though, and once you get that point, ext4 and XFSare significantly different in what they offer.


[ ... ]

"Flexibility" in filesystems, especially on rotating disk
storage with extremely anisotropic performance envelopes, is
very expensive, but of course lots of people know better :-).

Time is not free,


Your time seems especially and uniquely precious as you "waste"
as little as possible editing your replies into readability.

and humans generally prefer to minimize the amount of time they have
to work on things. This is why ZFS is so popular, it handles most
errors correctly by itself and usually requires very little human
intervention for maintenance.


That seems to me a pretty illusion, as it does not contain any magical
AI, just pretty ordinary and limited error correction for trivial cases.

On average, trivial cases account for most errors in any computer. So,by definition, to handle most errors correctly, you can get by with justhandling all 'trivial' cases correctly. By handling all trivial casescorrectly, ZFS is doing far better than any other current filesystem orstorage stack can even begin to claim. It's been doing this sincebefore most modern Linux distributions made their first release too, socompared to just about anything else people are using these days, it'sgot a pretty solid track record. Anyone trying to claim it's the bestoption in any case is obviously either a zealot or being paid, but formany cases, it really is one of the top options.

'Flexibility' in a filesystem costs some time on a regular basis, but
can save a huge amount of time in the long run.


Like everything else. The difficulty is having flexibility at scale with
challenging workloads. "An engineer can do  for a nickel what  any damn
fool can do for a dollar" :-).

To look at it another way, I have a home server system running BTRFS
on top of LVM. [ ... ]


But usually home servers have "unchallenging" workloads, and it is
relatively easy to overbudget their storage, because the total absolute
cost is "affordable".

OK, so running

* Almost a dozen statically allocated VM's with a variety ofdiffering workloads including web-servers, a local mail server, DHCP andDNS for the network, a VPN server, and 3 different file sharingprotocols (which see rather regular use) among other things* On average between 4 and 10 transient VM's running regressiontesting on kernel patches (including automation of almost everything butselecting patches)

  * A BOINC client
  * GlusterFS (both client and storage node)
  * Network security monitoring (Nagios plus a handful of custom scripts)
  * Cloud storage software

All on the same system is an 'unchallenging' workload. Given the factthat it's only got 32G of RAM and a cheap quad-core Xeon, that's apretty damn challenging workload by most people standards. I call it ahome server because I run it out of my house, not because it's sometrivial dinky little file server that could run just fine on somethinglike a Raspberry Pi.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Shrinking a device - performance?

Reply via email to