Am Tue, 2 May 2017 05:01:02 +0000 (UTC) schrieb Duncan <1i5t5.dun...@cox.net>:
> Of course on-list I'm somewhat known for my arguments propounding the > notion that any filesystem that's too big to be practically > maintained (including time necessary to restore from backups, should > that be necessary for whatever reason) is... too big... and should > ideally be broken along logical and functional boundaries into a > number of individual smaller filesystems until such point as each one > is found to be practically maintainable within a reasonably practical > time frame. Don't put all the eggs in one basket, and when the bottom > of one of those baskets inevitably falls out, most of your eggs will > be safe in other baskets. =:^) Hehe... Yes, you're a fan of small filesystems. I'm more from the opposite camp, preferring one big filesystem to not mess around with size constraints of small filesystems fighting for the same volume space. It also gives such filesystems better chances for data locality of data staying in totally different parts across your fs mounts and can reduce head movement. Of course, much of this is not true if you use different devices per filesystem, or use SSDs, or SAN where you have no real control over the physical placement of image stripes anyway. But well... In an ideal world, subvolumes of btrfs would be totally independent of each other, just only share the same volume and dynamically allocating chunks of space from it. If one is broken, it is simply not usable and it should be destroyable. A garbage collector would grab the leftover chunks from the subvolume and free them, and you could recreate this subvolume from backup. In reality, shared extents will cross subvolume borders so it is probably not how things could work anytime in the near of far future. This idea is more like having thinly provisioned LVM volumes which allocate space as the filesystems on top need them, much like doing thinly provisioned images with a VM host system. The problem here is, unlike subvolumes, those chunks of space could never be given back to the host as it doesn't know if it is still in use. Of course, there's implementations available which allow thinning the images by passing through TRIM from the guest to the host (or by other means of communication channels between host and guest), but that is usually not giving good performance, if even supported. I tried once to exploit this in VirtualBox and hoped it would translate guest discards into hole punching requests on the host, and it's even documented to work that way... But (a) it was horrible slow, and (b) it was incredibly unstable to the point of being useless. OTOH, it's not announced as a stable feature and has to be enabled by manually editing the XML config files. But I still like the idea: Is it possible to make btrfs still work if one subvolume gets corrupted? Of course it should have ways of telling the user which other subvolumes are interconnected through shared extents so those would be also discarded upon corruption cleanup - at least if those extents couldn't be made any sense of any longer. Since corruption is an issue mostly of subvolumes being written to, snapshots should be mostly safe. Such a feature would also only make sense if btrfs had an online repair tool. BTW, are there plans for having an online repair tool in the future? Maybe one that only scans and fixes part of the filesystems (for obvious performance reasons, wrt Duncans idea of handling filesystems), i.e. those parts that the kernel discovered having corruptions? If I could then just delete and restore affected files, this would be even better than having independent subvolumes like above. -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html