On 2015-09-22 13:32, Austin S Hemmelgarn wrote:
On that note, it might be nice to have the ability to say 'store at least n copies of this data' in addition to being able to say 'store exactly this many copies of this data'. (could be really helpful for filesystems with differing device sizes).On 2015-09-22 11:39, Hugo Mills wrote:On Tue, Sep 22, 2015 at 10:54:45AM -0400, Austin S Hemmelgarn wrote:On 2015-09-22 10:36, Hugo Mills wrote:On Tue, Sep 22, 2015 at 04:23:33PM +0200, David Sterba wrote:On Tue, Sep 22, 2015 at 01:41:31PM +0000, Hugo Mills wrote:On Tue, Sep 22, 2015 at 03:36:43PM +0200, Holger Hoffstätte wrote:On 09/22/15 14:59, Jeff Mahoney wrote: (snip)So if they way we want to prevent the loss of raid type info is by maintaining the last block group allocated with that raid type, fine, but that's a separate discussion. Personally, I think keeping 1GBAt this point I'm much more surprised to learn that the RAID type can apparently get "lost" in the first place, and is not persisted separately. I mean..wat?It's always been like that, unfortunately. The code tries to use the RAID type that's already present to work out what the next allocation should be. If there aren't any chunks in the FS, the configuration is lost, because it's not stored anywhere else. It's one of the things that tripped me up badly when I was failing to rewrite the chunk allocator last year.Yeah, right now there's no persistent default for the allocator. I'm still hoping that the object properties will magically solve that.There's no obvious place that filesystem-wide properties can be stored, though. There's a userspace tool to manipulate the few current FS-wide properties, but that's all special-cased to use the "historical" ioctls for those properties, with no generalisation of a property store, or even (IIRC) any external API for them. We're nominally using xattrs in the btrfs: namespace on directories and files, and presumably on the top directory of a subvolume for subvol-wide properties, but it's not clear where the FS-wide values should go: in the top directory of subvolid=5 would be confusing, because then you couldn't separate the properties for *that subvol*>from the ones for the whole FS (say, the default replication policy,where you might want the top subvol to have different properties from everything else).Possibly do special names for the defaults and store them there? In general, I personally see little value in having some special 'default' properties however.That would work.The way I would expect things to work is that a new subvolume inherits it's properties from it's parent (if it's a snapshot),Definitely this.or from the next higher subvolume it's nested in.I don't think I like this. I'm not quite sure why, though, at the moment. It definitely makes the process at the start of allocating a new block group much more complex: you have to walk back up through an arbitrary depth of nested subvols to find the one that's actually got a replication policy record in it. (Because after this feature is brought in, there will be lots of filesystems without per-subvol replication policies in them, and we have to have some way of dealing with those as well).ro-compat flag perhaps?With an FS default policy, you only need check the current subvol, and then fall back to the FS default if that's not found. These things are, I think, likely to be lightly used: I would be reasonably surprised to find more than two or possibly three storage policies in use on any given system with a sane sysadmin. I'm actually not sure what the interactions of multiple storage policies are going to be like. It's entirely possible, particularly with some of the more exotic (but useful) suggestions I've thought of, that the behaviour of the FS is dependent on the order in which the block groups are allocated. (i.e. "20 GiB to subvol-A, then 20 GiB to subvol-B" results in different behaviour than "1 GiB to subvol-A then 1 GiB to subvol-B and repeat"). I tried some simple Monte-Carlo simulations, but I didn't get any concrete results out of it before the end of the train journey. :)Yeah, I could easily see that getting complicated when you add in the (hopefully soon) possibility of n-copy replication.
This would obviate the need for some special 'default' properties, and would be relatively intuitive behavior for a significant majority of people.Of course, you shouldn't be nesting subvolumes anyway. It makes it much harder to manage them.That depends though, I only ever do single nesting (ie, a subvolume in a subvolume), and I use it to exclude stuff from getting saved in snapshots (mostly stuff like clones of public git trees, or other stuff that's easy to reproduce without a backup). Beyond that though, there are other inherent issues of course.
smime.p7s
Description: S/MIME Cryptographic Signature