On Tue, Jun 11, 2019 at 10:03:51PM -0600, Chris Murphy wrote: > On Tue, Jun 11, 2019 at 12:31 PM Neal Gompa <ngomp...@gmail.com> wrote: > > > > Hey, > > > > So Apple held its WWDC event last week, and among other things, they > > talked about improvements they've made to filesystems in macOS[1]. > > > > Among other things, one of the things introduced was a concept of > > "firm links", which is something like NTFS' directory junctions, > > except they can cross (sub)volumes. > > My understanding is it's a work around for the lack of APFS supporting > directory hardlinks. Btrfs does support directory hardlinks but a
Directory hardlinks are not supported in general on linux and prohibited on the VFS level. (check fs/namei.c vfs_link, explicitly returns -EPERM for a directory). > hardlink points to a particular inode within a particular subvolume > (files tree) so it's not possible to have a hard link that crosses > subvolumes. A reflink can already do this, but it's really just an > efficient copy, the resulting directory is independent. A directory > symlink can mirror a directory across subvolumes, but like any symlink > it must have a fixed path available to always find the real deal. > > I think a firm link like thing on Btrfs would require a format change, > but I'm not certain. My best guess of what it'd be, is a dir/file > object that gets its own inode but contains a hard reference (not > independent object) to a subvolid+inode. > > > >This concept makes it easier to > > handle uglier layouts. While bind mounts work kind of okay for this > > with simpler configurations, it requires operating system awareness, > > rather than being setup automatically as the volume is mounted. This > > is less brittle and works better for recovery environments, and help > > make easier to do read-only system volumes while supported read-write > > sections in a more flexible way. > > There are a couple of things going on. One is something between VFS > and Btrfs does this goofy assumption that bind mounts are subvolumes, > which is definitely not true. I bring this up here: > https://lore.kernel.org/linux-btrfs/CAJCQCtT=-YoFJgEo=bfqfipdtmojcyr3djpsekf+hq22gyg...@mail.gmail.com/ The subvolumes build on top of the bind mount API internally but it is or should be a different kind of object. > Near as I can tell, Btrfs kernel code just needs to be smarter about > distinguishing between bind mounts of directories versus the behind > the scene bind mount used for subvolumes mounted using -o subvol= or > -o subvolid= ; I don't think that's difficult. It's just someone needs > to work through the logic and set aside the resources to do it. I tried to fix that and got half way through, then hit the difficult problems mainly with nested subvolumes. For leaf subvolumes, the difference between subvolume/dir/dir/dir (bind mounted) and subvolume (mounted with -o) is to traverse back the path until the subvolume is hit, which in both cases would be 'subvolume'. Howvever, with nested subvolumes it's not easy to see where to stop subvol1/dir/dir/subvol2/dir/dir/subvol3/dir/dir and take 3 cases: mount -o subvol=subvol1 mount -o subvol=subvol2 mount -o subvol=subvol3 the backward path traversal will always say it's subvol3 (that's wrong from users POV). Keeping track of the exact subvolume that was mounted is not trivial because it partially has to duplicate the internal VFS information which makes it hard to keep consistent after moves. There was a concept proposal called 'fs view' that would add proper subvolume abstraction for subvolumes to VFS but I don't know how far this got.