Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()
On Thu, Apr 15, 2021 at 2:29 PM Luis Chamberlain wrote: > > On Thu, Apr 15, 2021 at 02:17:58PM -0400, Josef Bacik wrote: > > There's a lot of larger things that need to > > be addressed in general to support the volume approach inside file systems > > that is going to require a lot of work inside of VFS. If you feel like > > tackling that work and then wiring up btrfs by all means have at it, but I'm > > not seeing a urgent need to address this. Thanks, > > That's precisely what I what I want to hear me about. Things like this. > Would btrfs be the ony user of volumes inside filesystem? Jeff had > mentioned before this could also allow namespaces per volumes, and this > might be a desirable feature. > > What else? Wouldn't this be useful for union filesystems like OverlayFS? Or other filesystems that support nested filesystems like bcachefs? -- 真実はいつも一つ!/ Always, there's only one truth!
Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()
On Thu, Apr 15, 2021 at 02:17:58PM -0400, Josef Bacik wrote: > There's a lot of larger things that need to > be addressed in general to support the volume approach inside file systems > that is going to require a lot of work inside of VFS. If you feel like > tackling that work and then wiring up btrfs by all means have at it, but I'm > not seeing a urgent need to address this. Thanks, That's precisely what I what I want to hear me about. Things like this. Would btrfs be the ony user of volumes inside filesystem? Jeff had mentioned before this could also allow namespaces per volumes, and this might be a desirable feature. What else? Luis
Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()
On 4/15/21 1:53 PM, Luis Chamberlain wrote: On Wed, Aug 23, 2017 at 3:31 PM Jeff Mahoney wrote: On 8/15/14 5:29 AM, Al Viro wrote: On Thu, Aug 14, 2014 at 07:58:56PM -0700, Luis R. Rodriguez wrote: Christoph had noted that this seemed associated to the problem that the btrfs uses different assignments for st_dev than s_dev, but much as I'd like to see that changed based on discussions so far its unclear if this is going to be possible unless strong commitment is reached. Resurrecting a dead thread since we've been carrying this patch anyway since then. Explain, please. Whose commitment and commitment to what, exactly? Having different ->st_dev values for different files on the same fs is a bloody bad idea; why does btrfs do that at all? If nothing else, it breaks the usual "are those two files on the same fs?" tests... It's because btrfs snapshots would have inode number collisions. Changing the inode numbers for snapshots would negate a big benefit of btrfs snapshots: the quick creation and lightweight on-disk representation due to metadata sharing. The thing is that ustat() used to work. Your commit 0ee5dc676a5f8 (btrfs: kill magical embedded struct superblock) had a regression: Since it replaced the superblock with a simple dev_t, it rendered the device no longer discoverable by user_get_super. We need a list_head to attach for searching. There's an argument that this is hacky. It's valid. The only other feedback I've heard is to use a real superblock for subvolumes to do this instead. That doesn't work either, due to things like freeze/thaw and inode writeback. Ultimately, what we need is a single file system with multiple namespaces. Years ago we just needed different inode namespaces, but as people have started adopting btrfs for containers, we need more than that. I've heard requests for per-subvolume security contexts. I'd imagine user namespaces are on someone's wish list. A working df can be done with ->d_automount, but the way btrfs handles having a "canonical" subvolume location has always been a way to avoid directory loops. I'd like to just automount subvolumes everywhere they're referenced. One solution, for which I have no code yet, is to have something like a superblock-light that we can hang things like a security context, a user namespace, and an anonymous dev. Most file systems would have just one. Btrfs would have one per subvolume. That's a big project with a bunch of discussion. 4 years have gone by and this patch is still being carried around for btrfs. Other than resolving this ustat() issue for btrfs are there new reasons to support this effort done to be done properly? Are there other filesystems that would benefit? I'd like to get an idea of the stakeholder here before considering taking this on or not. Not really sure why this needs to be addressed, we have statfs(), and what we have has worked forever now. There's a lot of larger things that need to be addressed in general to support the volume approach inside file systems that is going to require a lot of work inside of VFS. If you feel like tackling that work and then wiring up btrfs by all means have at it, but I'm not seeing a urgent need to address this. Thanks, Josef
Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()
On Wed, Aug 23, 2017 at 3:31 PM Jeff Mahoney wrote: > > On 8/15/14 5:29 AM, Al Viro wrote: > > On Thu, Aug 14, 2014 at 07:58:56PM -0700, Luis R. Rodriguez wrote: > > > >> Christoph had noted that this seemed associated to the problem > >> that the btrfs uses different assignments for st_dev than s_dev, > >> but much as I'd like to see that changed based on discussions so > >> far its unclear if this is going to be possible unless strong > >> commitment is reached. > > Resurrecting a dead thread since we've been carrying this patch anyway > since then. > > > Explain, please. Whose commitment and commitment to what, exactly? > > Having different ->st_dev values for different files on the same > > fs is a bloody bad idea; why does btrfs do that at all? If nothing else, > > it breaks the usual "are those two files on the same fs?" tests... > > It's because btrfs snapshots would have inode number collisions. > Changing the inode numbers for snapshots would negate a big benefit of > btrfs snapshots: the quick creation and lightweight on-disk > representation due to metadata sharing. > > The thing is that ustat() used to work. Your commit 0ee5dc676a5f8 > (btrfs: kill magical embedded struct superblock) had a regression: > Since it replaced the superblock with a simple dev_t, it rendered the > device no longer discoverable by user_get_super. We need a list_head to > attach for searching. > > There's an argument that this is hacky. It's valid. The only other > feedback I've heard is to use a real superblock for subvolumes to do > this instead. That doesn't work either, due to things like freeze/thaw > and inode writeback. Ultimately, what we need is a single file system > with multiple namespaces. Years ago we just needed different inode > namespaces, but as people have started adopting btrfs for containers, we > need more than that. I've heard requests for per-subvolume security > contexts. I'd imagine user namespaces are on someone's wish list. A > working df can be done with ->d_automount, but the way btrfs handles > having a "canonical" subvolume location has always been a way to avoid > directory loops. I'd like to just automount subvolumes everywhere > they're referenced. One solution, for which I have no code yet, is to > have something like a superblock-light that we can hang things like a > security context, a user namespace, and an anonymous dev. Most file > systems would have just one. Btrfs would have one per subvolume. > > That's a big project with a bunch of discussion. 4 years have gone by and this patch is still being carried around for btrfs. Other than resolving this ustat() issue for btrfs are there new reasons to support this effort done to be done properly? Are there other filesystems that would benefit? I'd like to get an idea of the stakeholder here before considering taking this on or not. Luis
Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()
On 8/15/14 5:29 AM, Al Viro wrote: > On Thu, Aug 14, 2014 at 07:58:56PM -0700, Luis R. Rodriguez wrote: > >> Christoph had noted that this seemed associated to the problem >> that the btrfs uses different assignments for st_dev than s_dev, >> but much as I'd like to see that changed based on discussions so >> far its unclear if this is going to be possible unless strong >> commitment is reached. Resurrecting a dead thread since we've been carrying this patch anyway since then. > Explain, please. Whose commitment and commitment to what, exactly? > Having different ->st_dev values for different files on the same > fs is a bloody bad idea; why does btrfs do that at all? If nothing else, > it breaks the usual "are those two files on the same fs?" tests... It's because btrfs snapshots would have inode number collisions. Changing the inode numbers for snapshots would negate a big benefit of btrfs snapshots: the quick creation and lightweight on-disk representation due to metadata sharing. The thing is that ustat() used to work. Your commit 0ee5dc676a5f8 (btrfs: kill magical embedded struct superblock) had a regression: Since it replaced the superblock with a simple dev_t, it rendered the device no longer discoverable by user_get_super. We need a list_head to attach for searching. There's an argument that this is hacky. It's valid. The only other feedback I've heard is to use a real superblock for subvolumes to do this instead. That doesn't work either, due to things like freeze/thaw and inode writeback. Ultimately, what we need is a single file system with multiple namespaces. Years ago we just needed different inode namespaces, but as people have started adopting btrfs for containers, we need more than that. I've heard requests for per-subvolume security contexts. I'd imagine user namespaces are on someone's wish list. A working df can be done with ->d_automount, but the way btrfs handles having a "canonical" subvolume location has always been a way to avoid directory loops. I'd like to just automount subvolumes everywhere they're referenced. One solution, for which I have no code yet, is to have something like a superblock-light that we can hang things like a security context, a user namespace, and an anonymous dev. Most file systems would have just one. Btrfs would have one per subvolume. That's a big project with a bunch of discussion. So for now, I'd like to move this patch forward while we (I) work on the bigger issue. BTW, in this same thread, Christoph said:> Again, NAK. Make btrfs report the proper anon dev_t in stat and > everything will just work. We do. We did then too. But what doesn't work is a user doing stat() and then using the dev_t to call ustat(). -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()
On Fri, Aug 15, 2014 at 10:29:50AM +0100, Al Viro wrote: > On Thu, Aug 14, 2014 at 07:58:56PM -0700, Luis R. Rodriguez wrote: > > > Christoph had noted that this seemed associated to the problem > > that the btrfs uses different assignments for st_dev than s_dev, > > but much as I'd like to see that changed based on discussions so > > far its unclear if this is going to be possible unless strong > > commitment is reached. > > Explain, please. Whose commitment and commitment to what, exactly? There are two folks, one is the btrfs developers, and the others are the VFS maintainers to provide proper guidance. > Having different ->st_dev values for different files on the same > fs is a bloody bad idea; why does btrfs do that at all? With the disclosure of stating that I'm new to btrfs as I see its been done to help cope with the copy on write mechanism, but I welcome btrfs folks to chime in if there other reasons this was done from an architectural point of view. Provided all reasons why this was done are clarified what we'd need then is proper guidance on what *would* be a much more reasonable strategy to do what was desired, and finally commitmen from btrfs folks to change btrfs to switch to this new agreed upon strategy. > If nothing else, > it breaks the usual "are those two files on the same fs?" tests... It would seem that those tests need more context now with copy on write, even the notion of disk space is all fucked up now, we need to think of it in terms of different possibilities that the new filesystems allow us to share data and different outcomes that could be possible. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()
On Thu, Aug 14, 2014 at 07:58:56PM -0700, Luis R. Rodriguez wrote: > Christoph had noted that this seemed associated to the problem > that the btrfs uses different assignments for st_dev than s_dev, > but much as I'd like to see that changed based on discussions so > far its unclear if this is going to be possible unless strong > commitment is reached. Explain, please. Whose commitment and commitment to what, exactly? Having different ->st_dev values for different files on the same fs is a bloody bad idea; why does btrfs do that at all? If nothing else, it breaks the usual "are those two files on the same fs?" tests... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/