On Wed, Feb 18, 2026 at 11:58 AM T.J. Mercier <[email protected]> wrote: > > On Wed, Feb 18, 2026 at 11:15 AM T.J. Mercier <[email protected]> wrote: > > > > On Wed, Feb 18, 2026 at 10:37 AM Jan Kara <[email protected]> wrote: > > > > > > On Wed 18-02-26 10:06:35, T.J. Mercier wrote: > > > > On Wed, Feb 18, 2026 at 10:01 AM Jan Kara <[email protected]> wrote: > > > > > > > > > > On Tue 17-02-26 19:22:31, T.J. Mercier wrote: > > > > > > Currently some kernfs files (e.g. cgroup.events, memory.events) > > > > > > support > > > > > > inotify watches for IN_MODIFY, but unlike with regular filesystems, > > > > > > they > > > > > > do not receive IN_DELETE_SELF or IN_IGNORED events when they are > > > > > > removed. > > > > > > > > > > Please see my email: > > > > > https://lore.kernel.org/all/lc2jgt3yrvuvtdj2kk7q3rloie2c5mzyhfdy4zvxylx732voet@ol3kl4ackrpb > > > > > > > > > > I think this is actually a bug in kernfs... > > > > > > > > > > Honza > > > > > > > > Thanks, I'm looking at this now. I've tried calling clear_nlink in > > > > kernfs_iop_rmdir, but I've found that when we get back to vfs_rmdir > > > > and shrink_dcache_parent is called, d_walk doesn't find any entries, > > > > so shrink_kill->__dentry_kill is not called. I'm investigating why > > > > that is... > > > > > > Strange because when I was experimenting with this in my VM I have seen > > > __dentry_kill being called (if the dentries were created by someone > > > looking > > > up the names). > > > > Ahh yes, that's the difference. I was just doing mkdir > > /sys/fs/cgroup/foo immediately followed by rmdir /sys/fs/cgroup/foo. > > kernfs creates the dentries in kernfs_iop_lookup, so there were none > > when I did the rmdir because I didn't cause any lookups. > > > > If I actually have a program watching > > /sys/fs/cgroup/foo/memory.events, then I do see the __dentry_kill kill > > calls, but despite the prior clear_nlink call i_nlink is 1 so > > fsnotify_inoderemove is skipped. Something must be incrementing it. > > The issue was that kernfs_remove unlinks the kernfs nodes, but doesn't > clear_nlink when it does so. Adding that seems to work to generate > IN_DELETE_SELF and IN_IGNORED. I'll do some more testing and get a > patch ready.
This works for the rmdir case, because vfs_rmdir->shrink_dcache_parent->shrink_kill->__dentry_kill is invoked when the user runs rmdir. However the case where a kernfs file is removed because a cgroup subsys is deactivated does not work, because it occurs when the user writes to cgroup.subtree_control. That is a vfs_write which calls fsnotify_modify for cgroup.subtree_control, but (very reasonably) there is no attempt made to clean up the dcache in VFS on writes. So I think kernfs still needs to generate fsnotify events manually for the cgroup_subtree_control_write->cgroup_apply_control_disable case. Those removals happen via kernfs_remove_by_name->__kernfs_remove, so that would look a lot like what I sent in this v3 patch, even if we also add clear_nlink calls for the rmdir case.

