On Thu, Feb 09, 2017 at 10:36:15AM +0300, Konstantin Khlebnikov wrote: > Ok, Thank you. I've expected that this fix isn't sane, > > Maybe we could minimize changes for now. For example: keep these > stale dentries in memory but silently unhash them in ->d_compare(). > Memory processure and reclaimer will kill them later.
->d_compare() is called by the code walking the hash chains. What's worse, in the most common case all we have is rcu_read_lock(). Modifying the chain in rcu reader is no-go. Turning __d_lookup_rcu() into a writer on the off-chance that we'll walk onto a visibly stale sysctl dentry - even more so. If you want to deal with that, do it right, please. Have sysctl inodes on a list of some kind anchored in struct ctl_table_header; insert them there in proc_sys_make_inode(), remove - in proc_evict_inode() (or have it pass the inode to sysctl_head_put() and do the removal there). Use sysctl_lock for serialization. In start_unregistering(), just before the erase_header() call, check if the list is non-empty and if it is - grab sysctl_lock last = NULL walk the list igrab(inode we are looking at) if succeeded drop sysctl_lock iput(last) last = that inode d_prune_aliases(last) retake sysctl_lock // inode is still not evicted, so it's still on the list drop sysctl_lock iput(last) list would pass through struct proc_inode, and I would probably use hlist rather than the normal one; might be more convenient to initialize that way. Getting from containing struct proc_inode to inode - &ei->vfs_inode. It's not that much work; if you have time - go for it, or remind me after -rc1...