[PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
If DCACHE_REFERENCED is set, fast_dput() will return true, and then retain_dentry() have no chance to check DCACHE_DONTCACHE. As a result, the dentry won't be killed and the corresponding inode can't be evicted. In the following example, the DAX policy can't take effects unless we do a drop_caches manually. # DCACHE_LRU_LIST will be set echo abcdefg > test.txt # DCACHE_REFERENCED will be set and DCACHE_DONTCACHE can't do anything xfs_io -c 'chattr +x' test.txt # Drop caches to make DAX changing take effects echo 2 > /proc/sys/vm/drop_caches What this patch does is preventing fast_dput() from returning true if DCACHE_DONTCACHE is set. Then retain_dentry() will detect the DCACHE_DONTCACHE and will return false. As a result, the dentry will be killed and the inode will be evicted. In this way, if we change per-file DAX policy, it will take effects automatically after this file is closed by all processes. I also add some comments to make the code more clear. Signed-off-by: Hao Li Reviewed-by: Jan Kara Reviewed-by: Ira Weiny --- This patch may have been forgotten. Original patch: https://lore.kernel.org/linux-fsdevel/20200924055958.825515-1-lihao2018.f...@cn.fujitsu.com/ fs/dcache.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index ea0485861d93..97e81a844a96 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -793,10 +793,17 @@ static inline bool fast_dput(struct dentry *dentry) * a reference to the dentry and change that, but * our work is done - we can leave the dentry * around with a zero refcount. +* +* Nevertheless, there are two cases that we should kill +* the dentry anyway. +* 1. free disconnected dentries as soon as their refcount +*reached zero. +* 2. free dentries if they should not be cached. */ smp_rmb(); d_flags = READ_ONCE(dentry->d_flags); - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | + DCACHE_DISCONNECTED | DCACHE_DONTCACHE; /* Nothing to do? Dropping the reference was all we needed? */ if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) -- 2.28.0
[PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
If DCACHE_REFERENCED is set, fast_dput() will return true, and then retain_dentry() have no chance to check DCACHE_DONTCACHE. As a result, the dentry won't be killed and the corresponding inode can't be evicted. In the following example, the DAX policy can't take effects unless we do a drop_caches manually. # DCACHE_LRU_LIST will be set echo abcdefg > test.txt # DCACHE_REFERENCED will be set and DCACHE_DONTCACHE can't do anything xfs_io -c 'chattr +x' test.txt # Drop caches to make DAX changing take effects echo 2 > /proc/sys/vm/drop_caches What this patch does is preventing fast_dput() from returning true if DCACHE_DONTCACHE is set. Then retain_dentry() will detect the DCACHE_DONTCACHE and will return false. As a result, the dentry will be killed and the inode will be evicted. In this way, if we change per-file DAX policy, it will take effects automatically after this file is closed by all processes. I also add some comments to make the code more clear. Signed-off-by: Hao Li Reviewed-by: Jan Kara Reviewed-by: Ira Weiny --- This patch may have been forgotten. Original patch: https://lore.kernel.org/linux-fsdevel/20200924055958.825515-1-lihao2018.f...@cn.fujitsu.com/ fs/dcache.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index ea0485861d93..97e81a844a96 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -793,10 +793,17 @@ static inline bool fast_dput(struct dentry *dentry) * a reference to the dentry and change that, but * our work is done - we can leave the dentry * around with a zero refcount. +* +* Nevertheless, there are two cases that we should kill +* the dentry anyway. +* 1. free disconnected dentries as soon as their refcount +*reached zero. +* 2. free dentries if they should not be cached. */ smp_rmb(); d_flags = READ_ONCE(dentry->d_flags); - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | + DCACHE_DISCONNECTED | DCACHE_DONTCACHE; /* Nothing to do? Dropping the reference was all we needed? */ if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) -- 2.28.0
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On 2020/8/31 8:34, Dave Chinner wrote: > On Fri, Aug 28, 2020 at 05:04:14PM +0800, Li, Hao wrote: > > On 2020/8/28 8:35, Dave Chinner wrote: > > > On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: > > > > On 2020/8/27 14:37, Dave Chinner wrote: > > > > > On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > > > > > > Currently, DCACHE_REFERENCED prevents the dentry with > > > > > > DCACHE_DONTCACHE > > > > > > set from being killed, so the corresponding inode can't be evicted. > > > > > > If > > > > > > the DAX policy of an inode is changed, we can't make policy changing > > > > > > take effects unless dropping caches manually. > > > > > > > > > > > > This patch fixes this problem and flushes the inode to disk to > > > > > > prepare > > > > > > for evicting it. > > > > > > > > > > > > Signed-off-by: Hao Li > > > > > > --- > > > > > > fs/dcache.c | 3 ++- > > > > > > fs/inode.c | 2 +- > > > > > > 2 files changed, 3 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/fs/dcache.c b/fs/dcache.c > > > > > > index ea0485861d93..486c7409dc82 100644 > > > > > > --- a/fs/dcache.c > > > > > > +++ b/fs/dcache.c > > > > > > @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry > > > > > > *dentry) > > > > > > */ > > > > > > smp_rmb(); > > > > > > d_flags = READ_ONCE(dentry->d_flags); > > > > > > - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | > > > > > > DCACHE_DISCONNECTED; > > > > > > + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | > > > > > > DCACHE_DISCONNECTED > > > > > > + | DCACHE_DONTCACHE; > > > > > Seems reasonable, but you need to update the comment above as to > > > > > how this flag fits into this code > > > > Yes. I will change it. Thanks. > > > > > > > > > > /* Nothing to do? Dropping the reference was all we needed? */ > > > > > > if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && > > > > > > !d_unhashed(dentry)) > > > > > > diff --git a/fs/inode.c b/fs/inode.c > > > > > > index 72c4c347afb7..5218a8aebd7f 100644 > > > > > > --- a/fs/inode.c > > > > > > +++ b/fs/inode.c > > > > > > @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > > > > > > } > > > > > > > > > > > > state = inode->i_state; > > > > > > - if (!drop) { > > > > > > + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > > > > > > WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > > > > > > spin_unlock(&inode->i_lock); > > > > > What's this supposed to do? We'll only get here with drop set if the > > > > > filesystem is mounting or unmounting. > > > > The variable drop will also be set to True if I_DONTCACHE is set on > > > > inode->i_state. > > > > Although mounting/unmounting will set the drop variable, it won't set > > > > I_DONTCACHE if I understand correctly. As a result, > > > > drop && (inode->i_state & I_DONTCACHE) will filter out > > > > mounting/unmounting. > > > So what does this have to do with changing the way the dcache > > > treats DCACHE_DONTCACHE? > > After changing the way the dcache treats DCACHE_DONTCACHE, the dentry with > > DCACHE_DONTCACHE set will be killed unconditionally even if > > DCACHE_REFERENCED is set, and its inode will be processed by iput_final(). > > This inode has I_DONTCACHE flag, so op->drop_inode() will return true, > > and the inode will be evicted _directly_ even though it has dirty pages. > > Yes. But this doesn't rely on DCACHE_DONTCACHE behaviour. Inodes > that are looked up and cached by the filesystem without going > through dentry cache pathwalks can have I_DONTCACHE set and then get > evicted... > > i.e. we can get I_DONTCACHE set on inodes that do not have dentries > attached to them. That's the original functionality that got pulled > up from XFS - internal iteration of inodes, either via quotacheck or > bulkstat Oh, I see! > > > I think the kernel will run into error state because it doesn't writeback > > dirty pages of this inode before evicting. This is why I write back this > > inode here. > > > > According to my test, if I revert the second hunk of this patch, kernel > > will hang after running the following command: > > echo 123 > test.txt && xfs_io -c "chattr +x" test.txt > > > > The backtrace is: > > > > xfs_fs_destroy_inode+0x204/0x24d > > destroy_inode+0x3b/0x65 > > evict+0x150/0x1aa > > iput+0x117/0x19a > > dentry_unlink_inode+0x12b/0x12f > > __dentry_kill+0xee/0x211 > > dentry_kill+0x112/0x22f > > dput+0x79/0x86 > > __fput+0x200/0x23f > > fput+0x25/0x28 > > task_work_run+0x144/0x177 > > do_exit+0x4fb/0xb94 > > > > This backtrace is printed with an ASSERT(0) statement in > > xfs_fs_destroy_inode(). > > Sure, that's your messenger. That doesn't mean the bug can't be > triggered by other means. > > > > Also, if I_DONTCACHE is set, but the inode has also been unlinked or > > > is unhashed, should we be writing it back? i.e. it might have been > > > dropped for a different reason to I_DONTCACHE > > This is a
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On Fri, Aug 28, 2020 at 05:04:14PM +0800, Li, Hao wrote: > On 2020/8/28 8:35, Dave Chinner wrote: > > On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: > >> On 2020/8/27 14:37, Dave Chinner wrote: > >>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE > set from being killed, so the corresponding inode can't be evicted. If > the DAX policy of an inode is changed, we can't make policy changing > take effects unless dropping caches manually. > > This patch fixes this problem and flushes the inode to disk to prepare > for evicting it. > > Signed-off-by: Hao Li > --- > fs/dcache.c | 3 ++- > fs/inode.c | 2 +- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index ea0485861d93..486c7409dc82 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) > */ > smp_rmb(); > d_flags = READ_ONCE(dentry->d_flags); > -d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | > DCACHE_DISCONNECTED; > +d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | > DCACHE_DISCONNECTED > +| DCACHE_DONTCACHE; > >>> Seems reasonable, but you need to update the comment above as to > >>> how this flag fits into this code > >> Yes. I will change it. Thanks. > >> > /* Nothing to do? Dropping the reference was all we needed? */ > if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && > !d_unhashed(dentry)) > diff --git a/fs/inode.c b/fs/inode.c > index 72c4c347afb7..5218a8aebd7f 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > } > > state = inode->i_state; > -if (!drop) { > +if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > spin_unlock(&inode->i_lock); > >>> What's this supposed to do? We'll only get here with drop set if the > >>> filesystem is mounting or unmounting. > >> The variable drop will also be set to True if I_DONTCACHE is set on > >> inode->i_state. > >> Although mounting/unmounting will set the drop variable, it won't set > >> I_DONTCACHE if I understand correctly. As a result, > >> drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. > > So what does this have to do with changing the way the dcache > > treats DCACHE_DONTCACHE? > After changing the way the dcache treats DCACHE_DONTCACHE, the dentry with > DCACHE_DONTCACHE set will be killed unconditionally even if > DCACHE_REFERENCED is set, and its inode will be processed by iput_final(). > This inode has I_DONTCACHE flag, so op->drop_inode() will return true, > and the inode will be evicted _directly_ even though it has dirty pages. Yes. But this doesn't rely on DCACHE_DONTCACHE behaviour. Inodes that are looked up and cached by the filesystem without going through dentry cache pathwalks can have I_DONTCACHE set and then get evicted... i.e. we can get I_DONTCACHE set on inodes that do not have dentries attached to them. That's the original functionality that got pulled up from XFS - internal iteration of inodes, either via quotacheck or bulkstat > I think the kernel will run into error state because it doesn't writeback > dirty pages of this inode before evicting. This is why I write back this > inode here. > > According to my test, if I revert the second hunk of this patch, kernel > will hang after running the following command: > echo 123 > test.txt && xfs_io -c "chattr +x" test.txt > > The backtrace is: > > xfs_fs_destroy_inode+0x204/0x24d > destroy_inode+0x3b/0x65 > evict+0x150/0x1aa > iput+0x117/0x19a > dentry_unlink_inode+0x12b/0x12f > __dentry_kill+0xee/0x211 > dentry_kill+0x112/0x22f > dput+0x79/0x86 > __fput+0x200/0x23f > fput+0x25/0x28 > task_work_run+0x144/0x177 > do_exit+0x4fb/0xb94 > > This backtrace is printed with an ASSERT(0) statement in > xfs_fs_destroy_inode(). Sure, that's your messenger. That doesn't mean the bug can't be triggered by other means. > > Also, if I_DONTCACHE is set, but the inode has also been unlinked or > > is unhashed, should we be writing it back? i.e. it might have been > > dropped for a different reason to I_DONTCACHE > This is a problem I didn't realize. You are right. If the inode has been > unlinked/unhashed and the I_DONTCACHE is also set, the appended condition > will lead to unnecessary writeback. > > I think I need to handle the inode writeback more carefully. Maybe I can > revert the second hunk and remove I_DONTCACHE from generic_drop_inode() > and then change > > if (!drop && (sb->s_flags & SB_ACTIVE)) >
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On 2020/8/28 8:35, Dave Chinner wrote: > On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: >> On 2020/8/27 14:37, Dave Chinner wrote: >>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE set from being killed, so the corresponding inode can't be evicted. If the DAX policy of an inode is changed, we can't make policy changing take effects unless dropping caches manually. This patch fixes this problem and flushes the inode to disk to prepare for evicting it. Signed-off-by: Hao Li --- fs/dcache.c | 3 ++- fs/inode.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index ea0485861d93..486c7409dc82 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) */ smp_rmb(); d_flags = READ_ONCE(dentry->d_flags); - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED + | DCACHE_DONTCACHE; >>> Seems reasonable, but you need to update the comment above as to >>> how this flag fits into this code >> Yes. I will change it. Thanks. >> /* Nothing to do? Dropping the reference was all we needed? */ if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) diff --git a/fs/inode.c b/fs/inode.c index 72c4c347afb7..5218a8aebd7f 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) } state = inode->i_state; - if (!drop) { + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { WRITE_ONCE(inode->i_state, state | I_WILL_FREE); spin_unlock(&inode->i_lock); >>> What's this supposed to do? We'll only get here with drop set if the >>> filesystem is mounting or unmounting. >> The variable drop will also be set to True if I_DONTCACHE is set on >> inode->i_state. >> Although mounting/unmounting will set the drop variable, it won't set >> I_DONTCACHE if I understand correctly. As a result, >> drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. > So what does this have to do with changing the way the dcache > treats DCACHE_DONTCACHE? After changing the way the dcache treats DCACHE_DONTCACHE, the dentry with DCACHE_DONTCACHE set will be killed unconditionally even if DCACHE_REFERENCED is set, and its inode will be processed by iput_final(). This inode has I_DONTCACHE flag, so op->drop_inode() will return true, and the inode will be evicted _directly_ even though it has dirty pages. I think the kernel will run into error state because it doesn't writeback dirty pages of this inode before evicting. This is why I write back this inode here. According to my test, if I revert the second hunk of this patch, kernel will hang after running the following command: echo 123 > test.txt && xfs_io -c "chattr +x" test.txt The backtrace is: xfs_fs_destroy_inode+0x204/0x24d destroy_inode+0x3b/0x65 evict+0x150/0x1aa iput+0x117/0x19a dentry_unlink_inode+0x12b/0x12f __dentry_kill+0xee/0x211 dentry_kill+0x112/0x22f dput+0x79/0x86 __fput+0x200/0x23f fput+0x25/0x28 task_work_run+0x144/0x177 do_exit+0x4fb/0xb94 This backtrace is printed with an ASSERT(0) statement in xfs_fs_destroy_inode(). > > Also, if I_DONTCACHE is set, but the inode has also been unlinked or > is unhashed, should we be writing it back? i.e. it might have been > dropped for a different reason to I_DONTCACHE This is a problem I didn't realize. You are right. If the inode has been unlinked/unhashed and the I_DONTCACHE is also set, the appended condition will lead to unnecessary writeback. I think I need to handle the inode writeback more carefully. Maybe I can revert the second hunk and remove I_DONTCACHE from generic_drop_inode() and then change if (!drop && (sb->s_flags & SB_ACTIVE)) to if (!drop && !(inode->i_state & I_DONTCACHE) && (sb->s_flags & SB_ACTIVE)) This approach would be more suitable. > > IOWs, if there is a problem with how I_DONTCACHE is being handled, > then the problem must already exists regardless of the > DCACHE_DONTCACHE behaviour, right? So shouldn't this be a separate > bug fix with it's own explanation of the problem and the fix? I do think the way we treat I_DONTCACHE in current kernel is not suitable. generic_drop_inode() is used to determine if the inode should be evicted without writeback, so I_DONTCACHE shouldn't be handled in this function. The suitable approach is illustrated above. I can submit a patch to fix this problem if this new approach is acceptable. Thanks, Hao Li > Cheers, > > Dave.
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: > On 2020/8/27 14:37, Dave Chinner wrote: > > On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > >> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE > >> set from being killed, so the corresponding inode can't be evicted. If > >> the DAX policy of an inode is changed, we can't make policy changing > >> take effects unless dropping caches manually. > >> > >> This patch fixes this problem and flushes the inode to disk to prepare > >> for evicting it. > >> > >> Signed-off-by: Hao Li > >> --- > >> fs/dcache.c | 3 ++- > >> fs/inode.c | 2 +- > >> 2 files changed, 3 insertions(+), 2 deletions(-) > >> > >> diff --git a/fs/dcache.c b/fs/dcache.c > >> index ea0485861d93..486c7409dc82 100644 > >> --- a/fs/dcache.c > >> +++ b/fs/dcache.c > >> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) > >> */ > >>smp_rmb(); > >>d_flags = READ_ONCE(dentry->d_flags); > >> - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; > >> + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED > >> + | DCACHE_DONTCACHE; > > Seems reasonable, but you need to update the comment above as to > > how this flag fits into this code > > Yes. I will change it. Thanks. > > > > >>/* Nothing to do? Dropping the reference was all we needed? */ > >>if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && > >> !d_unhashed(dentry)) > >> diff --git a/fs/inode.c b/fs/inode.c > >> index 72c4c347afb7..5218a8aebd7f 100644 > >> --- a/fs/inode.c > >> +++ b/fs/inode.c > >> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > >>} > >> > >>state = inode->i_state; > >> - if (!drop) { > >> + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > >>WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > >>spin_unlock(&inode->i_lock); > > What's this supposed to do? We'll only get here with drop set if the > > filesystem is mounting or unmounting. > > The variable drop will also be set to True if I_DONTCACHE is set on > inode->i_state. > Although mounting/unmounting will set the drop variable, it won't set > I_DONTCACHE if I understand correctly. As a result, > drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. So what does this have to do with changing the way the dcache treats DCACHE_DONTCACHE? Also, if I_DONTCACHE is set, but the inode has also been unlinked or is unhashed, should we be writing it back? i.e. it might have been dropped for a different reason to I_DONTCACHE IOWs, if there is a problem with how I_DONTCACHE is being handled, then the problem must already exists regardless of the DCACHE_DONTCACHE behaviour, right? So shouldn't this be a separate bug fix with it's own explanation of the problem and the fix? Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On 2020/8/27 14:37, Dave Chinner wrote: > On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: >> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE >> set from being killed, so the corresponding inode can't be evicted. If >> the DAX policy of an inode is changed, we can't make policy changing >> take effects unless dropping caches manually. >> >> This patch fixes this problem and flushes the inode to disk to prepare >> for evicting it. >> >> Signed-off-by: Hao Li >> --- >> fs/dcache.c | 3 ++- >> fs/inode.c | 2 +- >> 2 files changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/fs/dcache.c b/fs/dcache.c >> index ea0485861d93..486c7409dc82 100644 >> --- a/fs/dcache.c >> +++ b/fs/dcache.c >> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) >> */ >> smp_rmb(); >> d_flags = READ_ONCE(dentry->d_flags); >> -d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; >> +d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED >> +| DCACHE_DONTCACHE; > Seems reasonable, but you need to update the comment above as to > how this flag fits into this code Yes. I will change it. Thanks. > >> /* Nothing to do? Dropping the reference was all we needed? */ >> if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && >> !d_unhashed(dentry)) >> diff --git a/fs/inode.c b/fs/inode.c >> index 72c4c347afb7..5218a8aebd7f 100644 >> --- a/fs/inode.c >> +++ b/fs/inode.c >> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) >> } >> >> state = inode->i_state; >> -if (!drop) { >> +if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { >> WRITE_ONCE(inode->i_state, state | I_WILL_FREE); >> spin_unlock(&inode->i_lock); > What's this supposed to do? We'll only get here with drop set if the > filesystem is mounting or unmounting. The variable drop will also be set to True if I_DONTCACHE is set on inode->i_state. Although mounting/unmounting will set the drop variable, it won't set I_DONTCACHE if I understand correctly. As a result, drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. > In either case, why does > having I_DONTCACHE set require the inode to be written back here > before it is evicted from the cache? Mounting/unmounting won't execute the code snippet which is in that if statement, as I have explained above. However, If I_DONTCACHE is set, we have to execute this snippet to write back inode. I_DONTCACHE is set in d_mark_dontcache() which will be called in two situations: 1. DAX policy is changed. 2. The inode is read through bulkstat in XFS. See commit 5132ba8f2b77 ("xfs: don't cache inodes read through bulkstat") for more details. For the first case, we have to write back the inode together with its dirty pages before evicting. For the second case, I think it's also necessary to write back inode before evicting. Thanks, Hao Li > > Cheers, > > Dave.
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE > set from being killed, so the corresponding inode can't be evicted. If > the DAX policy of an inode is changed, we can't make policy changing > take effects unless dropping caches manually. > > This patch fixes this problem and flushes the inode to disk to prepare > for evicting it. > > Signed-off-by: Hao Li > --- > fs/dcache.c | 3 ++- > fs/inode.c | 2 +- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index ea0485861d93..486c7409dc82 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) >*/ > smp_rmb(); > d_flags = READ_ONCE(dentry->d_flags); > - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; > + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED > + | DCACHE_DONTCACHE; Seems reasonable, but you need to update the comment above as to how this flag fits into this code > /* Nothing to do? Dropping the reference was all we needed? */ > if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && > !d_unhashed(dentry)) > diff --git a/fs/inode.c b/fs/inode.c > index 72c4c347afb7..5218a8aebd7f 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > } > > state = inode->i_state; > - if (!drop) { > + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > spin_unlock(&inode->i_lock); What's this supposed to do? We'll only get here with drop set if the filesystem is mounting or unmounting. In either case, why does having I_DONTCACHE set require the inode to be written back here before it is evicted from the cache? Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
Hello, CC to Dave, darrick.wong and xfs/nvdimm list to get more discussions. Thanks. Hao Li On 2020/8/24 14:17, Li, Hao wrote: > On 2020/8/23 14:54, Ira Weiny wrote: >> On Fri, Aug 21, 2020 at 10:40:41AM -0700, 'Ira Weiny' wrote: >>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE set from being killed, so the corresponding inode can't be evicted. If the DAX policy of an inode is changed, we can't make policy changing take effects unless dropping caches manually. This patch fixes this problem and flushes the inode to disk to prepare for evicting it. >>> This looks intriguing and I really hope this helps but I don't think this >>> will >>> guarantee that the state changes immediately will it? >>> >>> Do you have a test case which fails before and passes after? Perhaps one of >>> the new xfstests submitted by Xiao? >> Ok I just went back and read your comment before.[1] Sorry for being a bit >> slow on the correlation between this patch and that email. (BTW, I think it >> would have been good to put those examples in the commit message and or >> reference that example.) > Thanks for your advice. I will add those examples in v2 after further > discussion of this patch. > >> I'm assuming that with this patch example 2 from [1] >> works without a drop_cache _if_ no other task has the file open? > Yes. If no other task is opening the file, the inode and page cache of this > file will be dropped during xfs_io exiting process. There is no need to run > echo 2 > drop_caches. > >> Anyway, with that explanation I think you are correct that this improves the >> situation _if_ the only references on the file is controlled by the user and >> they have indeed closed all of them. >> >> The code for DCACHE_DONTCACHE as I attempted to write it was that it should >> have prevented further caching of the inode such that the inode would evict >> sooner. But it seems you have found a bug/optimization? > Yes. This patch is an optimization and can also be treated as a bugfix. > On the other side, even though this patch can make DCACHE_DONTCACHE more > reasonable, I am not quite sure if my approach is safe and doesn't impact > the fs performance. I hope the community can give me more advice. > >> In the end, however, if another user (such as a backup running by the admin) >> has a reference the DAX change may still be delayed. > Yes. In this situation, neither drop_caches approach nor this patch can make > the DAX change take effects soon. > Moreover, things are different if the backup task exits, this patch > will make sure the inode and page cache of the file can be dropped > _automatically_ without manual intervention. By contrast, the original > approach needs a manual cache dropping. > >> So I'm thinking the >> documentation should remain largely as is? But perhaps I am wrong. Does >> this >> completely remove the need for a drop_caches or only in the example you gave? > I think the contents related to drop_caches in documentation can be removed > if this patch's approach is acceptable. > >> Since I'm not a FS expert I'm still not sure. > Frankly, I'm not an expert either, so I hope this patch can be discussed > further in case it has side effects. > > Thanks, > Hao Li > >> Regardless, thanks for the fixup! :-D >> Ira >> >> [1] >> https://lore.kernel.org/linux-xfs/ba98b77e-a806-048a-a0dc-ca585677d...@cn.fujitsu.com/ >> >>> Ira >>> Signed-off-by: Hao Li --- fs/dcache.c | 3 ++- fs/inode.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index ea0485861d93..486c7409dc82 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) */ smp_rmb(); d_flags = READ_ONCE(dentry->d_flags); - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED + | DCACHE_DONTCACHE; /* Nothing to do? Dropping the reference was all we needed? */ if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) diff --git a/fs/inode.c b/fs/inode.c index 72c4c347afb7..5218a8aebd7f 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) } state = inode->i_state; - if (!drop) { + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { WRITE_ONCE(inode->i_state, state | I_WILL_FREE); spin_unlock(&inode->i_lock); -- 2.28.0 > >
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On 2020/8/23 14:54, Ira Weiny wrote: > On Fri, Aug 21, 2020 at 10:40:41AM -0700, 'Ira Weiny' wrote: >> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: >>> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE >>> set from being killed, so the corresponding inode can't be evicted. If >>> the DAX policy of an inode is changed, we can't make policy changing >>> take effects unless dropping caches manually. >>> >>> This patch fixes this problem and flushes the inode to disk to prepare >>> for evicting it. >> This looks intriguing and I really hope this helps but I don't think this >> will >> guarantee that the state changes immediately will it? >> >> Do you have a test case which fails before and passes after? Perhaps one of >> the new xfstests submitted by Xiao? > Ok I just went back and read your comment before.[1] Sorry for being a bit > slow on the correlation between this patch and that email. (BTW, I think it > would have been good to put those examples in the commit message and or > reference that example.) Thanks for your advice. I will add those examples in v2 after further discussion of this patch. > I'm assuming that with this patch example 2 from [1] > works without a drop_cache _if_ no other task has the file open? Yes. If no other task is opening the file, the inode and page cache of this file will be dropped during xfs_io exiting process. There is no need to run echo 2 > drop_caches. > Anyway, with that explanation I think you are correct that this improves the > situation _if_ the only references on the file is controlled by the user and > they have indeed closed all of them. > > The code for DCACHE_DONTCACHE as I attempted to write it was that it should > have prevented further caching of the inode such that the inode would evict > sooner. But it seems you have found a bug/optimization? Yes. This patch is an optimization and can also be treated as a bugfix. On the other side, even though this patch can make DCACHE_DONTCACHE more reasonable, I am not quite sure if my approach is safe and doesn't impact the fs performance. I hope the community can give me more advice. > In the end, however, if another user (such as a backup running by the admin) > has a reference the DAX change may still be delayed. Yes. In this situation, neither drop_caches approach nor this patch can make the DAX change take effects soon. Moreover, things are different if the backup task exits, this patch will make sure the inode and page cache of the file can be dropped _automatically_ without manual intervention. By contrast, the original approach needs a manual cache dropping. > So I'm thinking the > documentation should remain largely as is? But perhaps I am wrong. Does this > completely remove the need for a drop_caches or only in the example you gave? I think the contents related to drop_caches in documentation can be removed if this patch's approach is acceptable. > Since I'm not a FS expert I'm still not sure. Frankly, I'm not an expert either, so I hope this patch can be discussed further in case it has side effects. Thanks, Hao Li > > Regardless, thanks for the fixup! :-D > Ira > > [1] > https://lore.kernel.org/linux-xfs/ba98b77e-a806-048a-a0dc-ca585677d...@cn.fujitsu.com/ > >> Ira >> >>> Signed-off-by: Hao Li >>> --- >>> fs/dcache.c | 3 ++- >>> fs/inode.c | 2 +- >>> 2 files changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/dcache.c b/fs/dcache.c >>> index ea0485861d93..486c7409dc82 100644 >>> --- a/fs/dcache.c >>> +++ b/fs/dcache.c >>> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) >>> */ >>> smp_rmb(); >>> d_flags = READ_ONCE(dentry->d_flags); >>> - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; >>> + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED >>> + | DCACHE_DONTCACHE; >>> >>> /* Nothing to do? Dropping the reference was all we needed? */ >>> if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && >>> !d_unhashed(dentry)) >>> diff --git a/fs/inode.c b/fs/inode.c >>> index 72c4c347afb7..5218a8aebd7f 100644 >>> --- a/fs/inode.c >>> +++ b/fs/inode.c >>> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) >>> } >>> >>> state = inode->i_state; >>> - if (!drop) { >>> + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { >>> WRITE_ONCE(inode->i_state, state | I_WILL_FREE); >>> spin_unlock(&inode->i_lock); >>> >>> -- >>> 2.28.0 >>> >>> >>> >
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On Fri, Aug 21, 2020 at 10:40:41AM -0700, 'Ira Weiny' wrote: > On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > > Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE > > set from being killed, so the corresponding inode can't be evicted. If > > the DAX policy of an inode is changed, we can't make policy changing > > take effects unless dropping caches manually. > > > > This patch fixes this problem and flushes the inode to disk to prepare > > for evicting it. > > This looks intriguing and I really hope this helps but I don't think this will > guarantee that the state changes immediately will it? > > Do you have a test case which fails before and passes after? Perhaps one of > the new xfstests submitted by Xiao? Ok I just went back and read your comment before.[1] Sorry for being a bit slow on the correlation between this patch and that email. (BTW, I think it would have been good to put those examples in the commit message and or reference that example.) I'm assuming that with this patch example 2 from [1] works without a drop_cache _if_ no other task has the file open? Anyway, with that explanation I think you are correct that this improves the situation _if_ the only references on the file is controlled by the user and they have indeed closed all of them. The code for DCACHE_DONTCACHE as I attempted to write it was that it should have prevented further caching of the inode such that the inode would evict sooner. But it seems you have found a bug/optimization? In the end, however, if another user (such as a backup running by the admin) has a reference the DAX change may still be delayed. So I'm thinking the documentation should remain largely as is? But perhaps I am wrong. Does this completely remove the need for a drop_caches or only in the example you gave? Since I'm not a FS expert I'm still not sure. Regardless, thanks for the fixup! :-D Ira [1] https://lore.kernel.org/linux-xfs/ba98b77e-a806-048a-a0dc-ca585677d...@cn.fujitsu.com/ > > Ira > > > > > Signed-off-by: Hao Li > > --- > > fs/dcache.c | 3 ++- > > fs/inode.c | 2 +- > > 2 files changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/fs/dcache.c b/fs/dcache.c > > index ea0485861d93..486c7409dc82 100644 > > --- a/fs/dcache.c > > +++ b/fs/dcache.c > > @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) > > */ > > smp_rmb(); > > d_flags = READ_ONCE(dentry->d_flags); > > - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; > > + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED > > + | DCACHE_DONTCACHE; > > > > /* Nothing to do? Dropping the reference was all we needed? */ > > if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && > > !d_unhashed(dentry)) > > diff --git a/fs/inode.c b/fs/inode.c > > index 72c4c347afb7..5218a8aebd7f 100644 > > --- a/fs/inode.c > > +++ b/fs/inode.c > > @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > > } > > > > state = inode->i_state; > > - if (!drop) { > > + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > > WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > > spin_unlock(&inode->i_lock); > > > > -- > > 2.28.0 > > > > > >
Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: > Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE > set from being killed, so the corresponding inode can't be evicted. If > the DAX policy of an inode is changed, we can't make policy changing > take effects unless dropping caches manually. > > This patch fixes this problem and flushes the inode to disk to prepare > for evicting it. This looks intriguing and I really hope this helps but I don't think this will guarantee that the state changes immediately will it? Do you have a test case which fails before and passes after? Perhaps one of the new xfstests submitted by Xiao? Ira > > Signed-off-by: Hao Li > --- > fs/dcache.c | 3 ++- > fs/inode.c | 2 +- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index ea0485861d93..486c7409dc82 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) >*/ > smp_rmb(); > d_flags = READ_ONCE(dentry->d_flags); > - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; > + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED > + | DCACHE_DONTCACHE; > > /* Nothing to do? Dropping the reference was all we needed? */ > if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && > !d_unhashed(dentry)) > diff --git a/fs/inode.c b/fs/inode.c > index 72c4c347afb7..5218a8aebd7f 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) > } > > state = inode->i_state; > - if (!drop) { > + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { > WRITE_ONCE(inode->i_state, state | I_WILL_FREE); > spin_unlock(&inode->i_lock); > > -- > 2.28.0 > > >
[PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE set from being killed, so the corresponding inode can't be evicted. If the DAX policy of an inode is changed, we can't make policy changing take effects unless dropping caches manually. This patch fixes this problem and flushes the inode to disk to prepare for evicting it. Signed-off-by: Hao Li --- fs/dcache.c | 3 ++- fs/inode.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index ea0485861d93..486c7409dc82 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) */ smp_rmb(); d_flags = READ_ONCE(dentry->d_flags); - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED + | DCACHE_DONTCACHE; /* Nothing to do? Dropping the reference was all we needed? */ if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) diff --git a/fs/inode.c b/fs/inode.c index 72c4c347afb7..5218a8aebd7f 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) } state = inode->i_state; - if (!drop) { + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { WRITE_ONCE(inode->i_state, state | I_WILL_FREE); spin_unlock(&inode->i_lock); -- 2.28.0