Hi All, There is a good chance that, the inode on which unref came has already been zero refed and added to the purge list. This can happen when inode table is being destroyed (glfs_fini is something which destroys the inode table).
Consider a directory 'a' which has a file 'b'. Now as part of inode table destruction zero refing of inodes does not happen from leaf to the root. It happens in the order inodes are present in the list. So, in this example, the dentry of 'b' would have its parent set to the inode of 'a'. So if 'a' gets zero refed first (as part of inode table cleanup) and then 'b' has to zero refed, then dentry_unset is called on the dentry of 'b' and it further goes on to call inode_unref on b's parent which is 'a'. In this situation, GF_ASSERT would be called as the refcount of 'a' has been already set to zero. Below is a snippet of the core file generated from such ASSERT call in one of the regression test runs. " *14:39:49* No symbol table info available.*14:39:49* #1 0x00007f2a539fc8f8 in abort () from /lib64/libc.so.6*14:39:49* No symbol table info available.*14:39:49* #2 0x00007f2a539f4026 in __assert_fail_base () from /lib64/libc.so.6*14:39:49* No symbol table info available.*14:39:49* #3 0x00007f2a539f40d2 in __assert_fail () from /lib64/libc.so.6*14:39:49* No symbol table info available.*14:39:49* #4 0x00007f2a553e3208 in __inode_unref (inode=0x7f2a3c05edf8, clear=false) at /home/jenkins/root/workspace/centos7-regression/libglusterfs/src/inode.c:483*14:39:49* index = 0*14:39:49* this = 0x7f2a3c03e840*14:39:49* nlookup = 0*14:39:49* __PRETTY_FUNCTION__ = "__inode_unref"*14:39:49* #5 0x00007f2a553e2745 in __dentry_unset (dentry=0x7f2a3c064e48) at /home/jenkins/root/workspace/centos7-regression/libglusterfs/src/inode.c:212*14:39:49* No locals.*14:39:49* #6 0x00007f2a553e308a in __inode_retire (inode=0x7f2a3c05ebc8) at /home/jenkins/root/workspace/centos7-regression/libglusterfs/src/inode.c:442*14:39:49* dentry = 0x7f2a3c064e48*14:39:49* t = 0x7f2a3c064398*14:39:49* #7 0x00007f2a553e392f in __inode_ref_reduce_by_n (inode=0x7f2a3c05ebc8, nref=0) at /home/jenkins/root/workspace/centos7-regression/libglusterfs/src/inode.c:708*14:39:49* nlookup = 0*14:39:49* __PRETTY_FUNCTION__ = "__inode_ref_reduce_by_n"*14:39:49* #8 0x00007f2a553e61d5 in inode_table_destroy (inode_table=0x7f2a28007f90) at /home/jenkins/root/workspace/centos7-regression/libglusterfs/src/inode.c:1867*14:39:49* trav = 0x7f2a3c05ebc8*14:39:49* __FUNCTION__ = "inode_table_destroy"*14:39:49* #9 0x00007f2a553e600c in inode_table_destroy_all (ctx=0x7f2a3c001170) at /home/jenkins/root/workspace/centos7-regression/libglusterfs/src/inode.c:1791*14:39:49* trav_graph = 0x7f2a240041a0*14:39:49* tmp = 0x7f2a3c0013c8*14:39:49* tree = 0x7f2a24022af0*14:39:49* inode_table = 0x7f2a28007f90*14:39:49* #10 0x00007f2a46a83390 in pub_glfs_fini (fs=0x7f2a3c000ff0) at /home/jenkins/root/workspace/centos7-regression/api/src/glfs.c:1346*14:39:49* ret = 0*14:39:49* countdown = 98*14:39:49* subvol = 0x7f2a24022af0*14:39:49* ctx = 0x7f2a3c001170*14:39:49* graph = 0x7f2a240041a0*14:39:49* call_pool = 0x7f2a3c000df0*14:39:49* fs_init = 1*14:39:49* err = 0*14:39:49* old_THIS = 0x7f2a380084c0*14:39:49* __FUNCTION__ = "pub_glfs_fini" " IIUC the solution for it would be to add a flag in the inode table which tells whether the cleanup of inode table has started or not. Do not call GF_ASSRT (inode->ref) if the inode table to which the inode being unrefed belongs to, is already getting cleaned up. A patch [1] has been submitted for review with the change mentioned above. [1] https://review.gluster.org/#/c/glusterfs/+/22650/ Regards, Raghavendra
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel