On Tue, May 17, 2016 at 2:52 PM, Raghavendra Gowdappa <rgowd...@redhat.com> wrote:
> > > ----- Original Message ----- > > From: "Raghavendra Gowdappa" <rgowd...@redhat.com> > > To: "Nithya Balachandran" <nbala...@redhat.com> > > Cc: gluster-devel@gluster.org > > Sent: Tuesday, May 17, 2016 2:40:32 PM > > Subject: Re: [Gluster-devel] 'mv' of ./tests/bugs/posix/bug-1113960.t > causes 100% CPU > > > > > > > > ----- Original Message ----- > > > From: "Nithya Balachandran" <nbala...@redhat.com> > > > To: "Niels de Vos" <nde...@redhat.com> > > > Cc: gluster-devel@gluster.org > > > Sent: Tuesday, May 17, 2016 2:25:20 PM > > > Subject: Re: [Gluster-devel] 'mv' of ./tests/bugs/posix/bug-1113960.t > > > causes 100% CPU > > > > > > Hi, > > > > > > I have looked into this on another system earlier and this is what I > have > > > so > > > far: > > > > > > 1. The test involves moving and renaming directories and files within > those > > > dirs. > > > 2. A rename dir operation failed on one subvol. So we have 3 subvols > where > > > the directory has the new name and one where it has the old name. > > > 3. Some operation - perhaps a revalidate - has added a dentry with the > old > > > name to the inode . So there are now 2 dentries for the same inode for > a > > > directory. > > > > I think the stale dentry is caused by a racing lookup and rename. > > Please refer to bz 1335373 [1] for more details. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1335373 I have posted a "hacky" patch at [2] [2] http://review.gluster.org/#/c/14373/ > > > > Apart from > > that, I don't know of any other reasons for stale dentries in inode > table. > > "Dentry fop serializer" (DFS) [1], aims to solve these kind of races. > > > > [1] http://review.gluster.org/14286 > > > > > 4. Renaming a file inside that directory calls an inode_link which end > up > > > traversing the dentry list for each entry all the way up to the root > in the > > > __foreach_ancestor_dentry function. If there are multiple deep > directories > > > with the same problem in the path, this takes a very long time (hours) > > > because of the number of times the function is called. > > > > > > I do not know why the rename dir failed. However, is the following a > > > correct/acceptable fix for the traversal issue? > > > > > > 1. A directory should never have more than one dentry > > > 2. __foreach_ancestor_dentry uses the dentry list of the parent inode. > > > Parent > > > inode will always be a directory. > > > 3. Can we just take the first dentry in the list for the cycle check > as we > > > are really only comparing inodes? In the scenarios I have tried, all > the > > > dentries in the dentry_list always have the same inode. This would > prevent > > > the hang. If there is more than one dentry for a directory, flag an > error > > > somehow. > > > 4. Is there any chance that a dentry in the list can have a different > > > inode? > > > If yes, that is a different problem and 3 does not apply. > > > > > > It would work like this: > > > > > > > > > last_parent_inode = NULL; > > > > > > list_for_each_entry (each, &parent->dentry_list, inode_list) { > > > > > > //Since we are only using the each->parent to check, stop if we have > > > already > > > checked it > > > > > > if(each->parent != last_parent_inode) { > > > ret = __foreach_ancestor_dentry (each, per_dentry_fn, data); > > > if (ret) > > > goto out; > > > } > > > last_parent_inode = each->parent; > > > } > > > > > > > > > This would prevent the hang but leads to other issues which would > exist in > > > the current code anyway - mainly, which dentry is the correct one and > how > > > do > > > we recover? > > > > > > > > > Regards, > > > Nithya > > > > > > On Wed, May 11, 2016 at 7:09 PM, Niels de Vos < nde...@redhat.com > > wrote: > > > > > > > > > Could someone look into this busy loop? > > > https://paste.fedoraproject.org/365207/29732171/raw/ > > > > > > This was happening in a regression-test burn-in run, occupying a > Jenkins > > > slave for 2+ days: > > > https://build.gluster.org/job/regression-test-burn-in/936/ > > > (run with commit f0ade919006b2581ae192f997a8ae5bacc2892af from master) > > > > > > A coredump of the mount process is available from here: > > > http://slave20.cloud.gluster.org/archived_builds/crash.tar.gz > > > > > > Thanks misc for reporting and gathering the debugging info. > > > Niels > > > > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel@gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel@gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel > > >
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel