Hi Ross,

On Tue, 2 Apr 2019 at 00:59, Andreas Gruenbacher <agrue...@redhat.com> wrote:
> thanks for the great analysis.
>
> On Tue, 26 Mar 2019 at 20:14, Bob Peterson <rpete...@redhat.com> wrote:
> > ----- Original Message -----
> > > 6. gfs2_log_flush() continues and calls revoke_lo_after_commit() and
> > > uses the freed glock (stack trace above).
> > >
> > > Should evict_inode call gfs2_ail_flush() or something so that the revoke
> > > is written before it drops its reference to the glock?
> > >
> > > Or is there something else that is meant to keep the glock around if the
> > > inode is evicted while there is a linked gfs2_bufdata sitting on some
> > > AIL list?
> > >
> > > Let me know if any more specific information is needed since I now have
> > > a test setup that can usually reproduce this within 10 minutes.
> >
> > Very interesting.
> >
> > It's not unusual for glocks to outlive their inodes, but I'm not sure
> > what the right answer is in this case. Either the revoke should
> > take an additional reference to the glock, and not let go until the
> > revoke is written by some log flush, or else the evict needs to do the
> > log flush to make sure the revoke is committed. But we've had issues with
> > evict in the past, so we need to be careful about how we fix it.
> > Andreas and I will look into the best way to fix it. Thanks again for your 
> > help.
>
> Usually, glocks are attached to inodes or other objects. When the
> state of the glock changes, the go_lock, go_sync, and go_inval
> operations determine what happens to the attached object. In
> gfs2_evict_inode, we disassociate the inode from the glock, do the
> necessary cleanup work locally, and put the glock asynchronously when
> needed, though. We do that in PF_MEMALLOC context to avoid
> deadlocking; see commit 71c1b2136835 ("gfs2: gfs2_evict_inode: Put
> glocks asynchronously"). Even if we didn't do that, there could still
> be other glock holders like the glock state machine. There is no
> guarantee that the glock will go away anytime soon, or actually at
> all: it could get reused by another instance of the same inode. So
> waiting for the glock to go away first in gfs2_evict_inode definitely
> is not an option.
>
> This basically answers your above question: gfs2_evict_inode should
> definitely clean up before putting the glock. I'm just not sure how to
> best achieve that, yet.

after more discussions, Bob convinced me that it makes perfect sense
to not write out outstanding revokes in gfs2_evict_inode. We really
want to delay writing revokes as long as possible so that as many of
the revokes as possible will go away (either because the blocks are
re-added to the journal and the revokes are "unrevoked", or because
the journal wraps around). So the right fix here is indeed taking
additional glock references.

I've come up with a patch that takes one additional reference for each
glock instead of one reference for each bufdata object; that should
scale better. I'll post that and a follow-up patch separately.

Could you please verify?

Thanks,
Andreas

Reply via email to