On Tue, 9 Apr 2019 at 17:37, Ross Lagerwall <ross.lagerw...@citrix.com> wrote: > On 4/5/19 6:50 PM, Andreas Gruenbacher wrote: > > Hi Ross, > > > > On Tue, 2 Apr 2019 at 00:59, Andreas Gruenbacher <agrue...@redhat.com> > > wrote: > >> thanks for the great analysis. > >> > >> On Tue, 26 Mar 2019 at 20:14, Bob Peterson <rpete...@redhat.com> wrote: > >>> ----- Original Message ----- > >>>> 6. gfs2_log_flush() continues and calls revoke_lo_after_commit() and > >>>> uses the freed glock (stack trace above). > >>>> > >>>> Should evict_inode call gfs2_ail_flush() or something so that the revoke > >>>> is written before it drops its reference to the glock? > >>>> > >>>> Or is there something else that is meant to keep the glock around if the > >>>> inode is evicted while there is a linked gfs2_bufdata sitting on some > >>>> AIL list? > >>>> > >>>> Let me know if any more specific information is needed since I now have > >>>> a test setup that can usually reproduce this within 10 minutes. > >>> > >>> Very interesting. > >>> > >>> It's not unusual for glocks to outlive their inodes, but I'm not sure > >>> what the right answer is in this case. Either the revoke should > >>> take an additional reference to the glock, and not let go until the > >>> revoke is written by some log flush, or else the evict needs to do the > >>> log flush to make sure the revoke is committed. But we've had issues with > >>> evict in the past, so we need to be careful about how we fix it. > >>> Andreas and I will look into the best way to fix it. Thanks again for > >>> your help. > >> > >> Usually, glocks are attached to inodes or other objects. When the > >> state of the glock changes, the go_lock, go_sync, and go_inval > >> operations determine what happens to the attached object. In > >> gfs2_evict_inode, we disassociate the inode from the glock, do the > >> necessary cleanup work locally, and put the glock asynchronously when > >> needed, though. We do that in PF_MEMALLOC context to avoid > >> deadlocking; see commit 71c1b2136835 ("gfs2: gfs2_evict_inode: Put > >> glocks asynchronously"). Even if we didn't do that, there could still > >> be other glock holders like the glock state machine. There is no > >> guarantee that the glock will go away anytime soon, or actually at > >> all: it could get reused by another instance of the same inode. So > >> waiting for the glock to go away first in gfs2_evict_inode definitely > >> is not an option. > >> > >> This basically answers your above question: gfs2_evict_inode should > >> definitely clean up before putting the glock. I'm just not sure how to > >> best achieve that, yet. > > > > after more discussions, Bob convinced me that it makes perfect sense > > to not write out outstanding revokes in gfs2_evict_inode. We really > > want to delay writing revokes as long as possible so that as many of > > the revokes as possible will go away (either because the blocks are > > re-added to the journal and the revokes are "unrevoked", or because > > the journal wraps around). So the right fix here is indeed taking > > additional glock references. > > > > I've come up with a patch that takes one additional reference for each > > glock instead of one reference for each bufdata object; that should > > scale better. I'll post that and a follow-up patch separately. > > > > Could you please verify? > > > > I've tested the patches and no longer see the use-after-free.
Ok great, we'll line the fix up for the next merge window. Thanks, Andreas