In message <[EMAIL PROTECTED]>, Dave Miller writes: > Erez Zadok wrote on 1/29/08 11:32 PM: > > I was able to reproduce the bug in question: just umount -f an nfs partition > > or umount -l any partition that's used as a lower branch, then try to umount > > unionfs's mount; you get the exact oops above. Turns out that grabbing a > > vfsmount ref isn't enough: it prevents a casual umount on a lower branch > > from succeeding, returning an EBUSY. But we also needed to grab an s_active > > reference on all lower superblocks, to prevent a forced/detached unmount > > from destroying the lower super too early. With the patch below, the lower > > super will be detached from the namespace, but it won't be destroyed until > > unionfs is mounted: unionfs_put_super will decrement the (possibly last) > > reference on the lower super, which'd then be properly destroyed. > > > > Try this patch. I quickly tried it w/ branch management, umount -l, and my > > basic regression suite. It seems to work, but I'd like to hear from both of > > you first before considering this bug fixed. > > Poking around at my logs, I see that the OOPS I've been getting under > heavy usage (that I've been meaning to send you all our config so you > could reproduce it) actually matches the one we get when trying to shut > down with the unionfs still mounted (which is the one you're trying to > fix here). If this patch fixes this particular OOPS this may well solve > our whole problem. I've got it compiling now, I'll throw the load test > script at it again and let you know. :) > > -- > Dave Miller http://www.justdave.net/ > System Administrator, Mozilla Corporation http://www.mozilla.com/ > Project Leader, Bugzilla Bug Tracking System http://www.bugzilla.org/
Dave, please apply this very important patch below on top of unionfs-2.3, and let me know. The oopses you've seen and this fix seem to be a good match (fingers crossed :-) You've seen oopses that look like this Mar 10 15:53:02 dm-stage02 kernel: BUG: Dentry f652e6e8{i=63,n=archive.mozilla.org} still in use (1) [unmount of nfs 0:15] This was 'strange' b/c you weren't really unmounting anything, just doing some branch management commands. The bug in question was an oops from the VFS b/c it seems that a superblock's reference count reached zero while it still had active dentries (should never happen). The fix below seems fitting because: - it fixes stuff in unionfs_remount_fs as per your stack trace - if you added a branch, we had a bug which incorrectly decremented the refcnt of some branches (very clear from the patch itself). - over-decrementing the sb refcnt will result in behavior as you've seen: the sb refcnt will reach zero too early. - all of the other recent fixes were more related to races, whereas this bug was a buffer overflow that went beyond the valid array range of UNIONFS_SB(sb)->data[i].sb, and tried to decrement stuff there (i.e., the bug didn't tickle all the time for all users b/c it depended on the memory contents beyond that data[i] array, which varies from system to system). Cheers, Erez. diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c index e5cb235..4cddc83 100644 --- a/fs/unionfs/super.c +++ b/fs/unionfs/super.c @@ -755,7 +755,7 @@ out_no_change: /* grab new lower super references; release old ones */ for (i = 0; i < new_branches; i++) atomic_inc(&new_data[i].sb->s_active); - for (i = 0; i < new_branches; i++) + for (i = 0; i < sbmax(sb); i++) atomic_dec(&UNIONFS_SB(sb)->data[i].sb->s_active); /* copy new vectors into their correct place */ _______________________________________________ unionfs mailing list: http://unionfs.filesystems.org/ unionfs@mail.fsl.cs.sunysb.edu http://www.fsl.cs.sunysb.edu/mailman/listinfo/unionfs