On Fri, Feb 05, 2010 at 04:43:59PM -0800, Matthew Dillon wrote: > Please try this patch: > > fetch http://apollo.backplane.com/DFlyMisc/lock01.patch > > I don't know if this will fix it or not. There is an issue in > allocfreevnode() where a vnode whos v_lock.lk_flags sets > LK_CANRECURSE can be improperly reallocated while in the middle > of being freed, but only if the filesystem's VOP_RECLAIM code > recurses.
This didn't fix it. There was a new crash this night, possibly during the daily maintenance window at 3am. > So the only way I can think of for this crash to occur is if UFS > recurses in softupdates and allocates new vnodes while reclaiming > a vnode, the allocate code then reusing a HAMMER vnode and reclaiming > IT, and HAMMER then recursing and trying to allocate a new vnode > itself and winding up reusing the vnode UFS was originally trying to > reclaim. A difficult path to say the least. Only /boot is UFS on this machine and doesn't use softupdates. > Both your crash dump and the one I got from leaf today crashed on > a HAMMER vnode being reallocated with a seemingly impossible state. > Clearly a MP race, but I couldn't find a smoking gun related to > HAMMER itself. Basically vp->v_mount was NULL, the vnode was in > a reclaimed state, but vp->v_data was still pointing at the > HAMMER inode and the HAMMER inode was still pointing back at the > vp. That implies the vnode was reallocated back to the same > HAMMER inode recursively from within the VOP_RECLAIM itself, > which shouldn't be possible. Most of the crashes I could see occured during a pkgsrc distfile extraction, just after I did a pkgsrc cvs update. I've put the new core dump online. -- Francois Tigeot
