In the devfs_root->devfs_allocv issue I've decided to check inside allocv to see if devfs_node->v_node is NULL and only THEN allocate a new vnode, as it should be nulled out everry time a vnode is reclaimed.
Also I've taken into account your suggestions regarding locking and yesterday's problems are gone (hang when reading the content of a subdirectory) but I've found a new one. Now I get a panic in lockmgr due to the vn_unlock in nresolve when creating, or trying to create, a symlink. I suppose I'm trying to unlock an unlocked vnode and that is causing the problem. Is it right after all to unlock the vnode in nresolve? I don't think nsymlink has even been called yet at the time of the panic. Cheers, Alex Hornung The new code is at: http://gitweb.dragonflybsd.org/~alexh/dragonfly.git/tree/5f72c8d7bd3081a5aeaec1707cf55de110a3385d:/sys/vfs/devfs This is the panic: kernel: type 12 trap, code=0 Stopped at lockmgr+0x8b: movl 0x4(%ebx),%ecx db> trace trace lockmgr(c0,6) at lockmgr+0x8b vn_unlock(0,0,d4484328,cf79c518,d4453af8) at vn_unlock+0x13 devfs_nresolve(d1f54b64,c0616760,c1e899f0,d4484328,cf79c518) at devfs_nresolve+0xa5 vop_nresolve(c1e899f0,d1f54ba4,d4484328,c1d45f58,cf79c518) at vop_nresolve+0x2f cache_resolve(d1f54bd4,c1d45f58) at cache_resolve+0x30f nlookup(d1f54c80,0,d1f54c80,d1f54cf0,d1f54cc4) at nlookup+0x25e kern_stat(d1f54c80,d1f54c18,d1f54c34,c048f383,d1f067f4) at kern_stat+0xf sys_lstat(d1f54cf0,6,0,0,cf79c2d8) at sys_lstat+0x35 syscall2(d1f54d40) at syscall2+0x1ef Xint0x80_syscall() at Xint0x80_syscall+0x36 On Tue, 2009-05-26 at 16:02 -0700, Matthew Dillon wrote: > :Hi, > : > :I'm running into quite a lot of trouble with vnode locking and > :unlocking, releasing, putting, getting, ... I'm 100% sure I got most if > :not all of the vnode locking wrong and I reallly need some help > :understanding it. It is still unclear to me what vnops need to do what > :with regard to locking/unlocking. > : > :My current code is here: > :http://gitweb.dragonflybsd.org/~alexh/dragonfly.git/tree/9de1a66518d104077521a13d7b13ae958fd79d98:/sys/vfs/devfs > : > :Right now my concerns are mainly in devfs_vnops.c, where all the > :locking/unlocking/... occurs or should occur. > : > :I'd appreciate some insight/comments/corrections/... on this issue! > : > :Alex Hornung > > One thing I noticed is that the devfs_root() code path does not > look right. This routine can be called any number of times by > the kernel, you don't want to allocate a new root vnode every > time! > > devfs_root()->devfs_allocv()->getnewvnode()-> ... > > If the root node already has a vnode associated with it you have to > ref and vn_lock and return that vnode, not allocate a new vnode. > I think you might be overwriting previously acquired vnode pointers > in that root node and that will certainly mess things up. > > -- > > Another thing I noticed is that you need to remember that when you > return a vnode in *vpp, the caller is expected that vnode to be > referenced (and possibly also locked), which means you don't release > the vnode that you are also returning unless you have extra references > (2 or more) and you need to get them down to one reference for the > return. > > So for example the devfs_nsymlink() call is calling devfs_allocvp() > but it is also returning the vp in *ap->a_vpp. In the case of > devfs_nsymlink() I believe it is expected to return a referenced > but NOT locked vnode, so you would unlock it but not dereference it > before returning. > > You will want to check all the VNOPS that return a vnode in *ap->a_vpp > for similar issues. > > In the case if nresolve I believe you are doing it correctly... > VOP_NRESOLVE() does not return a vnode (there is no ap->a_vpp), > it just expects the namecache entry to be resolved and the > cache_setvp() call doesn't inherit any refs or anything so you unlock > and release the vnode like you are doing. > > -Matt > Matthew Dillon > <[email protected]>
