hi all, the following is on openbsd 4.5, on i386. i recently had a problem with a program that seemed to be hanging. top showed procs many procs in the "fdlock" state, which i hadn't seen before. the program that was blocking was inferno. inferno uses rfork (inferno is closely related to plan 9) to create multiple procs to handle blocking i/o, with shared file descriptor groups. after inspection with ddb(4), it seemed the proc that was holding the fdlock was busy closing a socket with SO_LINGER set, sleeping (with fd_lock held) in /sys/kern/uipc_socket.c:/^soclose.
i'm not very familiar with openbsd code, so my analysis was a bit fuzzy, but this is what i came to: /sys/kern/kern_descrip.c:/^sys_close does the following: fdplock(fdp); error = fdrelease(p, fd); fdpunlock(fdp); with fdplock: /usr/include/sys/filedesc.h:#define fdplock(fdp) rw_enter_write(&(fdp)->fd_lock) /sys/kern/kern_descrip.c:/^fdrelease calls /sys/kern/kern_descrip.c:/^closef. closef() calls the struct file's f_ops' close() method, which is /sys/kern/kern_descrip.c:/^soo_close, which calls /sys/kern/uipc_socket.c:/^soclose, which seems to be able to sleep when option SO_LINGER is set. still with the fd_lock held. other procs in the same fd group will now block when they try to lock the fd_lock, e.g. in sys_open (or any other fd slot operation). i had a quick look at other bsd's and linux' code, they seem to do the closing separately from the fd slot operations. i no longer use so_linger, so this particular problem is not bothering me any more. if the above is really what is happening, perhaps rthreads will run into the same problem sooner or later? perhaps other close() calls can also sleep? or other operations that hold fd_lock? best regards, mjl