On Thu, 8 May 2008 06:37:00 pm Doug Rabson wrote: > On 8 May 2008, at 09:12, Paul Koch wrote: > > Hi, > > > > We have been trying to track down a problem with one of our apps > > which does a lot of flock(2) calls. flock returns errno 11 > > (Resource deadlock avoided) under certain scenarios. Our app works > > fine on 7-Release, but fails on 7-stable and -current. > > > > The problem appears to be when we have at least three processes > > doing flock() on a file, and one is trying to upgrade a shared lock > > to an exclusive lock but fails with a deadlock avoided. > > > > Attached is a simple flock() test program. > > > > a. Process 1 requests and gets a shared lock > > b. Process 2 requests and blocks for an exclusive lock > > c. Process 3 requests and gets a shared lock > > d. Process 3 requests an upgrade to an exclusive lock but fails > > (errno 11) > > > > If we change 'd' to > > Process 3 requests unlock, then requests exclusive lock, it > > works. > > Could you possibly try this patch and tell me if it helps: > > ==== //depot/user/dfr/lockd/sys/kern/kern_lockf.c#57 - > /tank/projects/ lockd/src/sys/kern/kern_lockf.c ==== > @@ -1370,6 +1370,18 @@ > } > > /* > + * For flock type locks, we must first remove > + * any shared locks that we hold before we sleep > + * waiting for an exclusive lock. > + */ > + if ((lock->lf_flags & F_FLOCK) && > + lock->lf_type == F_WRLCK) { > + lock->lf_type = F_UNLCK; > + lf_activate_lock(state, lock); > + lock->lf_type = F_WRLCK; > + } > + > + /* > * We are blocked. Create edges to each blocking lock, > * checking for deadlock using the owner graph. For > * simplicity, we run deadlock detection for all > @@ -1389,17 +1401,6 @@ > } > > /* > - * For flock type locks, we must first remove > - * any shared locks that we hold before we sleep > - * waiting for an exclusive lock. > - */ > - if ((lock->lf_flags & F_FLOCK) && > - lock->lf_type == F_WRLCK) { > - lock->lf_type = F_UNLCK; > - lf_activate_lock(state, lock); > - lock->lf_type = F_WRLCK; > - } > - /* > * We have added edges to everything that blocks > * us. Sleep until they all go away. > */
Manually applied the patch to stable kern_lockf.c 1.57.2.1. Ran the flock_test program on many of our architectures and it works fine. Have also been testing our app on a single core i386 machine today with no locking problems. Just setup a quad core -stable amd64 build and it also appears to be running fine now. Thanks Paul. _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"