On Thu, 8 May 2008 06:37:00 pm Doug Rabson wrote:
> On 8 May 2008, at 09:12, Paul Koch wrote:
> > Hi,
> >
> > We have been trying to track down a problem with one of our apps
> > which does a lot of flock(2) calls.  flock returns errno 11
> > (Resource deadlock avoided) under certain scenarios.  Our app works
> > fine on 7-Release, but fails on 7-stable and -current.
> >
> > The problem appears to be when we have at least three processes
> > doing flock() on a file, and one is trying to upgrade a shared lock
> > to an exclusive lock but fails with a deadlock avoided.
> >
> > Attached is a simple flock() test program.
> >
> > a. Process 1 requests and gets a shared lock
> > b. Process 2 requests and blocks for an exclusive lock
> > c. Process 3 requests and gets a shared lock
> > d. Process 3 requests an upgrade to an exclusive lock but fails
> > (errno 11)
> >
> > If we change 'd' to
> >   Process 3 requests unlock, then requests exclusive lock, it
> > works.
>
> Could you possibly try this patch and tell me if it helps:
>
> ==== //depot/user/dfr/lockd/sys/kern/kern_lockf.c#57 -
> /tank/projects/ lockd/src/sys/kern/kern_lockf.c ====
> @@ -1370,6 +1370,18 @@
>               }
>
>               /*
> +              * For flock type locks, we must first remove
> +              * any shared locks that we hold before we sleep
> +              * waiting for an exclusive lock.
> +              */
> +             if ((lock->lf_flags & F_FLOCK) &&
> +                 lock->lf_type == F_WRLCK) {
> +                     lock->lf_type = F_UNLCK;
> +                     lf_activate_lock(state, lock);
> +                     lock->lf_type = F_WRLCK;
> +             }
> +
> +             /*
>                * We are blocked. Create edges to each blocking lock,
>                * checking for deadlock using the owner graph. For
>                * simplicity, we run deadlock detection for all
> @@ -1389,17 +1401,6 @@
>               }
>
>               /*
> -              * For flock type locks, we must first remove
> -              * any shared locks that we hold before we sleep
> -              * waiting for an exclusive lock.
> -              */
> -             if ((lock->lf_flags & F_FLOCK) &&
> -                 lock->lf_type == F_WRLCK) {
> -                     lock->lf_type = F_UNLCK;
> -                     lf_activate_lock(state, lock);
> -                     lock->lf_type = F_WRLCK;
> -             }
> -             /*
>                * We have added edges to everything that blocks
>                * us. Sleep until they all go away.
>                */

Manually applied the patch to stable kern_lockf.c  1.57.2.1.  Ran the 
flock_test program on many of our architectures and it works fine.

Have also been testing our app on a single core i386 machine today with 
no locking problems.  Just setup a quad core -stable amd64 build and it 
also appears to be running fine now.

Thanks

        Paul.
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to