Re: flock incorrectly detects deadlock on 7-stable and current

2008-05-09 Thread Doug Rabson


On 9 May 2008, at 07:07, Paul Koch wrote:


On Thu, 8 May 2008 06:37:00 pm Doug Rabson wrote:



Could you possibly try this patch and tell me if it helps:
...


Manually applied the patch to stable kern_lockf.c  1.57.2.1.  Ran the
flock_test program on many of our architectures and it works fine.

Have also been testing our app on a single core i386 machine today  
with
no locking problems.  Just setup a quad core -stable amd64 build and  
it

also appears to be running fine now.


Thanks for that. I'll get the patch committed to current today - it  
will turn up in 7-stable in a couple of days.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: flock incorrectly detects deadlock on 7-stable and current

2008-05-08 Thread Paul Koch
On Thu, 8 May 2008 06:37:00 pm Doug Rabson wrote:
> On 8 May 2008, at 09:12, Paul Koch wrote:
> > Hi,
> >
> > We have been trying to track down a problem with one of our apps
> > which does a lot of flock(2) calls.  flock returns errno 11
> > (Resource deadlock avoided) under certain scenarios.  Our app works
> > fine on 7-Release, but fails on 7-stable and -current.
> >
> > The problem appears to be when we have at least three processes
> > doing flock() on a file, and one is trying to upgrade a shared lock
> > to an exclusive lock but fails with a deadlock avoided.
> >
> > Attached is a simple flock() test program.
> >
> > a. Process 1 requests and gets a shared lock
> > b. Process 2 requests and blocks for an exclusive lock
> > c. Process 3 requests and gets a shared lock
> > d. Process 3 requests an upgrade to an exclusive lock but fails
> > (errno 11)
> >
> > If we change 'd' to
> >   Process 3 requests unlock, then requests exclusive lock, it
> > works.
>
> Could you possibly try this patch and tell me if it helps:
>
>  //depot/user/dfr/lockd/sys/kern/kern_lockf.c#57 -
> /tank/projects/ lockd/src/sys/kern/kern_lockf.c 
> @@ -1370,6 +1370,18 @@
>   }
>
>   /*
> +  * For flock type locks, we must first remove
> +  * any shared locks that we hold before we sleep
> +  * waiting for an exclusive lock.
> +  */
> + if ((lock->lf_flags & F_FLOCK) &&
> + lock->lf_type == F_WRLCK) {
> + lock->lf_type = F_UNLCK;
> + lf_activate_lock(state, lock);
> + lock->lf_type = F_WRLCK;
> + }
> +
> + /*
>* We are blocked. Create edges to each blocking lock,
>* checking for deadlock using the owner graph. For
>* simplicity, we run deadlock detection for all
> @@ -1389,17 +1401,6 @@
>   }
>
>   /*
> -  * For flock type locks, we must first remove
> -  * any shared locks that we hold before we sleep
> -  * waiting for an exclusive lock.
> -  */
> - if ((lock->lf_flags & F_FLOCK) &&
> - lock->lf_type == F_WRLCK) {
> - lock->lf_type = F_UNLCK;
> - lf_activate_lock(state, lock);
> - lock->lf_type = F_WRLCK;
> - }
> - /*
>* We have added edges to everything that blocks
>* us. Sleep until they all go away.
>*/

Manually applied the patch to stable kern_lockf.c  1.57.2.1.  Ran the 
flock_test program on many of our architectures and it works fine.

Have also been testing our app on a single core i386 machine today with 
no locking problems.  Just setup a quad core -stable amd64 build and it 
also appears to be running fine now.

Thanks

Paul.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: flock incorrectly detects deadlock on 7-stable and current

2008-05-08 Thread Doug Rabson


On 8 May 2008, at 09:12, Paul Koch wrote:


Hi,

We have been trying to track down a problem with one of our apps which
does a lot of flock(2) calls.  flock returns errno 11 (Resource
deadlock avoided) under certain scenarios.  Our app works fine on
7-Release, but fails on 7-stable and -current.

The problem appears to be when we have at least three processes doing
flock() on a file, and one is trying to upgrade a shared lock to an
exclusive lock but fails with a deadlock avoided.

Attached is a simple flock() test program.

a. Process 1 requests and gets a shared lock
b. Process 2 requests and blocks for an exclusive lock
c. Process 3 requests and gets a shared lock
d. Process 3 requests an upgrade to an exclusive lock but fails (errno
11)

If we change 'd' to
  Process 3 requests unlock, then requests exclusive lock, it works.


Could you possibly try this patch and tell me if it helps:

 //depot/user/dfr/lockd/sys/kern/kern_lockf.c#57 - /tank/projects/ 
lockd/src/sys/kern/kern_lockf.c 

@@ -1370,6 +1370,18 @@
}

/*
+* For flock type locks, we must first remove
+* any shared locks that we hold before we sleep
+* waiting for an exclusive lock.
+*/
+   if ((lock->lf_flags & F_FLOCK) &&
+   lock->lf_type == F_WRLCK) {
+   lock->lf_type = F_UNLCK;
+   lf_activate_lock(state, lock);
+   lock->lf_type = F_WRLCK;
+   }
+
+   /*
 * We are blocked. Create edges to each blocking lock,
 * checking for deadlock using the owner graph. For
 * simplicity, we run deadlock detection for all
@@ -1389,17 +1401,6 @@
}

/*
-* For flock type locks, we must first remove
-* any shared locks that we hold before we sleep
-* waiting for an exclusive lock.
-*/
-   if ((lock->lf_flags & F_FLOCK) &&
-   lock->lf_type == F_WRLCK) {
-   lock->lf_type = F_UNLCK;
-   lf_activate_lock(state, lock);
-   lock->lf_type = F_WRLCK;
-   }
-   /*
 * We have added edges to everything that blocks
 * us. Sleep until they all go away.
 */

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


flock incorrectly detects deadlock on 7-stable and current

2008-05-08 Thread Paul Koch
Hi,

We have been trying to track down a problem with one of our apps which 
does a lot of flock(2) calls.  flock returns errno 11 (Resource 
deadlock avoided) under certain scenarios.  Our app works fine on 
7-Release, but fails on 7-stable and -current.

The problem appears to be when we have at least three processes doing 
flock() on a file, and one is trying to upgrade a shared lock to an 
exclusive lock but fails with a deadlock avoided.

Attached is a simple flock() test program.

a. Process 1 requests and gets a shared lock
b. Process 2 requests and blocks for an exclusive lock
c. Process 3 requests and gets a shared lock
d. Process 3 requests an upgrade to an exclusive lock but fails (errno 
11)

If we change 'd' to
   Process 3 requests unlock, then requests exclusive lock, it works.


The manual page says:

"A shared lock may be upgraded to an exclusive lock, and vice versa, 
simply by specifying the appropriate lock type; this results in the 
previous lock being released and the new lock applied (possibly after 
other processes have gained and released the lock)."

The manual page doesn't mention that flock() can fail with a deadlock.


Our test environment is:
 - 8 core Intel machine running i386 stable
 - 4 core Intel machine running amd64 current (20080508)
 - 4 core Intel machine running amd64 stable  (20080508)
 - 2 core AMD machine running i386 stable (20080418)
 - 2 core AMD machine running i386 stable (20080418)
 - single core (no hyperthreading) i386 stable (20080418)

There appears to have been changes to kern_lockf.c and other stuff 
around the 10th April to do with deadlock detection.  We don't see the 
problem on 6.2-stable, 7-Release, or 7-stable pre ~10th April.

Paul.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"