Yeah, that looks like a bug to me. We need to keep the check before the lock 
but otherwise this is fine and should be fixed in 2.0.2.

-Nathan

> On Sep 21, 2016, at 3:16 AM, DEVEZE, PASCAL <pascal.dev...@atos.net> wrote:
> 
> I encountered a deadlock in sync_wait_mt().
>  
> After investigations, it appears that a first thread executing 
> wait_sync_update() decrements sync->count just after a second thread in 
> sync_wait_mt() made the test :
>  
>     if(sync->count <= 0)
>         return (0 == sync->status) ? OPAL_SUCCESS : OPAL_ERROR;
>  
> After that, there is a narrow window in which the first thread may call 
> pthread_cond_signal() before the second thread calls pthread_cond_wait().
>  
> If I protect this test by the sync->lock, this window is closed and the 
> problem does not reproduce.
>  
> To easy reproduce the problem, just add a call to usleep(100) before the call 
> to pthread_mutex(&sync->lock);
>  
> So my proposed patch is:
>  
> diff --git a/opal/threads/wait_sync.c b/opal/threads/wait_sync.c
> index c9b9137..2f90965 100644
> --- a/opal/threads/wait_sync.c
> +++ b/opal/threads/wait_sync.c
> @@ -25,12 +25,14 @@ static ompi_wait_sync_t* wait_sync_list = NULL;
>  
> int sync_wait_mt(ompi_wait_sync_t *sync)
> {
> -    if(sync->count <= 0)
> -        return (0 == sync->status) ? OPAL_SUCCESS : OPAL_ERROR;
> -
>      /* lock so nobody can signal us during the list updating */
>      pthread_mutex_lock(&sync->lock);
>  
> +    if(sync->count <= 0) {
> +        pthread_mutex_unlock(&sync->lock);
> +        return (0 == sync->status) ? OPAL_SUCCESS : OPAL_ERROR;
> +    }
> +
>      /* Insert sync on the list of pending synchronization constructs */
>      OPAL_THREAD_LOCK(&wait_sync_lock);
>      if( NULL == wait_sync_list ) {
>  
> For performance reasons, it is also possible to leave the first test call. So 
> if the request is terminated, we do not spend time to take and free the lock.
>  
>  
>  
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to