Yeah, that looks like a bug to me. We need to keep the check before the lock but otherwise this is fine and should be fixed in 2.0.2.
-Nathan > On Sep 21, 2016, at 3:16 AM, DEVEZE, PASCAL <pascal.dev...@atos.net> wrote: > > I encountered a deadlock in sync_wait_mt(). > > After investigations, it appears that a first thread executing > wait_sync_update() decrements sync->count just after a second thread in > sync_wait_mt() made the test : > > if(sync->count <= 0) > return (0 == sync->status) ? OPAL_SUCCESS : OPAL_ERROR; > > After that, there is a narrow window in which the first thread may call > pthread_cond_signal() before the second thread calls pthread_cond_wait(). > > If I protect this test by the sync->lock, this window is closed and the > problem does not reproduce. > > To easy reproduce the problem, just add a call to usleep(100) before the call > to pthread_mutex(&sync->lock); > > So my proposed patch is: > > diff --git a/opal/threads/wait_sync.c b/opal/threads/wait_sync.c > index c9b9137..2f90965 100644 > --- a/opal/threads/wait_sync.c > +++ b/opal/threads/wait_sync.c > @@ -25,12 +25,14 @@ static ompi_wait_sync_t* wait_sync_list = NULL; > > int sync_wait_mt(ompi_wait_sync_t *sync) > { > - if(sync->count <= 0) > - return (0 == sync->status) ? OPAL_SUCCESS : OPAL_ERROR; > - > /* lock so nobody can signal us during the list updating */ > pthread_mutex_lock(&sync->lock); > > + if(sync->count <= 0) { > + pthread_mutex_unlock(&sync->lock); > + return (0 == sync->status) ? OPAL_SUCCESS : OPAL_ERROR; > + } > + > /* Insert sync on the list of pending synchronization constructs */ > OPAL_THREAD_LOCK(&wait_sync_lock); > if( NULL == wait_sync_list ) { > > For performance reasons, it is also possible to leave the first test call. So > if the request is terminated, we do not spend time to take and free the lock. > > > > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel _______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel