This isn't thread-safe, but in a "simple" (if these things ever are) way because the operation could complete eagerly.
"Dener, Alp via petsc-users" <petsc-users@mcs.anl.gov> writes: > Hi Krys, > > On Dec 20, 2018, at 10:59 AM, Krzysztof Kamieniecki via petsc-users > <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote: > > That example seems to have critical sections around certain Vec calls, and it > looks like my problem occurs in VecDotBegin/VecDotEnd which is called by > TAO/BLMVM. > > The quasi-Newton matrix objects in BLMVM have asynchronous dot products in > the matrix-free forward and inverse product formulations. This is a > relatively recent performance optimization. If avoiding this split phase > communication would solve the problem, and you don’t need other recent PETSc > features, you could revert to 3.9 and use the old version of BLMVM that will > use straight VecDot operations instead. > > Unfortunately I don’t know enough about multithreading to definitively say > whether that will actually solve the problem or not. Other members of the > community can probably provide a more complete answer on that. > > > I assume PetscSplitReductionGet is pulling the PetscSplitReduction for > PETSC_COMM_SELF which is shared across the whole process? > > I tried PetscCommDuplicate/PetscCommDestroy but that does not seem to help. > > PetscErrorCode VecDotBegin(Vec x,Vec y,PetscScalar *result) > { > PetscErrorCode ierr; > PetscSplitReduction *sr; > MPI_Comm comm; > > PetscFunctionBegin; > PetscValidHeaderSpecific(x,VEC_CLASSID,1); > PetscValidHeaderSpecific(y,VEC_CLASSID,1); > ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); > ierr = PetscSplitReductionGet(comm,&sr);CHKERRQ(ierr); > if (sr->state != STATE_BEGIN) > SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ORDER,"Called before all VecxxxEnd() > called"); > if (sr->numopsbegin >= sr->maxops) { > ierr = PetscSplitReductionExtend(sr);CHKERRQ(ierr); > } > sr->reducetype[sr->numopsbegin] = PETSC_SR_REDUCE_SUM; > sr->invecs[sr->numopsbegin] = (void*)x; > if (!x->ops->dot_local) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_SUP,"Vector does > not suppport local dots"); > ierr = PetscLogEventBegin(VEC_ReduceArithmetic,0,0,0,0);CHKERRQ(ierr); > ierr = > (*x->ops->dot_local)(x,y,sr->lvalues+sr->numopsbegin++);CHKERRQ(ierr); > ierr = PetscLogEventEnd(VEC_ReduceArithmetic,0,0,0,0);CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > > > > On Thu, Dec 20, 2018 at 11:26 AM Smith, Barry F. > <bsm...@mcs.anl.gov<mailto:bsm...@mcs.anl.gov>> wrote: > > The code src/ksp/ksp/examples/tutorials/ex61f.F90 demonstrates working > with multiple threads each managing their own collection of PETSc objects. > Hope this helps. > > Barry > > >> On Dec 20, 2018, at 9:28 AM, Krzysztof Kamieniecki via petsc-users >> <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote: >> >> Hello All, >> >> I have an embarrassingly parallel problem that I would like to use TAO on, >> is there some way to do this with threads as opposed to multiple processes? >> >> I compiled PETSc with the following flags >> ./configure \ >> --prefix=${DEP_INSTALL_DIR} \ >> --with-threadsafety --with-log=0 --download-concurrencykit \ >> --with-openblas=1 \ >> --with-openblas-dir=${DEP_INSTALL_DIR} \ >> --with-mpi=0 \ >> --with-shared=0 \ >> --with-debugging=0 COPTFLAGS='-O3' CXXOPTFLAGS='-O3' FOPTFLAGS='-O3' >> >> When I run TAO in multiple threads I get the error "Called VecxxxEnd() in a >> different order or with a different vector than VecxxxBegin()" >> >> Thanks, >> Krys >> > > > — > Alp