This isn't thread-safe, but in a "simple" (if these things ever are) way 
because the operation could complete eagerly.

"Dener, Alp via petsc-users" <petsc-users@mcs.anl.gov> writes:

> Hi Krys,
>
> On Dec 20, 2018, at 10:59 AM, Krzysztof Kamieniecki via petsc-users 
> <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote:
>
> That example seems to have critical sections around certain Vec calls, and it 
> looks like my problem occurs in VecDotBegin/VecDotEnd which is called by 
> TAO/BLMVM.
>
> The quasi-Newton matrix objects in BLMVM have asynchronous dot products in 
> the matrix-free forward and inverse product formulations. This is a 
> relatively recent performance optimization. If avoiding this split phase 
> communication would solve the problem, and you don’t need other recent PETSc 
> features, you could revert to 3.9 and use the old version of BLMVM that will 
> use straight VecDot operations instead.
>
> Unfortunately I don’t know enough about multithreading to definitively say 
> whether that will actually solve the problem or not. Other members of the 
> community can probably provide a more complete answer on that.
>
>
> I assume  PetscSplitReductionGet is pulling the PetscSplitReduction for 
> PETSC_COMM_SELF which is shared across the whole process?
>
> I tried PetscCommDuplicate/PetscCommDestroy but that does not seem to help.
>
> PetscErrorCode  VecDotBegin(Vec x,Vec y,PetscScalar *result)
> {
>   PetscErrorCode      ierr;
>   PetscSplitReduction *sr;
>   MPI_Comm            comm;
>
>   PetscFunctionBegin;
>   PetscValidHeaderSpecific(x,VEC_CLASSID,1);
>   PetscValidHeaderSpecific(y,VEC_CLASSID,1);
>   ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr);
>   ierr = PetscSplitReductionGet(comm,&sr);CHKERRQ(ierr);
>   if (sr->state != STATE_BEGIN) 
> SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ORDER,"Called before all VecxxxEnd() 
> called");
>   if (sr->numopsbegin >= sr->maxops) {
>     ierr = PetscSplitReductionExtend(sr);CHKERRQ(ierr);
>   }
>   sr->reducetype[sr->numopsbegin] = PETSC_SR_REDUCE_SUM;
>   sr->invecs[sr->numopsbegin]     = (void*)x;
>   if (!x->ops->dot_local) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_SUP,"Vector does 
> not suppport local dots");
>   ierr = PetscLogEventBegin(VEC_ReduceArithmetic,0,0,0,0);CHKERRQ(ierr);
>   ierr = 
> (*x->ops->dot_local)(x,y,sr->lvalues+sr->numopsbegin++);CHKERRQ(ierr);
>   ierr = PetscLogEventEnd(VEC_ReduceArithmetic,0,0,0,0);CHKERRQ(ierr);
>   PetscFunctionReturn(0);
> }
>
>
>
>
> On Thu, Dec 20, 2018 at 11:26 AM Smith, Barry F. 
> <bsm...@mcs.anl.gov<mailto:bsm...@mcs.anl.gov>> wrote:
>
>    The code src/ksp/ksp/examples/tutorials/ex61f.F90 demonstrates working 
> with multiple threads each managing their own collection of PETSc objects. 
> Hope this helps.
>
>     Barry
>
>
>> On Dec 20, 2018, at 9:28 AM, Krzysztof Kamieniecki via petsc-users 
>> <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote:
>>
>> Hello All,
>>
>> I have an embarrassingly parallel problem that I would like to use TAO on, 
>> is there some way to do this with threads as opposed to multiple processes?
>>
>>  I compiled PETSc with the following flags
>> ./configure \
>> --prefix=${DEP_INSTALL_DIR} \
>> --with-threadsafety --with-log=0 --download-concurrencykit \
>> --with-openblas=1 \
>> --with-openblas-dir=${DEP_INSTALL_DIR} \
>> --with-mpi=0 \
>> --with-shared=0 \
>> --with-debugging=0 COPTFLAGS='-O3' CXXOPTFLAGS='-O3' FOPTFLAGS='-O3'
>>
>> When I run TAO in multiple threads I get the error "Called VecxxxEnd() in a 
>> different order or with a different vector than VecxxxBegin()"
>>
>> Thanks,
>> Krys
>>
>
>
> —
> Alp

Reply via email to