> On Mar 23, 2017, at 6:05 PM, Jed Brown <[email protected]> wrote:
> 
> Barry Smith <[email protected]> writes:
> 
>>  Wim,
>> 
>>    VecDotBegin/End() work by accumulating the partial values in a data 
>> structure associated with the MPI communicator until a 
>> PetscCommSplitReductionBegin() (or an VecXXXEnd()) is seen. Thus in the 
>> current model only a single collection of reductions can be outstanding at 
>> the same time. 
>> 
>>   For your needs we will need to extend the functionality so there can be 
>> multiple independent sets of outstanding reductions. 
> 
> Instead of this integer, I would prefer to change
> PetscSplitReductionGet() to give a request object that can be completed.
> If it is necessary to be able to start a new norm or dot product with
> the same arguments before completing the last, then
> 
>  VecNormBegin(X,&request);
>  VecNormEnd(X,request,&nrm);
> 
> The request above could be a pointer or an integer.

   Jed, how would you handle the chaining of several reductions into a single 
MPI communication? I don't think would work, you'd need a wider API for example

    VecNormBegin(X,&request);
    VecNormBeginWithRequest(Y,request);
    VecNormEnd(X,request,&nrm);
    VecNormEnd(Y,request,&nrm2);

   Ugly.

    Less ugly you could have something like

     PetscSplitReductionGetRequest(MPI_Comm,&request);
     VecNormBegin(X,request);
    VecNormBegin(Y,request);
    VecNormEnd(X,request,&nrm);
    VecNormEnd(Y,request,&nrm2);
     PetscSplitReductionRestoreRequest(MPI_Comm,&request);
   
  and to interleave multiple reductions 

     PetscSplitReductionGetRequest(MPI_Comm,&request);
     VecNormBegin(X,request);
    VecNormBegin(Y,request);

     PetscSplitReductionGetRequest(MPI_Comm,&request2);
     VecNormBegin(Z,request2);

    VecNormEnd(X,request,&nrm);
    VecNormEnd(Y,request,&nrm2);
     PetscSplitReductionRestoreRequest(MPI_Comm,&request);
     ....   
    VecNormEnd(Z,request2,&nrm3);
     PetscSplitReductionRestoreRequest(MPI_Comm,&request2);

   This is like my "integer" model except, as I said initially, we "hoist" the 
PetscSplitReduction object (the request) into visible space. Same functionality 
(using integer or hoisted object), just a different style. You could argue the 
hoisted style is more PETSc-like and we should use it.

   Note that with the hoisted model one would not need to attach the split 
information to the MPI_Comm as is currently done, but if the begin and end are 
in different routines one must carry the hoisted request variable around to get 
it to the correct final location. Of course tracking the integer around is a 
bit to much like hardwiring integer values for MPI tags (very dangerous) so 
hoisting is the way to go?

   Barry

> 
>>   Jed will likely have better ideas on but the simplest extension I can see 
>> is to add an additional integer argument to each call that indicates the sub 
>> collection of reductions. Thus something like
>> 
>> ierr = VecDotBegin(R,U,&gamma,0); CHKERRQ(ierr);
>> 
>> ierr = KSP_MatMult(ksp,Amat, ..., ... ); CHKERRQ(ierr);
>> 
>> ierr = VecDotBegin(W,V,&delta,1); CHKERRQ(ierr);
>> 
>> ierr = KSP_MatMult(ksp,Amat,M,N); CHKERRQ(ierr);
>> 
>> ierr = VecDotEnd(R,U,&gamma,0); CHKERRQ(ierr);
>> ierr = VecDotBegin(X,Y,&psi,2); CHKERRQ(ierr);
>> .... 
>> 
>> ierr = VecDotEnd(W,V,&delta,1); CHKERRQ(ierr);
>> ierr = VecDotEnd(X,Y,&psi,2); CHKERRQ(ierr);
>> 
>> The integer would be used internally by the routines to access different 
>> PetscSplitReduction objects associated with the MPI_Comm. In user code once 
>> you have completely Ended an operation with a particular integer you can 
>> recycle the integer and use it again for a new set.
>> 
>> An alternative to using integers is to hoist the PetscSplitReduction up to 
>> be visible to the calling code thus allowing multiple ones associated with 
>> different sets of reductions. This approach would result in a larger change 
>> to the public API so I would only do it if there is a fatal flaw in the 
>> integer approach.
>> 
>>  Jed, how do you suggest solving this ?
>> 
>>  Barry
>> 
>> 
>> 
>> 
>>> On Mar 23, 2017, at 9:41 AM, Wim Vanroose <[email protected]> wrote:
>>> 
>>> Dear  Petsc-Dev, 
>>> 
>>> Over the last few year we have contributed several pipelined Krylov 
>>> solvers.  Such as KSPPIPECG and  most recently pipelined bicgstab 
>>> (pipebcgs). 
>>> These make use of asynchronous global reductions using VecDotBegin en 
>>> VecDotEnd to overlap the calculation of a dot product with the matrix 
>>> vector product. 
>>> Experiments by various authors show that these methods can offere better 
>>> scaling in the extremely large system limit. 
>>> 
>>> We are now trying to introduce Krylov methods with longer  pipelines.  Such 
>>> that the dot-product can take multiple matrix vector products to complete. 
>>> 
>>> Below is a scetch.  After the first SpMV we would like to start a 
>>> VecDotBegin,  That would only complete 2 Spmv's, or more, later. 
>>> After each SpMV we would start such global reduction. 
>>> <out (1).png>
>>> While trying to implement a length-l version of pipelined CG in PETSc, we 
>>> ran across some trouble with the following type of construction 
>>> that are representative for the problem abouve.  Let R,U,V,W,X and Y  be 
>>> KSP work vectors, and gamma, delta and psi are PetscScalar:
>>> 
>>> ierr = VecDotBegin(R,U,&gamma); CHKERRQ(ierr);
>>> 
>>> ierr = KSP_MatMult(ksp,Amat, ..., ... ); CHKERRQ(ierr);
>>> 
>>> ierr = VecDotBegin(W,V,&delta); CHKERRQ(ierr);
>>> 
>>> ierr = KSP_MatMult(ksp,Amat,M,N); CHKERRQ(ierr);
>>> 
>>> ierr = VecDotEnd(R,U,&gamma); CHKERRQ(ierr);
>>> ierr = VecDotBegin(X,Y,&psi); CHKERRQ(ierr);
>>> .... 
>>> 
>>> ierr = VecDotEnd(W,V,&delta); CHKERRQ(ierr);
>>> ierr = VecDotEnd(X,Y,&psi); CHKERRQ(ierr);
>>> 
>>> Maybe this is a trivial remark, but it appears that it is not possible to 
>>> put a new VecDotBegin (line 7) in between two VecDotEnd's (lines 6 and 8). 
>>> Do you have any ideas on why this can't be done (is it intrinsic to 
>>> VecDotBegin?), and whether a work-around for this issue is available?
>>> 
>>> Are there other methods in Petsc  that we should use?   Or are the 
>>> VecDotBegin and VecDotEnd not designed to be used in this way?
>>> 
>>> Thanks a lot for the input,
>>> ​ 

Reply via email to