Dear All, We have a working prototype of pipe(l) CG in Petsc, where dot products are taking multiple iterations to complete. Due to the limitations of VecDotBegin we had to used MPI_WAIT and MPI_Iallreduce. A high level overview of the communication is given in the figure. The preprint of the paper is https://arxiv.org/abs/1801.04728
How should we proceed? Can we contribute this routine to KSP while it uses primitive MPI calls? Or should we interact with petsc-dev to see if we can redesign VecDotBegin and VecDotEnd to be able to handle these cases? And then rewrite the prototype with these new calls? Can we talk about this at SIAM PP18? Wim Vanroose
