Barry, Thanks for the illustration. Is there an easy way to mimic the implementation using shell matrix? I have been studying how the sMvctx is created and it seems pretty involved.
Thanks, Sam On Mon, Mar 21, 2022 at 2:48 PM Barry Smith <[email protected]> wrote: > > > On Mar 21, 2022, at 4:36 PM, Sam Guo <[email protected]> wrote: > > Barry, > Thanks. Could you elaborate? I try to implement the matrix-vector > multiplication for a symmetric matrix using shell matrix. > > > Consider with three ranks > > (a) = ( A B D) (x) > (b) ( B' C E) (y) > (c) ( D' E' F) (w) > > Only the ones without the ' are stored on the rank. So for example B > is stored on rank 0. > > Rank 0 computes A x and keeps it in a. Rank 1 computes Cy and keeps > it in b Rank 2 computes Fw and keeps it in c > > Rank 0 computes B'x and D'x. It puts the nonzero entries of these > values as well as the values of x into slvec0 > > Rank 1 computes E'y and puts the nonzero entries as well as the > values into slvec0 > > Rank 2 puts the values of we needed by the other ranks into slvec0 > > Rank 0 does B y_h + D z_h where it gets the y_h and z_h values from > slvec1 and adds it to a > > Rank 1 takes the B'x from slvec1 and adds it to b it then takes the > E y_h values where the y_h are pulled from slvec1 and adds them b > > Rank 2 takes the B'x and E'y from slvec0 and adds them to c. > > > > Thanks, > Sam > > On Mon, Mar 21, 2022 at 12:56 PM Barry Smith <[email protected]> wrote: > >> >> The "trick" is that though "more" communication is needed to complete >> the product the communication can still be done in a single VecScatter >> instead of two separate calls to VecScatter. We simply pack both pieces of >> information that needs to be sent into a single vector. >> >> /* copy x into the vec slvec0 */1111: VecGetArray >> <https://petsc.org/main/docs/manualpages/Vec/VecGetArray.html#VecGetArray>(a->slvec0,&from);1112: >> VecGetArrayRead >> <https://petsc.org/main/docs/manualpages/Vec/VecGetArrayRead.html#VecGetArrayRead>(xx,&x); >> 1114: PetscArraycpy >> <https://petsc.org/main/docs/manualpages/Sys/PetscArraycpy.html#PetscArraycpy>(from,x,bs*mbs);1115: >> VecRestoreArray >> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArray.html#VecRestoreArray>(a->slvec0,&from);1116: >> VecRestoreArrayRead >> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArrayRead.html#VecRestoreArrayRead>(xx,&x); >> 1118: VecScatterBegin >> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterBegin.html#VecScatterBegin>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >> >> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >> >> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>);1119: >> VecScatterEnd >> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterEnd.html#VecScatterEnd>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >> >> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >> >> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>); >> >> If you create two symmetric matrices, one with SBAIJ and one with BAIJ >> and compare the time to do the product you will find that the SBAIJ is not >> significantly slower but does save memory. >> >> >> On Mar 21, 2022, at 3:26 PM, Sam Guo <[email protected]> wrote: >> >> Using following example from the MatCreateSBAIJ documentation >> >> 0 1 2 3 4 5 6 7 8 9 10 11 >> -------------------------- >> row 3 |. . . d d d o o o o o o >> row 4 |. . . d d d o o o o o o >> row 5 |. . . d d d o o o o o o >> -------------------------- >> >> >> On a processor that owns rows 3, 4 and 5, rows 0-2 info are still needed. Is >> is that the processor that owns rows 0-2 will apply B symmetrical and then >> send the result >> >> to the processor that owns 3-5? >> >> >> On Mon, Mar 21, 2022 at 12:14 PM Mark Adams <[email protected]> wrote: >> >>> PETSc stores parallel matrices as two serial matrices. One for the >>> diagonal (d or A) block and one for the rest (o or B). >>> I would guess that for symmetric matrices it has a symmetric matrix for >>> the diagonal and a full AIJ matrix for the (upper) off-diagonal. >>> So the multtranspose is applying B symmetrically. This lower >>> off-diagonal and the diagonal block can be done without communication. >>> Then the off processor values are collected, and the upper off-diagonal >>> is applied. >>> >>> On Mon, Mar 21, 2022 at 2:35 PM Sam Guo <[email protected]> wrote: >>> >>>> I am most interested in how the lower triangular part is redistributed. >>>> It seems that SBAJI saves memory but requires more communication than BAIJ. >>>> >>>> On Mon, Mar 21, 2022 at 11:27 AM Sam Guo <[email protected]> wrote: >>>> >>>>> Mark, thanks for the quick response. I am more interested in parallel >>>>> implementation of MatMult for SBAIJ. I found following >>>>> >>>>> 1094: *PetscErrorCode >>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode> >>>>> MatMult_MPISBAIJ(Mat >>>>> <https://petsc.org/main/docs/manualpages/Mat/Mat.html#Mat> A,Vec >>>>> <https://petsc.org/main/docs/manualpages/Vec/Vec.html#Vec> xx,Vec >>>>> <https://petsc.org/main/docs/manualpages/Vec/Vec.html#Vec> yy)*1095: >>>>> {1096: Mat_MPISBAIJ *a = (Mat_MPISBAIJ*)A->data;1097: >>>>> PetscErrorCode >>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode> >>>>> ierr;1098: PetscInt >>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscInt.html#PetscInt> >>>>> mbs=a->mbs,bs=A->rmap->bs;1099: PetscScalar >>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscScalar.html#PetscScalar> >>>>> *from;1100: const PetscScalar >>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscScalar.html#PetscScalar> >>>>> *x; >>>>> 1103: /* diagonal part */1104: >>>>> (*a->A->ops->mult)(a->A,xx,a->slvec1a);1105: VecSet >>>>> <https://petsc.org/main/docs/manualpages/Vec/VecSet.html#VecSet>(a->slvec1b,0.0); >>>>> 1107: /* subdiagonal part */1108: >>>>> (*a->B->ops->multtranspose)(a->B,xx,a->slvec0b); >>>>> 1110: /* copy x into the vec slvec0 */1111: VecGetArray >>>>> <https://petsc.org/main/docs/manualpages/Vec/VecGetArray.html#VecGetArray>(a->slvec0,&from);1112: >>>>> VecGetArrayRead >>>>> <https://petsc.org/main/docs/manualpages/Vec/VecGetArrayRead.html#VecGetArrayRead>(xx,&x); >>>>> 1114: PetscArraycpy >>>>> <https://petsc.org/main/docs/manualpages/Sys/PetscArraycpy.html#PetscArraycpy>(from,x,bs*mbs);1115: >>>>> VecRestoreArray >>>>> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArray.html#VecRestoreArray>(a->slvec0,&from);1116: >>>>> VecRestoreArrayRead >>>>> <https://petsc.org/main/docs/manualpages/Vec/VecRestoreArrayRead.html#VecRestoreArrayRead>(xx,&x); >>>>> 1118: VecScatterBegin >>>>> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterBegin.html#VecScatterBegin>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >>>>> >>>>> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >>>>> >>>>> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>);1119: >>>>> VecScatterEnd >>>>> <https://petsc.org/main/docs/manualpages/PetscSF/VecScatterEnd.html#VecScatterEnd>(a->sMvctx,a->slvec0,a->slvec1,ADD_VALUES >>>>> >>>>> <https://petsc.org/main/docs/manualpages/Sys/ADD_VALUES.html#ADD_VALUES>,SCATTER_FORWARD >>>>> >>>>> <https://petsc.org/main/docs/manualpages/Vec/SCATTER_FORWARD.html#SCATTER_FORWARD>);1120: >>>>> /* supperdiagonal part */1121: >>>>> (*a->B->ops->multadd)(a->B,a->slvec1b,a->slvec1a,yy);1122: >>>>> return(0);1123: } >>>>> >>>>> I try to understand the algorithm. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Sam >>>>> >>>>> >>>>> On Mon, Mar 21, 2022 at 11:14 AM Mark Adams <[email protected]> wrote: >>>>> >>>>>> This code looks fine to me and the code is >>>>>> in src/mat/impls/sbaij/seq/sbaij2.c >>>>>> >>>>>> On Mon, Mar 21, 2022 at 2:02 PM Sam Guo <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Dear PETSc dev team, >>>>>>> The documentation about MatCreateSBAIJ has following >>>>>>> "It is recommended that one use the MatCreate >>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatCreate.html#MatCreate> >>>>>>> (), MatSetType >>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatSetType.html#MatSetType>() >>>>>>> and/or MatSetFromOptions >>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatSetFromOptions.html#MatSetFromOptions>(), >>>>>>> MatXXXXSetPreallocation() paradigm instead of this routine directly. >>>>>>> [MatXXXXSetPreallocation() is, for example, >>>>>>> MatSeqAIJSetPreallocation >>>>>>> <https://petsc.org/main/docs/manualpages/Mat/MatSeqAIJSetPreallocation.html#MatSeqAIJSetPreallocation> >>>>>>> ]" >>>>>>> I currently call MatCreateSBAIJ directly as follows: >>>>>>> MatCreateSBAIJ (with d_nnz and o_nnz) >>>>>>> MatSetValues (to add row by row) >>>>>>> MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); >>>>>>> MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); >>>>>>> MatSetOption(A, MAT_SYMMETRIC, PETSC_TRUE); >>>>>>> >>>>>>> Two questions: >>>>>>> (1) I am wondering whether what I am doing is the most efficient. >>>>>>> >>>>>>> (2) I try to find out how the matrix vector multiplication is >>>>>>> implemented in PETSc for SBAIJ storage. >>>>>>> >>>>>>> Thanks, >>>>>>> Sam >>>>>>> >>>>>> >> >
