I think this will be useful for SLEPc as it is now, but cannot test it because some changes are required in SLEPc. I will try to find time to implement them in the coming days. Jose
> El 12 may 2020, a las 17:28, Pierre Jolivet <pierre.joli...@enseeiht.fr> > escribió: > > MatShellSetMatProductOperation looks really nice to me, thanks! > Pierre > >> On 12 May 2020, at 12:13 PM, Stefano Zampini <stefano.zamp...@gmail.com> >> wrote: >> >> Pierre and Jose >> >> I have added support for MatMat callbacks operations for MATSHELL, you may >> want to take a look here for how to use it >> https://gitlab.com/petsc/petsc/-/merge_requests/2712/diffs?commit_id=7f809910e2bafe055242a87d70afd114664ffaf8 >> This is the relevant commit >> https://gitlab.com/petsc/petsc/-/merge_requests/2712/diffs?commit_id=e01f573d2ca0ec07c54db508a94a042fba4038de >> >> Let me know if you need more customization (e.g. attach data to the product >> in a more systematic way) or if it can already fit your frameworks. >> >> Best >> Stefano >> >> Il giorno dom 10 mag 2020 alle ore 21:04 Stefano Zampini >> <stefano.zamp...@gmail.com> ha scritto: >> >> >>> On May 10, 2020, at 8:56 PM, Jose E. Roman <jro...@dsic.upv.es> wrote: >>> >>> >>> >>>> El 10 may 2020, a las 19:42, Stefano Zampini <stefano.zamp...@gmail.com> >>>> escribió: >>>> >>>> Glad to hear it works. Anyway, without the MatShellVecSetType call the >>>> code was erroring for me, not leaking memory. >>>> Where you also providing -vec_type cuda at command line or what? >>>> Mark recently noted a similar leak, and I was wondering what was the cause >>>> for yours. A MWE would be great. >>> >>> I never use -vec_type cuda because all SLEPc tests that use CUDA create >>> vectors with MatCreateVecs(). Until now I had never tested examples with >>> shell matrices. >>> >>> This kind of issues would be more easy to detect if macros such as >>> PetscCheckSameTypeAndComm() would actually compare type_name. >> >> VECCUDA specific code checks for type names (either VECSEQCUDA or >> VECMPICUDA) and these errored in my case. (Try by simply removing >> MatShellSetVecType) >> >> zampins@jasmine:~/petsc/src/mat/tests$ git diff >> diff --git a/src/mat/tests/ex69.c b/src/mat/tests/ex69.c >> index b04652d..2a9374d 100644 >> --- a/src/mat/tests/ex69.c >> +++ b/src/mat/tests/ex69.c >> @@ -89,7 +89,7 @@ int main(int argc,char **argv) >> if (use_shell) { >> ierr = >> MatCreateShell(PetscObjectComm((PetscObject)v),nloc,nloc,n,n,A,&S);CHKERRQ(ierr); >> ierr = >> MatShellSetOperation(S,MATOP_MULT,(void(*)(void))MatMult_S);CHKERRQ(ierr); >> - ierr = MatShellSetVecType(S,vtype);CHKERRQ(ierr); >> + //ierr = MatShellSetVecType(S,vtype);CHKERRQ(ierr); >> /* we could have called the general convertor also */ >> /* ierr = MatConvert(A,MATSHELL,MAT_INITIAL_MATRIX,&S);CHKERRQ(ierr); */ >> } else { >> zampins@jasmine:~/petsc/src/mat/tests$ ./ex69 -use_shell >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Invalid argument >> [0]PETSC ERROR: Object (seq) is not seqcuda or mpicuda >> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for >> trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.13-213-g441560f GIT >> Date: 2020-04-25 17:01:13 +0300 >> [0]PETSC ERROR: ./ex69 on a arch-gpu-double-openmp-openblas named jasmine by >> zampins Sun May 10 21:03:09 2020 >> [0]PETSC ERROR: Configure options --download-cub=1 >> --download-hara-commit=HEAD --download-hara=1 --download-kblas=1 >> --download-magma=1 --download-openblas-use-pthreads=1 --download-openblas=1 >> --download-superlu_dist-commit=HEAD --download-superlu_dist=1 >> --download-mumps=1 --download-scalapack=1 >> --with-cc=/opt/ecrc/intel/2018/compilers_and_libraries/linux/mpi/intel64/bin/mpicc >> --with-cuda-gencodearch=37 --with-cuda=1 >> --with-cudac=/opt/ecrc/cuda/10.1/bin/nvcc --with-cxx-dialect=C++11 >> --with-cxx=/opt/ecrc/intel/2018/compilers_and_libraries/linux/mpi/intel64/bin/mpicxx >> >> --with-fc=/opt/ecrc/intel/2018/compilers_and_libraries/linux/mpi/intel64/bin/mpif90 >> --with-fortran-bindings=0 --with-magma-fortran-bindings=0 >> --with-opencl-include=/opt/ecrc/cuda/10.1/include >> --with-opencl-lib="-L/opt/ecrc/cuda/10.1/lib64 -lOpenCL" --with-openmp=1 >> --with-precision=double PETSC_ARCH=arch-gpu-double-openmp-openblas >> [0]PETSC ERROR: #1 VecCUDAGetArrayRead() line 2206 in >> /home/zampins/petsc/src/vec/vec/interface/rvector.c >> [0]PETSC ERROR: #2 MatMultAdd_SeqAIJCUSPARSE() line 1448 in >> /home/zampins/petsc/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu >> [0]PETSC ERROR: #3 MatMult_SeqAIJCUSPARSE() line 1375 in >> /home/zampins/petsc/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu >> [0]PETSC ERROR: #4 MatMult() line 2392 in >> /home/zampins/petsc/src/mat/interface/matrix.c >> [0]PETSC ERROR: #5 MatMult_S() line 12 in ex69.c >> [0]PETSC ERROR: #6 MatMult_Shell() line 586 in >> /home/zampins/petsc/src/mat/impls/shell/shell.c >> [0]PETSC ERROR: #7 MatMult() line 2392 in >> /home/zampins/petsc/src/mat/interface/matrix.c >> [0]PETSC ERROR: #8 MatProductNumeric_X_Dense() line 305 in >> /home/zampins/petsc/src/mat/interface/matproduct.c >> [0]PETSC ERROR: #9 MatProductNumeric() line 976 in >> /home/zampins/petsc/src/mat/interface/matproduct.c >> [0]PETSC ERROR: #10 main() line 109 in ex69.c >> [0]PETSC ERROR: PETSc Option Table entries: >> [0]PETSC ERROR: -use_gpu_aware_mpi 0 >> [0]PETSC ERROR: -use_shell >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-ma...@mcs.anl.gov---------- >> application called MPI_Abort(MPI_COMM_SELF, 109062) - process 0 >> [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=109062 >> : >> system msg for write_line failure : Bad file descriptor >> >> >>> >>> >>>> >>>> BTW, the branch also provide MatDenseReplaceArray() now. >>> >>> Yes, I saw it. Thanks. >>> >>>> Last thing I want to do is to support the user to provide MatMat (AB and >>>> AtB at least) callbacks for MATSHELL >>>> Can SLEPc benefit from such a feature ? >>> >>> Some solvers yes. >>> >>>> >>>>> On May 10, 2020, at 7:47 PM, Jose E. Roman <jro...@dsic.upv.es> wrote: >>>>> >>>>> Thanks for the hints. I have modified my branch. I was missing the >>>>> MatShellSetVecType() call. Now everything works fine and all tests are >>>>> clean. >>>>> >>>>> Jose >>>>> >>>>> >>>>>> El 9 may 2020, a las 21:32, Stefano Zampini <stefano.zamp...@gmail.com> >>>>>> escribió: >>>>>> >>>>>> Jose >>>>>> >>>>>> I have just pushed an updated example with the MatMat operation, and I >>>>>> do not see the memory leak. Can you check? >>>>>> >>>>>> zampins@jasmine:~/petsc$ make -f gmakefile.test test search='mat%' >>>>>> searchin='ex69' PETSC_OPTIONS='-malloc -malloc_dump -malloc_debug' >>>>>> /usr/bin/python /home/zampins/petsc/config/gmakegentest.py >>>>>> --petsc-dir=/home/zampins/petsc >>>>>> --petsc-arch=arch-gpu-double-openmp-openblas >>>>>> --testdir=./arch-gpu-double-openmp-openblas/tests >>>>>> Using MAKEFLAGS: -- PETSC_OPTIONS=-malloc -malloc_dump -malloc_debug >>>>>> searchin=ex69 search=mat% >>>>>> CC arch-gpu-double-openmp-openblas/tests/mat/tests/ex69.o >>>>>> CLINKER arch-gpu-double-openmp-openblas/tests/mat/tests/ex69 >>>>>> TEST >>>>>> arch-gpu-double-openmp-openblas/tests/counts/mat_tests-ex69_1.counts >>>>>> ok mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-1 >>>>>> ok mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-0 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-0 >>>>>> ok mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-1 >>>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-1 >>>>>> >>>>>> # ------------- >>>>>> # Summary >>>>>> # ------------- >>>>>> # success 48/48 tests (100.0%) >>>>>> # failed 0/48 tests (0.0%) >>>>>> # todo 0/48 tests (0.0%) >>>>>> # skip 0/48 tests (0.0%) >>>>>> # >>>>>> # Wall clock time for tests: 58 sec >>>>>> # Approximate CPU time (not incl. build time): 62.11 sec >>>>>> # >>>>>> # Timing summary (actual test time / total CPU time): >>>>>> # mat_tests-ex69_1: 2.30 sec / 62.11 sec >>>>>> >>>>>>> On May 9, 2020, at 9:28 PM, Jose E. Roman <jro...@dsic.upv.es> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> El 9 may 2020, a las 20:00, Stefano Zampini >>>>>>>> <stefano.zamp...@gmail.com> escribió: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Il giorno sab 9 mag 2020 alle ore 19:43 Jose E. Roman >>>>>>>> <jro...@dsic.upv.es> ha scritto: >>>>>>>> >>>>>>>> >>>>>>>>> El 9 may 2020, a las 12:45, Stefano Zampini >>>>>>>>> <stefano.zamp...@gmail.com> escribió: >>>>>>>>> >>>>>>>>> Jose >>>>>>>>> >>>>>>>>> I have just pushed a test >>>>>>>>> https://gitlab.com/petsc/petsc/-/blob/d64c2bc63c8d5d1a8c689f1abc762ae2722bba26/src/mat/tests/ex69.c >>>>>>>>> See if it fits your framework, and feel free to modify the test to >>>>>>>>> add more checks >>>>>>>> >>>>>>>> Almost good. The following modification of the example fails with >>>>>>>> -test 1: >>>>>>>> >>>>>>>> >>>>>>>> diff --git a/src/mat/tests/ex69.c b/src/mat/tests/ex69.c >>>>>>>> index e562f1e2e3..2df2c89be1 100644 >>>>>>>> --- a/src/mat/tests/ex69.c >>>>>>>> +++ b/src/mat/tests/ex69.c >>>>>>>> @@ -84,6 +84,10 @@ int main(int argc,char **argv) >>>>>>>> } >>>>>>>> ierr = VecCUDARestoreArray(v,&vv);CHKERRQ(ierr); >>>>>>>> >>>>>>>> + if (test==1) { >>>>>>>> + ierr = MatDenseCUDAGetArray(B,&aa);CHKERRQ(ierr); >>>>>>>> + if (aa) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"Expected a null >>>>>>>> pointer"); >>>>>>>> + } >>>>>>>> >>>>>>>> /* free work space */ >>>>>>>> ierr = MatDestroy(&B);CHKERRQ(ierr); >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I would expect that after MatDenseCUDAResetArray() the pointer is NULL >>>>>>>> because it was set so in line 60. In the CPU counterpart it works as >>>>>>>> expected. >>>>>>>> >>>>>>>> Pushed a fix for this, thanks. >>>>>>>> >>>>>>>> Another comment is: in line 60 you have changed >>>>>>>> MatDenseCUDAPlaceArray() to MatDenseCUDAReplaceArray(). This is ok, >>>>>>>> but it is strange because MatDenseReplaceArray() does not exist. So >>>>>>>> the interface is different in GPU vs CPU, but I guess it is necessary >>>>>>>> here. >>>>>>>> >>>>>>>> I think we do not support calling PlaceArray twice anywhere PETSc. >>>>>>>> This is why I have added MatDenseCUDAReplaceArray(). If you need >>>>>>>> support for the CPU case too, I can add it. >>>>>>> >>>>>>> Yes, please. It is better to have the same thing in both cases. >>>>>>> >>>>>>> I am attaching the modified example, now performs a mat-mat product. If >>>>>>> I do A*B it works well, but if I replace A with a shell matrix I get a >>>>>>> memory leak. >>>>>>> >>>>>>> [ 0]32 bytes VecCUDAAllocateCheck() line 34 in >>>>>>> /home/users/proy/copa/jroman/soft/petsc/src/vec/vec/impls/seq/seqcuda/veccuda2.cu >>>>>>> [ 0]32 bytes VecCUDAAllocateCheck() line 34 in >>>>>>> /home/users/proy/copa/jroman/soft/petsc/src/vec/vec/impls/seq/seqcuda/veccuda2.cu >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks. >>>>>>>> Jose >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno ven 8 mag 2020 alle ore 18:48 Jose E. Roman >>>>>>>>> <jro...@dsic.upv.es> ha scritto: >>>>>>>>> Attached. Run with -test 1 or -test 2 >>>>>>>>> >>>>>>>>>> El 8 may 2020, a las 17:14, Stefano Zampini >>>>>>>>>> <stefano.zamp...@gmail.com> escribió: >>>>>>>>>> >>>>>>>>>> Jose >>>>>>>>>> >>>>>>>>>> Just send me a MWE and I’ll fix the case for you >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Stefano >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Stefano >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Stefano >>>>>>> <ex69.c> >>>>>> >>>>> >>>> >>> >> >> >> >> -- >> Stefano >