Re: [petsc-dev] CUDA GAMG coarse grid solver

Smith, Barry F. via petsc-dev Sun, 21 Jul 2019 19:14:33 -0700


> -mg_coarse_sub_mat_solver_type value: cusparse


  It is a PC factor option so it is 

  -mg_coarse_sub_pc_factor_mat_solver_type cusparse 

  -help | grep mg_coarse_sub

should have found it.

Barry


> On Jul 21, 2019, at 8:12 PM, Mark Adams <mfad...@lbl.gov> wrote:b
> 
> Barry, 
> 
> Option left: name:-mg_coarse_mat_solver_type value: cusparse
> 
> I tried this too:
> 
> Option left: name:-mg_coarse_sub_mat_solver_type value: cusparse
> 
> Here is the view. cuda did not get into the factor type.
> 
> PC Object: 24 MPI processes
>   type: gamg
>     type is MULTIPLICATIVE, levels=5 cycles=v
>       Cycles per PCApply=1
>       Using externally compute Galerkin coarse grid matrices
>       GAMG specific options
>         Threshold for dropping small values in graph on each level =   0.05   
> 0.025   0.0125  
>         Threshold scaling factor for each level not specified = 0.5
>         AGG specific options
>           Symmetric graph false
>           Number of levels to square graph 10
>           Number smoothing steps 1
>         Complexity:    grid = 1.14213
>   Coarse grid solver -- level -------------------------------
>     KSP Object: (mg_coarse_) 24 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_coarse_) 24 MPI processes
>       type: bjacobi
>         number of blocks = 24
>         Local solve is same for all blocks, in the following KSP and PC 
> objects:
>       KSP Object: (mg_coarse_sub_) 1 MPI processes
>         type: preonly
>         maximum iterations=1, initial guess is zero
>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>         left preconditioning
>         using NONE norm type for convergence test
>       PC Object: (mg_coarse_sub_) 1 MPI processes
>         type: lu
>           out-of-place factorization
>           tolerance for zero pivot 2.22045e-14
>           using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>           matrix ordering: nd
>           factor fill ratio given 5., needed 1.
>             Factored matrix follows:
>               Mat Object: 1 MPI processes
>                 type: seqaij
>                 rows=6, cols=6
>                 package used to perform factorization: petsc
>                 total: nonzeros=36, allocated nonzeros=36
>                 total number of mallocs used during MatSetValues calls =0
>                   using I-node routines: found 2 nodes, limit used is 5
>         linear system matrix = precond matrix:
>         Mat Object: 1 MPI processes
>           type: seqaijcusparse
>           rows=6, cols=6
>           total: nonzeros=36, allocated nonzeros=36
>           total number of mallocs used during MatSetValues calls =0
>             using I-node routines: found 2 nodes, limit used is 5
>       linear system matrix = precond matrix:
>       Mat Object: 24 MPI processes
>         type: mpiaijcusparse
>         rows=6, cols=6, bs=6
>         total: nonzeros=36, allocated nonzeros=36
>         total number of mallocs used during MatSetValues calls =0
>           using scalable MatPtAP() implementation
>           using I-node (on process 0) routines: found 2 nodes, limit used is 5
>   Down solver (pre-smoother) on level 1 -------------------------------
> 
> 
> 
> On Sun, Jul 21, 2019 at 3:58 PM Mark Adams <mfad...@lbl.gov> wrote:
> Barry, I do NOT see communication. This is what made me think it was not 
> running on the GPU. I added print statements and found that 
> MatSolverTypeRegister_CUSPARSE IS called but (what it registers) 
> MatGetFactor_seqaijcusparse_cusparse does NOT get called.
> 
> I have a job waiting on the queue. I'll send ksp_view when it runs. I will 
> try -mg_coarse_mat_solver_type cusparse. That is probably the problem. Maybe 
> I should set the coarse grid solver in a more robust way in GAMG, like use 
> the matrix somehow? I currently use PCSetType(pc, PCLU).
> 
> I can't get an interactive shell now to run DDT, but I can try stepping 
> through from MatGetFactor to see what its doing.
> 
> Thanks,
> Mark
> 
> On Sun, Jul 21, 2019 at 11:14 AM Smith, Barry F. <bsm...@mcs.anl.gov> wrote:
> 
> 
> > On Jul 21, 2019, at 8:55 AM, Mark Adams via petsc-dev 
> > <petsc-dev@mcs.anl.gov> wrote:
> > 
> > I am running ex56 with -ex56_dm_vec_type cuda -ex56_dm_mat_type aijcusparse 
> > and I see no GPU communication in MatSolve (the serial LU coarse grid 
> > solver).
> 
>    Do you mean to say, you DO see communication?
> 
>    What does -ksp_view should you? It should show the factor type in the 
> information about the coarse grid solve?
> 
>    You might need something like -mg_coarse_mat_solver_type cusparse  
> (because it may default to the PETSc one, it may be possible to have it 
> default to the cusparse if it exists and the matrix is of type 
> MATSEQAIJCUSPARSE).
> 
>    The determination of the MatGetFactor() is a bit involved including 
> pasting together strings and string compares and could be finding a CPU 
> factorization.
> 
>    I could run on one MPI_Rank() in the debugger and put a break point in 
> MatGetFactor() and track along to see what it picks and why. You could do 
> this debugging without GAMG first, just -pc_type lu  
> 
> > GAMG does set the coarse grid solver to LU manually like this: ierr = 
> > PCSetType(pc2, PCLU);CHKERRQ(ierr);
> 
>   For parallel runs this won't work using the GPU code and only sequential 
> direct solvers, so it must using BJACOBI in that case?
> 
>    Barry
> 
> 
> 
> 
> 
> > I am thinking the dispatch of the CUDA version of this got dropped somehow. 
> > 
> > I see that this is getting called:
> > 
> > PETSC_EXTERN PetscErrorCode MatSolverTypeRegister_CUSPARSE(void)
> > {
> >   PetscErrorCode ierr;
> > 
> >   PetscFunctionBegin;
> >   ierr = 
> > MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_LU,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   ierr = 
> > MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_CHOLESKY,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   ierr = 
> > MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_ILU,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   ierr = 
> > MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_ICC,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   PetscFunctionReturn(0);
> > }
> > 
> > but MatGetFactor_seqaijcusparse_cusparse is not getting  called.
> > 
> > GAMG does set the coarse grid solver to LU manually like this: ierr = 
> > PCSetType(pc2, PCLU);CHKERRQ(ierr);
> > 
> > Any ideas?
> > 
> > Thanks,
> > Mark
> > 
> > 
>

Re: [petsc-dev] CUDA GAMG coarse grid solver

Reply via email to