Hello Sreeram, KSPCG (PETSc implementation of CG) does not handle solves with multiple columns at once. There is only a single native PETSc KSP implementation which handles solves with multiple columns at once: KSPPREONLY. If you use --download-hpddm, you can use a CG (or GMRES, or more advanced methods) implementation which handles solves with multiple columns at once (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM); KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);). I’m the main author of HPDDM, there is preliminary support for device matrices, but if it’s not working as intended/not faster than column by column, I’d be happy to have a deeper look (maybe in private), because most (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e., solvers that treat right-hand sides in a single go) are using plain host matrices.
Thanks, Pierre PS: you could have a look at https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to understand the philosophy behind block iterative methods in PETSc (and in HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was developed in the context of this paper to produce Figures 2-3. Note that this paper is now slightly outdated, since then, PCHYPRE and PCMG (among others) have been made “PCMatApply()-ready”. > On 13 Dec 2023, at 11:05 PM, Sreeram R Venkat <srven...@utexas.edu> wrote: > > Hello Pierre, > > I am trying out the KSPMatSolve with the BoomerAMG preconditioner. However, I > am noticing that it is still solving column by column (this is stated > explicitly in the info dump attached). I looked at the code for > KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true, it > should do the batched solve, though I'm not sure where that gets set. > > I am using the options -pc_type hypre -pc_hypre_type boomeramg when running > the code. > > Can you please help me with this? > > Thanks, > Sreeram > > > On Thu, Dec 7, 2023 at 4:04 PM Mark Adams <mfad...@lbl.gov > <mailto:mfad...@lbl.gov>> wrote: >> N.B., AMGX interface is a bit experimental. >> Mark >> >> On Thu, Dec 7, 2023 at 4:11 PM Sreeram R Venkat <srven...@utexas.edu >> <mailto:srven...@utexas.edu>> wrote: >>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build correctly >>> was also tricky so hopefully the HYPRE build will be easier. >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pie...@joliv.et >>> <mailto:pie...@joliv.et>> wrote: >>>> >>>> >>>>> On 7 Dec 2023, at 9:37 PM, Sreeram R Venkat <srven...@utexas.edu >>>>> <mailto:srven...@utexas.edu>> wrote: >>>>> >>>>> Thank you Barry and Pierre; I will proceed with the first option. >>>>> >>>>> I want to use the AMGX preconditioner for the KSP. I will try it out and >>>>> see how it performs. >>>> >>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has no >>>> PCMatApply() implementation. >>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG() implementation. >>>> But let us know if you need assistance figuring things out. >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>>> On Thu, Dec 7, 2023 at 2:02 PM Pierre Jolivet <pie...@joliv.et >>>>> <mailto:pie...@joliv.et>> wrote: >>>>>> To expand on Barry’s answer, we have observed repeatedly that MatMatMult >>>>>> with MatAIJ performs better than MatMult with MatMAIJ, you can reproduce >>>>>> this on your own with >>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html. >>>>>> Also, I’m guessing you are using some sort of preconditioner within your >>>>>> KSP. >>>>>> Not all are “KSPMatSolve-ready”, i.e., they may treat blocks of >>>>>> right-hand sides column by column, which is very inefficient. >>>>>> You could run your code with -info dump and send us dump.0 to see what >>>>>> needs to be done on our end to make things more efficient, should you >>>>>> not be satisfied with the current performance of the code. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> On 7 Dec 2023, at 8:34 PM, Barry Smith <bsm...@petsc.dev >>>>>>> <mailto:bsm...@petsc.dev>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Dec 7, 2023, at 1:17 PM, Sreeram R Venkat <srven...@utexas.edu >>>>>>>> <mailto:srven...@utexas.edu>> wrote: >>>>>>>> >>>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n >>>>>>>> x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i >>>>>>>> has size n. The data for v can be stored either in column-major or >>>>>>>> row-major order. Now, I want to do 2 types of operations: >>>>>>>> >>>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m. >>>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m. >>>>>>>> >>>>>>>> From what I have read on the documentation, I can think of 2 >>>>>>>> approaches. >>>>>>>> >>>>>>>> 1. Get the pointer to the data in v (column-major) and use it to >>>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take >>>>>>>> the data pointer of W to create the vector w. For KSPSolves, use >>>>>>>> KSPMatSolve with R and V. >>>>>>>> >>>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with >>>>>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that >>>>>>>> it is a multiple RHS system and act accordingly. >>>>>>>> >>>>>>>> Which would be the more efficient option? >>>>>>> >>>>>>> Use 1. >>>>>>>> >>>>>>>> As a side-note, I am also wondering if there is a way to use row-major >>>>>>>> storage of the vector v. >>>>>>> >>>>>>> No >>>>>>> >>>>>>>> The reason is that this could allow for more coalesced memory access >>>>>>>> when doing matvecs. >>>>>>> >>>>>>> PETSc matrix-vector products use BLAS GMEV matrix-vector products for >>>>>>> the computation so in theory they should already be well-optimized >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sreeram >>>>>> >>>> > <dump.0>