Re: [petsc-users] gpu cpu parallel

Junchao Zhang Tue, 11 Nov 2025 19:49:13 -0800

Hi, Wenbo,
   I think your approach should work.  But before going this extra step
with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU,
using nvidia's multiple process service (MPS)?  If MPS works well,  then
you can avoid the extra complexity.


--Junchao Zhang


On Tue, Nov 11, 2025 at 7:50 PM Wenbo Zhao <[email protected]> wrote:

> Dear all,
>
> We are trying to solve ksp using GPUs.
> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the
> matrix is created and assembling using COO way provided by PETSc. In this
> example, the number of CPU is as same as the number of GPU.
> In our case, computation of the parameters of matrix is performed on CPUs.
> And the cost of it is expensive, which might take half of total time or
> even more.
>
>  We want to use more CPUs to compute parameters in parallel. And a smaller
> communication domain (such as gpu_comm) for the CPUs corresponding to the
> GPUs is created. The parameters are computed by all of the CPUs (in
> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
> MPI. Matrix (type of aijcusparse) is then created and assembled within
> gpu_comm. Finally, ksp_solve is performed on GPUs.
>
> I’m not sure if this approach will work in practice. Are there any
> comparable examples I can look to for guidance?
>
> Best,
> Wenbo
>

Re: [petsc-users] gpu cpu parallel

Reply via email to