Re: [petsc-users] KSP on GPU

Carl-Johan Thore via petsc-users Tue, 01 Nov 2022 11:09:04 -0700

Yes, I'm calling VecRestoreArray, but I realized now that I exported the 
vectors to Matlab
before doing that. Apparently that worked anyway for the CPU, but when using 
the GPU
it didn't. If I call VecRestoreArray before exporting then everything works 
fine on the GPU
as well. Thanks for pointing this out!

From: Stefano Zampini <stefano.zamp...@gmail.com>
Sent: den 1 november 2022 17:11
To: Carl-Johan Thore <carl-johan.th...@liu.se>
Cc: Mark Adams <mfad...@lbl.gov>; PETSc users list <petsc-users@mcs.anl.gov>
Subject: Re: [petsc-users] KSP on GPU

Are you calling VecRestoreArray when you are done inserting the values?

On Tue, Nov 1, 2022, 18:42 Carl-Johan Thore via petsc-users 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote:
Thanks for the tips!

The suggested settings for GAMG did not yield better results,
but hypre worked well right away, giving very good convergence!

A follow-up question then (I hope that's ok; and it could be related to GAMG
not working, I'll check that). Once everything was running I discovered that my 
gradient vector
dfdx which I populate via an array df obtained from VecGetArray(dfdx, &df) 
doesn't get
filled properly; it always contains only zeros. This is not the case when I run 
on the CPU,
and df gets filled as it should even on the GPU, suggesting that either I'm not 
using
VecGetArray properly, or I shouldn't use it at all for GPU computations?

Kind regards,
Carl-Johan

From: Mark Adams <mfad...@lbl.gov<mailto:mfad...@lbl.gov>>
Sent: den 31 oktober 2022 13:30
To: Carl-Johan Thore <carl-johan.th...@liu.se<mailto:carl-johan.th...@liu.se>>
Cc: Matthew Knepley <knep...@gmail.com<mailto:knep...@gmail.com>>; Barry Smith 
<bsm...@petsc.dev<mailto:bsm...@petsc.dev>>; 
petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>
Subject: Re: [petsc-users] KSP on GPU

* You could try hypre or another preconditioner that you can afford, like LU or 
ASM, that works.
* If this matrix is SPD, you want to use -fieldsplit_0_pc_gamg_esteig_ksp_type 
cg -fieldsplit_0_pc_gamg_esteig_ksp_max_it 10
 These will give better eigen estimates, and that is important.
The differences between these steimates is not too bad.
There is a safety factor (1.05 is the default) that you could increase with: 
-fieldsplit_0_mg_levels_ksp_chebyshev_esteig 0,0.05,0,1.1
* Finally you could try -fieldsplit_0_pc_gamg_reuse_interpolation 1, if GAMG is 
still not working.

Use -fieldsplit_0_ksp_converged_reason and check the iteration count.
And it is a good idea to check with hypre to make sure something is not going 
badly in terms of performance anyway. AMG is hard and hypre is a good solver.

Mark

On Mon, Oct 31, 2022 at 1:56 AM Carl-Johan Thore via petsc-users 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote:
The GPU supports double precision and I didn't explicitly tell PETSc to use 
float when compiling, so
I guess it uses double? What's the easiest way to check?

Barry, running -ksp_view shows that the solver options are the same for CPU and 
GPU. The only
difference is the coarse grid solver for gamg ("the package used to perform 
factorization:") which
is petsc for CPU and cusparse for GPU. I tried forcing the GPU to use petsc via
-fieldsplit_0_mg_coarse_sub_pc_factor_mat_solver_type, but then ksp failed to 
converge
even on the first topology optimization iteration.

-ksp_view also shows differences in the eigenvalues from the Chebyshev 
smoother. For example,

GPU:
   Down solver (pre-smoother) on level 2 -------------------------------
          KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
            type: chebyshev
              eigenvalue targets used: min 0.109245, max 1.2017
              eigenvalues provided (min 0.889134, max 1.09245) with

CPU:
              eigenvalue targets used: min 0.112623, max 1.23886
              eigenvalues provided (min 0.879582, max 1.12623)

But I guess such differences are expected?

/Carl-Johan

From: Matthew Knepley <knep...@gmail.com<mailto:knep...@gmail.com>>
Sent: den 30 oktober 2022 22:00
To: Barry Smith <bsm...@petsc.dev<mailto:bsm...@petsc.dev>>
Cc: Carl-Johan Thore <carl-johan.th...@liu.se<mailto:carl-johan.th...@liu.se>>; 
petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>
Subject: Re: [petsc-users] KSP on GPU

On Sun, Oct 30, 2022 at 3:52 PM Barry Smith 
<bsm...@petsc.dev<mailto:bsm...@petsc.dev>> wrote:

   In general you should expect similar but not identical conference behavior.

    I suggest running with all the monitoring you can. 
-ksp_monitor_true_residual -fieldsplit_0_monitor_true_residual 
-fieldsplit_1_monitor_true_residual and compare the various convergence between 
the CPU and GPU. Also run with -ksp_view and check that the various solver 
options are the same (they should be).

Is the GPU using float or double?

   Matt

  Barry

On Oct 30, 2022, at 11:02 AM, Carl-Johan Thore via petsc-users 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote:

Hi,

I'm solving a topology optimization problem with Stokes flow discretized by a 
stabilized Q1-Q0 finite element method
and using BiCGStab with the fieldsplit preconditioner to solve the linear 
systems. The implementation
is based on DMStag, runs on Ubuntu via WSL2, and works fine with PETSc-3.18.1 
on multiple CPU cores and the following
options for the preconditioner:

-fieldsplit_0_ksp_type preonly \
-fieldsplit_0_pc_type gamg \
-fieldsplit_0_pc_gamg_reuse_interpolation 0 \
-fieldsplit_1_ksp_type preonly \
-fieldsplit_1_pc_type jacobi

However, when I enable GPU computations by adding two options -

...
-dm_vec_type cuda \
-dm_mat_type aijcusparse \
-fieldsplit_0_ksp_type preonly \
-fieldsplit_0_pc_type gamg \
-fieldsplit_0_pc_gamg_reuse_interpolation 0 \
-fieldsplit_1_ksp_type preonly \
-fieldsplit_1_pc_type jacobi

- KSP still works fine the first couple of topology optimization iterations but 
then
stops with "Linear solve did not converge due to DIVERGED_DTOL ..".

My question is whether I should expect the GPU versions of the linear solvers 
and pre-conditioners
to function exactly as their CPU counterparts (I got this impression from the 
documentation),
in which case I've probably made some mistake in my own code, or whether there 
are other/additional
settings or modifications I should use to run on the GPU (an NVIDIA Quadro 
T2000)?

Kind regards,

Carl-Johan

--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=05%7C01%7Ccarl-johan.thore%40liu.se%7Cd50c1f449f8d4bf1e8de08dabc23c183%7C913f18ec7f264c5fa816784fe9a58edd%7C0%7C0%7C638029159000385778%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3sFnsDvuGu3NlRhleQbFp9diX4DrDkDSWFd5RXH2aaA%3D&reserved=0>

Re: [petsc-users] KSP on GPU

Reply via email to