Thank you for your detailed response. Yes, using aijcusparse worked (as you 
mentioned it is not optimal).

My objective was to run an unstructured mesh (which I believe would be of type 
DMPlex), for varying size using different preconditioners by comparing their 
performance on cpus and gpus. I understand that would not be possible using 
hypre. Leaving out hypre - I need some recommendation to move onto other 
unstructured (DMPlex) example problems, where I change the size of the domain 
via command line input.

Kind regards,
Karthik.





From: Mark Adams <mfad...@lbl.gov>
Date: Tuesday, 14 December 2021 at 18:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
<karthikeyan.chockalin...@stfc.ac.uk>
Cc: Matthew Knepley <knep...@gmail.com>, "petsc-users@mcs.anl.gov" 
<petsc-users@mcs.anl.gov>
Subject: Re: [petsc-users] Unstructured mesh

I was able to get hypre to work on ex56 (snes) with -ex56_dm_mat_type 
aijcusparse -ex56_dm_vec_type cuda (not hypre matrix).
This should copy the cusparse matrix to a hypre matrix before the solve. So not 
optimal but the actual solve should be the same.
Hypre is not yet supported on this example so you might not want to spend too 
much time on it.
In particular, we do not have an example that uses DMPlex and hypre on a GPU.

src/ksp/ksp/tutotials/ex56 is old and does not use DMPLex, but it is missing a 
call like ksp ex4 for hypre like this:
#if defined(PETSC_HAVE_HYPRE)
  ierr = MatHYPRESetPreallocation(A,5,NULL,5,NULL);CHKERRQ(ierr);
#endif
If you add that it might work, but again this is all pretty fragile at this 
point.

Mark

Mark

On Tue, Dec 14, 2021 at 2:42 AM Karthikeyan Chockalingam - STFC UKRI 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
 wrote:
I tried adding the -mat_block_size 3 but I still get the same error message.

Thanks,
Karthik.

From: Mark Adams <mfad...@lbl.gov<mailto:mfad...@lbl.gov>>
Date: Monday, 13 December 2021 at 19:54
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
Cc: Matthew Knepley <knep...@gmail.com<mailto:knep...@gmail.com>>, 
"petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>" 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] Unstructured mesh

Try adding -mat_block_size 3


On Mon, Dec 13, 2021 at 11:57 AM Karthikeyan Chockalingam - STFC UKRI 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
 wrote:

I tried to run the problem using -pc_type hypre but it errored out:


./ex56 -cells 4,4,2  -max_conv_its 1 -lx 1. -alpha .01 -petscspace_degree 1 
-ksp_type cg -ksp_monitor -ksp_rtol 1.e-8 -pc_type hypre -pc_hypre_type 
boomeramg  -snes_monitor -use_mat_nearnullspace true -snes_rtol 1.e-10 
-ex56_dm_view -log_view -ex56_dm_vec_type cuda -ex56_dm_mat_type hypre  
-options_left







[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------

[0]PETSC ERROR: Petsc has generated inconsistent data

[0]PETSC ERROR: Blocksize of layout 1 must match that of mapping 3 (or the 
latter must be 1)

[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.

[0]PETSC ERROR: Petsc Development GIT revision: v3.16.1-353-g887dddf386  GIT 
Date: 2021-11-19 20:24:41 +0000

[0]PETSC ERROR: ./ex56 on a arch-linux2-c-opt named sqg2b13.bullx by 
kxc07-lxm25 Mon Dec 13 16:50:02 2021

[0]PETSC ERROR: Configure options --with-debugging=0 
--with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl
 --with-cuda=1 --with-cuda-arch=70 --download-hypre=yes 
--download-hypre-configure-arguments="--with-cuda=yes 
--enable-gpu-profiling=yes --enable-cusparse=yes --enable-cublas=yes 
--enable-curand=yes  --enable-unified-memory=yes HYPRE_CUDA_SM=70" 
--with-shared-libraries=1 --known-mpi-shared-libraries=1 --with-cc=mpicc 
--with-cxx=mpicxx -with-fc=mpif90

[0]PETSC ERROR: #1 PetscLayoutSetISLocalToGlobalMapping() at 
/lustre/scafellpike/local/HT04048/lxm25/kxc07-lxm25/petsc-main/petsc/src/vec/is/utils/pmap.c:371

[0]PETSC ERROR: #2 MatSetLocalToGlobalMapping() at 
/lustre/scafellpike/local/HT04048/lxm25/kxc07-lxm25/petsc-main/petsc/src/mat/interface/matrix.c:2089

[0]PETSC ERROR: #3 DMCreateMatrix_Plex() at 
/lustre/scafellpike/local/HT04048/lxm25/kxc07-lxm25/petsc-main/petsc/src/dm/impls/plex/plex.c:2460

[0]PETSC ERROR: #4 DMCreateMatrix() at 
/lustre/scafellpike/local/HT04048/lxm25/kxc07-lxm25/petsc-main/petsc/src/dm/interface/dm.c:1445

[0]PETSC ERROR: #5 main() at ex56.c:439

[0]PETSC ERROR: PETSc Option Table entries:

[0]PETSC ERROR: -alpha .01

[0]PETSC ERROR: -cells 4,4,2

[0]PETSC ERROR: -ex56_dm_mat_type hypre

[0]PETSC ERROR: -ex56_dm_vec_type cuda

[0]PETSC ERROR: -ex56_dm_view

[0]PETSC ERROR: -ksp_monitor

[0]PETSC ERROR: -ksp_rtol 1.e-8

[0]PETSC ERROR: -ksp_type cg

[0]PETSC ERROR: -log_view

[0]PETSC ERROR: -lx 1.

[0]PETSC ERROR: -max_conv_its 1

[0]PETSC ERROR: -options_left

[0]PETSC ERROR: -pc_hypre_type boomeramg

[0]PETSC ERROR: -pc_type hypre

[0]PETSC ERROR: -petscspace_degree 1

[0]PETSC ERROR: -snes_monitor

[0]PETSC ERROR: -snes_rtol 1.e-10

[0]PETSC ERROR: -use_gpu_aware_mpi 0

[0]PETSC ERROR: -use_mat_nearnullspace true

[0]PETSC ERROR: ----------------End of Error Message -------send entire error 
message to petsc-ma...@mcs.anl.gov----------

--------------------------------------------------------------------------

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD

with errorcode 77.



NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--------------------------------------------------------------------------


From: Mark Adams <mfad...@lbl.gov<mailto:mfad...@lbl.gov>>
Date: Monday, 13 December 2021 at 13:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
Cc: Matthew Knepley <knep...@gmail.com<mailto:knep...@gmail.com>>, 
"petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>" 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] Unstructured mesh



On Mon, Dec 13, 2021 at 8:35 AM Karthikeyan Chockalingam - STFC UKRI 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
 wrote:
Thanks Matt. Couple of weeks back you mentioned
“There are many unstructured grid examples, e.g. SNES ex13, ex17, ex56. The 
solver can run on the GPU, but the vector/matrix FEM assembly does not. I am 
working on that now.”

I am able to run other examples in ksp/tutorials on gpus. I complied ex56 in 
snes/tutorials no differently. The only difference being I didn’t specify 
_dm_vec_type and _dm_vec_type (as you mentioned they are not assembled on gpus 
anyways plus I am working on an unstructured grid thought _dm is not right type 
for this problem). I was hoping to see gpu flops recorded for KSPSolve, which I 
didn’t.

Okay, I will wait for Mark to comment.

This (DM) example works like any other, with a prefix, as far as GPU: 
-ex56_dm_vec_type cuda and -ex56_dm_mat_type cusparse, or aijkokkos/kokkos, etc.
Run with -options_left to verify that these are used.


Kind regards,
Karthik.

From: Matthew Knepley <knep...@gmail.com<mailto:knep...@gmail.com>>
Date: Monday, 13 December 2021 at 13:17
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
Cc: Mark Adams <mfad...@lbl.gov<mailto:mfad...@lbl.gov>>, 
"petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>" 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] Unstructured mesh

On Mon, Dec 13, 2021 at 7:15 AM Karthikeyan Chockalingam - STFC UKRI 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
 wrote:

Thank you. I was able to confirm both the below options produced the same mesh



 ./ex56 -cells 2,2,1 -max_conv_its 2

./ex56 -cells 4,4,2 -max_conv_its 1
Good

But I didn’t get how is -cells i,j,k <1,1,1> is related to the number of MPI 
processes.
It is not. The number of processes is specified independently using 'mpiexec -n 
<p>' or when using the test system NP=<p>.

(i)     Say I start with -cells 1,1,1 -max_conv its 7; that would eventually 
leave all refinement on level 7 running on 1 MPI process?

(ii)    Say I start with -cells 2,2,1 -max_conv its n; is it recommended to run 
on 4 MPI processes?
No, those options do not influence the number of processes.

I am running ex56 on gpu; I am looking at KSPSolve (or any other event) but no 
gpu flops are recorded in the -log_view?

I do not think you are running on the GPU then. Mark can comment, but we 
usually specify GPU execution using the Vec and Mat types
through -dm_vec_type and -dm_mat_type.

  Thanks,

     Matt

For your reference I used the below flags:

./ex56 -cells 1,1,1 -max_conv_its 3 -lx 1. -alpha .01 -petscspace_degree 1 
-ksp_type cg -ksp_monitor -ksp_rtol 1.e-8 -pc_type asm -snes_monitor 
-use_mat_nearnullspace true -snes_rtol 1.e-10 -ex56_dm_view -log_view

Kind regards,
Karthik.


From: Mark Adams <mfad...@lbl.gov<mailto:mfad...@lbl.gov>>
Date: Sunday, 12 December 2021 at 23:00
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
Cc: Matthew Knepley <knep...@gmail.com<mailto:knep...@gmail.com>>, 
"petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>" 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>>
Subject: Re: [petsc-users] Unstructured mesh



On Sun, Dec 12, 2021 at 3:19 PM Karthikeyan Chockalingam - STFC UKRI 
<karthikeyan.chockalin...@stfc.ac.uk<mailto:karthikeyan.chockalin...@stfc.ac.uk>>
 wrote:
Thank for your response that was helpful. I have a couple of questions:


(i)                  How can I control the level of refinement? I tried to pass 
the flag “-ex56_dm_refine 0” but that didn’t stop the refinement from 8 giving 
32 cubes.

I answered this question recently but ex56 clobbers ex56_dm_refine in the 
convergence loop. I have an MR that prints a warning if you provide a 
ex56_dm_refine.

* snes/ex56 runs a convergence study and confusingly sets the options manually, 
thus erasing your -ex56_dm_refine.

* To refine, use -max_conv_its N <3>, this sets the number of steps of 
refinement. That is, the length of the convergence study

* You can adjust where it starts from with -cells i,j,k <1,1,1>
You do want to set this if you have multiple MPI processes so that the size of 
this mesh is the number of processes. That way it starts with one cell per 
process and refines from there.


(ii)                What does -cell 2,2,1 correspond to?

The initial mesh or mesh_0. The convergence test uniformly refines this mesh. 
So if you want to refine this twice you could use -cells 8,8,4


How can I determine the total number of dofs?
Unfortunately, that is not printed but you can calculate from the initial cell 
grid, the order of the element and the refinement in each iteration of the 
convergence tests.


So that I can perform a scaling study by changing the input of the flag -cells.


You can and the convergence test gives you data for a strong speedup study in 
one run. Each solve is put in its own "stage" of the output and you want to 
look at KSPSolve lines in the log_view output.

This email and any attachments are intended solely for the use of the named 
recipients. If you are not the intended recipient you must not use, disclose, 
copy or distribute this email or any of its attachments and should notify the 
sender immediately and delete this email from your system. UK Research and 
Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
this email or any attachments containing viruses or malware but the recipient 
should carry out its own virus and malware checks before opening the 
attachments. UKRI does not accept any liability for any losses or damages which 
the recipient may sustain due to presence of any viruses.


--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

Reply via email to