On Mon, Dec 13, 2021 at 9:40 AM Karthikeyan Chockalingam - STFC UKRI < karthikeyan.chockalin...@stfc.ac.uk> wrote:
> *@Mark Adams <mfad...@lbl.gov> *Yes, it worked with *-ex56_dm_mat_type > mpiaijcusparse* else it crashes with the error message > > [0]PETSC ERROR: Unknown Mat type given: cusparse > > > > > > *@Matthew Knepley <knep...@gmail.com> *Usually PETSc -log_view reports > the GPU flops. Alternatively if are using an external package such as > hypre, where gpu flops are not recorded by petsc, profiling using Nvidia’s > nsight captures them. So one could tell if the problem is running on gpus > or not. > Yes, that is how we measure GPU flops. I was asking how you tell the example to run on the GPU. I suggested using -ex56_dm_mat_type. You said that you were not using this but still running on a GPU. I did not see how this could be possible, so I was asking about that. Thanks, Matt > Kind regards, > > Karthik. > > > > *From: *Mark Adams <mfad...@lbl.gov> > *Date: *Monday, 13 December 2021 at 13:58 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > karthikeyan.chockalin...@stfc.ac.uk> > *Cc: *Matthew Knepley <knep...@gmail.com>, "petsc-users@mcs.anl.gov" < > petsc-users@mcs.anl.gov> > *Subject: *Re: [petsc-users] Unstructured mesh > > > > > > > > On Mon, Dec 13, 2021 at 8:35 AM Karthikeyan Chockalingam - STFC UKRI < > karthikeyan.chockalin...@stfc.ac.uk> wrote: > > Thanks Matt. Couple of weeks back you mentioned > > “There are many unstructured grid examples, e.g. SNES ex13, ex17, ex56. > The solver can run on the GPU, but the vector/matrix FEM assembly does not. > I am working on that now.” > > > > I am able to run other examples in ksp/tutorials on gpus. I complied ex56 > in snes/tutorials no differently. The only difference being I didn’t > specify _dm_vec_type and _dm_vec_type (as you mentioned they are not > assembled on gpus anyways plus I am working on an unstructured grid thought > _dm is not right type for this problem). I was hoping to see gpu flops > recorded for KSPSolve, which I didn’t. > > > > Okay, I will wait for Mark to comment. > > > > This (DM) example works like any other, with a prefix, as far as GPU: > -ex56_dm_vec_type cuda and -ex56_dm_mat_type cusparse, or aijkokkos/kokkos, > etc. > > Run with -options_left to verify that these are used. > > > > > > Kind regards, > > Karthik. > > > > *From: *Matthew Knepley <knep...@gmail.com> > *Date: *Monday, 13 December 2021 at 13:17 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > karthikeyan.chockalin...@stfc.ac.uk> > *Cc: *Mark Adams <mfad...@lbl.gov>, "petsc-users@mcs.anl.gov" < > petsc-users@mcs.anl.gov> > *Subject: *Re: [petsc-users] Unstructured mesh > > > > On Mon, Dec 13, 2021 at 7:15 AM Karthikeyan Chockalingam - STFC UKRI < > karthikeyan.chockalin...@stfc.ac.uk> wrote: > > Thank you. I was able to confirm both the below options produced the same > mesh > > > > ./ex56 -cells 2,2,1 -max_conv_its 2 > > ./ex56 -cells 4,4,2 -max_conv_its 1 > > Good > > But I didn’t get how is -cells i,j,k <1,1,1> is related to the number of > MPI processes. > > It is not. The number of processes is specified independently using > 'mpiexec -n <p>' or when using the test system NP=<p>. > > (i) Say I start with -cells 1,1,1 -max_conv its 7; that would > eventually leave all refinement on level 7 running on 1 MPI process? > > (ii) Say I start with -cells 2,2,1 -max_conv its n; is it recommended > to run on 4 MPI processes? > > No, those options do not influence the number of processes. > > > > I am running ex56 on gpu; I am looking at KSPSolve (or any other event) > but no gpu flops are recorded in the -log_view? > > > > I do not think you are running on the GPU then. Mark can comment, but we > usually specify GPU execution using the Vec and Mat types > > through -dm_vec_type and -dm_mat_type. > > > > Thanks, > > > > Matt > > > > For your reference I used the below flags: > > ./ex56 -cells 1,1,1 -max_conv_its 3 -lx 1. -alpha .01 -petscspace_degree 1 > -ksp_type cg -ksp_monitor -ksp_rtol 1.e-8 -pc_type asm -snes_monitor > -use_mat_nearnullspace true -snes_rtol 1.e-10 -ex56_dm_view -log_view > > > > Kind regards, > > Karthik. > > > > > > *From: *Mark Adams <mfad...@lbl.gov> > *Date: *Sunday, 12 December 2021 at 23:00 > *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" < > karthikeyan.chockalin...@stfc.ac.uk> > *Cc: *Matthew Knepley <knep...@gmail.com>, "petsc-users@mcs.anl.gov" < > petsc-users@mcs.anl.gov> > *Subject: *Re: [petsc-users] Unstructured mesh > > > > > > > > On Sun, Dec 12, 2021 at 3:19 PM Karthikeyan Chockalingam - STFC UKRI < > karthikeyan.chockalin...@stfc.ac.uk> wrote: > > Thank for your response that was helpful. I have a couple of questions: > > > > (i) How can I control the level of refinement? I tried > to pass the flag “-ex56_dm_refine 0” but that didn’t stop the refinement > from 8 giving 32 cubes. > > > > I answered this question recently but ex56 clobbers ex56_dm_refine in the > convergence loop. I have an MR that prints a warning if you provide a > ex56_dm_refine. > > > > * snes/ex56 runs a convergence study and confusingly sets the options > manually, thus erasing your -ex56_dm_refine. > > > > * To refine, use -max_conv_its N <3>, this sets the number of steps of > refinement. That is, the length of the convergence study > > > > * You can adjust where it starts from with -cells i,j,k <1,1,1> > > You do want to set this if you have multiple MPI processes so that the > size of this mesh is the number of processes. That way it starts with one > cell per process and refines from there. > > > > (ii) What does -cell 2,2,1 correspond to? > > > > The initial mesh or mesh_0. The convergence test uniformly refines this > mesh. So if you want to refine this twice you could use -cells 8,8,4 > > > > How can I determine the total number of dofs? > > Unfortunately, that is not printed but you can calculate from the initial > cell grid, the order of the element and the refinement in each iteration of > the convergence tests. > > > > So that I can perform a scaling study by changing the input of the flag > -cells. > > > > > > You can and the convergence test gives you data for a strong speedup study > in one run. Each solve is put in its own "stage" of the output and you want > to look at KSPSolve lines in the log_view output. > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > <http://www.cse.buffalo.edu/~knepley/> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>