800k is a pretty small problem for GPUs. 

  We would need to see the runs with output from -ksp_view -log_view to see if 
the timing results are reasonable.

> On Apr 12, 2024, at 1:48 PM, Ng, Cho-Kuen <c...@slac.stanford.edu> wrote:
> 
> I performed tests on comparison using KSP with and without cuda backend on 
> NERSC's Perlmutter. For a finite element solve with 800k degrees of freedom, 
> the best times obtained using MPI and MPI+GPU were
> 
> o MPI - 128 MPI tasks, 27 s
> 
> o MPI+GPU - 4 MPI tasks, 4 GPU's, 32 s
> 
> Is that the performance one would expect using the hybrid mode of 
> computation. Attached image shows the scaling on a single node.
> 
> Thanks,
> Cho
> From: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> Sent: Saturday, August 12, 2023 8:08 AM
> To: Jacob Faibussowitsch <jacob....@gmail.com <mailto:jacob....@gmail.com>>
> Cc: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>; petsc-users 
> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using PETSc GPU backend
>  
> Thanks Jacob.
> From: Jacob Faibussowitsch <jacob....@gmail.com <mailto:jacob....@gmail.com>>
> Sent: Saturday, August 12, 2023 5:02 AM
> To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> Cc: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>; petsc-users 
> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using PETSc GPU backend
>  
> > Can petsc show the number of GPUs used?
> 
> -device_view
> 
> Best regards,
> 
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
> 
> > On Aug 12, 2023, at 00:53, Ng, Cho-Kuen via petsc-users 
> > <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
> > 
> > Barry,
> > 
> > I tried again today on Perlmutter and running on multiple GPU nodes worked. 
> > Likely, I had messed up something the other day. Also, I was able to have 
> > multiple MPI tasks on a GPU using Nvidia MPS. The petsc output shows the 
> > number of MPI tasks:
> > 
> > KSP Object: 32 MPI processes
> > 
> > Can petsc show the number of GPUs used?
> > 
> > Thanks,
> > Cho
> > 
> > From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> > Sent: Wednesday, August 9, 2023 4:09 PM
> > To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> > Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> > <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> > Subject: Re: [petsc-users] Using PETSc GPU backend
> >  
> >   We would need more information about "hanging". Do PETSc examples and 
> > tiny problems "hang" on multiple nodes? If you run with -info what are the 
> > last messages printed? Can you run with a debugger to see where it is 
> > "hanging"?
> > 
> > 
> > 
> >> On Aug 9, 2023, at 5:59 PM, Ng, Cho-Kuen <c...@slac.stanford.edu 
> >> <mailto:c...@slac.stanford.edu>> wrote:
> >> 
> >> Barry and Matt,
> >> 
> >> Thanks for your help. Now I can use petsc GPU backend on Perlmutter: 1 
> >> node, 4 MPI tasks and 4 GPUs. However, I ran into problems with multiple 
> >> nodes: 2 nodes, 8 MPI tasks and 8 GPUs. The run hung on KSPSolve. How can 
> >> I fix this?
> >> 
> >> Best,
> >> Cho
> >> 
> >>  From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> >> Sent: Monday, July 17, 2023 6:58 AM
> >> To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> >> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>  
> >>  The examples that use DM, in particular DMDA all trivially support using 
> >> the GPU with -dm_mat_type aijcusparse -dm_vec_type cuda
> >> 
> >> 
> >> 
> >>> On Jul 17, 2023, at 1:45 AM, Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>> <mailto:c...@slac.stanford.edu>> wrote:
> >>> 
> >>> Barry,
> >>> 
> >>> Thank you so much for the clarification. 
> >>> 
> >>> I see that ex104.c and ex300.c use  MatXAIJSetPreallocation(). Are there 
> >>> other tutorials available?
> >>> 
> >>> Cho
> >>>  From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> >>> Sent: Saturday, July 15, 2023 8:36 AM
> >>> To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> >>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>  
> >>>      Cho,
> >>> 
> >>>     We currently have a crappy API for turning on GPU support, and our 
> >>> documentation is misleading in places.
> >>> 
> >>>     People constantly say "to use GPU's with PETSc you only need to use 
> >>> -mat_type aijcusparse (for example)" This is incorrect.
> >>> 
> >>>  This does not work with code that uses the convenience Mat constructors 
> >>> such as MatCreateAIJ(), MatCreateAIJWithArrays etc. It only works if you 
> >>> use the constructor approach of MatCreate(), MatSetSizes(), 
> >>> MatSetFromOptions(), MatXXXSetPreallocation(). ...  Similarly you need to 
> >>> use VecCreate(), VecSetSizes(), VecSetFromOptions() and -vec_type cuda
> >>> 
> >>>    If you use DM to create the matrices and vectors then you can use 
> >>> -dm_mat_type aijcusparse -dm_vec_type cuda
> >>> 
> >>>    Sorry for the confusion.
> >>> 
> >>>    Barry
> >>> 
> >>> 
> >>> 
> >>> 
> >>>> On Jul 15, 2023, at 8:03 AM, Matthew Knepley <knep...@gmail.com 
> >>>> <mailto:knep...@gmail.com>> wrote:
> >>>> 
> >>>> On Sat, Jul 15, 2023 at 1:44 AM Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>> <mailto:c...@slac.stanford.edu>> wrote:
> >>>> Matt,
> >>>> 
> >>>> After inserting 2 lines in the code:
> >>>> 
> >>>>   ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);                   
> >>>>    
> >>>>   ierr = MatSetFromOptions(A);CHKERRQ(ierr);
> >>>>   ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
> >>>>                       d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
> >>>> 
> >>>> "There are no unused options." However, there is no improvement on the 
> >>>> GPU performance.
> >>>> 
> >>>> 1. MatCreateAIJ() sets the type, and in fact it overwrites the Mat you 
> >>>> created in steps 1 and 2. This is detailed in the manual.
> >>>> 
> >>>> 2. You should replace MatCreateAIJ(), with MatSetSizes() before 
> >>>> MatSetFromOptions().
> >>>> 
> >>>>   THanks,
> >>>> 
> >>>>     Matt
> >>>>  Thanks,
> >>>> Cho
> >>>> 
> >>>>  From: Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>>
> >>>> Sent: Friday, July 14, 2023 5:57 PM
> >>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> >>>> Cc: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>; Mark Adams 
> >>>> <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>; petsc-users@mcs.anl.gov 
> >>>> <mailto:petsc-users@mcs.anl.gov> <petsc-users@mcs.anl.gov 
> >>>> <mailto:petsc-users@mcs.anl.gov>>
> >>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>  On Fri, Jul 14, 2023 at 7:57 PM Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>> <mailto:c...@slac.stanford.edu>> wrote:
> >>>> I managed to pass the following options to PETSc using a GPU node on 
> >>>> Perlmutter.
> >>>> 
> >>>>     -mat_type aijcusparse -vec_type cuda -log_view -options_left
> >>>> 
> >>>> Below is a summary of the test using 4 MPI tasks and 1 GPU per task.
> >>>> 
> >>>> o #PETSc Option Table entries:
> >>>>    -log_view
> >>>>    -mat_type aijcusparse
> >>>>    -options_left
> >>>>    -vec_type cuda
> >>>>    #End of PETSc Option Table entries
> >>>>    WARNING! There are options you set that were not used!
> >>>>    WARNING! could be spelling mistake, etc!
> >>>>    There is one unused database option. It is:
> >>>>    Option left: name:-mat_type value: aijcusparse
> >>>> 
> >>>> The -mat_type option has not been used. In the application code, we use
> >>>> 
> >>>>     ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
> >>>>              d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
> >>>> 
> >>>> 
> >>>> If you create the Mat this way, then you need MatSetFromOptions() in 
> >>>> order to set the type from the command line.
> >>>> 
> >>>>   Thanks,
> >>>> 
> >>>>      Matt
> >>>>  o The percent flops on the GPU for KSPSolve is 17%.
> >>>> 
> >>>> In comparison with a CPU run using 16 MPI tasks, the GPU run is an order 
> >>>> of magnitude slower. How can I improve the GPU performance?
> >>>> 
> >>>> Thanks,
> >>>> Cho
> >>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>> <mailto:c...@slac.stanford.edu>>
> >>>> Sent: Friday, June 30, 2023 7:57 AM
> >>>> To: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>; Mark Adams 
> >>>> <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>> Cc: Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>>; 
> >>>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>  Barry, Mark and Matt,
> >>>> 
> >>>> Thank you all for the suggestions. I will modify the code so we can pass 
> >>>> runtime options.
> >>>> 
> >>>> Cho
> >>>>  From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> >>>> Sent: Friday, June 30, 2023 7:01 AM
> >>>> To: Mark Adams <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>> Cc: Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>>; Ng, 
> >>>> Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>; 
> >>>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>  
> >>>>   Note that options like -mat_type aijcusparse  -vec_type cuda only work 
> >>>> if the program is set up to allow runtime swapping of matrix and vector 
> >>>> types. If you have a call to MatCreateMPIAIJ() or other specific types 
> >>>> then then these options do nothing but because Mark had you use 
> >>>> -options_left the program will tell you at the end that it did not use 
> >>>> the option so you will know.
> >>>> 
> >>>>> On Jun 30, 2023, at 9:30 AM, Mark Adams <mfad...@lbl.gov 
> >>>>> <mailto:mfad...@lbl.gov>> wrote:
> >>>>> 
> >>>>> PetscCall(PetscInitialize(&argc, &argv, NULL, help)); gives us the args 
> >>>>> and you run:
> >>>>> 
> >>>>> a.out -mat_type aijcusparse -vec_type cuda -log_view -options_left
> >>>>> 
> >>>>> Mark
> >>>>> 
> >>>>> On Fri, Jun 30, 2023 at 6:16 AM Matthew Knepley <knep...@gmail.com 
> >>>>> <mailto:knep...@gmail.com>> wrote:
> >>>>> On Fri, Jun 30, 2023 at 1:13 AM Ng, Cho-Kuen via petsc-users 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
> >>>>> Mark,
> >>>>> 
> >>>>> The application code reads in parameters from an input file, where we 
> >>>>> can put the PETSc runtime options. Then we pass the options to 
> >>>>> PetscInitialize(...). Does that sounds right?
> >>>>> 
> >>>>> PETSc will read command line argument automatically in 
> >>>>> PetscInitialize() unless you shut it off.
> >>>>> 
> >>>>>   Thanks,
> >>>>> 
> >>>>>     Matt
> >>>>>  Cho
> >>>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>>> <mailto:c...@slac.stanford.edu>>
> >>>>> Sent: Thursday, June 29, 2023 8:32 PM
> >>>>> To: Mark Adams <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>>  Mark,
> >>>>> 
> >>>>> Thanks for the information. How do I put the runtime options for the 
> >>>>> executable, say, a.out, which does not have the provision to append 
> >>>>> arguments? Do I need to change the C++ main to read in the options?
> >>>>> 
> >>>>> Cho
> >>>>>  From: Mark Adams <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>>> Sent: Thursday, June 29, 2023 5:55 PM
> >>>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>>> <mailto:c...@slac.stanford.edu>>
> >>>>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>>  Run with options: -mat_type aijcusparse -vec_type cuda -log_view 
> >>>>> -options_left
> >>>>> The last column of the performance data (from -log_view) will be the 
> >>>>> percent flops on the GPU. Check that that is > 0.
> >>>>> 
> >>>>> The end of the output will list the options that were used and options 
> >>>>> that were _not_ used (if any). Check that there are no options left.
> >>>>> 
> >>>>> Mark
> >>>>> 
> >>>>> On Thu, Jun 29, 2023 at 7:50 PM Ng, Cho-Kuen via petsc-users 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
> >>>>> I installed PETSc on Perlmutter using "spack install petsc+cuda+zoltan" 
> >>>>> and used it by "spack load petsc/fwge6pf". Then I compiled the 
> >>>>> application code (purely CPU code) linking to the petsc package, hoping 
> >>>>> that I can get performance improvement using the petsc GPU backend. 
> >>>>> However, the timing was the same using the same number of MPI tasks 
> >>>>> with and without GPU accelerators. Have I missed something in the 
> >>>>> process, for example, setting up PETSc options at runtime to use the 
> >>>>> GPU backend?
> >>>>> 
> >>>>> Thanks,
> >>>>> Cho
> >>>>> 
> >>>>> 
> >>>>> -- 
> >>>>> What most experimenters take for granted before they begin their 
> >>>>> experiments is infinitely more interesting than any results to which 
> >>>>> their experiments lead.
> >>>>> -- Norbert Wiener
> >>>>> 
> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fMikK_wvRIVkv5jLV6EHt_rPhWLibqlxAAYjRVMbAEGOUp417LWCH59TvzCtcD3j4dOd4xR_tUy2MRnqU1N7kew$
> >>>>>  
> >>>> 
> >>>> 
> >>>> 
> >>>> -- 
> >>>> What most experimenters take for granted before they begin their 
> >>>> experiments is infinitely more interesting than any results to which 
> >>>> their experiments lead.
> >>>> -- Norbert Wiener
> >>>> 
> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fMikK_wvRIVkv5jLV6EHt_rPhWLibqlxAAYjRVMbAEGOUp417LWCH59TvzCtcD3j4dOd4xR_tUy2MRnqU1N7kew$
> >>>>  
> >>>> 
> >>>> 
> >>>> -- 
> >>>> What most experimenters take for granted before they begin their 
> >>>> experiments is infinitely more interesting than any results to which 
> >>>> their experiments lead.
> >>>> -- Norbert Wiener
> >>>> 
> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fMikK_wvRIVkv5jLV6EHt_rPhWLibqlxAAYjRVMbAEGOUp417LWCH59TvzCtcD3j4dOd4xR_tUy2MRnqU1N7kew$
> >>>>  
> >> 
> >> 
> >> 
> >>  From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> >> Sent: Monday, July 17, 2023 6:58 AM
> >> To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> >> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>  
> >>  The examples that use DM, in particular DMDA all trivially support using 
> >> the GPU with -dm_mat_type aijcusparse -dm_vec_type cuda
> >> 
> >> 
> >> 
> >>> On Jul 17, 2023, at 1:45 AM, Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>> <mailto:c...@slac.stanford.edu>> wrote:
> >>> 
> >>> Barry,
> >>> 
> >>> Thank you so much for the clarification. 
> >>> 
> >>> I see that ex104.c and ex300.c use  MatXAIJSetPreallocation(). Are there 
> >>> other tutorials available?
> >>> 
> >>> Cho
> >>>  From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> >>> Sent: Saturday, July 15, 2023 8:36 AM
> >>> To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> >>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>  
> >>>      Cho,
> >>> 
> >>>     We currently have a crappy API for turning on GPU support, and our 
> >>> documentation is misleading in places.
> >>> 
> >>>     People constantly say "to use GPU's with PETSc you only need to use 
> >>> -mat_type aijcusparse (for example)" This is incorrect.
> >>> 
> >>>  This does not work with code that uses the convenience Mat constructors 
> >>> such as MatCreateAIJ(), MatCreateAIJWithArrays etc. It only works if you 
> >>> use the constructor approach of MatCreate(), MatSetSizes(), 
> >>> MatSetFromOptions(), MatXXXSetPreallocation(). ...  Similarly you need to 
> >>> use VecCreate(), VecSetSizes(), VecSetFromOptions() and -vec_type cuda
> >>> 
> >>>    If you use DM to create the matrices and vectors then you can use 
> >>> -dm_mat_type aijcusparse -dm_vec_type cuda
> >>> 
> >>>    Sorry for the confusion.
> >>> 
> >>>    Barry
> >>> 
> >>> 
> >>> 
> >>> 
> >>>> On Jul 15, 2023, at 8:03 AM, Matthew Knepley <knep...@gmail.com 
> >>>> <mailto:knep...@gmail.com>> wrote:
> >>>> 
> >>>> On Sat, Jul 15, 2023 at 1:44 AM Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>> <mailto:c...@slac.stanford.edu>> wrote:
> >>>> Matt,
> >>>> 
> >>>> After inserting 2 lines in the code:
> >>>> 
> >>>>   ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);                   
> >>>>    
> >>>>   ierr = MatSetFromOptions(A);CHKERRQ(ierr);
> >>>>   ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
> >>>>                       d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
> >>>> 
> >>>> "There are no unused options." However, there is no improvement on the 
> >>>> GPU performance.
> >>>> 
> >>>> 1. MatCreateAIJ() sets the type, and in fact it overwrites the Mat you 
> >>>> created in steps 1 and 2. This is detailed in the manual.
> >>>> 
> >>>> 2. You should replace MatCreateAIJ(), with MatSetSizes() before 
> >>>> MatSetFromOptions().
> >>>> 
> >>>>   THanks,
> >>>> 
> >>>>     Matt
> >>>>  Thanks,
> >>>> Cho
> >>>> 
> >>>>  From: Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>>
> >>>> Sent: Friday, July 14, 2023 5:57 PM
> >>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>
> >>>> Cc: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>; Mark Adams 
> >>>> <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>; petsc-users@mcs.anl.gov 
> >>>> <mailto:petsc-users@mcs.anl.gov> <petsc-users@mcs.anl.gov 
> >>>> <mailto:petsc-users@mcs.anl.gov>>
> >>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>  On Fri, Jul 14, 2023 at 7:57 PM Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>> <mailto:c...@slac.stanford.edu>> wrote:
> >>>> I managed to pass the following options to PETSc using a GPU node on 
> >>>> Perlmutter.
> >>>> 
> >>>>     -mat_type aijcusparse -vec_type cuda -log_view -options_left
> >>>> 
> >>>> Below is a summary of the test using 4 MPI tasks and 1 GPU per task.
> >>>> 
> >>>> o #PETSc Option Table entries:
> >>>>    -log_view
> >>>>    -mat_type aijcusparse
> >>>>    -options_left
> >>>>    -vec_type cuda
> >>>>    #End of PETSc Option Table entries
> >>>>    WARNING! There are options you set that were not used!
> >>>>    WARNING! could be spelling mistake, etc!
> >>>>    There is one unused database option. It is:
> >>>>    Option left: name:-mat_type value: aijcusparse
> >>>> 
> >>>> The -mat_type option has not been used. In the application code, we use
> >>>> 
> >>>>     ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
> >>>>              d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
> >>>> 
> >>>> 
> >>>> If you create the Mat this way, then you need MatSetFromOptions() in 
> >>>> order to set the type from the command line.
> >>>> 
> >>>>   Thanks,
> >>>> 
> >>>>      Matt
> >>>>  o The percent flops on the GPU for KSPSolve is 17%.
> >>>> 
> >>>> In comparison with a CPU run using 16 MPI tasks, the GPU run is an order 
> >>>> of magnitude slower. How can I improve the GPU performance?
> >>>> 
> >>>> Thanks,
> >>>> Cho
> >>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>> <mailto:c...@slac.stanford.edu>>
> >>>> Sent: Friday, June 30, 2023 7:57 AM
> >>>> To: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>; Mark Adams 
> >>>> <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>> Cc: Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>>; 
> >>>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>  Barry, Mark and Matt,
> >>>> 
> >>>> Thank you all for the suggestions. I will modify the code so we can pass 
> >>>> runtime options.
> >>>> 
> >>>> Cho
> >>>>  From: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> >>>> Sent: Friday, June 30, 2023 7:01 AM
> >>>> To: Mark Adams <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>> Cc: Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>>; Ng, 
> >>>> Cho-Kuen <c...@slac.stanford.edu <mailto:c...@slac.stanford.edu>>; 
> >>>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>  
> >>>>   Note that options like -mat_type aijcusparse  -vec_type cuda only work 
> >>>> if the program is set up to allow runtime swapping of matrix and vector 
> >>>> types. If you have a call to MatCreateMPIAIJ() or other specific types 
> >>>> then then these options do nothing but because Mark had you use 
> >>>> -options_left the program will tell you at the end that it did not use 
> >>>> the option so you will know.
> >>>> 
> >>>>> On Jun 30, 2023, at 9:30 AM, Mark Adams <mfad...@lbl.gov 
> >>>>> <mailto:mfad...@lbl.gov>> wrote:
> >>>>> 
> >>>>> PetscCall(PetscInitialize(&argc, &argv, NULL, help)); gives us the args 
> >>>>> and you run:
> >>>>> 
> >>>>> a.out -mat_type aijcusparse -vec_type cuda -log_view -options_left
> >>>>> 
> >>>>> Mark
> >>>>> 
> >>>>> On Fri, Jun 30, 2023 at 6:16 AM Matthew Knepley <knep...@gmail.com 
> >>>>> <mailto:knep...@gmail.com>> wrote:
> >>>>> On Fri, Jun 30, 2023 at 1:13 AM Ng, Cho-Kuen via petsc-users 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
> >>>>> Mark,
> >>>>> 
> >>>>> The application code reads in parameters from an input file, where we 
> >>>>> can put the PETSc runtime options. Then we pass the options to 
> >>>>> PetscInitialize(...). Does that sounds right?
> >>>>> 
> >>>>> PETSc will read command line argument automatically in 
> >>>>> PetscInitialize() unless you shut it off.
> >>>>> 
> >>>>>   Thanks,
> >>>>> 
> >>>>>     Matt
> >>>>>  Cho
> >>>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>>> <mailto:c...@slac.stanford.edu>>
> >>>>> Sent: Thursday, June 29, 2023 8:32 PM
> >>>>> To: Mark Adams <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>>  Mark,
> >>>>> 
> >>>>> Thanks for the information. How do I put the runtime options for the 
> >>>>> executable, say, a.out, which does not have the provision to append 
> >>>>> arguments? Do I need to change the C++ main to read in the options?
> >>>>> 
> >>>>> Cho
> >>>>>  From: Mark Adams <mfad...@lbl.gov <mailto:mfad...@lbl.gov>>
> >>>>> Sent: Thursday, June 29, 2023 5:55 PM
> >>>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu 
> >>>>> <mailto:c...@slac.stanford.edu>>
> >>>>> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>>
> >>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
> >>>>>  Run with options: -mat_type aijcusparse -vec_type cuda -log_view 
> >>>>> -options_left
> >>>>> The last column of the performance data (from -log_view) will be the 
> >>>>> percent flops on the GPU. Check that that is > 0.
> >>>>> 
> >>>>> The end of the output will list the options that were used and options 
> >>>>> that were _not_ used (if any). Check that there are no options left.
> >>>>> 
> >>>>> Mark
> >>>>> 
> >>>>> On Thu, Jun 29, 2023 at 7:50 PM Ng, Cho-Kuen via petsc-users 
> >>>>> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
> >>>>> I installed PETSc on Perlmutter using "spack install petsc+cuda+zoltan" 
> >>>>> and used it by "spack load petsc/fwge6pf". Then I compiled the 
> >>>>> application code (purely CPU code) linking to the petsc package, hoping 
> >>>>> that I can get performance improvement using the petsc GPU backend. 
> >>>>> However, the timing was the same using the same number of MPI tasks 
> >>>>> with and without GPU accelerators. Have I missed something in the 
> >>>>> process, for example, setting up PETSc options at runtime to use the 
> >>>>> GPU backend?
> >>>>> 
> >>>>> Thanks,
> >>>>> Cho
> >>>>> 
> >>>>> 
> >>>>> -- 
> >>>>> What most experimenters take for granted before they begin their 
> >>>>> experiments is infinitely more interesting than any results to which 
> >>>>> their experiments lead.
> >>>>> -- Norbert Wiener
> >>>>> 
> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fMikK_wvRIVkv5jLV6EHt_rPhWLibqlxAAYjRVMbAEGOUp417LWCH59TvzCtcD3j4dOd4xR_tUy2MRnqU1N7kew$
> >>>>>  
> >>>> 
> >>>> 
> >>>> 
> >>>> -- 
> >>>> What most experimenters take for granted before they begin their 
> >>>> experiments is infinitely more interesting than any results to which 
> >>>> their experiments lead.
> >>>> -- Norbert Wiener
> >>>> 
> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fMikK_wvRIVkv5jLV6EHt_rPhWLibqlxAAYjRVMbAEGOUp417LWCH59TvzCtcD3j4dOd4xR_tUy2MRnqU1N7kew$
> >>>>  
> >>>> 
> >>>> 
> >>>> -- 
> >>>> What most experimenters take for granted before they begin their 
> >>>> experiments is infinitely more interesting than any results to which 
> >>>> their experiments lead.
> >>>> -- Norbert Wiener
> >>>> 
> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fMikK_wvRIVkv5jLV6EHt_rPhWLibqlxAAYjRVMbAEGOUp417LWCH59TvzCtcD3j4dOd4xR_tUy2MRnqU1N7kew$
> >>>>  
> 
> 
> <Untitled.png>

Reply via email to