I performed tests on comparison using KSP with and without cuda backend on 
NERSC's Perlmutter. For a finite element solve with 800k degrees of freedom, 
the best times obtained using MPI and MPI+GPU were

o MPI - 128 MPI tasks, 27 s

o MPI+GPU - 4 MPI tasks, 4 GPU's, 32 s

Is that the performance one would expect using the hybrid mode of computation. 
Attached image shows the scaling on a single node.

Thanks,
Cho
________________________________
From: Ng, Cho-Kuen <c...@slac.stanford.edu>
Sent: Saturday, August 12, 2023 8:08 AM
To: Jacob Faibussowitsch <jacob....@gmail.com>
Cc: Barry Smith <bsm...@petsc.dev>; petsc-users <petsc-users@mcs.anl.gov>
Subject: Re: [petsc-users] Using PETSc GPU backend

Thanks Jacob.
________________________________
From: Jacob Faibussowitsch <jacob....@gmail.com>
Sent: Saturday, August 12, 2023 5:02 AM
To: Ng, Cho-Kuen <c...@slac.stanford.edu>
Cc: Barry Smith <bsm...@petsc.dev>; petsc-users <petsc-users@mcs.anl.gov>
Subject: Re: [petsc-users] Using PETSc GPU backend

> Can petsc show the number of GPUs used?

-device_view

Best regards,

Jacob Faibussowitsch
(Jacob Fai - booss - oh - vitch)

> On Aug 12, 2023, at 00:53, Ng, Cho-Kuen via petsc-users 
> <petsc-users@mcs.anl.gov> wrote:
>
> Barry,
>
> I tried again today on Perlmutter and running on multiple GPU nodes worked. 
> Likely, I had messed up something the other day. Also, I was able to have 
> multiple MPI tasks on a GPU using Nvidia MPS. The petsc output shows the 
> number of MPI tasks:
>
> KSP Object: 32 MPI processes
>
> Can petsc show the number of GPUs used?
>
> Thanks,
> Cho
>
> From: Barry Smith <bsm...@petsc.dev>
> Sent: Wednesday, August 9, 2023 4:09 PM
> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
> Subject: Re: [petsc-users] Using PETSc GPU backend
>
>   We would need more information about "hanging". Do PETSc examples and tiny 
> problems "hang" on multiple nodes? If you run with -info what are the last 
> messages printed? Can you run with a debugger to see where it is "hanging"?
>
>
>
>> On Aug 9, 2023, at 5:59 PM, Ng, Cho-Kuen <c...@slac.stanford.edu> wrote:
>>
>> Barry and Matt,
>>
>> Thanks for your help. Now I can use petsc GPU backend on Perlmutter: 1 node, 
>> 4 MPI tasks and 4 GPUs. However, I ran into problems with multiple nodes: 2 
>> nodes, 8 MPI tasks and 8 GPUs. The run hung on KSPSolve. How can I fix this?
>>
>> Best,
>> Cho
>>
>>  From: Barry Smith <bsm...@petsc.dev>
>> Sent: Monday, July 17, 2023 6:58 AM
>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>
>>  The examples that use DM, in particular DMDA all trivially support using 
>> the GPU with -dm_mat_type aijcusparse -dm_vec_type cuda
>>
>>
>>
>>> On Jul 17, 2023, at 1:45 AM, Ng, Cho-Kuen <c...@slac.stanford.edu> wrote:
>>>
>>> Barry,
>>>
>>> Thank you so much for the clarification.
>>>
>>> I see that ex104.c and ex300.c use  MatXAIJSetPreallocation(). Are there 
>>> other tutorials available?
>>>
>>> Cho
>>>  From: Barry Smith <bsm...@petsc.dev>
>>> Sent: Saturday, July 15, 2023 8:36 AM
>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>
>>>      Cho,
>>>
>>>     We currently have a crappy API for turning on GPU support, and our 
>>> documentation is misleading in places.
>>>
>>>     People constantly say "to use GPU's with PETSc you only need to use 
>>> -mat_type aijcusparse (for example)" This is incorrect.
>>>
>>>  This does not work with code that uses the convenience Mat constructors 
>>> such as MatCreateAIJ(), MatCreateAIJWithArrays etc. It only works if you 
>>> use the constructor approach of MatCreate(), MatSetSizes(), 
>>> MatSetFromOptions(), MatXXXSetPreallocation(). ...  Similarly you need to 
>>> use VecCreate(), VecSetSizes(), VecSetFromOptions() and -vec_type cuda
>>>
>>>    If you use DM to create the matrices and vectors then you can use 
>>> -dm_mat_type aijcusparse -dm_vec_type cuda
>>>
>>>    Sorry for the confusion.
>>>
>>>    Barry
>>>
>>>
>>>
>>>
>>>> On Jul 15, 2023, at 8:03 AM, Matthew Knepley <knep...@gmail.com> wrote:
>>>>
>>>> On Sat, Jul 15, 2023 at 1:44 AM Ng, Cho-Kuen <c...@slac.stanford.edu> 
>>>> wrote:
>>>> Matt,
>>>>
>>>> After inserting 2 lines in the code:
>>>>
>>>>   ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);
>>>>   ierr = MatSetFromOptions(A);CHKERRQ(ierr);
>>>>   ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
>>>>                       d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
>>>>
>>>> "There are no unused options." However, there is no improvement on the GPU 
>>>> performance.
>>>>
>>>> 1. MatCreateAIJ() sets the type, and in fact it overwrites the Mat you 
>>>> created in steps 1 and 2. This is detailed in the manual.
>>>>
>>>> 2. You should replace MatCreateAIJ(), with MatSetSizes() before 
>>>> MatSetFromOptions().
>>>>
>>>>   THanks,
>>>>
>>>>     Matt
>>>>  Thanks,
>>>> Cho
>>>>
>>>>  From: Matthew Knepley <knep...@gmail.com>
>>>> Sent: Friday, July 14, 2023 5:57 PM
>>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>> Cc: Barry Smith <bsm...@petsc.dev>; Mark Adams <mfad...@lbl.gov>; 
>>>> petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>  On Fri, Jul 14, 2023 at 7:57 PM Ng, Cho-Kuen <c...@slac.stanford.edu> 
>>>> wrote:
>>>> I managed to pass the following options to PETSc using a GPU node on 
>>>> Perlmutter.
>>>>
>>>>     -mat_type aijcusparse -vec_type cuda -log_view -options_left
>>>>
>>>> Below is a summary of the test using 4 MPI tasks and 1 GPU per task.
>>>>
>>>> o #PETSc Option Table entries:
>>>>    -log_view
>>>>    -mat_type aijcusparse
>>>>    -options_left
>>>>    -vec_type cuda
>>>>    #End of PETSc Option Table entries
>>>>    WARNING! There are options you set that were not used!
>>>>    WARNING! could be spelling mistake, etc!
>>>>    There is one unused database option. It is:
>>>>    Option left: name:-mat_type value: aijcusparse
>>>>
>>>> The -mat_type option has not been used. In the application code, we use
>>>>
>>>>     ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
>>>>              d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
>>>>
>>>>
>>>> If you create the Mat this way, then you need MatSetFromOptions() in order 
>>>> to set the type from the command line.
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>  o The percent flops on the GPU for KSPSolve is 17%.
>>>>
>>>> In comparison with a CPU run using 16 MPI tasks, the GPU run is an order 
>>>> of magnitude slower. How can I improve the GPU performance?
>>>>
>>>> Thanks,
>>>> Cho
>>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>> Sent: Friday, June 30, 2023 7:57 AM
>>>> To: Barry Smith <bsm...@petsc.dev>; Mark Adams <mfad...@lbl.gov>
>>>> Cc: Matthew Knepley <knep...@gmail.com>; petsc-users@mcs.anl.gov 
>>>> <petsc-users@mcs.anl.gov>
>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>  Barry, Mark and Matt,
>>>>
>>>> Thank you all for the suggestions. I will modify the code so we can pass 
>>>> runtime options.
>>>>
>>>> Cho
>>>>  From: Barry Smith <bsm...@petsc.dev>
>>>> Sent: Friday, June 30, 2023 7:01 AM
>>>> To: Mark Adams <mfad...@lbl.gov>
>>>> Cc: Matthew Knepley <knep...@gmail.com>; Ng, Cho-Kuen 
>>>> <c...@slac.stanford.edu>; petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>
>>>>   Note that options like -mat_type aijcusparse  -vec_type cuda only work 
>>>> if the program is set up to allow runtime swapping of matrix and vector 
>>>> types. If you have a call to MatCreateMPIAIJ() or other specific types 
>>>> then then these options do nothing but because Mark had you use 
>>>> -options_left the program will tell you at the end that it did not use the 
>>>> option so you will know.
>>>>
>>>>> On Jun 30, 2023, at 9:30 AM, Mark Adams <mfad...@lbl.gov> wrote:
>>>>>
>>>>> PetscCall(PetscInitialize(&argc, &argv, NULL, help)); gives us the args 
>>>>> and you run:
>>>>>
>>>>> a.out -mat_type aijcusparse -vec_type cuda -log_view -options_left
>>>>>
>>>>> Mark
>>>>>
>>>>> On Fri, Jun 30, 2023 at 6:16 AM Matthew Knepley <knep...@gmail.com> wrote:
>>>>> On Fri, Jun 30, 2023 at 1:13 AM Ng, Cho-Kuen via petsc-users 
>>>>> <petsc-users@mcs.anl.gov> wrote:
>>>>> Mark,
>>>>>
>>>>> The application code reads in parameters from an input file, where we can 
>>>>> put the PETSc runtime options. Then we pass the options to 
>>>>> PetscInitialize(...). Does that sounds right?
>>>>>
>>>>> PETSc will read command line argument automatically in PetscInitialize() 
>>>>> unless you shut it off.
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>     Matt
>>>>>  Cho
>>>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>>> Sent: Thursday, June 29, 2023 8:32 PM
>>>>> To: Mark Adams <mfad...@lbl.gov>
>>>>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>>  Mark,
>>>>>
>>>>> Thanks for the information. How do I put the runtime options for the 
>>>>> executable, say, a.out, which does not have the provision to append 
>>>>> arguments? Do I need to change the C++ main to read in the options?
>>>>>
>>>>> Cho
>>>>>  From: Mark Adams <mfad...@lbl.gov>
>>>>> Sent: Thursday, June 29, 2023 5:55 PM
>>>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>>  Run with options: -mat_type aijcusparse -vec_type cuda -log_view 
>>>>> -options_left
>>>>> The last column of the performance data (from -log_view) will be the 
>>>>> percent flops on the GPU. Check that that is > 0.
>>>>>
>>>>> The end of the output will list the options that were used and options 
>>>>> that were _not_ used (if any). Check that there are no options left.
>>>>>
>>>>> Mark
>>>>>
>>>>> On Thu, Jun 29, 2023 at 7:50 PM Ng, Cho-Kuen via petsc-users 
>>>>> <petsc-users@mcs.anl.gov> wrote:
>>>>> I installed PETSc on Perlmutter using "spack install petsc+cuda+zoltan" 
>>>>> and used it by "spack load petsc/fwge6pf". Then I compiled the 
>>>>> application code (purely CPU code) linking to the petsc package, hoping 
>>>>> that I can get performance improvement using the petsc GPU backend. 
>>>>> However, the timing was the same using the same number of MPI tasks with 
>>>>> and without GPU accelerators. Have I missed something in the process, for 
>>>>> example, setting up PETSc options at runtime to use the GPU backend?
>>>>>
>>>>> Thanks,
>>>>> Cho
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their 
>>>>> experiments is infinitely more interesting than any results to which 
>>>>> their experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ayzvIKJwKmRG8pwu08_ikMDkk-2RTSFLjetpNY5u1zyOv8c0CVVizWOIcNzX27RfVhPixM8dbsF7cAlbrNTNyxdZ$
>>>>>  
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their 
>>>> experiments is infinitely more interesting than any results to which their 
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ayzvIKJwKmRG8pwu08_ikMDkk-2RTSFLjetpNY5u1zyOv8c0CVVizWOIcNzX27RfVhPixM8dbsF7cAlbrNTNyxdZ$
>>>>  
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their 
>>>> experiments is infinitely more interesting than any results to which their 
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ayzvIKJwKmRG8pwu08_ikMDkk-2RTSFLjetpNY5u1zyOv8c0CVVizWOIcNzX27RfVhPixM8dbsF7cAlbrNTNyxdZ$
>>>>  
>>
>>
>>
>>  From: Barry Smith <bsm...@petsc.dev>
>> Sent: Monday, July 17, 2023 6:58 AM
>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>
>>  The examples that use DM, in particular DMDA all trivially support using 
>> the GPU with -dm_mat_type aijcusparse -dm_vec_type cuda
>>
>>
>>
>>> On Jul 17, 2023, at 1:45 AM, Ng, Cho-Kuen <c...@slac.stanford.edu> wrote:
>>>
>>> Barry,
>>>
>>> Thank you so much for the clarification.
>>>
>>> I see that ex104.c and ex300.c use  MatXAIJSetPreallocation(). Are there 
>>> other tutorials available?
>>>
>>> Cho
>>>  From: Barry Smith <bsm...@petsc.dev>
>>> Sent: Saturday, July 15, 2023 8:36 AM
>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>
>>>      Cho,
>>>
>>>     We currently have a crappy API for turning on GPU support, and our 
>>> documentation is misleading in places.
>>>
>>>     People constantly say "to use GPU's with PETSc you only need to use 
>>> -mat_type aijcusparse (for example)" This is incorrect.
>>>
>>>  This does not work with code that uses the convenience Mat constructors 
>>> such as MatCreateAIJ(), MatCreateAIJWithArrays etc. It only works if you 
>>> use the constructor approach of MatCreate(), MatSetSizes(), 
>>> MatSetFromOptions(), MatXXXSetPreallocation(). ...  Similarly you need to 
>>> use VecCreate(), VecSetSizes(), VecSetFromOptions() and -vec_type cuda
>>>
>>>    If you use DM to create the matrices and vectors then you can use 
>>> -dm_mat_type aijcusparse -dm_vec_type cuda
>>>
>>>    Sorry for the confusion.
>>>
>>>    Barry
>>>
>>>
>>>
>>>
>>>> On Jul 15, 2023, at 8:03 AM, Matthew Knepley <knep...@gmail.com> wrote:
>>>>
>>>> On Sat, Jul 15, 2023 at 1:44 AM Ng, Cho-Kuen <c...@slac.stanford.edu> 
>>>> wrote:
>>>> Matt,
>>>>
>>>> After inserting 2 lines in the code:
>>>>
>>>>   ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr);
>>>>   ierr = MatSetFromOptions(A);CHKERRQ(ierr);
>>>>   ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
>>>>                       d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
>>>>
>>>> "There are no unused options." However, there is no improvement on the GPU 
>>>> performance.
>>>>
>>>> 1. MatCreateAIJ() sets the type, and in fact it overwrites the Mat you 
>>>> created in steps 1 and 2. This is detailed in the manual.
>>>>
>>>> 2. You should replace MatCreateAIJ(), with MatSetSizes() before 
>>>> MatSetFromOptions().
>>>>
>>>>   THanks,
>>>>
>>>>     Matt
>>>>  Thanks,
>>>> Cho
>>>>
>>>>  From: Matthew Knepley <knep...@gmail.com>
>>>> Sent: Friday, July 14, 2023 5:57 PM
>>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>> Cc: Barry Smith <bsm...@petsc.dev>; Mark Adams <mfad...@lbl.gov>; 
>>>> petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>  On Fri, Jul 14, 2023 at 7:57 PM Ng, Cho-Kuen <c...@slac.stanford.edu> 
>>>> wrote:
>>>> I managed to pass the following options to PETSc using a GPU node on 
>>>> Perlmutter.
>>>>
>>>>     -mat_type aijcusparse -vec_type cuda -log_view -options_left
>>>>
>>>> Below is a summary of the test using 4 MPI tasks and 1 GPU per task.
>>>>
>>>> o #PETSc Option Table entries:
>>>>    -log_view
>>>>    -mat_type aijcusparse
>>>>    -options_left
>>>>    -vec_type cuda
>>>>    #End of PETSc Option Table entries
>>>>    WARNING! There are options you set that were not used!
>>>>    WARNING! could be spelling mistake, etc!
>>>>    There is one unused database option. It is:
>>>>    Option left: name:-mat_type value: aijcusparse
>>>>
>>>> The -mat_type option has not been used. In the application code, we use
>>>>
>>>>     ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
>>>>              d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
>>>>
>>>>
>>>> If you create the Mat this way, then you need MatSetFromOptions() in order 
>>>> to set the type from the command line.
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>  o The percent flops on the GPU for KSPSolve is 17%.
>>>>
>>>> In comparison with a CPU run using 16 MPI tasks, the GPU run is an order 
>>>> of magnitude slower. How can I improve the GPU performance?
>>>>
>>>> Thanks,
>>>> Cho
>>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>> Sent: Friday, June 30, 2023 7:57 AM
>>>> To: Barry Smith <bsm...@petsc.dev>; Mark Adams <mfad...@lbl.gov>
>>>> Cc: Matthew Knepley <knep...@gmail.com>; petsc-users@mcs.anl.gov 
>>>> <petsc-users@mcs.anl.gov>
>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>  Barry, Mark and Matt,
>>>>
>>>> Thank you all for the suggestions. I will modify the code so we can pass 
>>>> runtime options.
>>>>
>>>> Cho
>>>>  From: Barry Smith <bsm...@petsc.dev>
>>>> Sent: Friday, June 30, 2023 7:01 AM
>>>> To: Mark Adams <mfad...@lbl.gov>
>>>> Cc: Matthew Knepley <knep...@gmail.com>; Ng, Cho-Kuen 
>>>> <c...@slac.stanford.edu>; petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>
>>>>   Note that options like -mat_type aijcusparse  -vec_type cuda only work 
>>>> if the program is set up to allow runtime swapping of matrix and vector 
>>>> types. If you have a call to MatCreateMPIAIJ() or other specific types 
>>>> then then these options do nothing but because Mark had you use 
>>>> -options_left the program will tell you at the end that it did not use the 
>>>> option so you will know.
>>>>
>>>>> On Jun 30, 2023, at 9:30 AM, Mark Adams <mfad...@lbl.gov> wrote:
>>>>>
>>>>> PetscCall(PetscInitialize(&argc, &argv, NULL, help)); gives us the args 
>>>>> and you run:
>>>>>
>>>>> a.out -mat_type aijcusparse -vec_type cuda -log_view -options_left
>>>>>
>>>>> Mark
>>>>>
>>>>> On Fri, Jun 30, 2023 at 6:16 AM Matthew Knepley <knep...@gmail.com> wrote:
>>>>> On Fri, Jun 30, 2023 at 1:13 AM Ng, Cho-Kuen via petsc-users 
>>>>> <petsc-users@mcs.anl.gov> wrote:
>>>>> Mark,
>>>>>
>>>>> The application code reads in parameters from an input file, where we can 
>>>>> put the PETSc runtime options. Then we pass the options to 
>>>>> PetscInitialize(...). Does that sounds right?
>>>>>
>>>>> PETSc will read command line argument automatically in PetscInitialize() 
>>>>> unless you shut it off.
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>     Matt
>>>>>  Cho
>>>>>  From: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>>> Sent: Thursday, June 29, 2023 8:32 PM
>>>>> To: Mark Adams <mfad...@lbl.gov>
>>>>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>>  Mark,
>>>>>
>>>>> Thanks for the information. How do I put the runtime options for the 
>>>>> executable, say, a.out, which does not have the provision to append 
>>>>> arguments? Do I need to change the C++ main to read in the options?
>>>>>
>>>>> Cho
>>>>>  From: Mark Adams <mfad...@lbl.gov>
>>>>> Sent: Thursday, June 29, 2023 5:55 PM
>>>>> To: Ng, Cho-Kuen <c...@slac.stanford.edu>
>>>>> Cc: petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov>
>>>>> Subject: Re: [petsc-users] Using PETSc GPU backend
>>>>>  Run with options: -mat_type aijcusparse -vec_type cuda -log_view 
>>>>> -options_left
>>>>> The last column of the performance data (from -log_view) will be the 
>>>>> percent flops on the GPU. Check that that is > 0.
>>>>>
>>>>> The end of the output will list the options that were used and options 
>>>>> that were _not_ used (if any). Check that there are no options left.
>>>>>
>>>>> Mark
>>>>>
>>>>> On Thu, Jun 29, 2023 at 7:50 PM Ng, Cho-Kuen via petsc-users 
>>>>> <petsc-users@mcs.anl.gov> wrote:
>>>>> I installed PETSc on Perlmutter using "spack install petsc+cuda+zoltan" 
>>>>> and used it by "spack load petsc/fwge6pf". Then I compiled the 
>>>>> application code (purely CPU code) linking to the petsc package, hoping 
>>>>> that I can get performance improvement using the petsc GPU backend. 
>>>>> However, the timing was the same using the same number of MPI tasks with 
>>>>> and without GPU accelerators. Have I missed something in the process, for 
>>>>> example, setting up PETSc options at runtime to use the GPU backend?
>>>>>
>>>>> Thanks,
>>>>> Cho
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their 
>>>>> experiments is infinitely more interesting than any results to which 
>>>>> their experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ayzvIKJwKmRG8pwu08_ikMDkk-2RTSFLjetpNY5u1zyOv8c0CVVizWOIcNzX27RfVhPixM8dbsF7cAlbrNTNyxdZ$
>>>>>  
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their 
>>>> experiments is infinitely more interesting than any results to which their 
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ayzvIKJwKmRG8pwu08_ikMDkk-2RTSFLjetpNY5u1zyOv8c0CVVizWOIcNzX27RfVhPixM8dbsF7cAlbrNTNyxdZ$
>>>>  
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their 
>>>> experiments is infinitely more interesting than any results to which their 
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ayzvIKJwKmRG8pwu08_ikMDkk-2RTSFLjetpNY5u1zyOv8c0CVVizWOIcNzX27RfVhPixM8dbsF7cAlbrNTNyxdZ$
>>>>  


Reply via email to