Hi,

I’ve built PETSc with NVIDIA support for our GPU machine 
(https://cirrus.readthedocs.io/en/master/user-guide/gpu.html), and then 
compiled our executable against this PETSc (using version 3.13.3). I should add 
that the MPI on our system is not GPU-aware so I have to use -use_gpu_aware_mpi 0

When running this, in the .petscrc I put

-dm_vec_type cuda
-dm_mat_type aijcusparse

as is suggested on the PETSc GPU page 
(https://www.mcs.anl.gov/petsc/features/gpus.html) to enable CUDA for DMs (all 
our PETSc data structures are with DMs). I have also ensured I'm using the 
jacobi preconditioner so that it definitely runs on the GPU (again, according 
to the PETSc GPU page).

When I run this, I note that the GPU seems to have memory allocated on it from 
my executable, however seems to be doing no computation:

Wed Aug  5 13:10:23 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:1A:00.0 Off |                  Off |
| N/A   43C    P0    64W / 300W |    490MiB / 16160MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+



+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     33712      C   .../z04/gpsgibb/TPLS/TPLS-GPU/./twophase.x   479MiB |
+-----------------------------------------------------------------------------+

I then ran the same example but without the -dm_vec_type cuda, -dm_mat_type 
aijcusparse arguments, and I found the same behaviour (479MB allocated on the 
GPU, 0% GPU utilisation).

In both cases the runtime of the example are near identical, suggesting that 
both are essentially the same run.

As a further test I compiled PETSc without CUDA support and ran the same 
example again, and found the same runtime as with the GPUs, and (as expected) 
no GPU memory allocated. I then tried to run the example with the -dm_vec_type 
cuda, -dm_mat_type aijcusparse arguments and it ran without complaint. I would 
have expected it to throw an error or at least a warning if invalid arguments 
were passed to it.

All this suggests to me that PETSc is ignoring my requests to use the GPUs. For 
the GPU-aware PETSc it seems to allocate memory on the GPUs but perform no 
calculations on them, regardless of whether I requested it to use the GPUs or 
not. On non-GPU-aware PETSc it accepts my requests to use the GPUs, but does 
not throw an error.

What am I doing wrong?

Thanks in advance,

Gordon
-----------------------------------------------
Dr Gordon P S Gibb
EPCC, The University of Edinburgh
Tel: +44 131 651 3459

The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.

Reply via email to