Colleagues,

I did not notice this, but Junchao's MR, "Directly pass root/leafdata to MPI in SF when possible"

  https://gitlab.com/petsc/petsc/-/merge_requests/2506

that was merged into master over the weekend causes PETSc to error out if PETSc has been configured with GPU support but the MPI implementation is "GPU-aware", unless the user has specified "-use_gpu_aware_mpi 0":

> [0]PETSC ERROR: PETSc is configured with GPU support, but your MPI is not GPU-aware. For better performance, please use a GPU-aware MPI. > [0]PETSC ERROR: For IBM Spectrum MPI on OLCF Summit, you may need jsrun --smpiargs=-gpu. > [0]PETSC ERROR: For OpenMPI, you need to configure it --with-cuda (https://www.open-mpi.org/faq/?category=buildcuda) > [0]PETSC ERROR: For MVAPICH2-GDR, you need to set MV2_USE_CUDA=1 (http://mvapich.cse.ohio-state.edu/userguide/gdr/) > [0]PETSC ERROR: For Cray-MPICH, you need to set MPICH_RDMA_ENABLED_CUDA=1 (https://www.olcf.ornl.gov/tutorials/gpudirect-mpich-enabled-cuda/) > [0]PETSC ERROR: If you do not care, use option -use_gpu_aware_mpi 0, then PETSc will copy data from GPU to CPU for communication.
> application called MPI_Abort(MPI_COMM_WORLD, 90693076) - process 0

I like that we are warning users about a potential performance problem, but this seems like something that should print a warning, rather than exiting with an error. So I am wondering

1) Do people agree that this should be a warning instead of an error?

and

2) Shouldn't we add a standard mechanism for reporting these sorts of warnings at runtime?

--Richard

Reply via email to