Glad to hear you found a way. Did you use Frontera at TACC? If yes, I could have a try.
--Junchao Zhang On Tue, Apr 16, 2024 at 8:35 PM Sreeram R Venkat <srven...@utexas.edu> wrote: > I finally figured out a way to make it work. I had to build PETSc and my > application using the (non GPU-aware) Intel MPI. Then, before running, I > switch to the MVAPICH2-GDR. I'm not sure why that works, but it's the only > way I've > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > I finally figured out a way to make it work. I had to build PETSc and my > application using the (non GPU-aware) Intel MPI. Then, before running, I > switch to the MVAPICH2-GDR. > I'm not sure why that works, but it's the only way I've found to compile > and run successfully without throwing any errors about not having a > GPU-aware MPI. > > > > On Fri, Dec 8, 2023 at 5:30 PM Mark Adams <mfad...@lbl.gov> wrote: > >> You may need to set some env variables. This can be system specific so >> you might want to look at docs or ask TACC how to run with GPU-aware MPI. >> >> Mark >> >> On Fri, Dec 8, 2023 at 5:17 PM Sreeram R Venkat <srven...@utexas.edu> >> wrote: >> >>> Actually, when I compile my program with this build of PETSc and run, I >>> still get the error: >>> >>> PETSC ERROR: PETSc is configured with GPU support, but your MPI is not >>> GPU-aware. For better performance, please use a GPU-aware MPI. >>> >>> I have the mvapich2-gdr module loaded and MV2_USE_CUDA=1. >>> >>> Is there anything else I need to do? >>> >>> Thanks, >>> Sreeram >>> >>> On Fri, Dec 8, 2023 at 3:29 PM Sreeram R Venkat <srven...@utexas.edu> >>> wrote: >>> >>>> Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr >>>> module didn't require CUDA 11.4 as a dependency, so I was using 12.0 >>>> >>>> On Fri, Dec 8, 2023 at 1:15 PM Satish Balay <ba...@mcs.anl.gov> wrote: >>>> >>>>> Executing: mpicc -show >>>>> stdout: icc -I/opt/apps/cuda/11.4/include >>>>> -I/opt/apps/cuda/11.4/include -lcuda -L/opt/apps/cuda/11.4/lib64/stubs >>>>> -L/opt/apps/cuda/11.4/lib64 -lcudart -lrt >>>>> -Wl,-rpath,/opt/apps/cuda/11.4/lib64 -Wl,-rpath,XORIGIN/placeholder >>>>> -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ -lm >>>>> -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include >>>>> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath >>>>> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags >>>>> -lmpi >>>>> >>>>> Checking for program /opt/apps/cuda/12.0/bin/nvcc...found >>>>> >>>>> Looks like you are trying to mix in 2 different cuda versions in this >>>>> build. >>>>> >>>>> Perhaps you need to use cuda-11.4 - with this install of mvapich.. >>>>> >>>>> Satish >>>>> >>>>> On Fri, 8 Dec 2023, Matthew Knepley wrote: >>>>> >>>>> > On Fri, Dec 8, 2023 at 1:54 PM Sreeram R Venkat <srven...@utexas.edu> >>>>> wrote: >>>>> > >>>>> > > I am trying to build PETSc with CUDA using the CUDA-Aware >>>>> MVAPICH2-GDR. >>>>> > > >>>>> > > Here is my configure command: >>>>> > > >>>>> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre >>>>> > > --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true >>>>> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental >>>>> --download-metis >>>>> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx >>>>> --with-fc=mpif90 >>>>> > > >>>>> > > which errors with: >>>>> > > >>>>> > > UNABLE to CONFIGURE with GIVEN OPTIONS (see >>>>> configure.log for >>>>> > > details): >>>>> > > >>>>> > > >>>>> --------------------------------------------------------------------------------------------- >>>>> > > CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14 >>>>> > > -Xcompiler -fPIC >>>>> > > -Xcompiler -fvisibility=hidden -g -lineinfo -gencode >>>>> > > arch=compute_80,code=sm_80" >>>>> > > generated from "--with-cuda-arch=80" >>>>> > > >>>>> > > >>>>> > > >>>>> > > The same configure command works when I use the Intel MPI and I >>>>> can build >>>>> > > with CUDA. The full config.log file is attached. Please let me >>>>> know if you >>>>> > > need any other information. I appreciate your help with this. >>>>> > > >>>>> > >>>>> > The proximate error is >>>>> > >>>>> > Executing: nvcc -c -o >>>>> /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o >>>>> > -I/tmp/petsc-kn3f29gl/config.setCompilers >>>>> > -I/tmp/petsc-kn3f29gl/config.types >>>>> > -I/tmp/petsc-kn3f29gl/config.packages.cuda -ccbin mpic++ -std=c++14 >>>>> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode >>>>> > arch=compute_80,code=sm_80 /tmp/petsc-kn3f29gl/config.packages.cuda/ >>>>> > conftest.cu >>>>> <https://urldefense.us/v3/__http://conftest.cu__;!!G_uCfscf7eWS!duKUz7pE9N0adJ-FOW7PLZ_1cSZvYlnqh7J0TIcZN0v8RLplcWxh1YE8Vis29K0cuw_zAvjdK-H9H2JYYuUUKRXxlA$> >>>>> > stdout: >>>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than >>>>> one >>>>> > instance of overloaded function >>>>> "__nv_associate_access_property_impl" has >>>>> > "C" linkage >>>>> > 1 error detected in the compilation of >>>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu >>>>> <https://urldefense.us/v3/__http://conftest.cu__;!!G_uCfscf7eWS!duKUz7pE9N0adJ-FOW7PLZ_1cSZvYlnqh7J0TIcZN0v8RLplcWxh1YE8Vis29K0cuw_zAvjdK-H9H2JYYuUUKRXxlA$> >>>>> ". >>>>> > Possible ERROR while running compiler: exit code 1 >>>>> > stderr: >>>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than >>>>> one >>>>> > instance of overloaded function >>>>> "__nv_associate_access_property_impl" has >>>>> > "C" linkage >>>>> > >>>>> > 1 error detected in the compilation of >>>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda >>>>> > >>>>> > This looks like screwed up headers to me, but I will let someone that >>>>> > understands CUDA compilation reply. >>>>> > >>>>> > Thanks, >>>>> > >>>>> > Matt >>>>> > >>>>> > Thanks, >>>>> > > Sreeram >>>>> > > >>>>> > >>>>> > >>>>> > >>>> >>>>