Glad to hear you found a way.   Did you use Frontera at TACC?  If yes, I
could have a try.

--Junchao Zhang


On Tue, Apr 16, 2024 at 8:35 PM Sreeram R Venkat <srven...@utexas.edu>
wrote:

> I finally figured out a way to make it work. I had to build PETSc and my
> application using the (non GPU-aware) Intel MPI. Then, before running, I
> switch to the MVAPICH2-GDR. I'm not sure why that works, but it's the only
> way I've
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> I finally figured out a way to make it work. I had to build PETSc and my
> application using the (non GPU-aware) Intel MPI. Then, before running, I
> switch to the MVAPICH2-GDR.
> I'm not sure why that works, but it's the only way I've found to compile
> and run successfully without throwing any errors about not having a
> GPU-aware MPI.
>
>
>
> On Fri, Dec 8, 2023 at 5:30 PM Mark Adams <mfad...@lbl.gov> wrote:
>
>> You may need to set some env variables. This can be system specific so
>> you might want to look at docs or ask TACC how to run with GPU-aware MPI.
>>
>> Mark
>>
>> On Fri, Dec 8, 2023 at 5:17 PM Sreeram R Venkat <srven...@utexas.edu>
>> wrote:
>>
>>> Actually, when I compile my program with this build of PETSc and run, I
>>> still get the error:
>>>
>>> PETSC ERROR: PETSc is configured with GPU support, but your MPI is not
>>> GPU-aware. For better performance, please use a GPU-aware MPI.
>>>
>>> I have the mvapich2-gdr module loaded and MV2_USE_CUDA=1.
>>>
>>> Is there anything else I need to do?
>>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Fri, Dec 8, 2023 at 3:29 PM Sreeram R Venkat <srven...@utexas.edu>
>>> wrote:
>>>
>>>> Thank you, changing to CUDA 11.4 fixed the issue. The mvapich2-gdr
>>>> module didn't require CUDA 11.4 as a dependency, so I was using 12.0
>>>>
>>>> On Fri, Dec 8, 2023 at 1:15 PM Satish Balay <ba...@mcs.anl.gov> wrote:
>>>>
>>>>> Executing: mpicc -show
>>>>> stdout: icc -I/opt/apps/cuda/11.4/include
>>>>> -I/opt/apps/cuda/11.4/include -lcuda -L/opt/apps/cuda/11.4/lib64/stubs
>>>>> -L/opt/apps/cuda/11.4/lib64 -lcudart -lrt
>>>>> -Wl,-rpath,/opt/apps/cuda/11.4/lib64 -Wl,-rpath,XORIGIN/placeholder
>>>>> -Wl,--build-id -L/opt/apps/cuda/11.4/lib64/ -lm
>>>>> -I/opt/apps/intel19/mvapich2-gdr/2.3.7/include
>>>>> -L/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,-rpath
>>>>> -Wl,/opt/apps/intel19/mvapich2-gdr/2.3.7/lib64 -Wl,--enable-new-dtags 
>>>>> -lmpi
>>>>>
>>>>>     Checking for program /opt/apps/cuda/12.0/bin/nvcc...found
>>>>>
>>>>> Looks like you are trying to mix in 2 different cuda versions in this
>>>>> build.
>>>>>
>>>>> Perhaps you need to use cuda-11.4 - with this install of mvapich..
>>>>>
>>>>> Satish
>>>>>
>>>>> On Fri, 8 Dec 2023, Matthew Knepley wrote:
>>>>>
>>>>> > On Fri, Dec 8, 2023 at 1:54 PM Sreeram R Venkat <srven...@utexas.edu>
>>>>> wrote:
>>>>> >
>>>>> > > I am trying to build PETSc with CUDA using the CUDA-Aware
>>>>> MVAPICH2-GDR.
>>>>> > >
>>>>> > > Here is my configure command:
>>>>> > >
>>>>> > > ./configure PETSC_ARCH=linux-c-debug-mvapich2-gdr --download-hypre
>>>>> > >  --with-cuda=true --cuda-dir=$TACC_CUDA_DIR --with-hdf5=true
>>>>> > > --with-hdf5-dir=$TACC_PHDF5_DIR --download-elemental
>>>>> --download-metis
>>>>> > > --download-parmetis --with-cc=mpicc --with-cxx=mpicxx
>>>>> --with-fc=mpif90
>>>>> > >
>>>>> > > which errors with:
>>>>> > >
>>>>> > >           UNABLE to CONFIGURE with GIVEN OPTIONS (see
>>>>> configure.log for
>>>>> > > details):
>>>>> > >
>>>>> > >
>>>>> ---------------------------------------------------------------------------------------------
>>>>> > >   CUDA compile failed with arch flags " -ccbin mpic++ -std=c++14
>>>>> > > -Xcompiler -fPIC
>>>>> > >   -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
>>>>> > > arch=compute_80,code=sm_80"
>>>>> > >   generated from "--with-cuda-arch=80"
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > The same configure command works when I use the Intel MPI and I
>>>>> can build
>>>>> > > with CUDA. The full config.log file is attached. Please let me
>>>>> know if you
>>>>> > > need any other information. I appreciate your help with this.
>>>>> > >
>>>>> >
>>>>> > The proximate error is
>>>>> >
>>>>> > Executing: nvcc -c -o
>>>>> /tmp/petsc-kn3f29gl/config.packages.cuda/conftest.o
>>>>> > -I/tmp/petsc-kn3f29gl/config.setCompilers
>>>>> > -I/tmp/petsc-kn3f29gl/config.types
>>>>> > -I/tmp/petsc-kn3f29gl/config.packages.cuda  -ccbin mpic++ -std=c++14
>>>>> > -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -g -lineinfo -gencode
>>>>> > arch=compute_80,code=sm_80  /tmp/petsc-kn3f29gl/config.packages.cuda/
>>>>> > conftest.cu
>>>>> <https://urldefense.us/v3/__http://conftest.cu__;!!G_uCfscf7eWS!duKUz7pE9N0adJ-FOW7PLZ_1cSZvYlnqh7J0TIcZN0v8RLplcWxh1YE8Vis29K0cuw_zAvjdK-H9H2JYYuUUKRXxlA$>
>>>>> > stdout:
>>>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than
>>>>> one
>>>>> > instance of overloaded function
>>>>> "__nv_associate_access_property_impl" has
>>>>> > "C" linkage
>>>>> > 1 error detected in the compilation of
>>>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda/conftest.cu
>>>>> <https://urldefense.us/v3/__http://conftest.cu__;!!G_uCfscf7eWS!duKUz7pE9N0adJ-FOW7PLZ_1cSZvYlnqh7J0TIcZN0v8RLplcWxh1YE8Vis29K0cuw_zAvjdK-H9H2JYYuUUKRXxlA$>
>>>>> ".
>>>>> > Possible ERROR while running compiler: exit code 1
>>>>> > stderr:
>>>>> > /opt/apps/cuda/11.4/include/crt/sm_80_rt.hpp(141): error: more than
>>>>> one
>>>>> > instance of overloaded function
>>>>> "__nv_associate_access_property_impl" has
>>>>> > "C" linkage
>>>>> >
>>>>> > 1 error detected in the compilation of
>>>>> > "/tmp/petsc-kn3f29gl/config.packages.cuda
>>>>> >
>>>>> > This looks like screwed up headers to me, but I will let someone that
>>>>> > understands CUDA compilation reply.
>>>>> >
>>>>> >   Thanks,
>>>>> >
>>>>> >      Matt
>>>>> >
>>>>> > Thanks,
>>>>> > > Sreeram
>>>>> > >
>>>>> >
>>>>> >
>>>>> >
>>>>
>>>>

Reply via email to