OK, Stefano got this to build. I now get this error. We had this before and thought that you needed to run the GPU aware MPI off. The test failed in make without PETSC_OPTIONS='-use_gpu_aware_mpi 0'. After adding this I get a runtime error. So my PETSC_OPTIONS did seem to work.
Treb, maybe this will work for you. This might be from our testing makefile. 17:14 stefanozampini/hypre-gpu> /gpfs/alpine/csc314/scratch/adams/petsc2$ *export PETSC_OPTIONS='-use_gpu_aware_mpi 0'* 17:15 stefanozampini/hypre-gpu> /gpfs/alpine/csc314/scratch/adams/petsc2$ make PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc2 PETSC_ARCH=arch-summit-hypre-cuda-dbg2 *-f gmakefile.test test search='ksp_ksp_tutorials-ex4_hypre_device'* Using MAKEFLAGS: -- search=ksp_ksp_tutorials-ex4_hypre_device PETSC_ARCH=arch-summit-hypre-cuda-dbg2 PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc2 TEST arch-summit-hypre-cuda-dbg2/tests/counts/ksp_ksp_tutorials-ex4_hypre_device.counts ok ksp_ksp_tutorials-ex4_hypre_device+nsize-1 ok diff-ksp_ksp_tutorials-ex4_hypre_device+nsize-1 not ok ksp_ksp_tutorials-ex4_hypre_device+nsize-2 # Error code: 1 # [1] (1117677) Warning: Could not find key lid0:0:2 in cache <========================= # [1] (1117677) Warning: Could not find key qpn0:0:0:2 in cache <========================= # Unable to connect queue-pairs # [g13n10:1117677] Error: common_pami.c:1094 - ompi_common_pami_init() 1: Unable to create 1 PAMI communication context(s) rc=1 # -------------------------------------------------------------------------- # No components were able to be opened in the pml framework. # # This typically means that either no components of this type were # installed, or none of the installed components can be loaded. # Sometimes this means that shared libraries required by these # components are unable to be found/loaded. # # Host: g13n10 # Framework: pml # -------------------------------------------------------------------------- # [g13n10:1117677] PML pami cannot be selected ok ksp_ksp_tutorials-ex4_hypre_device # SKIP Command failed so no diff # FAILED ksp_ksp_tutorials-ex4_hypre_device+nsize-2 # # To rerun failed tests: # /usr/bin/gmake -f gmakefile test test-fail=1 On Mon, Aug 30, 2021 at 3:35 PM Mark Adams <mfad...@lbl.gov> wrote: > I see you have an MR. Should I try these changes in my repo? > > On Mon, Aug 30, 2021 at 3:32 PM Mark Adams <mfad...@lbl.gov> wrote: > >> I am in a branch of Stefano's and a user wants this for a milestone asap. >> Maybe you can send me the fix and I can add it manually. >> My branch is in a funny "main<>" state and I'm not sure how to pull, >> etc., without Stefano. >> Thanks, >> Mark >> >> >> On Mon, Aug 30, 2021 at 3:28 PM Jacob Faibussowitsch <jacob....@gmail.com> >> wrote: >> >>> That did not seem to work. >>> >>> >>> So gcc didn’t ignore the hand-coded definitions in >>> src/sys/objects/device/interface/cupminterface.cxx >>> >>> See https://gitlab.com/petsc/petsc/-/merge_requests/4271 where I swap >>> constexpr for const and see if it works. >>> >>> Best regards, >>> >>> Jacob Faibussowitsch >>> (Jacob Fai - booss - oh - vitch) >>> >>> On Aug 30, 2021, at 14:14, Mark Adams <mfad...@lbl.gov> wrote: >>> >>> That did not seem to work. >>> >>> 15:09 main<> /gpfs/alpine/csc314/scratch/adams/petsc2$ mpicc --version >>> gcc (GCC) 9.1.0 >>> Copyright (C) 2019 Free Software Foundation, Inc. >>> This is free software; see the source for copying conditions. There is >>> NO >>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR >>> PURPOSE. >>> >>> I have read "GCC 8.x (and later) fully supports all of C++17." >>> >>> >>> >>> On Mon, Aug 30, 2021 at 3:07 PM Jacob Faibussowitsch < >>> jacob....@gmail.com> wrote: >>> >>>> Yeah I suppose so, all the values we alias are integral types so static >>>> const should have equivalent compile-time assurance as constexpr. >>>> >>>> Best regards, >>>> >>>> Jacob Faibussowitsch >>>> (Jacob Fai - booss - oh - vitch) >>>> >>>> On Aug 30, 2021, at 13:44, Junchao Zhang <junchao.zh...@gmail.com> >>>> wrote: >>>> >>>> Can you use less fancy 'static const int'? >>>> --Junchao Zhang >>>> >>>> >>>> On Mon, Aug 30, 2021 at 1:02 PM Jacob Faibussowitsch < >>>> jacob....@gmail.com> wrote: >>>> >>>>> No luck with C++14 >>>>> >>>>> >>>>> TL;DR: you need to have host and device compiler either both using >>>>> c++17 or neither using c++17. >>>>> >>>>> Long version: >>>>> C++17 among other things changed how static constexpr member variables >>>>> for classes worked. Previously if I had a class with a static constexpr >>>>> member variable I would have to not only declare it inline within the >>>>> class, but also define it within an executable otherwise the variable >>>>> would >>>>> not actually have any physical memory address: >>>>> >>>>> // foo.hpp >>>>> class foo >>>>> { >>>>> static constexpr int bar = 5; >>>>> }; >>>>> >>>>> // foo.cpp >>>>> int foo::bar; >>>>> >>>>> In c++17 however this changed because you can have static “inline” >>>>> variables. All this does is force the compiler define the variable for you >>>>> instead. The issue of course is that static constexpr implicitly makes >>>>> that >>>>> variable inline in c++17. So to sum it up: >>>>> >>>>> 1. The c++17 compiler (nvcc) sees the static constexpr variable, goes >>>>> “hmm ok I will define this in some undefined location”. >>>>> 2. The c++11/14 compiler comes along, sees your hand-coded definition >>>>> of the variable and goes “ah but I think I’ve seen this before, I’ll >>>>> ignore >>>>> it”. This silent rejection is due to the hand-coded definition idiom being >>>>> deprecated from c++17 onwards. Stupid, I know. >>>>> 2. The linker (driven by the c++11/14 compiler since PETSc links using >>>>> the host compiler) comes along and now suddenly cannot find the literal >>>>> definition, because it doesn’t know what the c++17 did. Disaster! >>>>> >>>>> Best regards, >>>>> >>>>> Jacob Faibussowitsch >>>>> (Jacob Fai - booss - oh - vitch) >>>>> >>>>> On Aug 30, 2021, at 10:12, Mark Adams <mfad...@lbl.gov> wrote: >>>>> >>>>> No luck with C++14 >>>>> >>>>> CUDAC >>>>> arch-summit-hypre-cuda-dbg/obj/vec/is/sf/impls/basic/cuda/sfcuda.o >>>>> CUDAC.dep >>>>> arch-summit-hypre-cuda-dbg/obj/vec/is/sf/impls/basic/cuda/sfcuda.o >>>>> CLINKER arch-summit-hypre-cuda-dbg/lib/libpetsc.so.3.015.3 >>>>> arch-summit-hypre-cuda-dbg/obj/sys/objects/device/impls/cupm/cuda/cupmcontext.o:(.rodata._ZN5Petsc13CUPMInterfaceILNS_14CUPMDeviceKindE0EE21cupmStreamNonBlockingE[_ZN5Petsc13CUPMInterfaceILNS_14CUPMDeviceKindE0EE21cupmStreamNonBlockingE]+0x0): >>>>> multiple definition of >>>>> `Petsc::CUPMInterface<(Petsc::CUPMDeviceKind)0>::cupmStreamNonBlocking' >>>>> arch-summit-hypre-cuda-dbg/obj/sys/objects/device/interface/cupminterface.o:(.rodata+0x44): >>>>> first defined here >>>>> /usr/bin/ld: link errors found, deleting executable >>>>> `arch-summit-hypre-cuda-dbg/lib/libpetsc.so.3.015.3' >>>>> collect2: error: ld returned 1 exit status >>>>> gmake[3]: *** [gmakefile:113: >>>>> arch-summit-hypre-cuda-dbg/lib/libpetsc.so.3.015.3] Error 1 >>>>> gmake[2]: *** >>>>> [/gpfs/alpine/csc314/scratch/adams/petsc2/lib/petsc/conf/rules:50: libs] >>>>> Error 2 >>>>> **************************ERROR************************************* >>>>> Error during compile, check >>>>> arch-summit-hypre-cuda-dbg/lib/petsc/conf/make.log >>>>> Send it and arch-summit-hypre-cuda-dbg/lib/petsc/conf/configure.log >>>>> to petsc-ma...@mcs.anl.gov >>>>> ******************************************************************** >>>>> gmake[1]: *** [makefile:40: all] Error 1 >>>>> >>>>> On Mon, Aug 30, 2021 at 10:50 AM Mark Adams <mfad...@lbl.gov> wrote: >>>>> >>>>>> Stefano suggested C++14 in configure. I was using C++11. >>>>>> >>>>>> On Mon, Aug 30, 2021 at 10:46 AM Junchao Zhang < >>>>>> junchao.zh...@gmail.com> wrote: >>>>>> >>>>>>> Petsc::CUPMInterface >>>>>>> @Jacob Faibussowitsch <jacob....@gmail.com> >>>>>>> --Junchao Zhang >>>>>>> >>>>>>> >>>>>>> On Mon, Aug 30, 2021 at 9:35 AM Mark Adams <mfad...@lbl.gov> wrote: >>>>>>> >>>>>>>> I was running fine this AM and am bouncing between modules to help >>>>>>>> two apps (ECP milestone season) at the same time and something broke. >>>>>>>> I did >>>>>>>> update main and I get the same error in main and a hypre branch of >>>>>>>> Stefano's. >>>>>>>> I started with a clean build and checked my modules... >>>>>>>> Any ideas? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Mark >>>>>>>> >>>>>>>> CC arch-summit-hypre-cuda-dbg/obj/tao/interface/taosolver.o >>>>>>>> CC arch-summit-hypre-cuda-dbg/obj/ts/interface/ts.o >>>>>>>> CUDAC >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/dense/seq/cuda/densecuda.o >>>>>>>> CUDAC.dep >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/dense/seq/cuda/densecuda.o >>>>>>>> CUDAC >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/aij/seq/seqcusparse/aijcusparseband.o >>>>>>>> CUDAC.dep >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/aij/seq/seqcusparse/aijcusparseband.o >>>>>>>> CUDAC >>>>>>>> arch-summit-hypre-cuda-dbg/obj/ts/utils/dmplexlandau/cuda/landaucu.o >>>>>>>> CUDAC.dep >>>>>>>> arch-summit-hypre-cuda-dbg/obj/ts/utils/dmplexlandau/cuda/landaucu.o >>>>>>>> CUDAC >>>>>>>> arch-summit-hypre-cuda-dbg/obj/vec/vec/impls/seq/seqcuda/veccuda2.o >>>>>>>> CUDAC.dep >>>>>>>> arch-summit-hypre-cuda-dbg/obj/vec/vec/impls/seq/seqcuda/veccuda2.o >>>>>>>> CUDAC >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.o >>>>>>>> CUDAC.dep >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.o >>>>>>>> CUDAC >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/aij/seq/seqcusparse/aijcusparse.o >>>>>>>> CUDAC.dep >>>>>>>> arch-summit-hypre-cuda-dbg/obj/mat/impls/aij/seq/seqcusparse/aijcusparse.o >>>>>>>> CUDAC >>>>>>>> arch-summit-hypre-cuda-dbg/obj/vec/is/sf/impls/basic/cuda/sfcuda.o >>>>>>>> CUDAC.dep >>>>>>>> arch-summit-hypre-cuda-dbg/obj/vec/is/sf/impls/basic/cuda/sfcuda.o >>>>>>>> CLINKER arch-summit-hypre-cuda-dbg/lib/libpetsc.so.3.015.3 >>>>>>>> arch-summit-hypre-cuda-dbg/obj/sys/objects/device/impls/cupm/cuda/cupmcontext.o:(.rodata._ZN5Petsc13CUPMInterfaceILNS_14CUPMDeviceKindE0EE21cupmStreamNonBlockingE[_ZN5Petsc13CUPMInterfaceILNS_14CUPMDeviceKindE0EE21cupmStreamNonBlockingE]+0x0): >>>>>>>> multiple definition of >>>>>>>> `Petsc::CUPMInterface<(Petsc::CUPMDeviceKind)0>::cupmStreamNonBlocking' >>>>>>>> arch-summit-hypre-cuda-dbg/obj/sys/objects/device/interface/cupminterface.o:(.rodata+0x44): >>>>>>>> first defined here >>>>>>>> /usr/bin/ld: link errors found, deleting executable >>>>>>>> `arch-summit-hypre-cuda-dbg/lib/libpetsc.so.3.015.3' >>>>>>>> collect2: error: ld returned 1 exit status >>>>>>>> gmake[3]: *** [gmakefile:113: >>>>>>>> arch-summit-hypre-cuda-dbg/lib/libpetsc.so.3.015.3] Error 1 >>>>>>>> gmake[2]: *** >>>>>>>> [/gpfs/alpine/csc314/scratch/adams/petsc2/lib/petsc/conf/rules:50: >>>>>>>> libs] >>>>>>>> Error 2 >>>>>>>> **************************ERROR************************************* >>>>>>>> Error during compile, check >>>>>>>> arch-summit-hypre-cuda-dbg/lib/petsc/conf/make.log >>>>>>>> Send it and >>>>>>>> arch-summit-hypre-cuda-dbg/lib/petsc/conf/configure.log to >>>>>>>> petsc-ma...@mcs.anl.gov >>>>>>>> ******************************************************************** >>>>>>>> gmake[1]: *** [makefile:40: all] Error 1 >>>>>>>> make: *** [GNUmakefile:9: all] Error 2 >>>>>>>> >>>>>>> >>>>> >>>> <make.log><configure.log> >>> >>> >>>