Dear dealii Community, I am working on solving the a nonlinear coupled problem involving a vector displacement field and a scalar phase-field variable. The code is MPI parallelized using p:d:t and TrilinosWrappers for linear algebra.
Usually I use CG+AMG for solving the SLEs when solving for each of the variables within a staggered scheme. But for certain scenarios, the iterative linear solver fails and we switch to Amesos_Superludist solver. The code is run on 2 nodes (144 MPI processes in total) and as shown by the code performance monitor, the flop count of one of the nodes drops to (almost) zero and only one one node seems to be doing the computations once the solver switch from iterative to direct solver occurs. Please see attached flops and memory bandwidth plots. The blue and red lines here represent the two nodes. Similar observations were also made for a larger problem involving 8 nodes. These plots seem to hint that Superlu-dist solver does not scale across multiple nodes. One possible reason I could think of is that I probably missed some option while installing dealii with trilinos and superlu-dist using spack. I also attach the spack spec which I installed on the cluster. The gcc compiler and corresponding openmpi@4.1.2 are available form the cluster. Any ideas on solving this issue would be of great help. Kind regards, Paras Kumar -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/44fad345-7ca4-47b7-a20b-29ee006efc11n%40googlegroups.com.
Input spec -------------------------------- dealii@9.0.1%gcc@11.2.0 ^trilinos@12.10.1~explicit_template_instantiation+superlu-dist Concretized -------------------------------- dealii@9.0.1%gcc@11.2.0+adol-c~arborx+arpack+assimp~cuda~doc+examples~ginkgo+gmsh+gsl+hdf5~int64~ipo+metis+mpi+muparser~nanoflann~netcdf+oce+optflags+p4est+petsc~python+scalapack~simplex+slepc+sundials~symengine+threads+trilinos build_type=DebugRelease cuda_arch=none cxxstd=default patches=4282b32e96f2f5d376eb34f3fddcc4615fcd99b40004cca784eb874288d1b31c,61f217744b70f352965be265d2f06e8c1276685e2944ca0a88b7297dd55755da,6f876dc8eadafe2c4ec2a6673864fb451c6627ca80511b6e16f3c401946fdf33 arch=linux-almalinux8-icelake ^adol-c@2.7.2%gcc@11.2.0~advanced_branching+atrig_erf~boost+doc+examples~openmp~sparse arch=linux-almalinux8-icelake ^arpack-ng@3.8.0%gcc@11.2.0+mpi+shared arch=linux-almalinux8-icelake ^cmake@3.21.4%gcc@11.2.0~doc+ncurses~openssl+ownlibs~qt build_type=Release arch=linux-almalinux8-icelake ^ncurses@6.2%gcc@11.2.0~symlinks+termlib abi=none arch=linux-almalinux8-icelake ^pkgconf@1.8.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^openblas@0.3.18%gcc@11.2.0~bignuma~consistent_fpcsr~ilp64+locking+pic+shared threads=none arch=linux-almalinux8-icelake ^perl@5.34.0%gcc@11.2.0+cpanm+shared+threads arch=linux-almalinux8-icelake ^berkeley-db@18.1.40%gcc@11.2.0+cxx~docs+stl patches=b231fcc4d5cff05e5c3a4814f6a5af0e9a966428dc2176540d2c05aff41de522 arch=linux-almalinux8-icelake ^bzip2@1.0.8%gcc@11.2.0~debug~pic+shared arch=linux-almalinux8-icelake ^diffutils@3.8%gcc@11.2.0 arch=linux-almalinux8-icelake ^libiconv@1.16%gcc@11.2.0 libs=shared,static arch=linux-almalinux8-icelake ^gdbm@1.19%gcc@11.2.0 arch=linux-almalinux8-icelake ^readline@8.1%gcc@11.2.0 arch=linux-almalinux8-icelake ^zlib@1.2.11%gcc@11.2.0+optimize+pic+shared arch=linux-almalinux8-icelake ^openmpi@4.1.2%gcc@11.2.0~atomics~cuda~cxx~cxx_exceptions+gpfs~internal-hwloc~java+legacylaunchers~lustre~memchecker+pmi~pmix~singularity~sqlite3+static~thread_multiple+vt+wrapper-rpath fabrics=ucx schedulers=slurm arch=linux-almalinux8-icelake ^assimp@5.0.1%gcc@11.2.0~ipo+shared build_type=RelWithDebInfo arch=linux-almalinux8-icelake ^boost@1.76.0%gcc@11.2.0+atomic+chrono~clanglibcpp~container~context~coroutine+date_time~debug+exception~fiber+filesystem+graph~icu+iostreams+locale+log+math~mpi+multithreaded~numpy~pic+program_options~python+random+regex+serialization+shared+signals~singlethreaded+system~taggedlayout+test+thread+timer~versionedlayout+wave cxxstd=98 visibility=hidden arch=linux-almalinux8-icelake ^gmsh@4.8.4%gcc@11.2.0+alglib~cairo+cgns+compression~eigen~external+fltk+gmp~hdf5~ipo+med+metis+mmg~mpi+netgen+oce~opencascade~openmp~petsc~privateapi+shared~slepc+tetgen+voropp build_type=RelWithDebInfo arch=linux-almalinux8-icelake ^cgns@4.2.0%gcc@11.2.0~base_scope~fortran+hdf5~int64~ipo~legacy~mem_debug+mpi+scoping+shared~static~testing build_type=RelWithDebInfo arch=linux-almalinux8-icelake ^hdf5@1.10.7%gcc@11.2.0~cxx+fortran+hl~ipo~java+mpi+shared~szip~threadsafe+tools api=default build_type=RelWithDebInfo arch=linux-almalinux8-icelake ^numactl@2.0.14%gcc@11.2.0 patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006,ff37630df599cfabf0740518b91ec8daaf18e8f288b19adaae5364dc1f6b2296 arch=linux-almalinux8-icelake ^autoconf@2.69%gcc@11.2.0 patches=35c449281546376449766f92d49fc121ca50e330e60fefcfc9be2af3253082c2,7793209b33013dc0f81208718c68440c5aae80e7a1c4b8d336e382525af791a7,a49dd5bac3b62daa0ff688ab4d508d71dbd2f4f8d7e2a02321926346161bf3ee arch=linux-almalinux8-icelake ^m4@1.4.19%gcc@11.2.0+sigsegv patches=9dc5fbd0d5cb1037ab1e6d0ecc74a30df218d0a94bdd5a02759a97f62daca573,bfdffa7c2eb01021d5849b36972c069693654ad826c1a20b53534009a4ec7a89 arch=linux-almalinux8-icelake ^libsigsegv@2.13%gcc@11.2.0 arch=linux-almalinux8-icelake ^automake@1.16.3%gcc@11.2.0 arch=linux-almalinux8-icelake ^libtool@2.4.6%gcc@11.2.0 arch=linux-almalinux8-icelake ^fltk@1.3.7%gcc@11.2.0+gl+shared~xft arch=linux-almalinux8-icelake ^libx11@1.7.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^inputproto@2.3.2%gcc@11.2.0 arch=linux-almalinux8-icelake ^util-macros@1.19.3%gcc@11.2.0 arch=linux-almalinux8-icelake ^kbproto@1.0.7%gcc@11.2.0 arch=linux-almalinux8-icelake ^libxcb@1.14%gcc@11.2.0 arch=linux-almalinux8-icelake ^libpthread-stubs@0.4%gcc@11.2.0 arch=linux-almalinux8-icelake ^libxau@1.0.8%gcc@11.2.0 arch=linux-almalinux8-icelake ^xproto@7.0.31%gcc@11.2.0 arch=linux-almalinux8-icelake ^libxdmcp@1.1.2%gcc@11.2.0 arch=linux-almalinux8-icelake ^libbsd@0.11.3%gcc@11.2.0 arch=linux-almalinux8-icelake ^libmd@1.0.3%gcc@11.2.0 arch=linux-almalinux8-icelake ^xcb-proto@1.14.1%gcc@11.2.0 arch=linux-almalinux8-icelake ^xextproto@7.3.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^xtrans@1.3.5%gcc@11.2.0 arch=linux-almalinux8-icelake ^mesa@21.2.3%gcc@11.2.0+glx+llvm+opengl~opengles+osmesa~strip buildtype=debugoptimized default_library=shared swr=auto arch=linux-almalinux8-icelake ^bison@3.8.2%gcc@11.2.0 arch=linux-almalinux8-icelake ^expat@2.4.1%gcc@11.2.0+libbsd arch=linux-almalinux8-icelake ^flex@2.6.4%gcc@11.2.0+lex~nls patches=09c22e5c6fef327d3e48eb23f0d610dcd3a35ab9207f12e0f875701c677978d3 arch=linux-almalinux8-icelake ^findutils@4.8.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^gettext@0.21%gcc@11.2.0+bzip2+curses+git~libunistring+libxml2+tar+xz arch=linux-almalinux8-icelake ^libxml2@2.9.12%gcc@11.2.0~python arch=linux-almalinux8-icelake ^xz@5.2.5%gcc@11.2.0~pic libs=shared,static arch=linux-almalinux8-icelake ^tar@1.34%gcc@11.2.0 arch=linux-almalinux8-icelake ^help2man@1.47.16%gcc@11.2.0 arch=linux-almalinux8-icelake ^glproto@1.4.17%gcc@11.2.0 arch=linux-almalinux8-icelake ^libxext@1.3.3%gcc@11.2.0 arch=linux-almalinux8-icelake ^libxt@1.1.5%gcc@11.2.0 arch=linux-almalinux8-icelake ^libice@1.0.9%gcc@11.2.0 arch=linux-almalinux8-icelake ^libsm@1.2.3%gcc@11.2.0 arch=linux-almalinux8-icelake ^util-linux-uuid@2.36.2%gcc@11.2.0 arch=linux-almalinux8-icelake ^llvm@12.0.1%gcc@11.2.0~all_targets+clang~code_signing+compiler-rt~cuda~flang+gold+internal_unwind~ipo+libcxx+lld+lldb~llvm_dylib~mlir+omp_as_runtime~omp_debug~omp_tsan+polly~python~shared_libs~split_dwarf build_type=Release cuda_arch=none arch=linux-almalinux8-icelake ^binutils@2.37%gcc@11.2.0~gas+gold~headers~interwork+ld~libiberty~lto+nls+plugins libs=shared,static arch=linux-almalinux8-icelake ^hwloc@2.6.0%gcc@11.2.0~cairo~cuda~gl~libudev+libxml2~netloc~nvml~opencl+pci~rocm+shared arch=linux-almalinux8-icelake ^libpciaccess@0.16%gcc@11.2.0 arch=linux-almalinux8-icelake ^libedit@3.1-20210216%gcc@11.2.0 arch=linux-almalinux8-icelake ^perl-data-dumper@2.173%gcc@11.2.0 arch=linux-almalinux8-icelake ^python@3.8.12%gcc@11.2.0+bz2+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3~ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87,4c2457325f2b608b1b6a2c63087df8c26e07db3e3d493caf36a56f0ecf6fb768,f2fd060afc4b4618fe8104c4c5d771f36dc55b1db5a4623785a4ea707ec72fb4 arch=linux-almalinux8-icelake ^libffi@3.3%gcc@11.2.0 patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0 arch=linux-almalinux8-icelake ^sqlite@3.36.0%gcc@11.2.0+column_metadata+fts~functions~rtree arch=linux-almalinux8-icelake ^swig@4.0.2%gcc@11.2.0 arch=linux-almalinux8-icelake ^pcre@8.44%gcc@11.2.0~jit+multibyte+utf arch=linux-almalinux8-icelake ^z3@4.8.9%gcc@11.2.0~gmp~ipo~python build_type=RelWithDebInfo arch=linux-almalinux8-icelake ^meson@0.60.0%gcc@11.2.0 patches=aa6c50d5a2aeb1a487d16f6712be4357fefb923aae37ab830699b07338388287 arch=linux-almalinux8-icelake ^ninja@1.10.2%gcc@11.2.0 arch=linux-almalinux8-icelake ^py-setuptools@58.2.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^py-mako@1.1.4%gcc@11.2.0 arch=linux-almalinux8-icelake ^py-markupsafe@2.0.1%gcc@11.2.0 arch=linux-almalinux8-icelake ^xrandr@1.5.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^libxrandr@1.5.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^libxrender@0.9.10%gcc@11.2.0 arch=linux-almalinux8-icelake ^renderproto@0.11.1%gcc@11.2.0 arch=linux-almalinux8-icelake ^randrproto@1.5.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^freetype@2.11.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^libpng@1.6.37%gcc@11.2.0 arch=linux-almalinux8-icelake ^gmp@6.2.1%gcc@11.2.0 arch=linux-almalinux8-icelake ^med@4.0.0%gcc@11.2.0+api23~fortran~ipo+mpi~shared build_type=RelWithDebInfo patches=ba351973779de38d658c62db0f97180449b40540c7e5be28dccf6098966cbf2a arch=linux-almalinux8-icelake ^mesa-glu@9.0.1%gcc@11.2.0 arch=linux-almalinux8-icelake ^mmg@5.5.2%gcc@11.2.0~doc~ipo+scotch+shared build_type=RelWithDebInfo arch=linux-almalinux8-icelake ^scotch@6.1.1%gcc@11.2.0+compression~esmumps~int64~metis+mpi+shared arch=linux-almalinux8-icelake ^oce@0.18.3%gcc@11.2.0~X11+tbb arch=linux-almalinux8-icelake ^intel-tbb@2020.3%gcc@11.2.0~ipo+shared+tm build_type=RelWithDebInfo cxxstd=default patches=62ba015ebd1819c45bef47411540b789b493e31ca668c4ff4cb2afcbc306b476,ce1fb16fb932ce86a82ca87cf0431d1a8c83652af9f552b264213b2ff2945d73,d62cb666de4010998c339cde6f41c7623a07e9fc69e498f2e149821c0c2c6dd0 arch=linux-almalinux8-icelake ^gsl@2.7%gcc@11.2.0~external-cblas arch=linux-almalinux8-icelake ^metis@5.1.0%gcc@11.2.0~gdb~int64~real64+shared build_type=Release patches=4991da938c1d3a1d3dea78e49bbebecba00273f98df2a656e38b83d55b281da1,b1225da886605ea558db7ac08dd8054742ea5afe5ed61ad4d0fe7a495b1270d2 arch=linux-almalinux8-icelake ^muparser@2.2.6.1%gcc@11.2.0 arch=linux-almalinux8-icelake ^netlib-scalapack@2.1.0%gcc@11.2.0~ipo~pic+shared build_type=Release patches=1c9ce5fee1451a08c2de3cc87f446aeda0b818ebbce4ad0d980ddf2f2a0b2dc4,f2baedde688ffe4c20943c334f580eb298e04d6f35c86b90a1f4e8cb7ae344a2 arch=linux-almalinux8-icelake ^p4est@2.8%gcc@11.2.0+mpi~openmp arch=linux-almalinux8-icelake ^petsc@3.16.1%gcc@11.2.0~X~batch~cgns~complex~cuda~debug+double~exodusii~fftw~giflib+hdf5~hpddm~hwloc+hypre~int64~jpeg~knl~libpng~libyaml~memkind+metis~mkl-pardiso~mmg~moab~mpfr+mpi~mumps~openmp~p4est~parmmg~ptscotch~random123~rocm~saws~scalapack+shared~strumpack~suite-sparse~superlu-dist~tetgen~trilinos~valgrind amdgpu_target=none clanguage=C cuda_arch=none arch=linux-almalinux8-icelake ^hypre@2.23.0%gcc@11.2.0~complex~cuda~debug+fortran~int64~internal-superlu~mixedint+mpi~openmp+shared~superlu-dist~unified-memory cuda_arch=none arch=linux-almalinux8-icelake ^parmetis@4.0.3%gcc@11.2.0~gdb~int64~ipo+shared build_type=RelWithDebInfo patches=4f892531eb0a807eb1b82e683a416d3e35154a455274cf9b162fb02054d11a5b,50ed2081bc939269689789942067c58b3e522c269269a430d5d34c00edbc5870,704b84f7c7444d4372cb59cca6e1209df4ef3b033bc4ee3cf50f369bce972a9d arch=linux-almalinux8-icelake ^slepc@3.16.0%gcc@11.2.0+arpack~blopex~cuda~rocm amdgpu_target=none cuda_arch=none arch=linux-almalinux8-icelake ^suite-sparse@5.10.1%gcc@11.2.0~cuda~openmp+pic~tbb arch=linux-almalinux8-icelake ^mpfr@4.1.0%gcc@11.2.0 arch=linux-almalinux8-icelake ^autoconf-archive@2019.01.06%gcc@11.2.0 arch=linux-almalinux8-icelake ^texinfo@6.5%gcc@11.2.0 patches=12f6edb0c6b270b8c8dba2ce17998c580db01182d871ee32b7b6e4129bd1d23a,1732115f651cff98989cb0215d8f64da5e0f7911ebf0c13b064920f088f2ffe1 arch=linux-almalinux8-icelake ^sundials@3.2.1%gcc@11.2.0+ARKODE+CVODE+CVODES+IDA+IDAS+KINSOL~cuda+examples+examples-install~f2003~fcmix+generic-math~hypre~int64~ipo~klu~lapack~monitoring+mpi~openmp~petsc~pthread~raja~rocm+shared+static~superlu-dist~superlu-mt~sycl~trilinos amdgpu_target=none build_type=RelWithDebInfo cuda_arch=none precision=double arch=linux-almalinux8-icelake ^trilinos@12.10.1%gcc@11.2.0~adios2+amesos+amesos2+anasazi+aztec~basker+belos~boost~chaco~complex~cuda~cuda_rdc~debug~dtk+epetra+epetraext~epetraextbtf~epetraextexperimental~epetraextgraphreorderings~exodus~explicit_template_instantiation~float+fortran~gtest~hdf5~hypre+ifpack+ifpack2~intrepid~intrepid2~ipo~isorropia+kokkos~mesquite~minitensor+ml+mpi+muelu~mumps~nox~openmp~phalanx~piro~python~rol~rythmos+sacado~scorec~shards+shared~shylu~stk~stokhos~stratimikos~strumpack~suite-sparse~superlu+superlu-dist~teko~tempus+tpetra~trilinoscouplings~wrapper~x11~zoltan~zoltan2 build_type=RelWithDebInfo cuda_arch=none cxxstd=14 gotype=long_long arch=linux-almalinux8-icelake ^superlu-dist@5.3.0%gcc@11.2.0~cuda~int64~ipo~openmp+shared build_type=RelWithDebInfo cuda_arch=none arch=linux-almalinux8-icelake