Dear dealii Community,

I am working on solving the a nonlinear coupled problem involving a vector 
displacement field and a scalar phase-field variable. The code is MPI 
parallelized using p:d:t and TrilinosWrappers for linear algebra. 

Usually I use CG+AMG for solving the SLEs when solving for each of the 
variables within a staggered scheme.  But for certain scenarios, the 
iterative linear solver fails and we switch to Amesos_Superludist solver. 
The code is run on 2 nodes (144 MPI processes in total) and as shown by the 
code performance monitor, the flop count of one of the nodes drops to 
(almost) zero and only one one node seems to be doing the computations once 
the solver switch from iterative to direct solver occurs. Please see 
attached flops and memory bandwidth plots. The blue and red lines here 
represent the two nodes. Similar observations were also made for  a larger 
problem involving 8 nodes.

These plots seem to  hint that Superlu-dist solver does not scale across 
multiple nodes. One possible reason I could think of is that I probably 
missed some option while installing dealii with trilinos and superlu-dist 
using spack. I also attach the spack spec which I installed on the 
cluster.  The gcc compiler and corresponding openmpi@4.1.2 are available 
form the cluster.

Any ideas on solving this issue would be of great help.

Kind regards,
Paras Kumar







-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/44fad345-7ca4-47b7-a20b-29ee006efc11n%40googlegroups.com.
Input spec
--------------------------------
dealii@9.0.1%gcc@11.2.0
    ^trilinos@12.10.1~explicit_template_instantiation+superlu-dist

Concretized
--------------------------------
dealii@9.0.1%gcc@11.2.0+adol-c~arborx+arpack+assimp~cuda~doc+examples~ginkgo+gmsh+gsl+hdf5~int64~ipo+metis+mpi+muparser~nanoflann~netcdf+oce+optflags+p4est+petsc~python+scalapack~simplex+slepc+sundials~symengine+threads+trilinos
 build_type=DebugRelease cuda_arch=none cxxstd=default 
patches=4282b32e96f2f5d376eb34f3fddcc4615fcd99b40004cca784eb874288d1b31c,61f217744b70f352965be265d2f06e8c1276685e2944ca0a88b7297dd55755da,6f876dc8eadafe2c4ec2a6673864fb451c6627ca80511b6e16f3c401946fdf33
 arch=linux-almalinux8-icelake
    
^adol-c@2.7.2%gcc@11.2.0~advanced_branching+atrig_erf~boost+doc+examples~openmp~sparse
 arch=linux-almalinux8-icelake
    ^arpack-ng@3.8.0%gcc@11.2.0+mpi+shared arch=linux-almalinux8-icelake
        ^cmake@3.21.4%gcc@11.2.0~doc+ncurses~openssl+ownlibs~qt 
build_type=Release arch=linux-almalinux8-icelake
            ^ncurses@6.2%gcc@11.2.0~symlinks+termlib abi=none 
arch=linux-almalinux8-icelake
                ^pkgconf@1.8.0%gcc@11.2.0 arch=linux-almalinux8-icelake
        
^openblas@0.3.18%gcc@11.2.0~bignuma~consistent_fpcsr~ilp64+locking+pic+shared 
threads=none arch=linux-almalinux8-icelake
            ^perl@5.34.0%gcc@11.2.0+cpanm+shared+threads 
arch=linux-almalinux8-icelake
                ^berkeley-db@18.1.40%gcc@11.2.0+cxx~docs+stl 
patches=b231fcc4d5cff05e5c3a4814f6a5af0e9a966428dc2176540d2c05aff41de522 
arch=linux-almalinux8-icelake
                ^bzip2@1.0.8%gcc@11.2.0~debug~pic+shared 
arch=linux-almalinux8-icelake
                    ^diffutils@3.8%gcc@11.2.0 arch=linux-almalinux8-icelake
                        ^libiconv@1.16%gcc@11.2.0 libs=shared,static 
arch=linux-almalinux8-icelake
                ^gdbm@1.19%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^readline@8.1%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^zlib@1.2.11%gcc@11.2.0+optimize+pic+shared 
arch=linux-almalinux8-icelake
        
^openmpi@4.1.2%gcc@11.2.0~atomics~cuda~cxx~cxx_exceptions+gpfs~internal-hwloc~java+legacylaunchers~lustre~memchecker+pmi~pmix~singularity~sqlite3+static~thread_multiple+vt+wrapper-rpath
 fabrics=ucx schedulers=slurm arch=linux-almalinux8-icelake
    ^assimp@5.0.1%gcc@11.2.0~ipo+shared build_type=RelWithDebInfo 
arch=linux-almalinux8-icelake
        
^boost@1.76.0%gcc@11.2.0+atomic+chrono~clanglibcpp~container~context~coroutine+date_time~debug+exception~fiber+filesystem+graph~icu+iostreams+locale+log+math~mpi+multithreaded~numpy~pic+program_options~python+random+regex+serialization+shared+signals~singlethreaded+system~taggedlayout+test+thread+timer~versionedlayout+wave
 cxxstd=98 visibility=hidden arch=linux-almalinux8-icelake
    
^gmsh@4.8.4%gcc@11.2.0+alglib~cairo+cgns+compression~eigen~external+fltk+gmp~hdf5~ipo+med+metis+mmg~mpi+netgen+oce~opencascade~openmp~petsc~privateapi+shared~slepc+tetgen+voropp
 build_type=RelWithDebInfo arch=linux-almalinux8-icelake
        
^cgns@4.2.0%gcc@11.2.0~base_scope~fortran+hdf5~int64~ipo~legacy~mem_debug+mpi+scoping+shared~static~testing
 build_type=RelWithDebInfo arch=linux-almalinux8-icelake
            
^hdf5@1.10.7%gcc@11.2.0~cxx+fortran+hl~ipo~java+mpi+shared~szip~threadsafe+tools
 api=default build_type=RelWithDebInfo arch=linux-almalinux8-icelake
                ^numactl@2.0.14%gcc@11.2.0 
patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006,ff37630df599cfabf0740518b91ec8daaf18e8f288b19adaae5364dc1f6b2296
 arch=linux-almalinux8-icelake
                    ^autoconf@2.69%gcc@11.2.0 
patches=35c449281546376449766f92d49fc121ca50e330e60fefcfc9be2af3253082c2,7793209b33013dc0f81208718c68440c5aae80e7a1c4b8d336e382525af791a7,a49dd5bac3b62daa0ff688ab4d508d71dbd2f4f8d7e2a02321926346161bf3ee
 arch=linux-almalinux8-icelake
                        ^m4@1.4.19%gcc@11.2.0+sigsegv 
patches=9dc5fbd0d5cb1037ab1e6d0ecc74a30df218d0a94bdd5a02759a97f62daca573,bfdffa7c2eb01021d5849b36972c069693654ad826c1a20b53534009a4ec7a89
 arch=linux-almalinux8-icelake
                            ^libsigsegv@2.13%gcc@11.2.0 
arch=linux-almalinux8-icelake
                    ^automake@1.16.3%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^libtool@2.4.6%gcc@11.2.0 arch=linux-almalinux8-icelake
        ^fltk@1.3.7%gcc@11.2.0+gl+shared~xft arch=linux-almalinux8-icelake
            ^libx11@1.7.0%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^inputproto@2.3.2%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^util-macros@1.19.3%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^kbproto@1.0.7%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^libxcb@1.14%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^libpthread-stubs@0.4%gcc@11.2.0 
arch=linux-almalinux8-icelake
                    ^libxau@1.0.8%gcc@11.2.0 arch=linux-almalinux8-icelake
                        ^xproto@7.0.31%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^libxdmcp@1.1.2%gcc@11.2.0 arch=linux-almalinux8-icelake
                        ^libbsd@0.11.3%gcc@11.2.0 arch=linux-almalinux8-icelake
                            ^libmd@1.0.3%gcc@11.2.0 
arch=linux-almalinux8-icelake
                    ^xcb-proto@1.14.1%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^xextproto@7.3.0%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^xtrans@1.3.5%gcc@11.2.0 arch=linux-almalinux8-icelake
            ^mesa@21.2.3%gcc@11.2.0+glx+llvm+opengl~opengles+osmesa~strip 
buildtype=debugoptimized default_library=shared swr=auto 
arch=linux-almalinux8-icelake
                ^bison@3.8.2%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^expat@2.4.1%gcc@11.2.0+libbsd arch=linux-almalinux8-icelake
                ^flex@2.6.4%gcc@11.2.0+lex~nls 
patches=09c22e5c6fef327d3e48eb23f0d610dcd3a35ab9207f12e0f875701c677978d3 
arch=linux-almalinux8-icelake
                    ^findutils@4.8.0%gcc@11.2.0 arch=linux-almalinux8-icelake
                    
^gettext@0.21%gcc@11.2.0+bzip2+curses+git~libunistring+libxml2+tar+xz 
arch=linux-almalinux8-icelake
                        ^libxml2@2.9.12%gcc@11.2.0~python 
arch=linux-almalinux8-icelake
                            ^xz@5.2.5%gcc@11.2.0~pic libs=shared,static 
arch=linux-almalinux8-icelake
                        ^tar@1.34%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^help2man@1.47.16%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^glproto@1.4.17%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^libxext@1.3.3%gcc@11.2.0 arch=linux-almalinux8-icelake
                ^libxt@1.1.5%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^libice@1.0.9%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^libsm@1.2.3%gcc@11.2.0 arch=linux-almalinux8-icelake
                        ^util-linux-uuid@2.36.2%gcc@11.2.0 
arch=linux-almalinux8-icelake
                
^llvm@12.0.1%gcc@11.2.0~all_targets+clang~code_signing+compiler-rt~cuda~flang+gold+internal_unwind~ipo+libcxx+lld+lldb~llvm_dylib~mlir+omp_as_runtime~omp_debug~omp_tsan+polly~python~shared_libs~split_dwarf
 build_type=Release cuda_arch=none arch=linux-almalinux8-icelake
                    
^binutils@2.37%gcc@11.2.0~gas+gold~headers~interwork+ld~libiberty~lto+nls+plugins
 libs=shared,static arch=linux-almalinux8-icelake
                    
^hwloc@2.6.0%gcc@11.2.0~cairo~cuda~gl~libudev+libxml2~netloc~nvml~opencl+pci~rocm+shared
 arch=linux-almalinux8-icelake
                        ^libpciaccess@0.16%gcc@11.2.0 
arch=linux-almalinux8-icelake
                    ^libedit@3.1-20210216%gcc@11.2.0 
arch=linux-almalinux8-icelake
                    ^perl-data-dumper@2.173%gcc@11.2.0 
arch=linux-almalinux8-icelake
                    
^python@3.8.12%gcc@11.2.0+bz2+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3~ssl~tix~tkinter~ucs4+uuid+zlib
 
patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87,4c2457325f2b608b1b6a2c63087df8c26e07db3e3d493caf36a56f0ecf6fb768,f2fd060afc4b4618fe8104c4c5d771f36dc55b1db5a4623785a4ea707ec72fb4
 arch=linux-almalinux8-icelake
                        ^libffi@3.3%gcc@11.2.0 
patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0 
arch=linux-almalinux8-icelake
                        
^sqlite@3.36.0%gcc@11.2.0+column_metadata+fts~functions~rtree 
arch=linux-almalinux8-icelake
                    ^swig@4.0.2%gcc@11.2.0 arch=linux-almalinux8-icelake
                        ^pcre@8.44%gcc@11.2.0~jit+multibyte+utf 
arch=linux-almalinux8-icelake
                    ^z3@4.8.9%gcc@11.2.0~gmp~ipo~python 
build_type=RelWithDebInfo arch=linux-almalinux8-icelake
                ^meson@0.60.0%gcc@11.2.0 
patches=aa6c50d5a2aeb1a487d16f6712be4357fefb923aae37ab830699b07338388287 
arch=linux-almalinux8-icelake
                    ^ninja@1.10.2%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^py-setuptools@58.2.0%gcc@11.2.0 
arch=linux-almalinux8-icelake
                ^py-mako@1.1.4%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^py-markupsafe@2.0.1%gcc@11.2.0 
arch=linux-almalinux8-icelake
                ^xrandr@1.5.0%gcc@11.2.0 arch=linux-almalinux8-icelake
                    ^libxrandr@1.5.0%gcc@11.2.0 arch=linux-almalinux8-icelake
                        ^libxrender@0.9.10%gcc@11.2.0 
arch=linux-almalinux8-icelake
                            ^renderproto@0.11.1%gcc@11.2.0 
arch=linux-almalinux8-icelake
                        ^randrproto@1.5.0%gcc@11.2.0 
arch=linux-almalinux8-icelake
        ^freetype@2.11.0%gcc@11.2.0 arch=linux-almalinux8-icelake
            ^libpng@1.6.37%gcc@11.2.0 arch=linux-almalinux8-icelake
        ^gmp@6.2.1%gcc@11.2.0 arch=linux-almalinux8-icelake
        ^med@4.0.0%gcc@11.2.0+api23~fortran~ipo+mpi~shared 
build_type=RelWithDebInfo 
patches=ba351973779de38d658c62db0f97180449b40540c7e5be28dccf6098966cbf2a 
arch=linux-almalinux8-icelake
        ^mesa-glu@9.0.1%gcc@11.2.0 arch=linux-almalinux8-icelake
        ^mmg@5.5.2%gcc@11.2.0~doc~ipo+scotch+shared build_type=RelWithDebInfo 
arch=linux-almalinux8-icelake
            ^scotch@6.1.1%gcc@11.2.0+compression~esmumps~int64~metis+mpi+shared 
arch=linux-almalinux8-icelake
        ^oce@0.18.3%gcc@11.2.0~X11+tbb arch=linux-almalinux8-icelake
            ^intel-tbb@2020.3%gcc@11.2.0~ipo+shared+tm 
build_type=RelWithDebInfo cxxstd=default 
patches=62ba015ebd1819c45bef47411540b789b493e31ca668c4ff4cb2afcbc306b476,ce1fb16fb932ce86a82ca87cf0431d1a8c83652af9f552b264213b2ff2945d73,d62cb666de4010998c339cde6f41c7623a07e9fc69e498f2e149821c0c2c6dd0
 arch=linux-almalinux8-icelake
    ^gsl@2.7%gcc@11.2.0~external-cblas arch=linux-almalinux8-icelake
    ^metis@5.1.0%gcc@11.2.0~gdb~int64~real64+shared build_type=Release 
patches=4991da938c1d3a1d3dea78e49bbebecba00273f98df2a656e38b83d55b281da1,b1225da886605ea558db7ac08dd8054742ea5afe5ed61ad4d0fe7a495b1270d2
 arch=linux-almalinux8-icelake
    ^muparser@2.2.6.1%gcc@11.2.0 arch=linux-almalinux8-icelake
    ^netlib-scalapack@2.1.0%gcc@11.2.0~ipo~pic+shared build_type=Release 
patches=1c9ce5fee1451a08c2de3cc87f446aeda0b818ebbce4ad0d980ddf2f2a0b2dc4,f2baedde688ffe4c20943c334f580eb298e04d6f35c86b90a1f4e8cb7ae344a2
 arch=linux-almalinux8-icelake
    ^p4est@2.8%gcc@11.2.0+mpi~openmp arch=linux-almalinux8-icelake
    
^petsc@3.16.1%gcc@11.2.0~X~batch~cgns~complex~cuda~debug+double~exodusii~fftw~giflib+hdf5~hpddm~hwloc+hypre~int64~jpeg~knl~libpng~libyaml~memkind+metis~mkl-pardiso~mmg~moab~mpfr+mpi~mumps~openmp~p4est~parmmg~ptscotch~random123~rocm~saws~scalapack+shared~strumpack~suite-sparse~superlu-dist~tetgen~trilinos~valgrind
 amdgpu_target=none clanguage=C cuda_arch=none arch=linux-almalinux8-icelake
        
^hypre@2.23.0%gcc@11.2.0~complex~cuda~debug+fortran~int64~internal-superlu~mixedint+mpi~openmp+shared~superlu-dist~unified-memory
 cuda_arch=none arch=linux-almalinux8-icelake
        ^parmetis@4.0.3%gcc@11.2.0~gdb~int64~ipo+shared 
build_type=RelWithDebInfo 
patches=4f892531eb0a807eb1b82e683a416d3e35154a455274cf9b162fb02054d11a5b,50ed2081bc939269689789942067c58b3e522c269269a430d5d34c00edbc5870,704b84f7c7444d4372cb59cca6e1209df4ef3b033bc4ee3cf50f369bce972a9d
 arch=linux-almalinux8-icelake
    ^slepc@3.16.0%gcc@11.2.0+arpack~blopex~cuda~rocm amdgpu_target=none 
cuda_arch=none arch=linux-almalinux8-icelake
    ^suite-sparse@5.10.1%gcc@11.2.0~cuda~openmp+pic~tbb 
arch=linux-almalinux8-icelake
        ^mpfr@4.1.0%gcc@11.2.0 arch=linux-almalinux8-icelake
            ^autoconf-archive@2019.01.06%gcc@11.2.0 
arch=linux-almalinux8-icelake
            ^texinfo@6.5%gcc@11.2.0 
patches=12f6edb0c6b270b8c8dba2ce17998c580db01182d871ee32b7b6e4129bd1d23a,1732115f651cff98989cb0215d8f64da5e0f7911ebf0c13b064920f088f2ffe1
 arch=linux-almalinux8-icelake
    
^sundials@3.2.1%gcc@11.2.0+ARKODE+CVODE+CVODES+IDA+IDAS+KINSOL~cuda+examples+examples-install~f2003~fcmix+generic-math~hypre~int64~ipo~klu~lapack~monitoring+mpi~openmp~petsc~pthread~raja~rocm+shared+static~superlu-dist~superlu-mt~sycl~trilinos
 amdgpu_target=none build_type=RelWithDebInfo cuda_arch=none precision=double 
arch=linux-almalinux8-icelake
    
^trilinos@12.10.1%gcc@11.2.0~adios2+amesos+amesos2+anasazi+aztec~basker+belos~boost~chaco~complex~cuda~cuda_rdc~debug~dtk+epetra+epetraext~epetraextbtf~epetraextexperimental~epetraextgraphreorderings~exodus~explicit_template_instantiation~float+fortran~gtest~hdf5~hypre+ifpack+ifpack2~intrepid~intrepid2~ipo~isorropia+kokkos~mesquite~minitensor+ml+mpi+muelu~mumps~nox~openmp~phalanx~piro~python~rol~rythmos+sacado~scorec~shards+shared~shylu~stk~stokhos~stratimikos~strumpack~suite-sparse~superlu+superlu-dist~teko~tempus+tpetra~trilinoscouplings~wrapper~x11~zoltan~zoltan2
 build_type=RelWithDebInfo cuda_arch=none cxxstd=14 gotype=long_long 
arch=linux-almalinux8-icelake
        ^superlu-dist@5.3.0%gcc@11.2.0~cuda~int64~ipo~openmp+shared 
build_type=RelWithDebInfo cuda_arch=none arch=linux-almalinux8-icelake

Reply via email to