Re: [deal.II] Trilinos::Amesos_Superludist not scaling across multiple nodes

Wolfgang Bangerth Thu, 14 Jul 2022 10:09:54 -0700

On 7/14/22 03:34, Paras Kumar wrote:

I am working on solving the a nonlinear coupled problem involving a vectordisplacement field and a scalar phase-field variable. The code is MPIparallelized using p:d:t and TrilinosWrappers for linear algebra.
Usually I use CG+AMG for solving the SLEs when solving for each of thevariables within a staggered scheme. But for certain scenarios, the iterativelinear solver fails and we switch to Amesos_Superludist solver. The code isrun on 2 nodes (144 MPI processes in total) and as shown by the codeperformance monitor, the flop count of one of the nodes drops to (almost) zeroand only one one node seems to be doing the computations once the solverswitch from iterative to direct solver occurs. Please see attached flops andmemory bandwidth plots. The blue and red lines here represent the two nodes.Similar observations were also made for a larger problem involving 8 nodes.
These plots seem to hint that Superlu-dist solver does not scale acrossmultiple nodes. One possible reason I could think of is that I probably missedsome option while installing dealii with trilinos and superlu-dist usingspack. I also attach the spack spec which I installed on the cluster. The gcccompiler and corresponding openmpi@4.1.2 are available form the cluster.


Paras:

I'm not sure any of us have experience with Amesos:SuperLU, so I'm not sureanyone will know right away what the problem may be.


But here are a couple of questions:

* What happens if you run the program with just two MPI jobs on one machine?In that case, you can watch what the two programs are doing by having 'top'run in a separate window.* How do you distribute the matrix and right hand side? Are they both fullydistributed?

* Is the solution you get correct?

* If the answer to the last question is yes, then either Amesos or SuperLU isapparently copying the data of the linear system from all other processes tojust one process that then solves the linear system. It might be useful totake a debugger, running with just two MPI processes, to step into the Amesosroutines to see if you get to a place where that is happening, and then toread the code in that place to see what flags need to be set to make sure thesolution really does happen in a distributed way.


That's about all I can offer.
Best
 W.

--
------------------------------------------------------------------------
Wolfgang Bangerth          email:                 bange...@colostate.edu
                           www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en

---You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/b207e535-5f6b-f06a-e902-87628bbcf5e5%40colostate.edu.

Re: [deal.II] Trilinos::Amesos_Superludist not scaling across multiple nodes

Reply via email to