Dear PETSc team, I am solving a linear transient dynamic problem, based on a discretization with finite elements. To do that, I am using FGMRES with GAMG as a preconditioner. I consider here 10 time steps. The problem has round to 118e6 dof and I am running on 1000, 1500 and 2000 procs. So I have something like 100e3, 78e3 and 50e3 dof/proc. I notice that the performance deteriorates when I increase the number of processes. You can find as attached file the log_view of the execution and the detailled definition of the KSP.
Is the problem too small to run on that number of processes or is there something wrong with my use of GAMG? I thank you in advance for your help, Nicolas
---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Unknown Name on a arch-linux2-c-opt-mpi-ml-hypre named eocn0117 with 1000 processors, by B07947 Thu Nov 15 16:14:46 2018 Using Petsc Release Version 3.8.2, Nov, 09, 2017 Max Max/Min Avg Total Time (sec): 1.661e+02 1.00034 1.661e+02 Objects: 1.401e+03 1.00143 1.399e+03 Flop: 7.695e+10 1.13672 7.354e+10 7.354e+13 Flop/sec: 4.633e+08 1.13672 4.428e+08 4.428e+11 MPI Messages: 3.697e+05 12.46258 1.179e+05 1.179e+08 MPI Message Lengths: 8.786e+08 3.98485 4.086e+03 4.817e+11 MPI Reductions: 2.635e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.6608e+02 100.0% 7.3541e+13 100.0% 1.178e+08 99.9% 4.081e+03 99.9% 2.603e+03 98.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 7342 1.0 4.4956e+01 1.4 4.09e+10 1.2 9.6e+07 4.3e+03 0.0e+00 23 53 81 86 0 23 53 81 86 0 859939 MatMultAdd 1130 1.0 3.4048e+00 2.3 1.55e+09 1.1 8.4e+06 8.2e+02 0.0e+00 2 2 7 1 0 2 2 7 1 0 434274 MatMultTranspose 1130 1.0 4.7555e+00 3.8 1.55e+09 1.1 8.4e+06 8.2e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 310924 MatSolve 226 0.0 6.8927e-04 0.0 6.24e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 90 MatSOR 6835 1.0 3.6061e+01 1.4 2.85e+10 1.1 0.0e+00 0.0e+00 0.0e+00 20 37 0 0 0 20 37 0 0 0 760198 MatLUFactorSym 1 1.0 1.0800e-0390.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 8.0395e-04421.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1 MatScale 15 1.0 1.7925e-02 1.8 9.12e+06 1.1 6.6e+04 1.1e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 485856 MatResidual 1130 1.0 6.3576e+00 1.5 5.31e+09 1.2 1.5e+07 3.7e+03 0.0e+00 3 7 13 11 0 3 7 13 11 0 781728 MatAssemblyBegin 112 1.0 9.9765e-01 3.0 0.00e+00 0.0 2.1e+05 7.8e+04 7.4e+01 0 0 0 3 3 0 0 0 3 3 0 MatAssemblyEnd 112 1.0 6.8845e-01 1.1 0.00e+00 0.0 8.3e+05 3.4e+02 2.6e+02 0 0 1 0 10 0 0 1 0 10 0 MatGetRow 582170 1.0 8.5022e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 0.0 2.0885e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMat 6 1.0 3.7804e-02 1.0 0.00e+00 0.0 5.6e+04 2.8e+03 1.0e+02 0 0 0 0 4 0 0 0 0 4 0 MatGetOrdering 1 0.0 4.4608e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 5 1.0 3.2871e-02 1.1 0.00e+00 0.0 1.2e+06 4.9e+02 5.2e+01 0 0 1 0 2 0 0 1 0 2 0 MatZeroEntries 5 1.0 6.6769e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 90 1.3 8.9249e-0216.6 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+01 0 0 0 0 3 0 0 0 0 3 0 MatAXPY 5 1.0 6.4984e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 MatMatMult 5 1.0 6.8333e-01 1.0 1.41e+08 1.2 3.7e+05 1.0e+04 8.2e+01 0 0 0 1 3 0 0 0 1 3 193093 MatMatMultSym 5 1.0 4.8541e-01 1.0 0.00e+00 0.0 3.0e+05 7.8e+03 7.0e+01 0 0 0 0 3 0 0 0 0 3 0 MatMatMultNum 5 1.0 1.9432e-01 1.0 1.41e+08 1.2 6.6e+04 2.2e+04 1.0e+01 0 0 0 0 0 0 0 0 0 0 679018 MatPtAP 5 1.0 4.2329e+00 1.0 1.54e+09 1.5 8.3e+05 4.3e+04 8.7e+01 3 2 1 7 3 3 2 1 7 3 292103 MatPtAPSymbolic 5 1.0 2.7832e+00 1.0 0.00e+00 0.0 3.5e+05 5.6e+04 3.7e+01 2 0 0 4 1 2 0 0 4 1 0 MatPtAPNumeric 5 1.0 1.4511e+00 1.0 1.54e+09 1.5 4.8e+05 3.3e+04 5.0e+01 1 2 0 3 2 1 2 0 3 2 852080 MatTrnMatMult 1 1.0 1.5337e+00 1.0 5.87e+07 1.3 6.9e+04 8.1e+04 1.9e+01 1 0 0 1 1 1 0 0 1 1 36505 MatTrnMatMultSym 1 1.0 9.2151e-01 1.0 0.00e+00 0.0 5.7e+04 3.4e+04 1.7e+01 1 0 0 0 1 1 0 0 0 1 0 MatTrnMatMultNum 1 1.0 6.1297e-01 1.0 5.87e+07 1.3 1.1e+04 3.2e+05 2.0e+00 0 0 0 1 0 0 0 0 1 0 91341 MatGetLocalMat 17 1.0 5.4432e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 15 1.0 7.0758e-02 2.1 0.00e+00 0.0 4.6e+05 4.2e+04 0.0e+00 0 0 0 4 0 0 0 0 4 0 0 VecMDot 329 1.0 6.2030e+0013.7 6.68e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 1 0 0 12 1 1 0 0 13 106230 VecNorm 595 1.0 1.1655e+00 8.5 1.21e+08 1.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 23 0 0 0 0 23 102761 VecScale 349 1.0 6.6033e-02 4.6 3.13e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 467735 VecCopy 1386 1.0 1.0624e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 4392 1.0 8.6035e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 246 1.0 4.8357e-02 1.4 5.69e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1160750 VecAYPX 9276 1.0 4.4571e-01 1.4 3.11e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 687917 VecAXPBYCZ 4520 1.0 2.8744e-01 1.4 5.66e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1939847 VecMAXPY 575 1.0 8.4132e-01 1.5 1.36e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1600021 VecAssemblyBegin 185 1.0 6.6342e-02 1.3 0.00e+00 0.0 2.3e+04 2.2e+04 5.5e+02 0 0 0 0 21 0 0 0 0 21 0 VecAssemblyEnd 185 1.0 4.2391e-04 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 55 1.0 3.7224e-03 1.5 1.38e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 364534 VecScatterBegin 9786 1.0 8.6765e-01 5.5 0.00e+00 0.0 1.1e+08 3.8e+03 0.0e+00 0 0 97 90 0 0 0 97 90 0 0 VecScatterEnd 9786 1.0 1.9699e+01 9.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 VecSetRandom 5 1.0 4.3778e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 113 1.0 9.7592e-02 3.3 9.34e+06 1.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 4 0 0 0 0 4 94297 KSPGMRESOrthog 326 1.0 6.4559e+00 9.1 1.33e+09 1.0 0.0e+00 0.0e+00 3.3e+02 1 2 0 0 12 1 2 0 0 13 203262 KSPSetUp 18 1.0 1.4065e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 10 1.0 7.9545e+01 1.0 7.50e+10 1.1 1.1e+08 3.8e+03 8.1e+02 48 98 95 89 31 48 98 95 89 31 903224 PCGAMGGraph_AGG 5 1.0 1.2315e+00 1.0 2.25e+06 1.2 3.3e+05 4.2e+02 1.3e+02 1 0 0 0 5 1 0 0 0 5 1759 PCGAMGCoarse_AGG 5 1.0 1.5847e+00 1.0 5.87e+07 1.3 1.3e+06 5.2e+03 8.7e+01 1 0 1 1 3 1 0 1 1 3 35331 PCGAMGProl_AGG 5 1.0 3.5152e-01 1.0 0.00e+00 0.0 2.3e+06 1.5e+03 9.0e+02 0 0 2 1 34 0 0 2 1 35 0 PCGAMGPOpt_AGG 5 1.0 1.0543e+00 1.0 4.17e+08 1.2 1.0e+06 6.1e+03 2.4e+02 1 1 1 1 9 1 1 1 1 9 372220 GAMG: createProl 5 1.0 4.2217e+00 1.0 4.78e+08 1.2 5.0e+06 3.3e+03 1.4e+03 3 1 4 3 52 3 1 4 3 52 106734 Graph 10 1.0 1.2300e+00 1.0 2.25e+06 1.2 3.3e+05 4.2e+02 1.3e+02 1 0 0 0 5 1 0 0 0 5 1761 MIS/Agg 5 1.0 3.2935e-02 1.1 0.00e+00 0.0 1.2e+06 4.9e+02 5.2e+01 0 0 1 0 2 0 0 1 0 2 0 SA: col data 5 1.0 1.3732e-01 1.0 0.00e+00 0.0 2.2e+06 1.2e+03 8.4e+02 0 0 2 1 32 0 0 2 1 32 0 SA: frmProl0 5 1.0 2.0841e-01 1.0 0.00e+00 0.0 9.5e+04 7.1e+03 5.0e+01 0 0 0 0 2 0 0 0 0 2 0 SA: smooth 5 1.0 7.6907e-01 1.0 1.48e+08 1.2 3.7e+05 1.0e+04 1.0e+02 0 0 0 1 4 0 0 0 1 4 180072 GAMG: partLevel 5 1.0 4.2824e+00 1.0 1.54e+09 1.5 8.9e+05 4.0e+04 2.5e+02 3 2 1 7 9 3 2 1 7 9 288729 repartition 3 1.0 2.1951e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 3 1.0 2.9290e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 Move A 3 1.0 2.8378e-02 1.1 0.00e+00 0.0 3.0e+04 5.2e+03 5.4e+01 0 0 0 0 2 0 0 0 0 2 0 Move P 3 1.0 1.5999e-02 1.2 0.00e+00 0.0 2.6e+04 4.0e+01 5.4e+01 0 0 0 0 2 0 0 0 0 2 0 PCSetUp 2 1.0 8.5208e+00 1.0 2.01e+09 1.4 5.8e+06 8.9e+03 1.6e+03 5 2 5 11 62 5 2 5 11 63 197991 PCSetUpOnBlocks 226 1.0 1.7779e-0310.4 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1 PCApply 226 1.0 6.9594e+01 1.1 6.40e+10 1.1 1.1e+08 3.3e+03 1.0e+02 41 83 90 72 4 41 83 90 72 4 878121 SFSetGraph 5 1.0 5.3883e-0556.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 62 1.0 9.9101e-03 1.7 0.00e+00 0.0 1.2e+06 4.9e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 0 SFBcastEnd 62 1.0 6.8467e-03 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BuildTwoSided 5 1.0 7.4060e-03 2.8 0.00e+00 0.0 3.3e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 154 154 347782424 0. Matrix Coarsen 5 5 3140 0. Matrix Null Space 1 1 688 0. Vector 1035 1035 582369520 0. Vector Scatter 36 36 39744 0. Index Set 112 112 484084 0. Krylov Solver 18 18 330336 0. Preconditioner 13 13 12868 0. PetscRandom 10 10 6380 0. Star Forest Graph 5 5 4280 0. Viewer 12 11 9152 0. ======================================================================================================================== 0 KSP unpreconditioned resid norm 3.738834777485e+08 true resid norm 3.738834777485e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 1.256707592053e+08 true resid norm 1.256707592053e+08 ||r(i)||/||b|| 3.361227940911e-01 2 KSP unpreconditioned resid norm 1.824938621520e+07 true resid norm 1.824938621520e+07 ||r(i)||/||b|| 4.881035750789e-02 3 KSP unpreconditioned resid norm 6.102002718084e+06 true resid norm 6.102002718084e+06 ||r(i)||/||b|| 1.632060008329e-02 4 KSP unpreconditioned resid norm 2.562432902883e+06 true resid norm 2.562432902883e+06 ||r(i)||/||b|| 6.853560147439e-03 5 KSP unpreconditioned resid norm 1.188336046012e+06 true resid norm 1.188336046012e+06 ||r(i)||/||b|| 3.178359346520e-03 6 KSP unpreconditioned resid norm 5.326022866065e+05 true resid norm 5.326022866065e+05 ||r(i)||/||b|| 1.424514102131e-03 7 KSP unpreconditioned resid norm 2.433972087119e+05 true resid norm 2.433972087119e+05 ||r(i)||/||b|| 6.509974984122e-04 8 KSP unpreconditioned resid norm 1.095996827533e+05 true resid norm 1.095996827533e+05 ||r(i)||/||b|| 2.931386094225e-04 9 KSP unpreconditioned resid norm 4.986951871355e+04 true resid norm 4.986951871355e+04 ||r(i)||/||b|| 1.333825153597e-04 10 KSP unpreconditioned resid norm 2.330078182947e+04 true resid norm 2.330078182946e+04 ||r(i)||/||b|| 6.232097221779e-05 11 KSP unpreconditioned resid norm 1.084965391397e+04 true resid norm 1.084965391396e+04 ||r(i)||/||b|| 2.901881083191e-05 12 KSP unpreconditioned resid norm 5.108480961660e+03 true resid norm 5.108480961647e+03 ||r(i)||/||b|| 1.366329689776e-05 13 KSP unpreconditioned resid norm 2.450752492671e+03 true resid norm 2.450752492670e+03 ||r(i)||/||b|| 6.554856361741e-06 14 KSP unpreconditioned resid norm 1.181086403619e+03 true resid norm 1.181086403614e+03 ||r(i)||/||b|| 3.158969234817e-06 15 KSP unpreconditioned resid norm 5.606721134498e+02 true resid norm 5.606721134433e+02 ||r(i)||/||b|| 1.499590505629e-06 16 KSP unpreconditioned resid norm 2.700319247455e+02 true resid norm 2.700319247344e+02 ||r(i)||/||b|| 7.222355113430e-07 17 KSP unpreconditioned resid norm 1.314293551958e+02 true resid norm 1.314293551859e+02 ||r(i)||/||b|| 3.515249081809e-07 18 KSP unpreconditioned resid norm 6.357572858020e+01 true resid norm 6.357572858253e+01 ||r(i)||/||b|| 1.700415567047e-07 19 KSP unpreconditioned resid norm 3.077536939056e+01 true resid norm 3.077536939188e+01 ||r(i)||/||b|| 8.231272902779e-08 20 KSP unpreconditioned resid norm 1.504910881547e+01 true resid norm 1.504910882709e+01 ||r(i)||/||b|| 4.025079930707e-08 21 KSP unpreconditioned resid norm 7.400345249992e+00 true resid norm 7.400345259132e+00 ||r(i)||/||b|| 1.979318611161e-08 22 KSP unpreconditioned resid norm 3.607811417234e+00 true resid norm 3.607811420482e+00 ||r(i)||/||b|| 9.649560986776e-09 Linear solve converged due to CONVERGED_RTOL iterations 22 KSP Object: 1000 MPI processes type: fgmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1000 MPI processes type: gamg type is MULTIPLICATIVE, levels=6 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = 0. 0. 0. 0. Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1000 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1000 MPI processes type: bjacobi number of blocks = 1000 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 package used to perform factorization: petsc total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1000 MPI processes type: mpiaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.0999997, max = 1.1 eigenvalues estimate via gmres min 0.0078618, max 0.999997 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_1_esteig_) 1000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 1000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1000 MPI processes type: mpiaij rows=288, cols=288, bs=6 total: nonzeros=78408, allocated nonzeros=78408 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 86 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.139457, max = 1.53403 eigenvalues estimate via gmres min 0.077969, max 1.39457 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_2_esteig_) 1000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 1000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1000 MPI processes type: mpiaij rows=10254, cols=10254, bs=6 total: nonzeros=12883716, allocated nonzeros=12883716 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 8 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 1000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.14493, max = 1.59423 eigenvalues estimate via gmres min 0.356008, max 1.4493 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_3_esteig_) 1000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 1000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1000 MPI processes type: mpiaij rows=332466, cols=332466, bs=6 total: nonzeros=286141284, allocated nonzeros=286141284 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 88 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 1000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.175972, max = 1.93569 eigenvalues estimate via gmres min 0.145536, max 1.75972 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_4_esteig_) 1000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_) 1000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1000 MPI processes type: mpiaij rows=5142126, cols=5142126, bs=6 total: nonzeros=1363101804, allocated nonzeros=1363101804 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 1522 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 5 ------------------------------- KSP Object: (mg_levels_5_) 1000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.234733, max = 2.58207 eigenvalues estimate via gmres min 0.061528, max 2.34733 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_5_esteig_) 1000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_5_) 1000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1000 MPI processes type: mpiaij rows=117874305, cols=117874305, bs=3 total: nonzeros=9333251991, allocated nonzeros=9333251991 total number of mallocs used during MatSetValues calls =0 has attached near null space using I-node (on process 0) routines: found 39198 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1000 MPI processes type: mpiaij rows=117874305, cols=117874305, bs=3 total: nonzeros=9333251991, allocated nonzeros=9333251991 total number of mallocs used during MatSetValues calls =0 has attached near null space using I-node (on process 0) routines: found 39198 nodes, limit used is 5
---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Unknown Name on a arch-linux2-c-opt-mpi-ml-hypre named eobm0011 with 2000 processors, by B07947 Thu Nov 15 15:47:29 2018 Using Petsc Release Version 3.8.2, Nov, 09, 2017 Max Max/Min Avg Total Time (sec): 2.837e+02 1.00021 2.836e+02 Objects: 1.409e+03 1.00142 1.407e+03 Flop: 3.920e+10 1.16752 3.710e+10 7.420e+13 Flop/sec: 1.382e+08 1.16751 1.308e+08 2.616e+11 MPI Messages: 4.031e+05 10.62284 1.243e+05 2.486e+08 MPI Message Lengths: 6.348e+08 4.13328 2.721e+03 6.762e+11 MPI Reductions: 2.654e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.8364e+02 100.0% 7.4202e+13 100.0% 2.484e+08 99.9% 2.718e+03 99.9% 2.622e+03 98.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 7470 1.0 4.7611e+01 1.9 2.11e+10 1.2 2.0e+08 2.9e+03 0.0e+00 11 53 81 86 0 11 53 81 86 0 827107 MatMultAdd 1150 1.0 3.8834e+00 3.5 8.06e+08 1.2 1.7e+07 5.7e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 388724 MatMultTranspose 1150 1.0 6.7493e+00 7.4 8.06e+08 1.2 1.7e+07 5.7e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 223663 MatSolve 230 0.0 8.3327e-04 0.0 6.35e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 76 MatSOR 6955 1.0 2.9793e+01 2.8 1.41e+10 1.1 0.0e+00 0.0e+00 0.0e+00 9 37 0 0 0 9 37 0 0 0 912909 MatLUFactorSym 1 1.0 4.5509e-03561.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 3.5341e-031852.9 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 15 1.0 1.7186e-02 3.3 4.62e+06 1.2 1.4e+05 7.0e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 508009 MatResidual 1150 1.0 7.0952e+00 2.3 2.75e+09 1.2 3.1e+07 2.5e+03 0.0e+00 2 7 13 12 0 2 7 13 12 0 713964 MatAssemblyBegin 112 1.0 1.0418e+00 4.7 0.00e+00 0.0 4.3e+05 5.3e+04 7.4e+01 0 0 0 3 3 0 0 0 3 3 0 MatAssemblyEnd 112 1.0 5.9064e-01 1.1 0.00e+00 0.0 1.6e+06 2.4e+02 2.6e+02 0 0 1 0 10 0 0 1 0 10 0 MatGetRow 291670 1.0 3.9900e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 0.0 4.3106e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMat 6 1.0 4.7464e-02 1.0 0.00e+00 0.0 7.5e+04 2.0e+03 1.0e+02 0 0 0 0 4 0 0 0 0 4 0 MatGetOrdering 1 0.0 1.0009e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 5 1.0 3.4372e-02 1.1 0.00e+00 0.0 3.0e+06 3.0e+02 5.9e+01 0 0 1 0 2 0 0 1 0 2 0 MatZeroEntries 5 1.0 5.3163e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 90 1.3 6.1949e-01 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+01 0 0 0 0 3 0 0 0 0 3 0 MatAXPY 5 1.0 4.1116e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 MatMatMult 5 1.0 4.7434e-01 1.2 7.18e+07 1.2 7.5e+05 7.0e+03 8.2e+01 0 0 0 1 3 0 0 0 1 3 278596 MatMatMultSym 5 1.0 2.8504e-01 1.0 0.00e+00 0.0 6.2e+05 5.3e+03 7.0e+01 0 0 0 0 3 0 0 0 0 3 0 MatMatMultNum 5 1.0 1.1035e-01 1.0 7.18e+07 1.2 1.4e+05 1.5e+04 1.0e+01 0 0 0 0 0 0 0 0 0 0 1197494 MatPtAP 5 1.0 2.6336e+00 1.0 8.35e+08 1.7 1.7e+06 3.0e+04 8.7e+01 1 2 1 7 3 1 2 1 7 3 472910 MatPtAPSymbolic 5 1.0 1.6345e+00 1.0 0.00e+00 0.0 7.2e+05 3.8e+04 3.7e+01 1 0 0 4 1 1 0 0 4 1 0 MatPtAPNumeric 5 1.0 1.0015e+00 1.0 8.35e+08 1.7 9.3e+05 2.3e+04 5.0e+01 0 2 0 3 2 0 2 0 3 2 1243604 MatTrnMatMult 1 1.0 8.1209e-01 1.0 2.97e+07 1.3 1.5e+05 5.0e+04 1.9e+01 0 0 0 1 1 0 0 0 1 1 69321 MatTrnMatMultSym 1 1.0 4.7897e-01 1.0 0.00e+00 0.0 1.3e+05 2.1e+04 1.7e+01 0 0 0 0 1 0 0 0 0 1 0 MatTrnMatMultNum 1 1.0 3.3517e-01 1.0 2.97e+07 1.3 2.4e+04 2.0e+05 2.0e+00 0 0 0 1 0 0 0 0 1 0 167958 MatGetLocalMat 17 1.0 3.8855e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 15 1.0 6.2124e-02 2.5 0.00e+00 0.0 9.5e+05 2.9e+04 0.0e+00 0 0 0 4 0 0 0 0 4 0 0 VecMDot 333 1.0 1.2028e+0113.7 3.44e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 1 0 0 13 1 1 0 0 13 56587 VecNorm 603 1.0 2.7685e+00 4.5 6.16e+07 1.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 23 0 0 0 0 23 43942 VecScale 353 1.0 9.8841e-03 1.8 1.59e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3172544 VecCopy 1410 1.0 7.7031e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 4468 1.0 5.7269e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 250 1.0 3.3906e-02 2.1 2.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1683301 VecAYPX 9440 1.0 3.6537e-01 2.9 1.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 854103 VecAXPBYCZ 4600 1.0 2.7121e-01 3.2 2.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2092658 VecMAXPY 583 1.0 7.1103e-01 2.7 7.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1955563 VecAssemblyBegin 185 1.0 1.1164e-01 1.4 0.00e+00 0.0 4.9e+04 1.4e+04 5.5e+02 0 0 0 0 21 0 0 0 0 21 0 VecAssemblyEnd 185 1.0 3.3379e-04 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 55 1.0 2.9728e-03 2.9 6.90e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 456523 VecScatterBegin 9954 1.0 1.0825e+00 7.3 0.00e+00 0.0 2.4e+08 2.5e+03 0.0e+00 0 0 97 90 0 0 0 97 90 0 0 VecScatterEnd 9954 1.0 3.8453e+0111.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 VecSetRandom 5 1.0 2.1403e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 113 1.0 7.4105e-02 6.0 4.68e+06 1.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 4 0 0 0 0 4 124201 KSPGMRESOrthog 330 1.0 1.2168e+0110.7 6.86e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 2 0 0 12 1 2 0 0 13 111406 KSPSetUp 18 1.0 1.2172e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 10 1.0 6.5991e+01 1.0 3.82e+10 1.2 2.4e+08 2.6e+03 8.2e+02 23 98 95 89 31 23 98 95 89 31 1098603 PCGAMGGraph_AGG 5 1.0 6.7798e-01 1.0 1.13e+06 1.2 6.8e+05 2.8e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 3197 PCGAMGCoarse_AGG 5 1.0 8.5740e-01 1.0 2.97e+07 1.3 3.3e+06 2.7e+03 9.4e+01 0 0 1 1 4 0 0 1 1 4 65658 PCGAMGProl_AGG 5 1.0 2.4710e-01 1.0 0.00e+00 0.0 4.8e+06 9.8e+02 9.0e+02 0 0 2 1 34 0 0 2 1 35 0 PCGAMGPOpt_AGG 5 1.0 7.5785e-01 1.0 2.12e+08 1.2 2.1e+06 4.1e+03 2.4e+02 0 1 1 1 9 0 1 1 1 9 518589 GAMG: createProl 5 1.0 2.5407e+00 1.0 2.43e+08 1.2 1.1e+07 2.1e+03 1.4e+03 1 1 4 3 51 1 1 4 3 52 177698 Graph 10 1.0 6.7570e-01 1.0 1.13e+06 1.2 6.8e+05 2.8e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 3208 MIS/Agg 5 1.0 3.4434e-02 1.1 0.00e+00 0.0 3.0e+06 3.0e+02 5.9e+01 0 0 1 0 2 0 0 1 0 2 0 SA: col data 5 1.0 1.3094e-01 1.0 0.00e+00 0.0 4.6e+06 8.2e+02 8.4e+02 0 0 2 1 31 0 0 2 1 32 0 SA: frmProl0 5 1.0 1.1028e-01 1.0 0.00e+00 0.0 1.9e+05 4.7e+03 5.0e+01 0 0 0 0 2 0 0 0 0 2 0 SA: smooth 5 1.0 5.2676e-01 1.2 7.53e+07 1.2 7.5e+05 7.0e+03 1.0e+02 0 0 0 1 4 0 0 0 1 4 263330 GAMG: partLevel 5 1.0 2.7087e+00 1.0 8.35e+08 1.7 1.7e+06 2.9e+04 2.5e+02 1 2 1 7 9 1 2 1 7 9 459805 repartition 3 1.0 5.6183e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 3 1.0 7.8020e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 Move A 3 1.0 4.1104e-02 1.1 0.00e+00 0.0 3.2e+04 4.5e+03 5.4e+01 0 0 0 0 2 0 0 0 0 2 0 Move P 3 1.0 1.8200e-02 1.3 0.00e+00 0.0 4.3e+04 3.6e+01 5.4e+01 0 0 0 0 2 0 0 0 0 2 0 PCSetUp 2 1.0 5.2812e+00 1.0 1.08e+09 1.5 1.3e+07 5.7e+03 1.6e+03 2 2 5 11 62 2 2 5 11 63 321316 PCSetUpOnBlocks 230 1.0 6.2256e-0319.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 230 1.0 5.7271e+01 1.3 3.26e+10 1.2 2.2e+08 2.2e+03 1.0e+02 20 83 90 73 4 20 83 90 73 4 1074640 SFSetGraph 5 1.0 5.3167e-0555.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 69 1.0 1.1146e-02 1.9 0.00e+00 0.0 3.0e+06 3.0e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 0 SFBcastEnd 69 1.0 7.9596e-03 7.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BuildTwoSided 5 1.0 7.5631e-03 2.8 0.00e+00 0.0 6.9e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 154 154 176666644 0. Matrix Coarsen 5 5 3140 0. Matrix Null Space 1 1 688 0. Vector 1043 1043 297405224 0. Vector Scatter 36 36 39456 0. Index Set 112 112 395240 0. Krylov Solver 18 18 330336 0. Preconditioner 13 13 12868 0. PetscRandom 10 10 6380 0. Star Forest Graph 5 5 4280 0. Viewer 12 11 9152 0. ======================================================================================================================== 0 KSP unpreconditioned resid norm 3.738834778097e+08 true resid norm 3.738834778097e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 1.256561415764e+08 true resid norm 1.256561415764e+08 ||r(i)||/||b|| 3.360836972859e-01 2 KSP unpreconditioned resid norm 1.843932942229e+07 true resid norm 1.843932942229e+07 ||r(i)||/||b|| 4.931838531703e-02 3 KSP unpreconditioned resid norm 6.189553415818e+06 true resid norm 6.189553415818e+06 ||r(i)||/||b|| 1.655476581120e-02 4 KSP unpreconditioned resid norm 2.614928212473e+06 true resid norm 2.614928212473e+06 ||r(i)||/||b|| 6.993965680944e-03 5 KSP unpreconditioned resid norm 1.208975553355e+06 true resid norm 1.208975553355e+06 ||r(i)||/||b|| 3.233562393388e-03 6 KSP unpreconditioned resid norm 5.481792905733e+05 true resid norm 5.481792905733e+05 ||r(i)||/||b|| 1.466176825423e-03 7 KSP unpreconditioned resid norm 2.526854282559e+05 true resid norm 2.526854282559e+05 ||r(i)||/||b|| 6.758400497828e-04 8 KSP unpreconditioned resid norm 1.150052500229e+05 true resid norm 1.150052500229e+05 ||r(i)||/||b|| 3.075965022488e-04 9 KSP unpreconditioned resid norm 5.289416146528e+04 true resid norm 5.289416146528e+04 ||r(i)||/||b|| 1.414723158540e-04 10 KSP unpreconditioned resid norm 2.495584369428e+04 true resid norm 2.495584369427e+04 ||r(i)||/||b|| 6.674765047246e-05 11 KSP unpreconditioned resid norm 1.184780633606e+04 true resid norm 1.184780633605e+04 ||r(i)||/||b|| 3.168849932994e-05 12 KSP unpreconditioned resid norm 5.709557885707e+03 true resid norm 5.709557885717e+03 ||r(i)||/||b|| 1.527095532321e-05 13 KSP unpreconditioned resid norm 2.811037623050e+03 true resid norm 2.811037623058e+03 ||r(i)||/||b|| 7.518485811476e-06 14 KSP unpreconditioned resid norm 1.399589249024e+03 true resid norm 1.399589249031e+03 ||r(i)||/||b|| 3.743383519460e-06 15 KSP unpreconditioned resid norm 6.919705622362e+02 true resid norm 6.919705622376e+02 ||r(i)||/||b|| 1.850765287333e-06 16 KSP unpreconditioned resid norm 3.469221128804e+02 true resid norm 3.469221128823e+02 ||r(i)||/||b|| 9.278883220907e-07 17 KSP unpreconditioned resid norm 1.747835577077e+02 true resid norm 1.747835577094e+02 ||r(i)||/||b|| 4.674813627318e-07 18 KSP unpreconditioned resid norm 8.648881836541e+01 true resid norm 8.648881835829e+01 ||r(i)||/||b|| 2.313255960519e-07 19 KSP unpreconditioned resid norm 4.247581916935e+01 true resid norm 4.247581916507e+01 ||r(i)||/||b|| 1.136071040472e-07 20 KSP unpreconditioned resid norm 2.086023330347e+01 true resid norm 2.086023330200e+01 ||r(i)||/||b|| 5.579340767933e-08 21 KSP unpreconditioned resid norm 1.023525173739e+01 true resid norm 1.023525174086e+01 ||r(i)||/||b|| 2.737551228746e-08 22 KSP unpreconditioned resid norm 4.963414450847e+00 true resid norm 4.963414447514e+00 ||r(i)||/||b|| 1.327529763174e-08 23 KSP unpreconditioned resid norm 2.415620601642e+00 true resid norm 2.415620604831e+00 ||r(i)||/||b|| 6.460891556327e-09 Linear solve converged due to CONVERGED_RTOL iterations 23 KSP Object: 2000 MPI processes type: fgmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 2000 MPI processes type: gamg type is MULTIPLICATIVE, levels=6 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = 0. 0. 0. 0. Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 2000 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 2000 MPI processes type: bjacobi number of blocks = 2000 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 package used to perform factorization: petsc total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 2000 MPI processes type: mpiaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 2000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.0999937, max = 1.09993 eigenvalues estimate via gmres min 0.075342, max 0.999937 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_1_esteig_) 2000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 2000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2000 MPI processes type: mpiaij rows=318, cols=318, bs=6 total: nonzeros=90828, allocated nonzeros=90828 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 87 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 2000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.130639, max = 1.43703 eigenvalues estimate via gmres min 0.077106, max 1.30639 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_2_esteig_) 2000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 2000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2000 MPI processes type: mpiaij rows=9870, cols=9870, bs=6 total: nonzeros=11941884, allocated nonzeros=11941884 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 9 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 2000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.151779, max = 1.66957 eigenvalues estimate via gmres min 0.352485, max 1.51779 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_3_esteig_) 2000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 2000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2000 MPI processes type: mpiaij rows=334476, cols=334476, bs=6 total: nonzeros=292009536, allocated nonzeros=292009536 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 50 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 2000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.181248, max = 1.99372 eigenvalues estimate via gmres min 0.141976, max 1.81248 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_4_esteig_) 2000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_) 2000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2000 MPI processes type: mpiaij rows=5160228, cols=5160228, bs=6 total: nonzeros=1375082208, allocated nonzeros=1375082208 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 792 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 5 ------------------------------- KSP Object: (mg_levels_5_) 2000 MPI processes type: chebyshev eigenvalue estimates used: min = 0.23761, max = 2.61371 eigenvalues estimate via gmres min 0.0632228, max 2.3761 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_5_esteig_) 2000 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_5_) 2000 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2000 MPI processes type: mpiaij rows=117874305, cols=117874305, bs=3 total: nonzeros=9333251991, allocated nonzeros=9333251991 total number of mallocs used during MatSetValues calls =0 has attached near null space using I-node (on process 0) routines: found 19690 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 2000 MPI processes type: mpiaij rows=117874305, cols=117874305, bs=3 total: nonzeros=9333251991, allocated nonzeros=9333251991 total number of mallocs used during MatSetValues calls =0 has attached near null space using I-node (on process 0) routines: found 19690 nodes, limit used is 5
---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Unknown Name on a arch-linux2-c-opt-mpi-ml-hypre named eocn0055 with 1500 processors, by B07947 Thu Nov 15 15:55:02 2018 Using Petsc Release Version 3.8.2, Nov, 09, 2017 Max Max/Min Avg Total Time (sec): 2.296e+02 1.00007 2.296e+02 Objects: 1.409e+03 1.00142 1.407e+03 Flop: 5.219e+10 1.14806 4.965e+10 7.447e+13 Flop/sec: 2.273e+08 1.14806 2.162e+08 3.243e+11 MPI Messages: 4.774e+05 14.16274 1.262e+05 1.893e+08 MPI Message Lengths: 7.718e+08 4.12637 3.102e+03 5.872e+11 MPI Reductions: 2.667e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.2961e+02 100.0% 7.4472e+13 100.0% 1.892e+08 99.9% 3.099e+03 99.9% 2.635e+03 98.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 7470 1.0 6.1871e+01 1.8 2.79e+10 1.2 1.5e+08 3.3e+03 0.0e+00 18 53 81 86 0 18 53 81 86 0 636062 MatMultAdd 1150 1.0 4.4228e+00 3.1 1.07e+09 1.2 1.3e+07 6.4e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 340660 MatMultTranspose 1150 1.0 5.8074e+00 4.5 1.07e+09 1.2 1.3e+07 6.4e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 259436 MatSolve 230 0.0 7.8106e-04 0.0 6.35e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 81 MatSOR 6955 1.0 4.0051e+01 2.6 1.90e+10 1.1 0.0e+00 0.0e+00 0.0e+00 15 37 0 0 0 15 37 0 0 0 686820 MatLUFactorSym 1 1.0 1.9209e-03175.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 1.7691e-03927.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1 MatScale 15 1.0 2.3391e-02 4.6 6.13e+06 1.1 1.0e+05 8.0e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 372687 MatResidual 1150 1.0 9.1807e+00 2.2 3.63e+09 1.2 2.4e+07 2.8e+03 0.0e+00 2 7 13 12 0 2 7 13 12 0 551315 MatAssemblyBegin 112 1.0 9.0080e-01 2.5 0.00e+00 0.0 3.2e+05 6.2e+04 7.4e+01 0 0 0 3 3 0 0 0 3 3 0 MatAssemblyEnd 112 1.0 6.8422e-01 1.1 0.00e+00 0.0 1.2e+06 2.7e+02 2.6e+02 0 0 1 0 10 0 0 1 0 10 0 MatGetRow 388852 1.0 5.5644e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 0.0 1.6968e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMat 6 1.0 6.8178e-02 1.0 0.00e+00 0.0 8.2e+04 1.8e+03 1.0e+02 0 0 0 0 4 0 0 0 0 4 0 MatGetOrdering 1 0.0 1.8709e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 5 1.0 3.8623e-02 1.1 0.00e+00 0.0 2.7e+06 2.9e+02 7.2e+01 0 0 1 0 3 0 0 1 0 3 0 MatZeroEntries 5 1.0 6.3353e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 90 1.3 6.1051e-01 5.3 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+01 0 0 0 0 3 0 0 0 0 3 0 MatAXPY 5 1.0 6.4298e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 MatMatMult 5 1.0 5.4698e-01 1.0 9.46e+07 1.2 5.7e+05 8.1e+03 8.2e+01 0 0 0 1 3 0 0 0 1 3 241395 MatMatMultSym 5 1.0 3.6737e-01 1.0 0.00e+00 0.0 4.6e+05 6.1e+03 7.0e+01 0 0 0 0 3 0 0 0 0 3 0 MatMatMultNum 5 1.0 1.7525e-01 1.0 9.46e+07 1.2 1.0e+05 1.7e+04 1.0e+01 0 0 0 0 0 0 0 0 0 0 753412 MatPtAP 5 1.0 3.4278e+00 1.0 1.10e+09 1.6 1.2e+06 3.4e+04 8.7e+01 1 2 1 7 3 1 2 1 7 3 361157 MatPtAPSymbolic 5 1.0 2.2084e+00 1.0 0.00e+00 0.0 5.4e+05 4.4e+04 3.7e+01 1 0 0 4 1 1 0 0 4 1 0 MatPtAPNumeric 5 1.0 1.2233e+00 1.0 1.10e+09 1.6 7.0e+05 2.7e+04 5.0e+01 1 2 0 3 2 1 2 0 3 2 1011960 MatTrnMatMult 1 1.0 1.0668e+00 1.0 3.95e+07 1.3 1.1e+05 6.0e+04 1.9e+01 0 0 0 1 1 0 0 0 1 1 52637 MatTrnMatMultSym 1 1.0 6.2306e-01 1.0 0.00e+00 0.0 9.0e+04 2.5e+04 1.7e+01 0 0 0 0 1 0 0 0 0 1 0 MatTrnMatMultNum 1 1.0 4.4524e-01 1.0 3.95e+07 1.3 1.8e+04 2.4e+05 2.0e+00 0 0 0 1 0 0 0 0 1 0 126116 MatGetLocalMat 17 1.0 5.2980e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 15 1.0 7.0928e-02 2.7 0.00e+00 0.0 7.2e+05 3.2e+04 0.0e+00 0 0 0 4 0 0 0 0 4 0 0 VecMDot 333 1.0 6.9139e+00 5.8 4.59e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 1 0 0 12 1 1 0 0 13 98444 VecNorm 603 1.0 3.4630e+00 7.2 8.20e+07 1.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 23 0 0 0 0 23 35129 VecScale 353 1.0 5.6857e-02 5.8 2.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 551520 VecCopy 1410 1.0 1.0825e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 4468 1.0 8.2095e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 250 1.0 5.0242e-02 2.3 3.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1135966 VecAYPX 9440 1.0 5.1576e-01 2.6 2.11e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 605004 VecAXPBYCZ 4600 1.0 3.6595e-01 2.7 3.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1550752 VecMAXPY 583 1.0 1.0786e+00 3.4 9.38e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1289187 VecAssemblyBegin 185 1.0 7.6495e-02 1.2 0.00e+00 0.0 3.5e+04 1.6e+04 5.5e+02 0 0 0 0 20 0 0 0 0 21 0 VecAssemblyEnd 185 1.0 3.8767e-04 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 55 1.0 4.6344e-03 2.8 9.20e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 292821 VecScatterBegin 9954 1.0 1.1589e+00 7.6 0.00e+00 0.0 1.8e+08 2.9e+03 0.0e+00 0 0 97 90 0 0 0 97 90 0 0 VecScatterEnd 9954 1.0 4.8668e+0111.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 6 0 0 0 0 6 0 0 0 0 0 VecSetRandom 5 1.0 2.8229e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 113 1.0 1.3515e-01 3.0 6.23e+06 1.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 4 0 0 0 0 4 68095 KSPGMRESOrthog 330 1.0 7.1848e+00 4.6 9.14e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 2 0 0 12 1 2 0 0 13 188679 KSPSetUp 18 1.0 1.2331e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 1 0 0 0 0 1 0 KSPSolve 10 1.0 8.7217e+01 1.0 5.09e+10 1.1 1.8e+08 2.9e+03 8.2e+02 38 98 94 89 31 38 98 94 89 31 834427 PCGAMGGraph_AGG 5 1.0 8.6509e-01 1.0 1.50e+06 1.2 5.2e+05 3.2e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 2505 PCGAMGCoarse_AGG 5 1.0 1.1177e+00 1.0 3.95e+07 1.3 2.9e+06 2.7e+03 1.1e+02 0 0 2 1 4 0 0 2 1 4 50240 PCGAMGProl_AGG 5 1.0 3.2632e-01 1.0 0.00e+00 0.0 3.7e+06 1.1e+03 9.0e+02 0 0 2 1 34 0 0 2 1 34 0 PCGAMGPOpt_AGG 5 1.0 9.1948e-01 1.0 2.80e+08 1.2 1.6e+06 4.7e+03 2.4e+02 0 1 1 1 9 0 1 1 1 9 427090 GAMG: createProl 5 1.0 3.2296e+00 1.0 3.20e+08 1.2 8.7e+06 2.3e+03 1.4e+03 1 1 5 3 52 1 1 5 3 52 139652 Graph 10 1.0 8.6263e-01 1.0 1.50e+06 1.2 5.2e+05 3.2e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 2512 MIS/Agg 5 1.0 3.8683e-02 1.1 0.00e+00 0.0 2.7e+06 2.9e+02 7.2e+01 0 0 1 0 3 0 0 1 0 3 0 SA: col data 5 1.0 1.6884e-01 1.0 0.00e+00 0.0 3.5e+06 9.4e+02 8.4e+02 0 0 2 1 31 0 0 2 1 32 0 SA: frmProl0 5 1.0 1.5309e-01 1.0 0.00e+00 0.0 1.4e+05 5.5e+03 5.0e+01 0 0 0 0 2 0 0 0 0 2 0 SA: smooth 5 1.0 6.2292e-01 1.0 9.92e+07 1.2 5.7e+05 8.1e+03 1.0e+02 0 0 0 1 4 0 0 0 1 4 222484 GAMG: partLevel 5 1.0 3.5234e+00 1.0 1.10e+09 1.6 1.3e+06 3.2e+04 2.5e+02 2 2 1 7 9 2 2 1 7 9 351357 repartition 3 1.0 4.2057e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0 Invert-Sort 3 1.0 3.6008e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 Move A 3 1.0 5.2288e-02 1.1 0.00e+00 0.0 4.3e+04 3.4e+03 5.4e+01 0 0 0 0 2 0 0 0 0 2 0 Move P 3 1.0 2.8588e-02 1.2 0.00e+00 0.0 3.8e+04 3.3e+01 5.4e+01 0 0 0 0 2 0 0 0 0 2 0 PCSetUp 2 1.0 6.7734e+00 1.0 1.42e+09 1.5 1.0e+07 6.2e+03 1.7e+03 3 2 5 11 62 3 2 5 11 63 249358 PCSetUpOnBlocks 230 1.0 3.0496e-03 6.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 230 1.0 7.5882e+01 1.1 4.35e+10 1.2 1.7e+08 2.5e+03 1.0e+02 32 83 90 73 4 32 83 90 73 4 814742 SFSetGraph 5 1.0 2.3127e-0524.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 82 1.0 1.2156e-02 1.9 0.00e+00 0.0 2.7e+06 2.9e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 0 SFBcastEnd 82 1.0 9.9132e-03 9.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BuildTwoSided 5 1.0 7.6580e-03 2.6 0.00e+00 0.0 5.2e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 154 154 245066064 0. Matrix Coarsen 5 5 3140 0. Matrix Null Space 1 1 688 0. Vector 1043 1043 395730096 0. Vector Scatter 36 36 39456 0. Index Set 112 112 566408 0. Krylov Solver 18 18 330336 0. Preconditioner 13 13 12868 0. PetscRandom 10 10 6380 0. Star Forest Graph 5 5 4280 0. Viewer 12 11 9152 0. ======================================================================================================================== 0 KSP unpreconditioned resid norm 3.738834778994e+08 true resid norm 3.738834778994e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 1.279113569974e+08 true resid norm 1.279113569974e+08 ||r(i)||/||b|| 3.421155642289e-01 2 KSP unpreconditioned resid norm 1.874944207644e+07 true resid norm 1.874944207644e+07 ||r(i)||/||b|| 5.014782194118e-02 3 KSP unpreconditioned resid norm 6.305464086727e+06 true resid norm 6.305464086727e+06 ||r(i)||/||b|| 1.686478397536e-02 4 KSP unpreconditioned resid norm 2.648974672476e+06 true resid norm 2.648974672476e+06 ||r(i)||/||b|| 7.085027365634e-03 5 KSP unpreconditioned resid norm 1.239886218685e+06 true resid norm 1.239886218685e+06 ||r(i)||/||b|| 3.316236988195e-03 6 KSP unpreconditioned resid norm 5.641563718944e+05 true resid norm 5.641563718944e+05 ||r(i)||/||b|| 1.508909607517e-03 7 KSP unpreconditioned resid norm 2.606746938444e+05 true resid norm 2.606746938444e+05 ||r(i)||/||b|| 6.972083797577e-04 8 KSP unpreconditioned resid norm 1.184535518381e+05 true resid norm 1.184535518381e+05 ||r(i)||/||b|| 3.168194339682e-04 9 KSP unpreconditioned resid norm 5.392667623794e+04 true resid norm 5.392667623794e+04 ||r(i)||/||b|| 1.442339108990e-04 10 KSP unpreconditioned resid norm 2.520203694105e+04 true resid norm 2.520203694106e+04 ||r(i)||/||b|| 6.740612632217e-05 11 KSP unpreconditioned resid norm 1.185967319435e+04 true resid norm 1.185967319434e+04 ||r(i)||/||b|| 3.172023877859e-05 12 KSP unpreconditioned resid norm 5.627359926956e+03 true resid norm 5.627359926969e+03 ||r(i)||/||b|| 1.505110618577e-05 13 KSP unpreconditioned resid norm 2.702021069922e+03 true resid norm 2.702021069923e+03 ||r(i)||/||b|| 7.226906856392e-06 14 KSP unpreconditioned resid norm 1.307500233445e+03 true resid norm 1.307500233448e+03 ||r(i)||/||b|| 3.497079466561e-06 15 KSP unpreconditioned resid norm 6.250158790292e+02 true resid norm 6.250158790312e+02 ||r(i)||/||b|| 1.671686276545e-06 16 KSP unpreconditioned resid norm 3.038680168367e+02 true resid norm 3.038680168345e+02 ||r(i)||/||b|| 8.127345410977e-07 17 KSP unpreconditioned resid norm 1.504350436399e+02 true resid norm 1.504350436436e+02 ||r(i)||/||b|| 4.023580942618e-07 18 KSP unpreconditioned resid norm 7.388944694136e+01 true resid norm 7.388944694645e+01 ||r(i)||/||b|| 1.976269381081e-07 19 KSP unpreconditioned resid norm 3.596911660459e+01 true resid norm 3.596911660288e+01 ||r(i)||/||b|| 9.620408156298e-08 20 KSP unpreconditioned resid norm 1.769248937152e+01 true resid norm 1.769248936529e+01 ||r(i)||/||b|| 4.732086441662e-08 21 KSP unpreconditioned resid norm 8.746482066795e+00 true resid norm 8.746482056876e+00 ||r(i)||/||b|| 2.339360408761e-08 22 KSP unpreconditioned resid norm 4.283455600167e+00 true resid norm 4.283455596172e+00 ||r(i)||/||b|| 1.145665922506e-08 23 KSP unpreconditioned resid norm 2.096047551274e+00 true resid norm 2.096047547699e+00 ||r(i)||/||b|| 5.606151840341e-09 Linear solve converged due to CONVERGED_RTOL iterations 23 KSP Object: 1500 MPI processes type: fgmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1500 MPI processes type: gamg type is MULTIPLICATIVE, levels=6 cycles=v Cycles per PCApply=1 Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = 0. 0. 0. 0. Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 1 Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1500 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1500 MPI processes type: bjacobi number of blocks = 1500 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 package used to perform factorization: petsc total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 3 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1500 MPI processes type: mpiaij rows=12, cols=12, bs=6 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 3 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1500 MPI processes type: chebyshev eigenvalue estimates used: min = 0.0999807, max = 1.09979 eigenvalues estimate via gmres min 0.310311, max 0.999807 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_1_esteig_) 1500 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 1500 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1500 MPI processes type: mpiaij rows=312, cols=312, bs=6 total: nonzeros=90792, allocated nonzeros=90792 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 87 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1500 MPI processes type: chebyshev eigenvalue estimates used: min = 0.128747, max = 1.41622 eigenvalues estimate via gmres min 0.191833, max 1.28747 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_2_esteig_) 1500 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 1500 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1500 MPI processes type: mpiaij rows=9990, cols=9990, bs=6 total: nonzeros=11862180, allocated nonzeros=11862180 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 6 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 1500 MPI processes type: chebyshev eigenvalue estimates used: min = 0.149515, max = 1.64466 eigenvalues estimate via gmres min 0.342896, max 1.49515 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_3_esteig_) 1500 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 1500 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1500 MPI processes type: mpiaij rows=333960, cols=333960, bs=6 total: nonzeros=289654416, allocated nonzeros=289654416 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 39 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 1500 MPI processes type: chebyshev eigenvalue estimates used: min = 0.173537, max = 1.90891 eigenvalues estimate via gmres min 0.143849, max 1.73537 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_4_esteig_) 1500 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_) 1500 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1500 MPI processes type: mpiaij rows=5149116, cols=5149116, bs=6 total: nonzeros=1368332496, allocated nonzeros=1368332496 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 976 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 5 ------------------------------- KSP Object: (mg_levels_5_) 1500 MPI processes type: chebyshev eigenvalue estimates used: min = 0.241719, max = 2.65891 eigenvalues estimate via gmres min 0.0638427, max 2.41719 eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_5_esteig_) 1500 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test estimating eigenvalues using noisy right hand side maximum iterations=2, nonzero initial guess tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_5_) 1500 MPI processes type: sor type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 1500 MPI processes type: mpiaij rows=117874305, cols=117874305, bs=3 total: nonzeros=9333251991, allocated nonzeros=9333251991 total number of mallocs used during MatSetValues calls =0 has attached near null space using I-node (on process 0) routines: found 26223 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 1500 MPI processes type: mpiaij rows=117874305, cols=117874305, bs=3 total: nonzeros=9333251991, allocated nonzeros=9333251991 total number of mallocs used during MatSetValues calls =0 has attached near null space using I-node (on process 0) routines: found 26223 nodes, limit used is 5