Hi, I ran KSP example 45 on a single node with 32 cores and 125GB memory using 1, 16 and 32 MPI processes. Here's a comparison of the time spent during KSP.solve:
- 1 MPI process: ~98 sec, speedup: 1X - 16 MPI processes: ~12 sec, speedup: ~8X - 32 MPI processes: ~11 sec, speedup: ~9X Since the problem size is large enough (8M unknowns), I expected a speedup much closer to 32X, rather than 9X. Is this expected? If yes, how can it be improved? I've attached three log files for more details. Sincerely, Amin
[aminsad@gra798 tutorials]$ export OMP_NUM_THREADS=1; time mpirun -n 1 ./ex45 -da_grid_x 200 -da_grid_y 200 -da_grid_z 200 -ksp_monitor -log_view 0 KSP Residual norm 5.258064405273e+02 1 KSP Residual norm 1.278253394418e+02 2 KSP Residual norm 6.511271570096e+01 3 KSP Residual norm 4.135214140632e+01 4 KSP Residual norm 2.922513996425e+01 5 KSP Residual norm 2.204747487144e+01 6 KSP Residual norm 1.739713045931e+01 7 KSP Residual norm 1.417931744122e+01 8 KSP Residual norm 1.184374519260e+01 9 KSP Residual norm 1.008506076330e+01 10 KSP Residual norm 8.721445668356e+00 11 KSP Residual norm 7.638849276263e+00 12 KSP Residual norm 6.762286552928e+00 13 KSP Residual norm 6.040701673926e+00 14 KSP Residual norm 5.438219991460e+00 15 KSP Residual norm 4.928968227009e+00 16 KSP Residual norm 4.493866926479e+00 17 KSP Residual norm 4.118572030763e+00 18 KSP Residual norm 3.792124134425e+00 19 KSP Residual norm 3.506027000640e+00 20 KSP Residual norm 3.253607008689e+00 21 KSP Residual norm 3.029552816399e+00 22 KSP Residual norm 2.829585518871e+00 23 KSP Residual norm 2.650219011338e+00 24 KSP Residual norm 2.488587406800e+00 25 KSP Residual norm 2.342315612358e+00 26 KSP Residual norm 2.209421401090e+00 27 KSP Residual norm 2.088238753752e+00 28 KSP Residual norm 1.977358853814e+00 29 KSP Residual norm 1.875582791372e+00 30 KSP Residual norm 1.781883004600e+00 31 KSP Residual norm 1.736355592493e+00 32 KSP Residual norm 1.689802418566e+00 33 KSP Residual norm 1.642330945653e+00 34 KSP Residual norm 1.594067680872e+00 35 KSP Residual norm 1.545141620575e+00 36 KSP Residual norm 1.495696012728e+00 37 KSP Residual norm 1.445872233985e+00 38 KSP Residual norm 1.395827586448e+00 39 KSP Residual norm 1.345725260919e+00 40 KSP Residual norm 1.295731272357e+00 41 KSP Residual norm 1.246028548392e+00 42 KSP Residual norm 1.196795241049e+00 43 KSP Residual norm 1.148224207298e+00 44 KSP Residual norm 1.100512793196e+00 45 KSP Residual norm 1.053857218299e+00 46 KSP Residual norm 1.008393212219e+00 47 KSP Residual norm 9.643431709584e-01 48 KSP Residual norm 9.220075828693e-01 49 KSP Residual norm 8.815371326016e-01 50 KSP Residual norm 8.430754941240e-01 51 KSP Residual norm 8.069283181947e-01 52 KSP Residual norm 7.736058784267e-01 53 KSP Residual norm 7.438791922022e-01 54 KSP Residual norm 7.187496343169e-01 55 KSP Residual norm 6.989924349694e-01 56 KSP Residual norm 6.842383310219e-01 57 KSP Residual norm 6.725928197691e-01 58 KSP Residual norm 6.614246390768e-01 59 KSP Residual norm 6.488228189072e-01 60 KSP Residual norm 6.362497219776e-01 61 KSP Residual norm 6.310576622808e-01 62 KSP Residual norm 6.222805856447e-01 63 KSP Residual norm 6.135102320894e-01 64 KSP Residual norm 6.045478751124e-01 65 KSP Residual norm 5.953900252814e-01 66 KSP Residual norm 5.831923828721e-01 67 KSP Residual norm 5.674340239745e-01 68 KSP Residual norm 5.514386482427e-01 69 KSP Residual norm 5.331703769160e-01 70 KSP Residual norm 5.130105727821e-01 71 KSP Residual norm 4.939642780919e-01 72 KSP Residual norm 4.736283190935e-01 73 KSP Residual norm 4.543755273874e-01 74 KSP Residual norm 4.352475628588e-01 75 KSP Residual norm 4.163157126078e-01 76 KSP Residual norm 3.990162130316e-01 77 KSP Residual norm 3.809381189489e-01 78 KSP Residual norm 3.652493983140e-01 79 KSP Residual norm 3.492146621424e-01 80 KSP Residual norm 3.349556893166e-01 81 KSP Residual norm 3.211628960531e-01 82 KSP Residual norm 3.078715656898e-01 83 KSP Residual norm 2.952352033285e-01 84 KSP Residual norm 2.829258065907e-01 85 KSP Residual norm 2.722494228035e-01 86 KSP Residual norm 2.632988026599e-01 87 KSP Residual norm 2.559836740725e-01 88 KSP Residual norm 2.494846515621e-01 89 KSP Residual norm 2.424910256174e-01 90 KSP Residual norm 2.350783047703e-01 91 KSP Residual norm 2.297176627239e-01 92 KSP Residual norm 2.239807796082e-01 93 KSP Residual norm 2.189487728757e-01 94 KSP Residual norm 2.129028764448e-01 95 KSP Residual norm 2.064486156208e-01 96 KSP Residual norm 1.996150634880e-01 97 KSP Residual norm 1.922502635756e-01 98 KSP Residual norm 1.850232436496e-01 99 KSP Residual norm 1.780506972401e-01 100 KSP Residual norm 1.712736120490e-01 101 KSP Residual norm 1.646455202490e-01 102 KSP Residual norm 1.579011664896e-01 103 KSP Residual norm 1.516406195159e-01 104 KSP Residual norm 1.450996365325e-01 105 KSP Residual norm 1.390117360981e-01 106 KSP Residual norm 1.330497628570e-01 107 KSP Residual norm 1.272144816438e-01 108 KSP Residual norm 1.218709660754e-01 109 KSP Residual norm 1.165759575597e-01 110 KSP Residual norm 1.119527604341e-01 111 KSP Residual norm 1.074909304456e-01 112 KSP Residual norm 1.035850549917e-01 113 KSP Residual norm 1.001922844713e-01 114 KSP Residual norm 9.700108691235e-02 115 KSP Residual norm 9.452063838471e-02 116 KSP Residual norm 9.276670318311e-02 117 KSP Residual norm 9.127775937271e-02 118 KSP Residual norm 8.992664232562e-02 119 KSP Residual norm 8.851464629866e-02 120 KSP Residual norm 8.745513674656e-02 121 KSP Residual norm 8.661966267267e-02 122 KSP Residual norm 8.542514522208e-02 123 KSP Residual norm 8.431368422075e-02 124 KSP Residual norm 8.296383824558e-02 125 KSP Residual norm 8.156677423151e-02 126 KSP Residual norm 7.983727365337e-02 127 KSP Residual norm 7.774553931311e-02 128 KSP Residual norm 7.555801619595e-02 129 KSP Residual norm 7.321019002668e-02 130 KSP Residual norm 7.070087447108e-02 131 KSP Residual norm 6.828925902033e-02 132 KSP Residual norm 6.574372961312e-02 133 KSP Residual norm 6.328247502396e-02 134 KSP Residual norm 6.088052234370e-02 135 KSP Residual norm 5.847754305832e-02 136 KSP Residual norm 5.623613704991e-02 137 KSP Residual norm 5.393769750917e-02 138 KSP Residual norm 5.183677393935e-02 139 KSP Residual norm 4.971148791332e-02 140 KSP Residual norm 4.770277944095e-02 141 KSP Residual norm 4.580275494510e-02 142 KSP Residual norm 4.392995310708e-02 143 KSP Residual norm 4.207936132269e-02 144 KSP Residual norm 4.030913906792e-02 145 KSP Residual norm 3.869518096455e-02 146 KSP Residual norm 3.722715926612e-02 147 KSP Residual norm 3.599475062909e-02 148 KSP Residual norm 3.504162173459e-02 149 KSP Residual norm 3.404313925572e-02 150 KSP Residual norm 3.301315457551e-02 151 KSP Residual norm 3.213179590346e-02 152 KSP Residual norm 3.127474629457e-02 153 KSP Residual norm 3.052656900310e-02 154 KSP Residual norm 2.956526736750e-02 155 KSP Residual norm 2.855745174838e-02 156 KSP Residual norm 2.756702651158e-02 157 KSP Residual norm 2.656022478111e-02 158 KSP Residual norm 2.557239572212e-02 159 KSP Residual norm 2.465750156399e-02 160 KSP Residual norm 2.378538302862e-02 161 KSP Residual norm 2.292826286681e-02 162 KSP Residual norm 2.208131385340e-02 163 KSP Residual norm 2.127634140024e-02 164 KSP Residual norm 2.046614836530e-02 165 KSP Residual norm 1.969475181587e-02 166 KSP Residual norm 1.893471336015e-02 167 KSP Residual norm 1.820165451421e-02 168 KSP Residual norm 1.749361165350e-02 169 KSP Residual norm 1.680434282700e-02 170 KSP Residual norm 1.617372763782e-02 171 KSP Residual norm 1.556252932546e-02 172 KSP Residual norm 1.501093993224e-02 173 KSP Residual norm 1.451897308535e-02 174 KSP Residual norm 1.406166019521e-02 175 KSP Residual norm 1.369049275789e-02 176 KSP Residual norm 1.340585882977e-02 177 KSP Residual norm 1.315163647586e-02 178 KSP Residual norm 1.293845397043e-02 179 KSP Residual norm 1.271957482225e-02 180 KSP Residual norm 1.255431736443e-02 181 KSP Residual norm 1.240910020801e-02 182 KSP Residual norm 1.221463732114e-02 183 KSP Residual norm 1.203762451604e-02 184 KSP Residual norm 1.182658007866e-02 185 KSP Residual norm 1.161039292815e-02 186 KSP Residual norm 1.135620909067e-02 187 KSP Residual norm 1.106423678918e-02 188 KSP Residual norm 1.076461690513e-02 189 KSP Residual norm 1.045535666748e-02 190 KSP Residual norm 1.013265363660e-02 191 KSP Residual norm 9.824591528394e-03 192 KSP Residual norm 9.511814641049e-03 193 KSP Residual norm 9.200333439991e-03 194 KSP Residual norm 8.904549160573e-03 195 KSP Residual norm 8.602773933280e-03 196 KSP Residual norm 8.316780036425e-03 197 KSP Residual norm 8.026706223186e-03 198 KSP Residual norm 7.746536046276e-03 199 KSP Residual norm 7.466624619223e-03 200 KSP Residual norm 7.189528445225e-03 201 KSP Residual norm 6.920182299197e-03 202 KSP Residual norm 6.647762013570e-03 203 KSP Residual norm 6.365955470441e-03 204 KSP Residual norm 6.092249265175e-03 205 KSP Residual norm 5.833208737328e-03 206 KSP Residual norm 5.584672393183e-03 207 KSP Residual norm 5.372408417020e-03 208 KSP Residual norm 5.210010429499e-03 Residual norm 2.99117e-05 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex45 on a arch-linux-c-opt named gra798 with 1 processor, by aminsad Wed Mar 25 12:34:19 2020 Using 1 OpenMP threads Using Petsc Development GIT revision: v3.12.4-1057-g94d548e326 GIT Date: 2020-03-24 15:34:20 +0000 Max Max/Min Avg Total Time (sec): 9.836e+01 1.000 9.836e+01 Objects: 5.700e+01 1.000 5.700e+01 Flop: 1.557e+11 1.000 1.557e+11 1.557e+11 Flop/sec: 1.583e+09 1.000 1.583e+09 1.583e+09 MPI Messages: 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 9.8355e+01 100.0% 1.5568e+11 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 215 1.0 1.6618e+01 1.0 2.23e+10 1.0 0.0e+00 0.0e+00 0.0e+00 17 14 0 0 0 17 14 0 0 0 1339 MatSolve 215 1.0 2.5040e+01 1.0 2.23e+10 1.0 0.0e+00 0.0e+00 0.0e+00 25 14 0 0 0 25 14 0 0 0 889 MatLUFactorNum 1 1.0 4.6073e-01 1.0 1.71e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 371 MatILUFactorSym 1 1.0 3.3528e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 2 1.0 9.6485e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 2.0322e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 2.3982e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 2.1713e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 1 1.0 8.8390e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 9.8112e+01 1.0 1.56e+11 1.0 0.0e+00 0.0e+00 0.0e+00100100 0 0 0 100100 0 0 0 1585 KSPGMRESOrthog 208 1.0 4.5682e+01 1.0 1.02e+11 1.0 0.0e+00 0.0e+00 0.0e+00 46 66 0 0 0 46 66 0 0 0 2239 DMCreateMat 1 1.0 3.2282e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecMDot 208 1.0 2.1305e+01 1.0 5.11e+10 1.0 0.0e+00 0.0e+00 0.0e+00 22 33 0 0 0 22 33 0 0 0 2400 VecNorm 216 1.0 1.2016e+00 1.0 3.46e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 2876 VecScale 215 1.0 1.5751e+00 1.0 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 1092 VecCopy 7 1.0 5.2071e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 45 1.0 7.5333e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 14 1.0 1.8468e-01 1.0 2.24e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1213 VecMAXPY 215 1.0 2.5949e+01 1.0 5.45e+10 1.0 0.0e+00 0.0e+00 0.0e+00 26 35 0 0 0 26 35 0 0 0 2099 VecNormalize 215 1.0 2.7716e+00 1.0 5.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 3 0 0 0 3 3 0 0 0 1862 PCSetUp 1 1.0 8.1784e-01 1.0 1.71e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 209 PCApply 215 1.0 2.5040e+01 1.0 2.23e+10 1.0 0.0e+00 0.0e+00 0.0e+00 25 14 0 0 0 25 14 0 0 0 889 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 1 1 18648 0. DMKSP interface 1 1 664 0. Matrix 2 2 1626246524 0. Distributed Mesh 1 1 5552 0. Index Set 5 5 128004520 0. IS L to G Mapping 1 1 32000680 0. Star Forest Graph 2 2 2272 0. Discrete System 1 1 936 0. Vec Scatter 1 1 784 0. Vector 39 39 2368061280 0. Preconditioner 1 1 1016 0. Viewer 2 1 848 0. ======================================================================================================================== Average time to get PetscTime(): 3.20841e-08 #PETSc Option Table entries: -da_grid_x 200 -da_grid_y 200 -da_grid_z 200 -ksp_monitor -log_view #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-metis=1 --download-parmetis=1 --download-mumps=1 --download-superlu_dist=1 --download-pastix=1 --download-ptscotch=1 --download-hwloc=1 --download-mpi4py=1 --download-petsc4py=1 --with-python-exec=/home/aminsad/env/bin/python3.7 --with-debugging=0 --with-openmp=1 --with-blaslapack=1 --with-blaslapack-dir=/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl --with-scalapack=1 --with-scalapack-dir=/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" ----------------------------------------- Libraries compiled on 2020-03-24 19:22:01 on gra-login1 Machine characteristics: Linux-3.10.0-957.12.2.el7.x86_64-x86_64-with-centos-7.5.1804-Core Using PETSc directory: /project/6003554/aminsad/petsc Using PETSc arch: arch-linux-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -O3 -march=native -mtune=native -fopenmp Using Fortran compiler: mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -O3 -march=native -mtune=native -fopenmp ----------------------------------------- Using include paths: -I/project/6003554/aminsad/petsc/include -I/project/6003554/aminsad/petsc/arch-linux-c-opt/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -L/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -lpetsc -Wl,-rpath,/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -L/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/openmpi/3.1.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/openmpi/3.1.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/cuda/10.0.130/lib64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/cuda/10.0.130/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib64 -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib64 -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib64 -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc7.3/cuda10.0/openmpi3.1/scalapack/2.0.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc7.3/cuda10.0/openmpi3.1/scalapack/2.0.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openblas/0.2.20/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openblas/0.2.20/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/ucx/1.5.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/ucx/1.5.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/scipy-stack/2019b/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/scipy-stack/2019b/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lpastix -lsuperlu_dist -lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lmkl_def -lpthread -lptesmumps -lptscotchparmetis -lptscotch -lptscotcherr -lesmumps -lscotch -lscotcherr -lhwloc -lparmetis -lmetis -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgcc_s -lquadmath -lrt -lm -lrt -lquadmath -lstdc++ -ldl ----------------------------------------- real 1m39.926s user 1m37.868s sys 0m1.408s
[aminsad@gra798 tutorials]$ export OMP_NUM_THREADS=1; time mpirun -n 32 ./ex45 -da_grid_x 200 -da_grid_y 200 -da_grid_z 200 -ksp_monitor -log_view 0 KSP Residual norm 5.249608433925e+02 1 KSP Residual norm 1.265428762530e+02 2 KSP Residual norm 6.440385867407e+01 3 KSP Residual norm 4.102998736678e+01 4 KSP Residual norm 2.919142735608e+01 5 KSP Residual norm 2.245685657122e+01 6 KSP Residual norm 1.816342040249e+01 7 KSP Residual norm 1.497714976981e+01 8 KSP Residual norm 1.259234289384e+01 9 KSP Residual norm 1.101676015503e+01 10 KSP Residual norm 9.716196503380e+00 11 KSP Residual norm 8.488499161510e+00 12 KSP Residual norm 7.519258438529e+00 13 KSP Residual norm 6.818253860737e+00 14 KSP Residual norm 6.199119869588e+00 15 KSP Residual norm 5.635205086411e+00 16 KSP Residual norm 5.180073034576e+00 17 KSP Residual norm 4.765696046660e+00 18 KSP Residual norm 4.392918698690e+00 19 KSP Residual norm 4.093518855484e+00 20 KSP Residual norm 3.816667635253e+00 21 KSP Residual norm 3.555324027975e+00 22 KSP Residual norm 3.339142343195e+00 23 KSP Residual norm 3.138747892495e+00 24 KSP Residual norm 2.947488891891e+00 25 KSP Residual norm 2.787612374551e+00 26 KSP Residual norm 2.640611798979e+00 27 KSP Residual norm 2.494516282558e+00 28 KSP Residual norm 2.366601361911e+00 29 KSP Residual norm 2.250829868652e+00 30 KSP Residual norm 2.137276910926e+00 31 KSP Residual norm 2.082175156166e+00 32 KSP Residual norm 2.026525914625e+00 33 KSP Residual norm 1.968243173977e+00 34 KSP Residual norm 1.910107186678e+00 35 KSP Residual norm 1.854773908180e+00 36 KSP Residual norm 1.800716724524e+00 37 KSP Residual norm 1.748707570595e+00 38 KSP Residual norm 1.698444649528e+00 39 KSP Residual norm 1.648677808684e+00 40 KSP Residual norm 1.594546744178e+00 41 KSP Residual norm 1.537145110870e+00 42 KSP Residual norm 1.480597675851e+00 43 KSP Residual norm 1.424275121339e+00 44 KSP Residual norm 1.369697596159e+00 45 KSP Residual norm 1.319809654662e+00 46 KSP Residual norm 1.268228019997e+00 47 KSP Residual norm 1.215066065426e+00 48 KSP Residual norm 1.166081801324e+00 49 KSP Residual norm 1.118860048643e+00 50 KSP Residual norm 1.074350417538e+00 51 KSP Residual norm 1.033868164618e+00 52 KSP Residual norm 9.915336088696e-01 53 KSP Residual norm 9.515792599450e-01 54 KSP Residual norm 9.173665467694e-01 55 KSP Residual norm 8.872289804587e-01 56 KSP Residual norm 8.586539974717e-01 57 KSP Residual norm 8.317648048018e-01 58 KSP Residual norm 8.088300296813e-01 59 KSP Residual norm 7.898600584047e-01 60 KSP Residual norm 7.741169695461e-01 61 KSP Residual norm 7.666034505350e-01 62 KSP Residual norm 7.591356508830e-01 63 KSP Residual norm 7.504803488370e-01 64 KSP Residual norm 7.368231316409e-01 65 KSP Residual norm 7.233978289913e-01 66 KSP Residual norm 7.099398146419e-01 67 KSP Residual norm 6.973011549847e-01 68 KSP Residual norm 6.846076709045e-01 69 KSP Residual norm 6.714910021982e-01 70 KSP Residual norm 6.580035107203e-01 71 KSP Residual norm 6.441069344681e-01 72 KSP Residual norm 6.306745126752e-01 73 KSP Residual norm 6.165112664881e-01 74 KSP Residual norm 6.018151573093e-01 75 KSP Residual norm 5.883190756643e-01 76 KSP Residual norm 5.751191789096e-01 77 KSP Residual norm 5.613318177006e-01 78 KSP Residual norm 5.462482695909e-01 79 KSP Residual norm 5.296806580957e-01 80 KSP Residual norm 5.138832977245e-01 81 KSP Residual norm 4.994269968227e-01 82 KSP Residual norm 4.838638867626e-01 83 KSP Residual norm 4.675107521663e-01 84 KSP Residual norm 4.531422219857e-01 85 KSP Residual norm 4.395557563929e-01 86 KSP Residual norm 4.262082723883e-01 87 KSP Residual norm 4.126357626332e-01 88 KSP Residual norm 3.995884224758e-01 89 KSP Residual norm 3.859947796179e-01 90 KSP Residual norm 3.702372344967e-01 91 KSP Residual norm 3.586865166601e-01 92 KSP Residual norm 3.489986560940e-01 93 KSP Residual norm 3.383015999447e-01 94 KSP Residual norm 3.254486490032e-01 95 KSP Residual norm 3.136283114457e-01 96 KSP Residual norm 3.034660846628e-01 97 KSP Residual norm 2.947968085267e-01 98 KSP Residual norm 2.864246436519e-01 99 KSP Residual norm 2.787719933067e-01 100 KSP Residual norm 2.711310945618e-01 101 KSP Residual norm 2.626624591672e-01 102 KSP Residual norm 2.545627706202e-01 103 KSP Residual norm 2.475841435827e-01 104 KSP Residual norm 2.416913604981e-01 105 KSP Residual norm 2.357527439001e-01 106 KSP Residual norm 2.295081907110e-01 107 KSP Residual norm 2.230040397502e-01 108 KSP Residual norm 2.174076642272e-01 109 KSP Residual norm 2.127545919621e-01 110 KSP Residual norm 2.085650347045e-01 111 KSP Residual norm 2.046563493208e-01 112 KSP Residual norm 2.008842244034e-01 113 KSP Residual norm 1.977449444019e-01 114 KSP Residual norm 1.950160913793e-01 115 KSP Residual norm 1.922131940159e-01 116 KSP Residual norm 1.889806169176e-01 117 KSP Residual norm 1.856470539408e-01 118 KSP Residual norm 1.832144729719e-01 119 KSP Residual norm 1.812124847051e-01 120 KSP Residual norm 1.793225308149e-01 121 KSP Residual norm 1.777373185059e-01 122 KSP Residual norm 1.758846604621e-01 123 KSP Residual norm 1.740208152185e-01 124 KSP Residual norm 1.712608586427e-01 125 KSP Residual norm 1.682805634955e-01 126 KSP Residual norm 1.652070403112e-01 127 KSP Residual norm 1.621375181753e-01 128 KSP Residual norm 1.592076976400e-01 129 KSP Residual norm 1.559842697067e-01 130 KSP Residual norm 1.526186414683e-01 131 KSP Residual norm 1.496657406367e-01 132 KSP Residual norm 1.468223239123e-01 133 KSP Residual norm 1.434373417124e-01 134 KSP Residual norm 1.396489916561e-01 135 KSP Residual norm 1.362772917475e-01 136 KSP Residual norm 1.334398618975e-01 137 KSP Residual norm 1.308775665505e-01 138 KSP Residual norm 1.279045646846e-01 139 KSP Residual norm 1.240511974151e-01 140 KSP Residual norm 1.203373892093e-01 141 KSP Residual norm 1.169500635143e-01 142 KSP Residual norm 1.133518041936e-01 143 KSP Residual norm 1.096762803745e-01 144 KSP Residual norm 1.065389028869e-01 145 KSP Residual norm 1.031650968952e-01 146 KSP Residual norm 9.972429382486e-02 147 KSP Residual norm 9.604990248094e-02 148 KSP Residual norm 9.280538885248e-02 149 KSP Residual norm 8.989799298809e-02 150 KSP Residual norm 8.622191257494e-02 151 KSP Residual norm 8.314297347813e-02 152 KSP Residual norm 8.079175656005e-02 153 KSP Residual norm 7.814331368067e-02 154 KSP Residual norm 7.499791735298e-02 155 KSP Residual norm 7.209580047382e-02 156 KSP Residual norm 6.953599436091e-02 157 KSP Residual norm 6.752121273945e-02 158 KSP Residual norm 6.556027566041e-02 159 KSP Residual norm 6.382199997339e-02 160 KSP Residual norm 6.207621031792e-02 161 KSP Residual norm 6.007190088213e-02 162 KSP Residual norm 5.813887730789e-02 163 KSP Residual norm 5.669097434266e-02 164 KSP Residual norm 5.557761371067e-02 165 KSP Residual norm 5.439832688168e-02 166 KSP Residual norm 5.299234152664e-02 167 KSP Residual norm 5.135580432085e-02 168 KSP Residual norm 4.994592829824e-02 169 KSP Residual norm 4.895876289634e-02 170 KSP Residual norm 4.814728496529e-02 171 KSP Residual norm 4.726731374541e-02 172 KSP Residual norm 4.639401434967e-02 173 KSP Residual norm 4.567042835366e-02 174 KSP Residual norm 4.501736767734e-02 175 KSP Residual norm 4.437612950997e-02 176 KSP Residual norm 4.366449825969e-02 177 KSP Residual norm 4.298455757737e-02 178 KSP Residual norm 4.246209670699e-02 179 KSP Residual norm 4.197500879820e-02 180 KSP Residual norm 4.158882418128e-02 181 KSP Residual norm 4.124933522563e-02 182 KSP Residual norm 4.080868474458e-02 183 KSP Residual norm 4.041796941975e-02 184 KSP Residual norm 3.987523091238e-02 185 KSP Residual norm 3.924764595191e-02 186 KSP Residual norm 3.857662927742e-02 187 KSP Residual norm 3.778359654969e-02 188 KSP Residual norm 3.709854700634e-02 189 KSP Residual norm 3.626492070798e-02 190 KSP Residual norm 3.542795264245e-02 191 KSP Residual norm 3.479394462214e-02 192 KSP Residual norm 3.412742365093e-02 193 KSP Residual norm 3.328834953716e-02 194 KSP Residual norm 3.235962174909e-02 195 KSP Residual norm 3.153985569560e-02 196 KSP Residual norm 3.089595206732e-02 197 KSP Residual norm 3.033963440988e-02 198 KSP Residual norm 2.972637364100e-02 199 KSP Residual norm 2.880857317841e-02 200 KSP Residual norm 2.791389267586e-02 201 KSP Residual norm 2.710631270510e-02 202 KSP Residual norm 2.623553047862e-02 203 KSP Residual norm 2.538105620349e-02 204 KSP Residual norm 2.464820283216e-02 205 KSP Residual norm 2.376911985539e-02 206 KSP Residual norm 2.288581248338e-02 207 KSP Residual norm 2.192429050312e-02 208 KSP Residual norm 2.109943969796e-02 209 KSP Residual norm 2.044847892581e-02 210 KSP Residual norm 1.936339356184e-02 211 KSP Residual norm 1.844506606510e-02 212 KSP Residual norm 1.790565782681e-02 213 KSP Residual norm 1.725557688860e-02 214 KSP Residual norm 1.650832896252e-02 215 KSP Residual norm 1.582635222852e-02 216 KSP Residual norm 1.518356674587e-02 217 KSP Residual norm 1.471502538185e-02 218 KSP Residual norm 1.425851357051e-02 219 KSP Residual norm 1.387101825755e-02 220 KSP Residual norm 1.348994679578e-02 221 KSP Residual norm 1.304218934250e-02 222 KSP Residual norm 1.259732486832e-02 223 KSP Residual norm 1.230062109218e-02 224 KSP Residual norm 1.208052227556e-02 225 KSP Residual norm 1.185227963847e-02 226 KSP Residual norm 1.154121294881e-02 227 KSP Residual norm 1.114997137422e-02 228 KSP Residual norm 1.079989440263e-02 229 KSP Residual norm 1.058649071326e-02 230 KSP Residual norm 1.043810756544e-02 231 KSP Residual norm 1.025235384775e-02 232 KSP Residual norm 1.005047849820e-02 233 KSP Residual norm 9.883211153379e-03 234 KSP Residual norm 9.720655200692e-03 235 KSP Residual norm 9.585728202389e-03 236 KSP Residual norm 9.444639131060e-03 237 KSP Residual norm 9.321738384777e-03 238 KSP Residual norm 9.210712667920e-03 239 KSP Residual norm 9.090842501285e-03 240 KSP Residual norm 9.012724152274e-03 241 KSP Residual norm 8.944248827931e-03 242 KSP Residual norm 8.845894448884e-03 243 KSP Residual norm 8.764885513498e-03 244 KSP Residual norm 8.662950571828e-03 245 KSP Residual norm 8.543186887738e-03 246 KSP Residual norm 8.413744234456e-03 247 KSP Residual norm 8.224759693722e-03 248 KSP Residual norm 8.072472182981e-03 249 KSP Residual norm 7.856167731929e-03 250 KSP Residual norm 7.668475040020e-03 251 KSP Residual norm 7.547778724446e-03 252 KSP Residual norm 7.388324757543e-03 253 KSP Residual norm 7.208369897534e-03 254 KSP Residual norm 6.998054296952e-03 255 KSP Residual norm 6.807694421481e-03 256 KSP Residual norm 6.666664532810e-03 257 KSP Residual norm 6.549661309380e-03 258 KSP Residual norm 6.434743070010e-03 259 KSP Residual norm 6.233702928992e-03 260 KSP Residual norm 6.041040743172e-03 261 KSP Residual norm 5.853097295784e-03 262 KSP Residual norm 5.635354146687e-03 263 KSP Residual norm 5.466832089504e-03 264 KSP Residual norm 5.293037425952e-03 265 KSP Residual norm 5.087345356073e-03 Residual norm 3.29946e-05 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex45 on a arch-linux-c-opt named gra798 with 32 processors, by aminsad Wed Mar 25 12:31:13 2020 Using 1 OpenMP threads Using Petsc Development GIT revision: v3.12.4-1057-g94d548e326 GIT Date: 2020-03-24 15:34:20 +0000 Max Max/Min Avg Total Time (sec): 1.119e+01 1.000 1.119e+01 Objects: 7.000e+01 1.000 7.000e+01 Flop: 6.164e+09 1.001 6.161e+09 1.971e+11 Flop/sec: 5.511e+08 1.001 5.508e+08 1.763e+10 MPI Messages: 1.385e+03 1.667 1.108e+03 3.546e+04 MPI Message Lengths: 4.950e+07 1.800 3.475e+04 1.232e+09 MPI Reductions: 6.400e+02 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 1.1186e+01 100.0% 1.9715e+11 100.0% 3.546e+04 100.0% 3.475e+04 100.0% 6.330e+02 98.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 2 1.0 8.7132e-0324.3 0.00e+00 0.0 1.3e+02 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BuildTwoSidedF 2 1.0 3.0992e-0254.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 274 1.0 2.1757e+00 1.1 8.89e+08 1.0 3.5e+04 3.5e+04 0.0e+00 19 14 99100 0 19 14 99100 0 13037 MatSolve 274 1.0 2.2525e+00 1.1 8.77e+08 1.0 0.0e+00 0.0e+00 0.0e+00 20 14 0 0 0 20 14 0 0 0 12456 MatLUFactorNum 1 1.0 2.3524e-02 1.6 5.33e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 7142 MatILUFactorSym 1 1.0 2.2375e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 2 1.0 3.1108e-0244.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 2.2871e-02 1.0 0.00e+00 0.0 1.9e+02 1.2e+04 4.0e+00 0 0 1 0 1 0 0 1 0 1 0 MatGetRowIJ 1 1.0 4.5050e-06 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 2.1350e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 2 1.0 7.7485e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 2 0 0 0 0 2 0 KSPSolve 1 1.0 1.1128e+01 1.0 6.16e+09 1.0 3.5e+04 3.5e+04 6.2e+02 99100 99 99 96 99100 99 99 97 17705 KSPGMRESOrthog 265 1.0 5.8372e+00 1.0 4.04e+09 1.0 0.0e+00 0.0e+00 2.6e+02 51 66 0 0 41 51 66 0 0 42 22175 DMCreateMat 1 1.0 2.5385e-01 1.0 0.00e+00 0.0 1.9e+02 1.2e+04 6.0e+00 2 0 1 0 1 2 0 1 0 1 0 SFSetGraph 2 1.0 1.8198e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 2 1.0 1.6453e-02 1.4 0.00e+00 0.0 3.8e+02 1.2e+04 0.0e+00 0 0 1 0 0 0 0 1 0 0 0 SFBcastOpBegin 274 1.0 1.6522e-01 7.1 0.00e+00 0.0 3.5e+04 3.5e+04 0.0e+00 0 0 99100 0 0 0 99100 0 0 SFBcastOpEnd 274 1.0 1.2824e-01 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFPack 274 1.0 1.4009e-01 8.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 274 1.0 1.4459e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 265 1.0 2.8243e+00 1.1 2.02e+09 1.0 0.0e+00 0.0e+00 2.6e+02 24 33 0 0 41 24 33 0 0 42 22915 VecNorm 275 1.0 3.0538e-01 1.3 1.38e+08 1.0 0.0e+00 0.0e+00 2.8e+02 2 2 0 0 43 2 2 0 0 43 14408 VecScale 274 1.0 5.1492e-02 1.2 6.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 42570 VecCopy 9 1.0 1.2801e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 285 1.0 1.9370e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 18 1.0 2.2425e-02 1.1 9.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 12843 VecMAXPY 274 1.0 3.2605e+00 1.0 2.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00 29 35 0 0 0 29 35 0 0 0 21150 VecScatterBegin 274 1.0 1.6638e-01 7.1 0.00e+00 0.0 3.5e+04 3.5e+04 0.0e+00 0 0 99100 0 0 0 99100 0 0 VecScatterEnd 274 1.0 1.2893e-01 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 274 1.0 3.5557e-01 1.3 2.06e+08 1.0 0.0e+00 0.0e+00 2.7e+02 3 3 0 0 43 3 3 0 0 43 18494 PCSetUp 2 1.0 4.7702e-02 1.3 5.33e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3522 PCSetUpOnBlocks 1 1.0 4.7516e-02 1.3 5.33e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3536 PCApply 274 1.0 2.3890e+00 1.1 8.77e+08 1.0 0.0e+00 0.0e+00 0.0e+00 21 14 0 0 0 21 14 0 0 0 11745 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 2 2 20056 0. DMKSP interface 1 1 664 0. Matrix 4 4 55612836 0. Distributed Mesh 1 1 5552 0. Index Set 7 7 4106320 0. IS L to G Mapping 1 1 1051484 0. Star Forest Graph 4 4 4544 0. Discrete System 1 1 936 0. Vec Scatter 2 2 1632 0. Vector 43 43 74172416 0. Preconditioner 2 2 1928 0. Viewer 2 1 848 0. ======================================================================================================================== Average time to get PetscTime(): 3.2899e-08 Average time for MPI_Barrier(): 1.99434e-05 Average time for zero size MPI_Send(): 5.92269e-06 #PETSc Option Table entries: -da_grid_x 200 -da_grid_y 200 -da_grid_z 200 -ksp_monitor -log_view #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-metis=1 --download-parmetis=1 --download-mumps=1 --download-superlu_dist=1 --download-pastix=1 --download-ptscotch=1 --download-hwloc=1 --download-mpi4py=1 --download-petsc4py=1 --with-python-exec=/home/aminsad/env/bin/python3.7 --with-debugging=0 --with-openmp=1 --with-blaslapack=1 --with-blaslapack-dir=/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl --with-scalapack=1 --with-scalapack-dir=/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" ----------------------------------------- Libraries compiled on 2020-03-24 19:22:01 on gra-login1 Machine characteristics: Linux-3.10.0-957.12.2.el7.x86_64-x86_64-with-centos-7.5.1804-Core Using PETSc directory: /project/6003554/aminsad/petsc Using PETSc arch: arch-linux-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -O3 -march=native -mtune=native -fopenmp Using Fortran compiler: mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -O3 -march=native -mtune=native -fopenmp ----------------------------------------- Using include paths: -I/project/6003554/aminsad/petsc/include -I/project/6003554/aminsad/petsc/arch-linux-c-opt/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -L/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -lpetsc -Wl,-rpath,/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -L/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/openmpi/3.1.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/openmpi/3.1.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/cuda/10.0.130/lib64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/cuda/10.0.130/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib64 -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib64 -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib64 -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc7.3/cuda10.0/openmpi3.1/scalapack/2.0.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc7.3/cuda10.0/openmpi3.1/scalapack/2.0.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openblas/0.2.20/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openblas/0.2.20/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/ucx/1.5.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/ucx/1.5.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/scipy-stack/2019b/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/scipy-stack/2019b/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lpastix -lsuperlu_dist -lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lmkl_def -lpthread -lptesmumps -lptscotchparmetis -lptscotch -lptscotcherr -lesmumps -lscotch -lscotcherr -lhwloc -lparmetis -lmetis -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgcc_s -lquadmath -lrt -lm -lrt -lquadmath -lstdc++ -ldl ----------------------------------------- real 0m30.642s user 6m9.528s sys 0m19.209s
[aminsad@gra798 tutorials]$ export OMP_NUM_THREADS=1; time mpirun -n 16 ./ex45 -da_grid_x 200 -da_grid_y 200 -da_grid_z 200 -ksp_monitor -log_view 0 KSP Residual norm 5.252011296306e+02 1 KSP Residual norm 1.269085380469e+02 2 KSP Residual norm 6.460593785942e+01 3 KSP Residual norm 4.112218358818e+01 4 KSP Residual norm 2.920564121051e+01 5 KSP Residual norm 2.238167545114e+01 6 KSP Residual norm 1.809801707825e+01 7 KSP Residual norm 1.496957966410e+01 8 KSP Residual norm 1.255482561725e+01 9 KSP Residual norm 1.089128192313e+01 10 KSP Residual norm 9.684749849860e+00 11 KSP Residual norm 8.498509768478e+00 12 KSP Residual norm 7.506462480505e+00 13 KSP Residual norm 6.774188282259e+00 14 KSP Residual norm 6.185966707924e+00 15 KSP Residual norm 5.626846567250e+00 16 KSP Residual norm 5.165136084636e+00 17 KSP Residual norm 4.763836534515e+00 18 KSP Residual norm 4.389745575850e+00 19 KSP Residual norm 4.083551740617e+00 20 KSP Residual norm 3.817824424805e+00 21 KSP Residual norm 3.555959460704e+00 22 KSP Residual norm 3.333383120792e+00 23 KSP Residual norm 3.140588514974e+00 24 KSP Residual norm 2.949787099521e+00 25 KSP Residual norm 2.783310493965e+00 26 KSP Residual norm 2.641514376639e+00 27 KSP Residual norm 2.497958700290e+00 28 KSP Residual norm 2.366197082156e+00 29 KSP Residual norm 2.252538884755e+00 30 KSP Residual norm 2.142135867206e+00 31 KSP Residual norm 2.088087731083e+00 32 KSP Residual norm 2.034863255730e+00 33 KSP Residual norm 1.978362295133e+00 34 KSP Residual norm 1.924644303719e+00 35 KSP Residual norm 1.871125469563e+00 36 KSP Residual norm 1.815549551568e+00 37 KSP Residual norm 1.763447437183e+00 38 KSP Residual norm 1.710625609888e+00 39 KSP Residual norm 1.655761966307e+00 40 KSP Residual norm 1.600464900493e+00 41 KSP Residual norm 1.542616350842e+00 42 KSP Residual norm 1.484353337038e+00 43 KSP Residual norm 1.428963224773e+00 44 KSP Residual norm 1.374856636946e+00 45 KSP Residual norm 1.323327598365e+00 46 KSP Residual norm 1.271482597914e+00 47 KSP Residual norm 1.217659131364e+00 48 KSP Residual norm 1.167512041138e+00 49 KSP Residual norm 1.121080161762e+00 50 KSP Residual norm 1.076188916407e+00 51 KSP Residual norm 1.034032758160e+00 52 KSP Residual norm 9.919283225046e-01 53 KSP Residual norm 9.521150298374e-01 54 KSP Residual norm 9.179270989461e-01 55 KSP Residual norm 8.869975263710e-01 56 KSP Residual norm 8.570446420244e-01 57 KSP Residual norm 8.306474411941e-01 58 KSP Residual norm 8.082961780665e-01 59 KSP Residual norm 7.884194927067e-01 60 KSP Residual norm 7.721027976459e-01 61 KSP Residual norm 7.646063457499e-01 62 KSP Residual norm 7.570861714499e-01 63 KSP Residual norm 7.484174740567e-01 64 KSP Residual norm 7.348471428925e-01 65 KSP Residual norm 7.215526430293e-01 66 KSP Residual norm 7.084601029373e-01 67 KSP Residual norm 6.962837818918e-01 68 KSP Residual norm 6.833243360672e-01 69 KSP Residual norm 6.699878556765e-01 70 KSP Residual norm 6.554086625701e-01 71 KSP Residual norm 6.408373624372e-01 72 KSP Residual norm 6.277446823055e-01 73 KSP Residual norm 6.142948104248e-01 74 KSP Residual norm 6.000022305918e-01 75 KSP Residual norm 5.857795765289e-01 76 KSP Residual norm 5.705887930506e-01 77 KSP Residual norm 5.548180698718e-01 78 KSP Residual norm 5.383621101021e-01 79 KSP Residual norm 5.216104157021e-01 80 KSP Residual norm 5.074386555473e-01 81 KSP Residual norm 4.939002143431e-01 82 KSP Residual norm 4.786223296316e-01 83 KSP Residual norm 4.629791233076e-01 84 KSP Residual norm 4.484223822265e-01 85 KSP Residual norm 4.344808222946e-01 86 KSP Residual norm 4.205901809383e-01 87 KSP Residual norm 4.066855269507e-01 88 KSP Residual norm 3.936812618644e-01 89 KSP Residual norm 3.798557677341e-01 90 KSP Residual norm 3.642688829039e-01 91 KSP Residual norm 3.531546202076e-01 92 KSP Residual norm 3.436697706775e-01 93 KSP Residual norm 3.334139390420e-01 94 KSP Residual norm 3.214827362686e-01 95 KSP Residual norm 3.105830361752e-01 96 KSP Residual norm 3.006573372550e-01 97 KSP Residual norm 2.919852226668e-01 98 KSP Residual norm 2.837053971355e-01 99 KSP Residual norm 2.760621413616e-01 100 KSP Residual norm 2.687346237461e-01 101 KSP Residual norm 2.609819595786e-01 102 KSP Residual norm 2.528946160957e-01 103 KSP Residual norm 2.457133881066e-01 104 KSP Residual norm 2.392686662912e-01 105 KSP Residual norm 2.329573066980e-01 106 KSP Residual norm 2.266279841906e-01 107 KSP Residual norm 2.203850216323e-01 108 KSP Residual norm 2.149475637202e-01 109 KSP Residual norm 2.102078084395e-01 110 KSP Residual norm 2.057138299439e-01 111 KSP Residual norm 2.014233338469e-01 112 KSP Residual norm 1.973487860007e-01 113 KSP Residual norm 1.940436641312e-01 114 KSP Residual norm 1.913409920407e-01 115 KSP Residual norm 1.885713686476e-01 116 KSP Residual norm 1.855042991634e-01 117 KSP Residual norm 1.824220293074e-01 118 KSP Residual norm 1.800292281797e-01 119 KSP Residual norm 1.779251083190e-01 120 KSP Residual norm 1.759341208617e-01 121 KSP Residual norm 1.743150235495e-01 122 KSP Residual norm 1.724795638604e-01 123 KSP Residual norm 1.705221288713e-01 124 KSP Residual norm 1.676932169980e-01 125 KSP Residual norm 1.646268740100e-01 126 KSP Residual norm 1.615002730863e-01 127 KSP Residual norm 1.586188448183e-01 128 KSP Residual norm 1.557645078418e-01 129 KSP Residual norm 1.527526170434e-01 130 KSP Residual norm 1.493176088185e-01 131 KSP Residual norm 1.462297870194e-01 132 KSP Residual norm 1.434565929145e-01 133 KSP Residual norm 1.404892355857e-01 134 KSP Residual norm 1.371870451937e-01 135 KSP Residual norm 1.339384352225e-01 136 KSP Residual norm 1.306471807638e-01 137 KSP Residual norm 1.274325029632e-01 138 KSP Residual norm 1.240057257976e-01 139 KSP Residual norm 1.203816729048e-01 140 KSP Residual norm 1.171656093590e-01 141 KSP Residual norm 1.140688414886e-01 142 KSP Residual norm 1.105775315359e-01 143 KSP Residual norm 1.070877351950e-01 144 KSP Residual norm 1.040724523655e-01 145 KSP Residual norm 1.008073971422e-01 146 KSP Residual norm 9.737181518732e-02 147 KSP Residual norm 9.379374041303e-02 148 KSP Residual norm 9.068718405460e-02 149 KSP Residual norm 8.779877197248e-02 150 KSP Residual norm 8.453450720005e-02 151 KSP Residual norm 8.181413112968e-02 152 KSP Residual norm 7.953718155597e-02 153 KSP Residual norm 7.704925325943e-02 154 KSP Residual norm 7.406755829295e-02 155 KSP Residual norm 7.132448853025e-02 156 KSP Residual norm 6.893057585077e-02 157 KSP Residual norm 6.700553958734e-02 158 KSP Residual norm 6.513559184924e-02 159 KSP Residual norm 6.341814343195e-02 160 KSP Residual norm 6.175395638360e-02 161 KSP Residual norm 5.993905178657e-02 162 KSP Residual norm 5.809390206444e-02 163 KSP Residual norm 5.651625392691e-02 164 KSP Residual norm 5.516481163454e-02 165 KSP Residual norm 5.382898256093e-02 166 KSP Residual norm 5.246446292218e-02 167 KSP Residual norm 5.102866595862e-02 168 KSP Residual norm 4.975691207155e-02 169 KSP Residual norm 4.875888692860e-02 170 KSP Residual norm 4.784209425806e-02 171 KSP Residual norm 4.687139140795e-02 172 KSP Residual norm 4.597973826973e-02 173 KSP Residual norm 4.526136800521e-02 174 KSP Residual norm 4.465092395672e-02 175 KSP Residual norm 4.401801892455e-02 176 KSP Residual norm 4.332105664660e-02 177 KSP Residual norm 4.266489234954e-02 178 KSP Residual norm 4.214113121537e-02 179 KSP Residual norm 4.165673847455e-02 180 KSP Residual norm 4.125022092097e-02 181 KSP Residual norm 4.089238732284e-02 182 KSP Residual norm 4.045419997201e-02 183 KSP Residual norm 4.002800443263e-02 184 KSP Residual norm 3.944889951992e-02 185 KSP Residual norm 3.876192471825e-02 186 KSP Residual norm 3.803411760940e-02 187 KSP Residual norm 3.728927681576e-02 188 KSP Residual norm 3.662378887183e-02 189 KSP Residual norm 3.590447142495e-02 190 KSP Residual norm 3.504576585543e-02 191 KSP Residual norm 3.436164820657e-02 192 KSP Residual norm 3.373410870737e-02 193 KSP Residual norm 3.301397179984e-02 194 KSP Residual norm 3.222407328719e-02 195 KSP Residual norm 3.146621879336e-02 196 KSP Residual norm 3.072565954475e-02 197 KSP Residual norm 3.002574169124e-02 198 KSP Residual norm 2.924916714637e-02 199 KSP Residual norm 2.833670491994e-02 200 KSP Residual norm 2.751925323545e-02 201 KSP Residual norm 2.676985878486e-02 202 KSP Residual norm 2.591786520764e-02 203 KSP Residual norm 2.508534332660e-02 204 KSP Residual norm 2.440138052528e-02 205 KSP Residual norm 2.359185540008e-02 206 KSP Residual norm 2.271856404241e-02 207 KSP Residual norm 2.177295471962e-02 208 KSP Residual norm 2.099615013328e-02 209 KSP Residual norm 2.032804701703e-02 210 KSP Residual norm 1.951790621987e-02 211 KSP Residual norm 1.881891438066e-02 212 KSP Residual norm 1.828074450719e-02 213 KSP Residual norm 1.766999245313e-02 214 KSP Residual norm 1.691535865060e-02 215 KSP Residual norm 1.623473463454e-02 216 KSP Residual norm 1.565194875541e-02 217 KSP Residual norm 1.522264335160e-02 218 KSP Residual norm 1.478831877580e-02 219 KSP Residual norm 1.438968625930e-02 220 KSP Residual norm 1.400621242186e-02 221 KSP Residual norm 1.356595900212e-02 222 KSP Residual norm 1.310962940697e-02 223 KSP Residual norm 1.275296841695e-02 224 KSP Residual norm 1.246410895588e-02 225 KSP Residual norm 1.218040732758e-02 226 KSP Residual norm 1.188115502230e-02 227 KSP Residual norm 1.154571508336e-02 228 KSP Residual norm 1.123810375785e-02 229 KSP Residual norm 1.102650608938e-02 230 KSP Residual norm 1.084564289822e-02 231 KSP Residual norm 1.062524692085e-02 232 KSP Residual norm 1.041943556772e-02 233 KSP Residual norm 1.025490221536e-02 234 KSP Residual norm 1.010581183942e-02 235 KSP Residual norm 9.963807247023e-03 236 KSP Residual norm 9.811275591639e-03 237 KSP Residual norm 9.680100844331e-03 238 KSP Residual norm 9.564404658743e-03 239 KSP Residual norm 9.448271153436e-03 240 KSP Residual norm 9.360235918394e-03 241 KSP Residual norm 9.281648889877e-03 242 KSP Residual norm 9.180477620278e-03 243 KSP Residual norm 9.088600775967e-03 244 KSP Residual norm 8.972789876572e-03 245 KSP Residual norm 8.826074058557e-03 246 KSP Residual norm 8.669810341657e-03 247 KSP Residual norm 8.483400503458e-03 248 KSP Residual norm 8.334114168226e-03 249 KSP Residual norm 8.156873150450e-03 250 KSP Residual norm 7.952242283128e-03 251 KSP Residual norm 7.810417349836e-03 252 KSP Residual norm 7.667573073352e-03 253 KSP Residual norm 7.498257108287e-03 254 KSP Residual norm 7.322123501512e-03 255 KSP Residual norm 7.152360213063e-03 256 KSP Residual norm 6.986230021998e-03 257 KSP Residual norm 6.826448226367e-03 258 KSP Residual norm 6.655864004730e-03 259 KSP Residual norm 6.439232794595e-03 260 KSP Residual norm 6.241440743907e-03 261 KSP Residual norm 6.069475631917e-03 262 KSP Residual norm 5.864587364995e-03 263 KSP Residual norm 5.675068210926e-03 264 KSP Residual norm 5.525430271317e-03 265 KSP Residual norm 5.327127858897e-03 266 KSP Residual norm 5.110152028873e-03 Residual norm 3.21743e-05 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex45 on a arch-linux-c-opt named gra798 with 16 processors, by aminsad Wed Mar 25 12:50:28 2020 Using 1 OpenMP threads Using Petsc Development GIT revision: v3.12.4-1057-g94d548e326 GIT Date: 2020-03-24 15:34:20 +0000 Max Max/Min Avg Total Time (sec): 1.229e+01 1.000 1.229e+01 Objects: 7.000e+01 1.000 7.000e+01 Flop: 1.240e+10 1.000 1.239e+10 1.983e+11 Flop/sec: 1.009e+09 1.000 1.008e+09 1.613e+10 MPI Messages: 1.112e+03 1.333 9.730e+02 1.557e+04 MPI Message Lengths: 6.624e+07 1.500 5.673e+04 8.832e+08 MPI Reductions: 6.420e+02 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 1.2291e+01 100.0% 1.9832e+11 100.0% 1.557e+04 100.0% 5.673e+04 100.0% 6.350e+02 98.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSided 2 1.0 1.7242e-0245.8 0.00e+00 0.0 5.6e+01 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BuildTwoSidedF 2 1.0 8.6203e-02799.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 275 1.0 2.2680e+00 1.0 1.78e+09 1.0 1.5e+04 5.7e+04 0.0e+00 18 14 99100 0 18 14 99100 0 12552 MatSolve 275 1.0 2.6305e+00 1.0 1.77e+09 1.0 0.0e+00 0.0e+00 0.0e+00 21 14 0 0 0 21 14 0 0 0 10739 MatLUFactorNum 1 1.0 3.7381e-02 1.2 1.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4517 MatILUFactorSym 1 1.0 2.8429e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 2 1.0 8.6353e-02354.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 3.6191e-02 1.0 0.00e+00 0.0 8.4e+01 1.9e+04 4.0e+00 0 0 1 0 1 0 0 1 0 1 0 MatGetRowIJ 1 1.0 6.7940e-06 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 3.3286e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 2 1.0 1.1705e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 2 0 0 0 0 2 0 KSPSolve 1 1.0 1.2206e+01 1.0 1.24e+10 1.0 1.5e+04 5.7e+04 6.2e+02 99100 99 99 96 99100 99 99 97 16236 KSPGMRESOrthog 266 1.0 5.9548e+00 1.0 8.14e+09 1.0 0.0e+00 0.0e+00 2.7e+02 48 66 0 0 41 48 66 0 0 42 21877 DMCreateMat 1 1.0 4.0670e-01 1.0 0.00e+00 0.0 8.4e+01 1.9e+04 6.0e+00 3 0 1 0 1 3 0 1 0 1 0 SFSetGraph 2 1.0 2.7592e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 2 1.0 3.1398e-02 1.4 0.00e+00 0.0 1.7e+02 1.9e+04 0.0e+00 0 0 1 0 0 0 0 1 0 0 0 SFBcastOpBegin 275 1.0 6.2169e-02 2.7 0.00e+00 0.0 1.5e+04 5.7e+04 0.0e+00 0 0 99100 0 0 0 99100 0 0 SFBcastOpEnd 275 1.0 9.5892e-02 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFPack 275 1.0 5.3985e-02 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 275 1.0 8.3809e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 266 1.0 2.7815e+00 1.0 4.07e+09 1.0 0.0e+00 0.0e+00 2.7e+02 22 33 0 0 41 22 33 0 0 42 23418 VecNorm 276 1.0 2.7810e-01 1.4 2.76e+08 1.0 0.0e+00 0.0e+00 2.8e+02 2 2 0 0 43 2 2 0 0 43 15879 VecScale 275 1.0 8.1842e-02 1.1 1.38e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 26881 VecCopy 9 1.0 1.2840e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 286 1.0 2.3125e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecAXPY 18 1.0 2.6113e-02 1.1 1.80e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11029 VecMAXPY 275 1.0 3.4257e+00 1.0 4.34e+09 1.0 0.0e+00 0.0e+00 0.0e+00 27 35 0 0 0 27 35 0 0 0 20257 VecScatterBegin 275 1.0 6.2997e-02 2.7 0.00e+00 0.0 1.5e+04 5.7e+04 0.0e+00 0 0 99100 0 0 0 99100 0 0 VecScatterEnd 275 1.0 9.6426e-02 5.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 275 1.0 3.5998e-01 1.3 4.12e+08 1.0 0.0e+00 0.0e+00 2.8e+02 3 3 0 0 43 3 3 0 0 43 18334 PCSetUp 2 1.0 6.9312e-02 1.2 1.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 2436 PCSetUpOnBlocks 1 1.0 6.9197e-02 1.2 1.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 2440 PCApply 275 1.0 2.8535e+00 1.0 1.77e+09 1.0 0.0e+00 0.0e+00 0.0e+00 23 14 0 0 0 23 14 0 0 0 9899 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 2 2 20056 0. DMKSP interface 1 1 664 0. Matrix 4 4 111372836 0. Distributed Mesh 1 1 5552 0. Index Set 7 7 8166320 0. IS L to G Mapping 1 1 2081684 0. Star Forest Graph 4 4 4544 0. Discrete System 1 1 936 0. Vec Scatter 2 2 1632 0. Vector 43 43 148232416 0. Preconditioner 2 2 1928 0. Viewer 2 1 848 0. ======================================================================================================================== Average time to get PetscTime(): 3.32948e-08 Average time for MPI_Barrier(): 3.54443e-06 Average time for zero size MPI_Send(): 5.24169e-06 #PETSc Option Table entries: -da_grid_x 200 -da_grid_y 200 -da_grid_z 200 -ksp_monitor -log_view #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-metis=1 --download-parmetis=1 --download-mumps=1 --download-superlu_dist=1 --download-pastix=1 --download-ptscotch=1 --download-hwloc=1 --download-mpi4py=1 --download-petsc4py=1 --with-python-exec=/home/aminsad/env/bin/python3.7 --with-debugging=0 --with-openmp=1 --with-blaslapack=1 --with-blaslapack-dir=/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl --with-scalapack=1 --with-scalapack-dir=/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" ----------------------------------------- Libraries compiled on 2020-03-24 19:22:01 on gra-login1 Machine characteristics: Linux-3.10.0-957.12.2.el7.x86_64-x86_64-with-centos-7.5.1804-Core Using PETSc directory: /project/6003554/aminsad/petsc Using PETSc arch: arch-linux-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -O3 -march=native -mtune=native -fopenmp Using Fortran compiler: mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -O3 -march=native -mtune=native -fopenmp ----------------------------------------- Using include paths: -I/project/6003554/aminsad/petsc/include -I/project/6003554/aminsad/petsc/arch-linux-c-opt/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -L/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -lpetsc -Wl,-rpath,/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -L/project/6003554/aminsad/petsc/arch-linux-c-opt/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/mkl/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/openmpi/3.1.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/openmpi/3.1.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/cuda/10.0.130/lib64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/cuda/10.0.130/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib64 -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib64 -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib64 -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc7.3/cuda10.0/openmpi3.1/scalapack/2.0.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/MPI/gcc7.3/cuda10.0/openmpi3.1/scalapack/2.0.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openblas/0.2.20/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openblas/0.2.20/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/ucx/1.5.2/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/CUDA/gcc7.3/cuda10.0/ucx/1.5.2/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2019.3.199/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/scipy-stack/2019b/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/scipy-stack/2019b/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/gcc-7.3.0/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/ifort/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -L/cvmfs/restricted.computecanada.ca/easybuild/software/2017/Core/icc/2016.4.258/compilers_and_libraries_2016.4.258/linux/compiler/lib/intel64 -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib -L/cvmfs/soft.computecanada.ca/nix/var/nix/profiles/16.09/lib -Wl,-rpath,/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib -L/cvmfs/soft.computecanada.ca/nix/store/c9qaklf3dvjvlbky3fiakmafb1p8l106-gfortran-7.3.0/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lpastix -lsuperlu_dist -lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lmkl_def -lpthread -lptesmumps -lptscotchparmetis -lptscotch -lptscotcherr -lesmumps -lscotch -lscotcherr -lhwloc -lparmetis -lmetis -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgcc_s -lquadmath -lrt -lm -lrt -lquadmath -lstdc++ -ldl ----------------------------------------- real 0m22.546s user 3m19.315s sys 0m8.018s