Hello!
I want to acquire the 3 smallest eigenvalue, and attachment is the log view output. I can see epssolve really cost the major time. But I can not see why it cost so much time. Can you see something from it?
Thank you !
I want to acquire the 3 smallest eigenvalue, and attachment is the log view output. I can see epssolve really cost the major time. But I can not see why it cost so much time. Can you see something from it?
Thank you !
Runfeng Jin
On 6/4/2022 01:37,Jose E. Roman<[email protected]> wrote:
Convergence depends on distribution of eigenvalues you want to compute. On the other hand, the cost also depends on the time it takes to build the preconditioner. Use -log_view to see the cost of the different steps of the computation.
Jose
El 3 jun 2022, a las 18:50, jsfaraway <[email protected]> escribió:
hello!
I am trying to use epsgd compute matrix's one smallest eigenvalue. And I find a strang thing. There are two matrix A(900000*900000) and B(90000*90000). While solve A use 371 iterations and only 30.83s, solve B use 22 iterations and 38885s! What could be the reason for this? Or what can I do to find the reason?
I use" -eps_type gd -eps_ncv 300 -eps_nev 3 -eps_smallest_real ".
And there is one difference I can tell is matrix B has many small value, whose absolute value is less than 10-6. Could this be the reason?
Thank you!
Runfeng Jin
************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary:
----------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was compiled with a debugging option. #
# To get timing results run ./configure #
# using --with-debugging=no, the performance will #
# be generally two or three times faster. #
# #
##########################################################
/public/home/jrf/works/ecMRCI-shaula/MRCI on a named h04r1n11 with 32
processors, by jrf Fri Jun 10 16:41:22 2022
Using Petsc Release Version 3.15.1, Jun 17, 2021
Max Max/Min Avg Total
Time (sec): 1.115e+03 1.000 1.115e+03
Objects: 1.789e+03 1.178 1.553e+03
Flop: 3.989e+07 1.129 3.715e+07 1.189e+09
Flop/sec: 3.578e+04 1.129 3.332e+04 1.066e+06
Memory: 2.899e+09 6.128 5.491e+08 1.757e+10
MPI Messages: 5.472e+03 1.460 4.932e+03 1.578e+05
MPI Message Lengths: 4.115e+06 7.568 3.233e+02 5.103e+07
MPI Reductions: 3.091e+04 1.000
Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N
flop
and VecAXPY() for complex vectors of length N -->
8N flop
Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- --
Message Lengths -- -- Reductions --
Avg %Total Avg %Total Count %Total
Avg %Total Count %Total
0: Main Stage: 1.1148e+03 100.0% 1.1887e+09 100.0% 1.578e+05 100.0%
3.233e+02 100.0% 3.088e+04 99.9%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting
output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
AvgLen: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and
PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in
this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all
processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was compiled with a debugging option. #
# To get timing results run ./configure #
# using --with-debugging=no, the performance will #
# be generally two or three times faster. #
# #
##########################################################
Event Count Time (sec) Flop
--- Global --- --- Stage ---- Total
Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct
%T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
BuildTwoSided 3 1.0 4.3622e-01 1.4 0.00e+00 0.0 9.3e+02 4.0e+00
6.0e+00 0 0 1 0 0 0 0 1 0 0 0
BuildTwoSidedF 2 1.0 3.1168e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
4.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMult 168 1.0 4.0797e+00 1.1 5.52e+06 2.3 1.6e+05 3.0e+02
2.0e+00 0 10100 92 0 0 10100 92 0 29
MatSolve 327 1.0 3.3158e-02 6.3 3.08e+06 1.8 0.0e+00 0.0e+00
0.0e+00 0 6 0 0 0 0 6 0 0 0 2201
MatLUFactorNum 1 1.0 8.0269e-04 2.9 1.85e+05 3.4 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 4309
MatILUFactorSym 1 1.0 7.4272e-0366.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 3 1.0 5.7829e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 3 1.0 1.5648e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
3.8e+01 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 7.4893e-0597.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMat 1 1.0 2.7604e+00 1.0 0.00e+00 0.0 1.6e+02 2.6e+04
5.5e+01 0 0 0 8 0 0 0 0 8 0 0
MatGetOrdering 1 1.0 3.0672e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 60 1.0 1.2657e+00 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 1 1.0 2.4659e+01 1.0 0.00e+00 0.0 2.2e+02 1.9e+04
6.7e+01 2 0 0 8 0 2 0 0 8 0 0
VecNorm 3 1.0 2.1451e-01 1.4 4.80e+02 1.0 0.0e+00 0.0e+00
6.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecCopy 834 1.0 1.4403e-0211.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 338 1.0 7.0060e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 3 1.0 5.4741e-05 5.5 4.80e+02 1.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 279
VecScatterBegin 171 1.0 4.4001e-01 1.3 0.00e+00 0.0 1.6e+05 3.0e+02
5.0e+00 0 0100 92 0 0 0100 92 0 0
VecScatterEnd 171 1.0 3.9038e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSetRandom 3 1.0 9.7709e-03312.7 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecReduceArith 549 1.0 1.1454e-03 1.2 8.73e+04 1.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 2426
VecReduceComm 384 1.0 2.4445e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
7.7e+02 2 0 0 0 2 2 0 0 0 2 0
SFSetGraph 2 1.0 1.5990e-05 3.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFSetUp 4 1.0 2.6563e-01 1.2 0.00e+00 0.0 1.9e+03 7.7e+01
2.0e+00 0 0 1 0 0 0 0 1 0 0 0
SFPack 171 1.0 1.9103e-0225.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFUnpack 171 1.0 1.6528e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
EPSSetUp 1 1.0 1.0828e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
2.5e+02 1 0 0 0 1 1 0 0 0 1 0
EPSSolve 1 1.0 1.0803e+03 1.0 3.98e+07 1.1 1.5e+05 3.0e+02
3.1e+04 97100 98 90 99 97100 98 90 99 1
STSetUp 1 1.0 3.4610e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
1.0e+01 0 0 0 0 0 0 0 0 0 0 0
STComputeOperatr 1 1.0 2.8391e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
8.0e+00 0 0 0 0 0 0 0 0 0 0 0
BVCreate 169 1.0 2.6636e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
5.9e+03 24 0 0 0 19 24 0 0 0 19 0
BVCopy 336 1.0 4.2180e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
1.3e+03 4 0 0 0 4 4 0 0 0 4 0
BVMultVec 972 1.0 2.6155e+01 1.0 1.09e+07 1.0 0.0e+00 0.0e+00
8.3e+02 2 29 0 0 3 2 29 0 0 3 13
BVMultInPlace 171 1.0 4.6021e-02 8.8 1.28e+07 1.0 0.0e+00 0.0e+00
0.0e+00 0 34 0 0 0 0 34 0 0 0 8827
BVDot 274 1.0 1.7569e+01 1.0 4.38e+06 1.0 0.0e+00 0.0e+00
5.5e+02 2 12 0 0 2 2 12 0 0 2 8
BVDotVec 369 1.0 4.9438e+01 1.0 3.06e+06 1.0 0.0e+00 0.0e+00
1.6e+03 4 8 0 0 5 4 8 0 0 5 2
BVOrthogonalizeV 165 1.0 9.1103e+01 1.0 6.04e+06 1.0 0.0e+00 0.0e+00
2.9e+03 8 16 0 0 9 8 16 0 0 9 2
BVScale 219 1.0 8.6838e-0316.3 1.75e+04 1.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 64
BVSetRandom 3 1.0 4.2140e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
1.2e+01 0 0 0 0 0 0 0 0 0 0 0
BVMatProject 220 1.0 1.7618e+01 1.0 4.38e+06 1.0 0.0e+00 0.0e+00
5.5e+02 2 12 0 0 2 2 12 0 0 2 8
DSSolve 58 1.0 1.9047e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DSVectors 330 1.0 1.8081e-0210.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
DSOther 117 1.0 3.8670e-02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 1.6990e-0523.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 327 1.0 5.3268e-02 5.0 3.08e+06 1.8 0.0e+00 0.0e+00
0.0e+00 0 6 0 0 0 0 6 0 0 0 1370
PCSetUp 2 1.0 1.0324e-0210.9 1.85e+05 3.4 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 335
PCApply 327 1.0 5.8199e-02 3.7 3.26e+06 1.9 0.0e+00 0.0e+00
0.0e+00 0 6 0 0 0 0 6 0 0 0 1313
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 624 624 175636580 0.
Vector 695 695 58611248 0.
Index Set 15 15 30496 0.
Star Forest Graph 5 5 6600 0.
Viewer 3 2 1696 0.
EPS Solver 1 1 94404 0.
Spectral Transform 1 1 908 0.
Basis Vectors 170 170 537168 0.
Region 1 1 680 0.
Direct Solver 1 1 259622228 0.
Krylov Solver 2 2 3200 0.
Preconditioner 2 2 1936 0.
PetscRandom 1 1 670 0.
========================================================================================================================
Average time to get PetscTime(): 5e-08
Average time for MPI_Barrier(): 0.0273054
rank 1Time used for building and solve ec-Hamiltonian is 1250.734 s
rank 2Time used for building and solve ec-Hamiltonian is 1250.649 s
rank 3Time used for building and solve ec-Hamiltonian is 1250.625 s
rank 4Time used for building and solve ec-Hamiltonian is 1250.596 s
rank 5Time used for building and solve ec-Hamiltonian is 1250.554 s
rank 7Time used for building and solve ec-Hamiltonian is 1250.46 s
rank 8Time used for building and solve ec-Hamiltonian is 1250.417 s
rank 6Time used for building and solve ec-Hamiltonian is 1250.526 s
rank 9Time used for building and solve ec-Hamiltonian is 1250.37 s
rank 11Time used for building and solve ec-Hamiltonian is 1250.339 s
rank 10Time used for building and solve ec-Hamiltonian is 1250.628 s
rank 12Time used for building and solve ec-Hamiltonian is 1250.352 s
rank 13Time used for building and solve ec-Hamiltonian is 1250.579 s
rank 14Time used for building and solve ec-Hamiltonian is 1250.32 s
rank 15Time used for building and solve ec-Hamiltonian is 1250.365 s
rank 16Time used for building and solve ec-Hamiltonian is 1250.387 s
rank 17Time used for building and solve ec-Hamiltonian is 1250.268 s
rank 18Time used for building and solve ec-Hamiltonian is 1250.458 s
rank 19Time used for building and solve ec-Hamiltonian is 1250.428 s
rank 21Time used for building and solve ec-Hamiltonian is 1250.407 s
rank 24Time used for building and solve ec-Hamiltonian is 1250.393 s
rank 25Time used for building and solve ec-Hamiltonian is 1250.424 s
rank 20Time used for building and solve ec-Hamiltonian is 1250.341 s
rank 23Time used for building and solve ec-Hamiltonian is 1250.435 s
rank 22Time used for building and solve ec-Hamiltonian is 1250.26 s
rank 26Time used for building and solve ec-Hamiltonian is 1250.268 s
rank 27Time used for building and solve ec-Hamiltonian is 1250.274 s
rank 28Time used for building and solve ec-Hamiltonian is 1250.356 s
rank 30Time used for building and solve ec-Hamiltonian is 1250.224 s
rank 29Time used for building and solve ec-Hamiltonian is 1250.396 s
rank 31Time used for building and solve ec-Hamiltonian is 1250.324 s
Average time for zero size MPI_Send(): 0.00394064
#PETSc Option Table entries:
-eps_gd_blocksize 3
-eps_gd_initial_size 3
-eps_ncv 5000
-eps_type gd
-log_view
-mat_view ::ascii_matlab
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90
--with-blaslapack=1
--with-blaslapack-dir=/public/software/compiler/intel/oneapi/mkl/2021.3.0
--with-64-bit-blas-indices=0 --with-boost=1
--with-boost-dir=/public/home/jrf/tools/boost_1_73_0/gcc7.3.1
--prefix=/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices
--with-valgrind-dir=/public/home/jrf/tools/valgrind
--LDFLAGS=-Wl,-rpath=/opt/rh/devtoolset-7/root/usr/lib64
-Wl,-rpath=/opt/rh/devtoolset-7/root/usr/lib --with-64-bit-indices=0
--with-petsc-arch=gcc7.3.1-32indices
-----------------------------------------
Libraries compiled on 2021-09-27 14:16:48 on login09
Machine characteristics:
Linux-3.10.0-957.el7.x86_64-x86_64-with-centos-7.6.1810-Core
Using PETSc directory: /public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices
Using PETSc arch:
-----------------------------------------
Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing
-Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3
Using Fortran compiler: mpif90 -fPIC -Wall -ffree-line-length-0
-Wno-unused-dummy-argument -g
-----------------------------------------
Using include paths:
-I/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices/include
-I/public/home/jrf/tools/boost_1_73_0/gcc7.3.1/include
-I/public/home/jrf/tools/valgrind/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries:
-Wl,-rpath,/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices/lib
-L/public/home/jrf/tools/petsc3.15.1/gcc7.3.1-32indices/lib -lpetsc
-Wl,-rpath,/public/software/compiler/intel/oneapi/mkl/2021.3.0/lib/intel64
-L/public/software/compiler/intel/oneapi/mkl/2021.3.0/lib/intel64
-Wl,-rpath,/opt/hpc/software/mpi/hwloc/lib -L/opt/hpc/software/mpi/hwloc/lib
-Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/gcc-7.3.1/lib
-L/opt/hpc/software/mpi/hpcx/v2.7.4/gcc-7.3.1/lib
-Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7
-L/opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7
-Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib64
-L/opt/rh/devtoolset-7/root/usr/lib64
-Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/sharp/lib
-L/opt/hpc/software/mpi/hpcx/v2.7.4/sharp/lib
-Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/hcoll/lib
-L/opt/hpc/software/mpi/hpcx/v2.7.4/hcoll/lib
-Wl,-rpath,/opt/hpc/software/mpi/hpcx/v2.7.4/ucx_without_rocm/lib
-L/opt/hpc/software/mpi/hpcx/v2.7.4/ucx_without_rocm/lib
-Wl,-rpath,/opt/rh/devtoolset-7/root/usr/lib
-L/opt/rh/devtoolset-7/root/usr/lib -lmkl_intel_lp64 -lmkl_core
-lmkl_sequential -lpthread -lm -lX11 -lstdc++ -ldl -lmpi_usempif08
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s
-lquadmath -lpthread -lquadmath -lstdc++ -ldl
-----------------------------------------
