Dear Matthew, Thanks for your comments. I will prepare log summary. To activate an separate log stage for each iteration (i.e. each RHS vector), I tried the code below, but got an error. Could you give me a comment? Does PetscLogView() allow users to use different output file names such that each iteration generates its own log file?
Evan /* work */ PetscViewer viewer; for (int i=0; i<1000; i++) { PetscLogBegin(); /* work to do*/ PetscLogView(viewer); } /* work */ Errors: Number of PCG iterations:1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Corrupt argument: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: Invalid type of object: Parameter # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.0, Jun, 30, 2014 [0]PETSC ERROR: fetdem3dp on a arch-linux2-c-debug named n0024.voltaire0 by esum Sun Nov 16 08:08:53 2014 [0]PETSC ERROR: Configure options --prefix=/clusterfs/voltaire/home/software/modules/petsc/3.5.0 --download-fblaslapack=1 --download-mumps=1 --download-parmetis=1 --with-ptscotch=1 --with-ptscotch-dir=/clusterfs/voltaire/home/software/modules/scotch/5.1.12-gcc/ --download-scalapack --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --with-mpi-dir=/global/software/sl-6.x86_64/modules/gcc/4.4.7/openmpi/1.6.5-gcc/ [0]PETSC ERROR: #1 PetscObjectTypeCompare() line 145 in /clusterfs/voltaire/home/software/source/petsc-3.5.0/src/sys/objects/destroy.c [0]PETSC ERROR: #2 PetscLogView() line 1807 in /clusterfs/voltaire/home/software/source/petsc-3.5.0/src/sys/logging/plog.c On Sun, Nov 16, 2014 at 3:57 AM, Matthew Knepley <knep...@gmail.com> wrote: > On Sat, Nov 15, 2014 at 9:24 PM, Evan Um <eva...@gmail.com> wrote: > >> Dear PETSC users, >> >> I would like to show you a performance issue when Cholesky factor is >> re-used as a direct solver or pre-conditioner many times with many >> right-hand side vectors. Does anyone suggest a solution about this issue? >> In advance, thanks for your help. >> >> Regards, >> Evan >> >> Example 1: I used MUMPS as a direct solver and measured backward/forward >> substitution solution time for selected RHS vectors. A simplified test code >> is shown in code 1 and time measurements in result 1. As shown below, >> backward and forward solution time is nearly constant (e.g. about 2.2 >> seconds) in early time, but later the solution time overall increases. In >> contrast, in late time, it takes about 12.8 seconds. It is hard to >> understand why the backward/forward solution time gets longer although the >> same numerical operation is carried out with each RHS vector (for i). For >> many RHS vectors, is there any parameter that I need to reset to stably >> take care of lots of RHS vectors? >> >> Example 2: In this case, I use the Cholesky factor as a preconditioner >> for CG solver. One iteration is performed. Its sample code and time >> measurments are shown in Code 2 and Result 2. Again, KSPSolve() gets slower >> as the preconditioner is re-used with many RHS vectors. For example, In >> early stage, it took about 4.6 seconds. Later, it took 16 seconds. Does >> anyone observe such a performance issue? Do you know any solution for this >> problem? >> In the two experiments, I expected Example 2 would show shorter solution >> time with each RHS vector than Example 1 because Example 2 uses scalable >> matrix-vector multiplication. Instead, when MUMPS is used in Example 1, >> MUMPS carries out back/forward substitution that are inherently >> un-scalable. However, my experiments consistently showed that the >> back/forward substitution of MUMPS is faster than a single PCG iteration >> with Cholesky preconditioner. Can anyone explain this? FYI, in both example >> 1 and 2, I used a very large test problem (about sparse SPD >> 4,000,000-by-4,000,000 matrix). >> >> Code 1: >> KSPCreate(PETSC_COMM_WORLD, &ksp_fetd_dt); >> KSPSetOperators(ksp_fetd_dt, A_dt, A_dt); >> KSPSetType (ksp_fetd_dt, KSPPREONLY); >> KSPGetPC(ksp_fetd_dt, &pc_fetd_dt); >> MatSetOption(A_dt, MAT_SPD, PETSC_TRUE); >> PCSetType(pc_fetd_dt, PCCHOLESKY); >> PCFactorSetMatSolverPackage(pc_fetd_dt, MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(pc_fetd_dt); >> PCFactorGetMatrix(pc_fetd_dt, &F_dt); >> for (int i=0; i<1000; i++) { >> // Create a new RHS vector B_dt >> > > Are you calling VecCreate() each time here? If so, you could be filling up > the memory > of your computer as you run, causing it to slow down due to swapping. > > For any performance question, ALWAYS send the output of -log_summary. For > your > question, also divide this loop into log stages for each iteration, so we > can see what > specific operations slow down. > > Matt > > >> time1=my_timer_function(); >> KSPSolve(ksp_fetd_dt,B_dt,solution); >> time2=my_timer_function () >> // Output solution time=time2-time1; >> } >> >> Result1: >> nTS:12 Time(s):2.400e-04 Sol. time(s):2.246e+00 PCG niter:1 >> nTS:13 Time(s):2.600e-04 Sol. time(s):2.329e+00 PCG niter:1 >> nTS:14 Time(s):2.800e-04 Sol. time(s):2.289e+00 PCG niter:1 >> nTS:15 Time(s):3.000e-04 Sol. time(s):2.239e+00 PCG niter:1 >> . >> . >> nTS:267 Time(s):5.340e-03 Sol. time(s):2.152e+00 PCG niter:1 >> nTS:280 Time(s):5.600e-03 Sol. time(s):2.255e+00 PCG niter:1 >> nTS:293 Time(s):5.860e-03 Sol. time(s):1.087e+01 PCG niter:1 >> nTS:307 Time(s):6.140e-03 Sol. time(s):1.225e+01 PCG niter:1 >> nTS:321 Time(s):6.420e-03 Sol. time(s):1.280e+01 PCG niter:1 >> nTS:337 Time(s):6.740e-03 Sol. time(s):1.243e+01 PCG niter:1 >> nTS:353 Time(s):7.060e-03 Sol. time(s):6.953e+00 PCG niter:1 >> nTS:369 Time(s):7.380e-03 Sol. time(s):8.712e+00 PCG niter:1 >> nTS:387 Time(s):7.740e-03 Sol. time(s):9.048e+00 PCG niter:1 >> nTS:405 Time(s):8.100e-03 Sol. time(s):1.099e+01 PCG niter:1 >> nTS:424 Time(s):8.480e-03 Sol. time(s):1.171e+01 PCG niter:1 >> nTS:445 Time(s):8.900e-03 Sol. time(s):1.294e+01 PCG niter:1 >> nTS:466 Time(s):9.320e-03 Sol. time(s):1.227e+01 PCG niter:1 >> nTS:488 Time(s):9.760e-03 Sol. time(s):8.581e+00 PCG niter:1 >> nTS:511 Time(s):1.022e-02 Sol. time(s):1.059e+01 PCG niter:1 >> nTS:535 Time(s):1.070e-02 Sol. time(s):8.452e+00 PCG niter:1 >> >> Code 2: >> KSPCreate(PETSC_COMM_WORLD, &ksp_fetd_dt); >> KSPSetOperators(ksp_fetd_dt, A_dt, A_dt); >> KSPSetType (ksp_fetd_dt, KSPPREONLY); >> KSPGetPC(ksp_fetd_dt, &pc_fetd_dt); >> MatSetOption(A_dt, MAT_SPD, PETSC_TRUE); >> PCSetType(pc_fetd_dt, PCCHOLESKY); >> PCFactorSetMatSolverPackage(pc_fetd_dt, MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(pc_fetd_dt); >> PCFactorGetMatrix(pc_fetd_dt, &F_dt); >> KSPSetType(ksp_fetd_dt, KSPCG); >> KSPSetTolerances(ksp_fetd_dt, 1e-9, 1.0e-50, 1.0e10, ksp_iter); >> for (int i=0; i<1000; i++) { >> // Create a new RHS vector B_dt >> time1=my_timer_function(); >> KSPSolve(ksp_fetd_dt,B_dt,solution); >> time2=my_timer_function () >> // Output solution time=time2-time1; >> } >> >> Result 2: >> nTS:10 Time(s):2.000e-04 Sol. time(s):4.644e+00 PCG niter:1 >> Norm:2.937e-10 >> nTS:11 Time(s):2.200e-04 Sol. time(s):4.585e+00 PCG niter:1 >> Norm:3.151e-10 >> nTS:12 Time(s):2.400e-04 Sol. time(s):4.737e+00 PCG niter:1 >> Norm:2.719e-10 >> nTS:13 Time(s):2.600e-04 Sol. time(s):4.537e+00 PCG niter:1 >> Norm:3.982e-10 >> nTS:14 Time(s):2.800e-04 Sol. time(s):4.578e+00 PCG niter:1 >> Norm:3.221e-10 >> nTS:15 Time(s):3.000e-04 Sol. time(s):4.678e+00 PCG niter:1 >> Norm:3.008e-10 >> nTS:16 Time(s):3.200e-04 Sol. time(s):4.619e+00 PCG niter:1 >> Norm:3.005e-10 >> nTS:46 Time(s):9.200e-04 Sol. time(s):6.862e+00 PCG niter:1 >> Norm:2.759e-10 >> nTS:48 Time(s):9.600e-04 Sol. time(s):4.475e+00 PCG niter:1 >> Norm:3.071e-10 >> nTS:50 Time(s):1.000e-03 Sol. time(s):6.068e+00 PCG niter:1 >> Norm:3.848e-10 >> nTS:52 Time(s):1.040e-03 Sol. time(s):5.024e+00 PCG niter:1 >> Norm:3.639e-10 >> nTS:55 Time(s):1.100e-03 Sol. time(s):1.333e+01 PCG niter:1 >> Norm:3.049e-10 >> nTS:58 Time(s):1.160e-03 Sol. time(s):1.440e+01 PCG niter:1 >> Norm:3.467e-10 >> nTS:60 Time(s):1.200e-03 Sol. time(s):1.475e+01 PCG niter:1 >> Norm:2.951e-10 >> nTS:63 Time(s):1.260e-03 Sol. time(s):9.899e+00 PCG niter:1 >> Norm:3.018e-10 >> nTS:66 Time(s):1.320e-03 Sol. time(s):1.097e+01 PCG niter:1 >> Norm:3.320e-10 >> nTS:69 Time(s):1.380e-03 Sol. time(s):1.485e+01 PCG niter:1 >> Norm:2.811e-10 >> nTS:73 Time(s):1.460e-03 Sol. time(s):1.199e+01 PCG niter:1 >> Norm:2.999e-10 >> nTS:76 Time(s):1.520e-03 Sol. time(s):1.109e+01 PCG niter:1 >> Norm:3.359e-10 >> nTS:80 Time(s):1.600e-03 Sol. time(s):1.473e+01 PCG niter:1 >> Norm:3.336e-10 >> nTS:84 Time(s):1.680e-03 Sol. time(s):1.444e+01 PCG niter:1 >> Norm:3.181e-10 >> nTS:88 Time(s):1.760e-03 Sol. time(s):1.639e+01 PCG niter:1 >> Norm:2.956e-10 >> nTS:92 Time(s):1.840e-03 Sol. time(s):1.713e+01 PCG niter:1 >> Norm:3.552e-10 >> nTS:96 Time(s):1.920e-03 Sol. time(s):1.562e+01 PCG niter:1 >> Norm:2.843e-10 >> nTS:101 Time(s):2.020e-03 Sol. time(s):1.631e+01 PCG niter:1 >> Norm:3.687e-10 >> nTS:105 Time(s):2.100e-03 Sol. time(s):1.496e+01 PCG niter:1 >> Norm:3.567e-10 >> nTS:111 Time(s):2.220e-03 Sol. time(s):1.607e+01 PCG niter:1 >> Norm:3.392e-10 >> >> >> >> >> >> >> >> >> >> >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >