Barry,

Your performance data is identical with mine.  Could you repost?

Thanks,
================================
 Keita Teranishi
 Scientific Library Group
 Cray, Inc.
 keita at cray.com
================================

From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] 
On Behalf Of Barry Smith
Sent: Tuesday, August 31, 2010 1:38 PM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] [GPU] Performance of ex19


  Interesting. Some numbers are worse than our older system (MAXPY), some are a 
bit better, nothing is huge amounts better. Here is the older one

VecDot                 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot             2024 1.0 1.1760e+00 1.0 2.54e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 18 29  0  0  0  32 29  0  0  0  2163
VecNorm             2096 1.0 3.1199e-01 1.0 1.68e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  5  2  0  0  0   9  2  0  0  0   537
VecScale            2092 1.0 1.7600e-01 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  3  1  0  0  0   5  1  0  0  0   475
VecCopy             2072 1.0 9.1996e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   3  0  0  0  0     0
VecSet                70 1.0 3.9999e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              108 1.0 1.5999e-02 1.0 8.64e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   540
VecWAXPY              68 1.0 7.9999e-03 1.0 2.72e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   340
VecMAXPY            2092 1.0 7.0399e-01 1.0 2.71e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 11 31  0  0  0  19 31  0  0  0  3844
VecScatterBegin        5 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith         2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceComm          1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCUDACopyTo         10 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCUDACopyFrom        5 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SNESSolve              1 1.0 3.6199e+00 1.0 8.87e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 56100  0  0  0 100100  0  0  0  2451
SNESLineSearch         2 1.0 7.9999e-03 1.0 5.49e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   687
SNESFunctionEval       3 1.0 3.9999e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   630
SNESJacobianEval       2 1.0 3.0399e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  5  0  0  0  0   8  0  0  0  0   127
KSPGMRESOrthog      2024 1.0 1.8280e+00 1.0 5.09e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 28 57  0  0  0  50 57  0  0  0  2783
KSPSetup               2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               2 1.0 3.3079e+00 1.0 8.83e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 51 99  0  0  0  91 99  0  0  0  2668
PCSetUp                2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply             2024 1.0 8.7996e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   2  0  0  0  0     0
MatMult             2092 1.0 8.3197e-01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 13 37  0  0  0  23 37  0  0  0  3991
MatAssemblyBegin       2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         2 1.0 7.9989e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         2 1.0 4.0002e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorApply        2 1.0 3.0399e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  5  0  0  0  0   8  0  0  0  0   127
MatFDColorFunc        42 1.0 1.2000e-02 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  2940


On Aug 31, 2010, at 2:21 PM, Keita Teranishi wrote:


Barry,

Here it is.  The flops rate is better, but the solver is not multilevel anymore 
:(.

Thanks,

--- Event Stage 0: Main Stage

PetscBarrier           1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: SetUp

MatAssemblyBegin       1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         1 1.0 8.0001e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   9  0  0  0  0     0
MatFDColorCreate       1 1.0 3.5999e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0  41  0  0  0  0     0

--- Event Stage 2: Solve

VecDot                 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot             2024 1.0 1.1760e+00 1.0 2.54e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 18 29  0  0  0  32 29  0  0  0  2163
VecNorm             2096 1.0 3.1199e-01 1.0 1.68e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  5  2  0  0  0   9  2  0  0  0   537
VecScale            2092 1.0 1.7600e-01 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  3  1  0  0  0   5  1  0  0  0   475
VecCopy             2072 1.0 9.1996e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   3  0  0  0  0     0
VecSet                70 1.0 3.9999e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              108 1.0 1.5999e-02 1.0 8.64e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   540
VecWAXPY              68 1.0 7.9999e-03 1.0 2.72e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   340
VecMAXPY            2092 1.0 7.0399e-01 1.0 2.71e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 11 31  0  0  0  19 31  0  0  0  3844
VecScatterBegin        5 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith         2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceComm          1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCUDACopyTo         10 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCUDACopyFrom        5 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SNESSolve              1 1.0 3.6199e+00 1.0 8.87e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 56100  0  0  0 100100  0  0  0  2451
SNESLineSearch         2 1.0 7.9999e-03 1.0 5.49e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   687
SNESFunctionEval       3 1.0 3.9999e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   630
SNESJacobianEval       2 1.0 3.0399e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  5  0  0  0  0   8  0  0  0  0   127
KSPGMRESOrthog      2024 1.0 1.8280e+00 1.0 5.09e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 28 57  0  0  0  50 57  0  0  0  2783
KSPSetup               2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               2 1.0 3.3079e+00 1.0 8.83e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 51 99  0  0  0  91 99  0  0  0  2668
PCSetUp                2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply             2024 1.0 8.7996e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   2  0  0  0  0     0
MatMult             2092 1.0 8.3197e-01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 13 37  0  0  0  23 37  0  0  0  3991
MatAssemblyBegin       2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         2 1.0 7.9989e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         2 1.0 4.0002e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorApply        2 1.0 3.0399e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  5  0  0  0  0   8  0  0  0  0   127
MatFDColorFunc        42 1.0 1.2000e-02 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  2940
------------------------------------------------------------------------------------------------------------------------

From: petsc-dev-bounces at mcs.anl.gov<mailto:petsc-dev-bounces at mcs.anl.gov> 
[mailto:petsc-dev-boun...@mcs.anl.gov] On Behalf Of Barry Smith
Sent: Tuesday, August 31, 2010 10:53 AM
To: For users of the development version of PETSc
Subject: Re: [petsc-dev] [GPU] Performance of ex19


  Please run with the options ./ex19 -da_vec_type seqcuda -da_mat_type 
seqaijcuda -pc_type none -dmmg_nlevels 1 -da_grid_x 100 -da_grid_y 100 
-log_summary -mat_no_inode -preload off -cuda_synchronize


On Aug 31, 2010, at 11:45 AM, Keita Teranishi wrote:



Hi PETSc Developer team,

I have just measured the performance of ex19 program running on Fermi GPU.   I 
hope it will help you to develop GPU-enabled PETSc further.

Thanks,

Keita

./ex19 -pc_type jacobi -dmmg_nlevels 5 -da_vec_type cuda -da_mat_type aijcuda 
-log_summary -cuda_synchronize


--- Event Stage 0: Main Stage

PetscBarrier           2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: SetUp

VecSet                 8 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCUDACopyFrom        8 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMultTranspose       4 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0 58  0  0  0     0
MatAssemblyBegin       9 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         9 1.0 3.9999e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0  14  0  0  0  0     0
MatFDColorCreate       5 1.0 1.2000e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0  43  0  0  0  0     0

--- Event Stage 2: Solve

VecDot                 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot              980 1.0 5.5599e-01 1.0 2.95e+08 1.0 0.0e+00 0.0e+00 
0.0e+00 10 14  0  0  0  39 28  0  0  0   530
VecNorm             1025 1.0 1.2399e-01 1.0 1.95e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  2  1  0  0  0   9  2  0  0  0   158
VecScale            1013 1.0 9.9998e-02 1.0 9.73e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  2  0  0  0  0   7  1  0  0  0    97
VecCopy              208 1.0 3.9999e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                45 1.0 7.9989e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPY              233 1.0 3.9999e-03 1.0 1.68e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   419
VecWAXPY              33 1.0 3.9990e-03 1.0 3.17e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0    79
VecMAXPY            1013 1.0 2.9199e-01 1.0 3.14e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  5 15  0  0  0  21 30  0  0  0  1074
VecPointwiseMult     988 1.0 9.5995e-02 1.0 9.42e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  2  0  0  0  0   7  1  0  0  0    98
VecScatterBegin       13 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith         2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceComm          1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCUDACopyTo         24 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecCUDACopyFrom       21 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult             1013 1.0 1.3600e-01 1.0 3.83e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  2 18  0  0  0  10 37  0  0  0  2815
MatMultTranspose       8 1.0 3.9999e-03 1.0 1.15e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0    29
MatAssemblyBegin      10 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        10 1.0 8.0001e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatZeroEntries        10 1.0 4.0002e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorApply       10 1.0 8.7998e-02 1.0 1.26e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  2  1  0  0  0   6  1  0  0  0   143
MatFDColorFunc       210 1.0 1.2000e-02 1.0 1.15e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   1  1  0  0  0   958
SNESSolve              1 1.0 1.4160e+00 1.0 1.04e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 25 50  0  0  0 100100  0  0  0   737
SNESLineSearch         2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SNESFunctionEval       3 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SNESJacobianEval       2 1.0 9.1998e-02 1.0 1.27e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  2  1  0  0  0   6  1  0  0  0   138
KSPGMRESOrthog       980 1.0 8.3199e-01 1.0 5.89e+08 1.0 0.0e+00 0.0e+00 
0.0e+00 15 28  0  0  0  59 56  0  0  0   708
KSPSetup               2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               2 1.0 1.3240e+00 1.0 1.03e+09 1.0 0.0e+00 0.0e+00 
0.0e+00 23 49  0  0  0  93 99  0  0  0   778
PCSetUp                2 1.0 3.9999e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply              980 1.0 9.5995e-02 1.0 9.41e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  2  0  0  0  0   7  1  0  0  0    98


-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100831/241b0f21/attachment.html>

Reply via email to